GUFE Serialization API#

class gufe.tokenization.GufeTokenizable(*args, **kwargs)#

Base class for all tokenizeable gufe objects.

Subclassing from this provides sorting, equality and hashing operators, provided that the class implements the _to_dict and _from_dict method.

This extra work in serializing is important for hashes that are stable across different Python sessions.

classmethod serialization_migration(old_dict: dict, version: int) dict#

Migrate old serialization dicts to the current form.

The input dict old_dict comes from some previous serialization version, given by version. The output dict should be in the format of the current serialization dict.

The recommended pattern to use looks like this:

def serialization_migration(cls, old_dict, version):
    if version == 1:
        ...  # do things for migrating version 1->2
    if version <= 2:
        ...  # do things for migrating version 2->3
    if version <= 3:
        ...  # do things for migrating version 3->4
    # etc

This approach steps through each old serialization model on its way to the current version. It keeps code relatively minimal and readable.

As a convenience, the following functions are available to simplify the various kinds of changes that are likely to occur in as serializtion versions change:

Parameters:
  • old_dict (dict) – dict as received from a serialized form

  • version (int) – the serialization version of old_dict

Returns:

serialization dict suitable for the current implmentation of from_dict.

Return type:

dict

property logger#

Return logger adapter for this instance

property key#
classmethod defaults()#

Dict of default key-value pairs for this GufeTokenizable object.

These defaults are stripped from the dict form of this object produced with to_dict(include_defaults=False) where default values are present.

to_dict(include_defaults=True) dict#

Generate full dict representation, with all referenced GufeTokenizable objects also given in full dict representations.

Parameters:

include_defaults (bool) – If False, strip keys from dict representation with values equal to those in defaults.

classmethod from_dict(dct: Dict)#

Generate an instance from full dict representation.

Parameters:

dct (Dict) – A dictionary produced by to_dict to instantiate from. If an identical instance already exists in memory, it will be returned. Otherwise, a new instance will be returned.

to_keyed_dict(include_defaults=True) Dict#

Generate keyed dict representation, with all referenced GufeTokenizable objects given in keyed representations.

A keyed representation of an object is a dict of the form:

{‘:gufe-key:’: <GufeTokenizable.key>}

These function as stubs to allow for serialization and storage of GufeTokenizable objects with minimal duplication.

The original object can be re-assembled with from_keyed_dict.

classmethod from_keyed_dict(dct: Dict)#

Generate an instance from keyed dict representation.

Parameters:

dct (Dict) – A dictionary produced by to_keyed_dict to instantiate from. If an identical instance already exists in memory, it will be returned. Otherwise, a new instance will be returned.

to_shallow_dict() Dict#

Generate shallow dict representation, with all referenced GufeTokenizable objects left intact.

classmethod from_shallow_dict(dct: Dict)#

Generate an instance from shallow dict representation.

Parameters:

dct (Dict) – A dictionary produced by to_shallow_dict to instantiate from. If an identical instance already exists in memory, it will be returned. Otherwise, a new instance will be returned.

copy_with_replacements(**replacements)#

Make a modified copy of this object.

Since GufeTokenizables are immutable, this is essentially a shortcut to mutate the object. Note that the keyword arguments it takes are based on keys of the dictionaries used in the the _to_dict/_from_dict cycle for this object; in most cases that is the same as parameters to __init__, but not always.

This will always return a new object in memory. So using obj.copy_with_replacements() (with no keyword arguments) is a way to create a shallow copy: the object is different in memory, but its attributes will be the same objects in memory as the original.

Parameters:

replacements (Dict) – keyword arguments with keys taken from the keys given by the output of this object’s to_dict method.

Serialization migration#

gufe.tokenization.new_key_added(dct, new_key, default)#

Serialization migration: Add a new key to the dictionary.

This can be used in when writing a serialization migration (see GufeTokenizable.serialization_migration()) where a new key has been added to the object’s representation (e.g., a new parameter has been added). In order to be migratable, the new key must have an associated default value.

Parameters:
  • dct (dict) – dictionary based on the old serialization version

  • new_key (str) – name of the new key

  • default (Any) – default value for the new key

Returns:

input dictionary modified to add the new key

Return type:

dict

gufe.tokenization.old_key_removed(dct, old_key, should_warn)#

Serialization migration: Remove an old key from the dictionary.

This can be used in when writing a serialization migration (see GufeTokenizable.serialization_migration()) where a key has been removed from the object’s serialized representation (e.g., an old parameter is no longer allowed). If a parameter has been removed, it is likely that you will want to warn the user that the parameter is no longer used: the should_warn option allows that.

Parameters:
  • dct (dict) – dictionary based on the old serialization version

  • old_key (str) – name of the key that has been removed

  • should_warn (bool) – whether to issue a warning for this (generally recommended)

Returns:

input dictionary modified to remove the old key

Return type:

dict

gufe.tokenization.key_renamed(dct, old_name, new_name)#

Serialization migration: Rename a key in the dictionary.

This can be used in when writing a serialization migration (see GufeTokenizable.serialization_migration()) where a key has been renamed (e.g., a parameter name has changed).

Parameters:
  • dct (dict) – dictionary based on the old serialization version

  • old_name (str) – name of the key in the old serialization representation

  • new_name (str) – name of the key in the new serialization representation

Returns:

input dictionary modified to rename the key from the old name to the new one

Return type:

dict

gufe.tokenization.nested_key_moved(dct, old_name, new_name)#

Serialization migration: Move nested key to a new location.

This can be used in when writing a serialization migration (see GufeTokenizable.serialization_migration()) where a key that is nested in a structure of dicts/lists has been moved elsewhere. It uses labels that match Python namespace/list notations. That is, if dct is the following dict:

{'first': {'inner': ['list', 'of', 'words']}}

then the label 'first.inner[1]' would refer to the word 'of'.

In that case, the following call:

nested_key_moved(dct, 'first.inner[1]', 'second')

would result in the dictionary:

{'first': {'inner': ['list', 'words']}, 'second': 'of'}

This is particular useful for things like protocol settings, which present as nested objects like this.

Parameters:
  • dct (dict) – dictionary based on the old serialization version

  • old_name (str) – label for the old location (see above for description of label format)

  • new_name (str) – label for the new location (see above for description of label format)

Returns:

input dictionary modified to move the value at the old location to the new location

Return type:

dict