GUFE Serialization API#
- class gufe.tokenization.GufeTokenizable(*args, **kwargs)#
Base class for all tokenizeable gufe objects.
Subclassing from this provides sorting, equality and hashing operators, provided that the class implements the _to_dict and _from_dict method.
This extra work in serializing is important for hashes that are stable across different Python sessions.
- classmethod serialization_migration(old_dict: dict, version: int) dict #
Migrate old serialization dicts to the current form.
The input dict
old_dict
comes from some previous serialization version, given byversion
. The output dict should be in the format of the current serialization dict.The recommended pattern to use looks like this:
def serialization_migration(cls, old_dict, version): if version == 1: ... # do things for migrating version 1->2 if version <= 2: ... # do things for migrating version 2->3 if version <= 3: ... # do things for migrating version 3->4 # etc
This approach steps through each old serialization model on its way to the current version. It keeps code relatively minimal and readable.
As a convenience, the following functions are available to simplify the various kinds of changes that are likely to occur in as serializtion versions change:
- Parameters:
old_dict (dict) – dict as received from a serialized form
version (int) – the serialization version of
old_dict
- Returns:
serialization dict suitable for the current implmentation of
from_dict
.- Return type:
dict
- property logger#
Return logger adapter for this instance
- property key#
- classmethod defaults()#
Dict of default key-value pairs for this GufeTokenizable object.
These defaults are stripped from the dict form of this object produced with
to_dict(include_defaults=False)
where default values are present.
- to_dict(include_defaults=True) dict #
Generate full dict representation, with all referenced GufeTokenizable objects also given in full dict representations.
- Parameters:
include_defaults (bool) – If False, strip keys from dict representation with values equal to those in defaults.
- classmethod from_dict(dct: Dict)#
Generate an instance from full dict representation.
- Parameters:
dct (Dict) – A dictionary produced by to_dict to instantiate from. If an identical instance already exists in memory, it will be returned. Otherwise, a new instance will be returned.
- to_keyed_dict(include_defaults=True) Dict #
Generate keyed dict representation, with all referenced GufeTokenizable objects given in keyed representations.
A keyed representation of an object is a dict of the form:
{‘:gufe-key:’: <GufeTokenizable.key>}
These function as stubs to allow for serialization and storage of GufeTokenizable objects with minimal duplication.
The original object can be re-assembled with from_keyed_dict.
- classmethod from_keyed_dict(dct: Dict)#
Generate an instance from keyed dict representation.
- Parameters:
dct (Dict) – A dictionary produced by to_keyed_dict to instantiate from. If an identical instance already exists in memory, it will be returned. Otherwise, a new instance will be returned.
- to_shallow_dict() Dict #
Generate shallow dict representation, with all referenced GufeTokenizable objects left intact.
- classmethod from_shallow_dict(dct: Dict)#
Generate an instance from shallow dict representation.
- Parameters:
dct (Dict) – A dictionary produced by to_shallow_dict to instantiate from. If an identical instance already exists in memory, it will be returned. Otherwise, a new instance will be returned.
- copy_with_replacements(**replacements)#
Make a modified copy of this object.
Since GufeTokenizables are immutable, this is essentially a shortcut to mutate the object. Note that the keyword arguments it takes are based on keys of the dictionaries used in the the
_to_dict
/_from_dict
cycle for this object; in most cases that is the same as parameters to__init__
, but not always.This will always return a new object in memory. So using
obj.copy_with_replacements()
(with no keyword arguments) is a way to create a shallow copy: the object is different in memory, but its attributes will be the same objects in memory as the original.- Parameters:
replacements (Dict) – keyword arguments with keys taken from the keys given by the output of this object’s
to_dict
method.
Serialization migration#
- gufe.tokenization.new_key_added(dct, new_key, default)#
Serialization migration: Add a new key to the dictionary.
This can be used in when writing a serialization migration (see
GufeTokenizable.serialization_migration()
) where a new key has been added to the object’s representation (e.g., a new parameter has been added). In order to be migratable, the new key must have an associated default value.- Parameters:
dct (dict) – dictionary based on the old serialization version
new_key (str) – name of the new key
default (Any) – default value for the new key
- Returns:
input dictionary modified to add the new key
- Return type:
dict
- gufe.tokenization.old_key_removed(dct, old_key, should_warn)#
Serialization migration: Remove an old key from the dictionary.
This can be used in when writing a serialization migration (see
GufeTokenizable.serialization_migration()
) where a key has been removed from the object’s serialized representation (e.g., an old parameter is no longer allowed). If a parameter has been removed, it is likely that you will want to warn the user that the parameter is no longer used: theshould_warn
option allows that.- Parameters:
dct (dict) – dictionary based on the old serialization version
old_key (str) – name of the key that has been removed
should_warn (bool) – whether to issue a warning for this (generally recommended)
- Returns:
input dictionary modified to remove the old key
- Return type:
dict
- gufe.tokenization.key_renamed(dct, old_name, new_name)#
Serialization migration: Rename a key in the dictionary.
This can be used in when writing a serialization migration (see
GufeTokenizable.serialization_migration()
) where a key has been renamed (e.g., a parameter name has changed).- Parameters:
dct (dict) – dictionary based on the old serialization version
old_name (str) – name of the key in the old serialization representation
new_name (str) – name of the key in the new serialization representation
- Returns:
input dictionary modified to rename the key from the old name to the new one
- Return type:
dict
- gufe.tokenization.nested_key_moved(dct, old_name, new_name)#
Serialization migration: Move nested key to a new location.
This can be used in when writing a serialization migration (see
GufeTokenizable.serialization_migration()
) where a key that is nested in a structure of dicts/lists has been moved elsewhere. It uses labels that match Python namespace/list notations. That is, ifdct
is the following dict:{'first': {'inner': ['list', 'of', 'words']}}
then the label
'first.inner[1]'
would refer to the word'of'
.In that case, the following call:
nested_key_moved(dct, 'first.inner[1]', 'second')
would result in the dictionary:
{'first': {'inner': ['list', 'words']}, 'second': 'of'}
This is particular useful for things like protocol settings, which present as nested objects like this.
- Parameters:
dct (dict) – dictionary based on the old serialization version
old_name (str) – label for the old location (see above for description of label format)
new_name (str) – label for the new location (see above for description of label format)
- Returns:
input dictionary modified to move the value at the old location to the new location
- Return type:
dict