Note
Click here to download the full example code
Defining Serializables¶
This example describes the various ways in which one can construct new serializable objects.
Introduction¶
To run the example you will need the following imports:
import dman
import numpy as np
from dataclasses import dataclass, asdict
from dman.core.serializables import serializable
The base objects in dman
are serializables
. Any serializable
instance is defined implicitly such that the following operations
return the original object.
ser: dict = dman.serialize(obj)
res: str = dman.sjson.dumps(ser)
ser: dict = dman.sjson.loads(res)
assert(obj == dman.deserialize(res))
The goal is that the string res
can be stored in a human readable file.
Note
We used sjson
instead of json
for dumping the dictionary
to a string. This replaces any unserializable objects with
a placeholder string. This corresponds with the ideals behind dman
,
one of which is that (some) serialization should always be produced.
By default, several types are serializable. Specifically: str
, int
,
float
, bool
, NoneType
, list
, dict
, tuple
. Collections
can be nested. Note that tuple
is somewhat of an exception since it
is deserialized as a list. We are however able to extend upon these
basic types, which is the topic of this example.
Creating Serializables¶
There are several ways of creating a serializable
class from scratch.
You can either do it manually or use some code generation functionality
build into dman
.
Manual Definition¶
The standard way of defining a serializable is as follows:
@dman.serializable(name='manual')
class Manual:
def __init__(self, value: str):
self.value = value
def __repr__(self):
return f'Manual(value={self.value})'
def __serialize__(self):
return {'value': self.value}
@classmethod
def __deserialize__(cls, ser: dict):
return cls(**ser)
We can serialize the object
test = Manual(value='hello world!')
ser = dman.serialize(test)
res = dman.sjson.dumps(ser, indent=4)
print(res)
{
"_ser__type": "manual",
"_ser__content": {
"value": "hello world!"
}
}
Note how the dictionary under _ser__content
is the output of our __serialize__
method. The type name is also added such that the dictionary can be interpreted
correctly. We can deserialize
a dictionary created like this as follows:
ser = dman.sjson.loads(res)
test = dman.deserialize(ser)
print(test)
Manual(value=hello world!)
Note
It is possible to not include the serializable type and deserialize by specifying the type manually using the following syntax
ser = dman.serialize(test, content_only=True)
reconstructed: Manual = dman.deserialize(ser, ser_type=Manual)
Warning
The name provided to @serializable
should be unique within
your library. It is used as the identifier of the class by dman
when deserializing.
Automatic Generation¶
Of course it would not be convenient to manually specify the __serialize__
and __deserialize__
methods. Hence, the serializable
decorator
has been implemented to automatically generate them whenever
the class is an instance of Enum
or a dataclass
(and when no prior __serialize__
and __deserialize__
methods are specified).
So in the case of enums:
from enum import Enum
@dman.serializable(name='mode')
class Mode(Enum):
RED = 1
BLUE = 2
ser = dman.serialize(Mode.RED)
print(dman.sjson.dumps(ser, indent=4))
{
"_ser__type": "mode",
"_ser__content": "Mode.RED"
}
In the case of dataclasses
we get the following:
from dataclasses import dataclass
@dman.serializable(name='dcl_basic')
@dataclass
class DCLBasic:
value: str
test = DCLBasic(value='hello world!')
ser = dman.serialize(test)
print(dman.sjson.dumps(ser, indent=4))
{
"_ser__type": "dcl_basic",
"_ser__content": {
"value": "hello world!"
}
}
As long as all of the fields in the dataclass are serializable, the whole will be as well.
Warning
Be careful when specifying the name that it is unique. It
is used to reconstruct an instance of a class based on the
_ser__type
string. If a name is left unspecified, the value
under __name__
in the class will be used.
Warning
In almost all cases it will be better to use @dman.modelclass
when converting a dataclass
into a serializable
.
This is mostly important when some fields are storable
,
in which case they will be handled automatically. See Model Types
for an overview of the modelclass
decorator.
Note
It is possible to have fields in your dataclass that you don’t want serialized. ‘’
from dataclasses import dataclass
@serializable(name='dcl_basic')
@dataclass
class DCLBasic:
__no_serialize__ = ['hidden']
value: str
hidden: int = 0
The field names in __no_serialize__
will not be included
in the serialized dict
. Note that this means that you should
specify a default value for these fields to support deserialization.
Serializing Existing Types¶
Often you will already have some objects in a library that should
also be made serializable. In dman
we provide some functionality
that makes this process simpler.
Registered Definition¶
The most flexible way of making a class serializable is by registering it
manually. This is especially useful when the original class definition
cannot be manipulated (for example for numpy.ndarray
).
Say we have some frozen class definition:
class Frozen:
def __init__(self, data: int):
self.data = data
def __repr__(self):
return f'{self.__class__.__name__}(data={self.data})'
We can make it serializable without touching the original class definition as follows:
dman.register_serializable(
'frozen',
Frozen,
serialize=lambda frozen: frozen.data,
deserialize=lambda data: Frozen(data)
)
Now we can serialize frozen itself:
frozen = Frozen(data=42)
ser = dman.serialize(frozen)
res = dman.sjson.dumps(ser, indent=4)
print(res)
{
"_ser__type": "frozen",
"_ser__content": 42
}
And deserialize it
ser = dman.sjson.loads(res)
frozen = dman.deserialize(ser)
print(frozen)
Frozen(data=42)
You can take a look at dman.numerics
to see an example of this
in practice.
Templates¶
In many cases however it will be possible to alter the original class.
So say we have some user class that is used all throughout your library:
class User:
def __init__(self, name: int):
self.name = name
def __repr__(self):
return f'{self.__class__.__name__}(id={self.name})'
We would like to make User
serializable without
defining __serialize__
and __deserialize__
manually.
We can do so using a template:
@dman.serializable
@dataclass
class UserTemplate:
name: str
@classmethod
def __convert__(cls, other: 'User'):
return cls(other.name)
def __de_convert__(self):
return User(self.name)
A template has a method that allows conversion from the original class to the template and a method to undo that conversion.
Using a template we can then make User
itself serializable like this:
serializable(User, name='user', template=UserTemplate)
Now we can serialize a user:
user = User(name='Thomas Anderson')
ser = dman.serialize(user)
res = dman.sjson.dumps(ser, indent=4)
print(res)
{
"_ser__type": "user",
"_ser__content": {
"name": "Thomas Anderson"
}
}
However this does make an adjustment to the class.
Specifically a field _ser__type
is added:
print(getattr(User, '_ser__type'))
user
Using templates can also be useful when
you are able to work with subclasses of some Base
class
instead.
So say you start with some Base
class:
class Base:
def __init__(self, data: int, computation: int = None):
self.data = data
self.computation = computation
def compute(self):
self.computation = self.data**2
def __repr__(self):
return f'{self.__class__.__name__}(data={self.data}, computation={self.computation})'
We want to create a subtype of this class that is serializable without
defining the __serialize__
method manually.
@dman.serializable
@dataclass
class Template:
data: int
computation: int
@classmethod
def __convert__(cls, other: 'SBase'):
return cls(other.data, other.computation)
@dman.serializable(name='base', template=Template)
class SBase(Base): ...
So we defined a template class with a convert method from Base
and
similarly we defined a serializable subclass of Base
that
can be converted from Template
. Now we can serialize an instance of SBase
as follows:
base = SBase(data=25)
base.compute()
ser = dman.serialize(base)
res = dman.sjson.dumps(ser, indent=4)
print(res)
{
"_ser__type": "base",
"_ser__content": {
"data": 25,
"computation": 625
}
}
And we can deserialize it too
ser = dman.sjson.loads(res)
base = dman.deserialize(ser)
print(base)
SBase(data=25, computation=625)
Note how we did not specify in the above example how to
go from an instance of Template
to one of SBase
.
Such a __convert__
method was actually generated automatically.
We could have instead specified the same behavior manually as follows:
@dman.serializable(name='base', template=Template)
class SBase(Base):
@classmethod
def __convert__(cls, other: Template):
return cls(**asdict(other))
Specifying this conversion manually could be
relevant if the fields of the Template
dataclass
do not match the ones for the __init__
method of Base
.
For example we could have had:
class Base:
def __init__(self, data: int):
self.data = data
self.computation = None
def compute(self):
self.computation = self.data**2
def __repr__(self):
return f'{self.__class__.__name__}(data={self.data}, computation={self.computation})'
So the value of computation
cannot be passed to the constructor.
We can however compensate for this in the __convert__
method:
@dman.serializable(name='base', template=Template)
class SBase(Base):
@classmethod
def __convert__(cls, other: Template):
res = cls(other.data)
res.computation = other.computation
return res
Serializing Instances¶
In some settings it is useful to serialize instances directly. One common example is methods.
from math import sqrt
@dman.register_instance(name='ell1')
def ell1(x, y):
return abs(x) + abs(y)
@dman.register_instance(name='ell2')
def ell2(x, y):
return sqrt(x**2 + y**2)
When serializing the result looks as follows:
ser = dman.serialize([ell1, ell2])
dman.tui.print_serialized(ser)
[
{
"_ser__type": "__instance",
"_ser__content": "ell1"
},
{
"_ser__type": "__instance",
"_ser__content": "ell2"
}
]
Deserialization then works as expected.
dser = dman.deserialize(ser)
print(dser)
[<function ell1 at 0x7fe4a1aa7ac0>, <function ell2 at 0x7fe4a1aa5b40>]
For specific instances we can also call register_instance inline.
class Auto: ...
AUTO = Auto()
dman.register_instance(AUTO, name='auto')
dman.tui.print_serialized(dman.serialize(AUTO))
{
"_ser__type": "__instance",
"_ser__content": "auto"
}
Total running time of the script: ( 0 minutes 0.026 seconds)