|
| 1 | +# [Hynek Schlawack](https://hynek.me/) |
| 2 | + |
| 3 | +## Pythonista, Gopher, C hacker, JavaScript dabbler, and speaker from |
| 4 | +Berlin/Germany. |
| 5 | + |
| 6 | + * [About Me](https://hynek.me/about/) |
| 7 | + * [Articles](https://hynek.me/articles) |
| 8 | + * [Talks](https://hynek.me/talks/) |
| 9 | + |
| 10 | +# Better Python Object Serialization |
| 11 | + |
| 12 | +22 August 2016 |
| 13 | + |
| 14 | +The Python standard library is full of underappreciated gems. One of them |
| 15 | +allows for simple and elegant function dispatching based on argument types. |
| 16 | +This makes it perfect for serialization of arbitrary objects – for example to |
| 17 | +JSON in web APIs and structured logs. |
| 18 | + |
| 19 | +Who hasn’t seen it: |
| 20 | + |
| 21 | +[code] |
| 22 | + |
| 23 | + TypeError: datetime.datetime(...) is not JSON serializable |
| 24 | + |
| 25 | +[/code] |
| 26 | + |
| 27 | +While this shouldn’t be a big deal, it is. The `json` module – that inherited |
| 28 | +its API from `simplejson` – offers two ways to serialize objects: |
| 29 | + |
| 30 | + 1. Implement a `default()` _function_ that takes an object and returns something that [`JSONEncoder`](https://docs.python.org/3/library/json.html#json.JSONEncoder) understands. |
| 31 | + 2. Implement or subclass a `JSONEncoder` yourself and pass it as `cls` to the dump methods. You can implement it on your own or just override the `JSONEncoder.default()` _method_. |
| 32 | + |
| 33 | +And since alternative implementations want to be drop-in, they imitate the |
| 34 | +`json` module’s API to various degrees1. |
| 35 | + |
| 36 | +## Expandability |
| 37 | + |
| 38 | +What both approaches have in common is that they’re not expandable: adding |
| 39 | +support for new types is not provided for. Your single `default()` fallback |
| 40 | +has to know about all custom types you want to serialize. Which means you |
| 41 | +either write functions like: |
| 42 | + |
| 43 | +[code] |
| 44 | + |
| 45 | + def to_serializable(val): |
| 46 | + if isinstance(val, datetime): |
| 47 | + return val.isoformat() + "Z" |
| 48 | + elif isinstance(val, enum.Enum): |
| 49 | + return val.value |
| 50 | + elif attr.has(val.__class__): |
| 51 | + return attr.asdict(val) |
| 52 | + elif isinstance(val, Exception): |
| 53 | + return { |
| 54 | + "error": val.__class__.__name__, |
| 55 | + "args": val.args, |
| 56 | + } |
| 57 | + return str(val) |
| 58 | + |
| 59 | +[/code] |
| 60 | + |
| 61 | +Which is painful since you have to add serialization for all objects in one |
| 62 | +place2. |
| 63 | + |
| 64 | +Alternatively you can try to come up with general solutions on your own like |
| 65 | +Pyramid’s JSON renderer did in [`JSON.add_adapter`](http://docs.pylonsproject. |
| 66 | +org/projects/pyramid/en/latest/narr/renderers.html#using-the-add-adapter- |
| 67 | +method-of-a-custom-json-renderer) which uses the widely underappreciated |
| 68 | +`zope.interface`’s adapter registry3. |
| 69 | + |
| 70 | +Django on the other hand satisfies itself with a `DjangoJSONEncoder` that is a |
| 71 | +subclass of `json.JSONEncoder` and knows how to encode dates, times, UUIDs, |
| 72 | +and promises. But other than that, you’re on your own again. If you want to go |
| 73 | +further with Django and web APIs, you’re probably already using the Django |
| 74 | +REST framework anyway. They came up with a whole [serialization |
| 75 | +system](http://www.django-rest-framework.org/api-guide/serializers/) that does |
| 76 | +a lot more than just making data `json.dumps()`-ready. |
| 77 | + |
| 78 | +Finally for the sake of completeness I feel like I have to mention my own |
| 79 | +solution in [`structlog`](http://www.structlog.org/en/stable/) that I fiercely |
| 80 | +hated from day one: adding a `__structlog__` method to your classes that |
| 81 | +return a serializable representation in the tradition of `__str__`. Please |
| 82 | +don’t repeat my mistake; hashtag [software clown](https://softwareclown.com). |
| 83 | + |
| 84 | +* * * |
| 85 | + |
| 86 | +Given how prevalent JSON is, it’s surprising that we have only siloed |
| 87 | +solutions so far. What _I_ personally would like to have is a way to register |
| 88 | +serializers in a central place but in a decentralized fashion that doesn’t |
| 89 | +require any changes to my (or worse: third party) classes. |
| 90 | + |
| 91 | +## Enter PEP 443 |
| 92 | + |
| 93 | +Turns out, Python 3.4 came with a nice solution to this problem in the form of |
| 94 | +[PEP 443](https://www.python.org/dev/peps/pep-0443/): [`functools.singledispat |
| 95 | +ch`](https://docs.python.org/3/library/functools.html#functools.singledispatch |
| 96 | +) (also available on [PyPI](https://pypi.org/project/singledispatch/) for |
| 97 | +legacy Python versions). |
| 98 | + |
| 99 | +Put simply, you define a default function and then register additional |
| 100 | +versions of that functions depending on the type of the first argument: |
| 101 | + |
| 102 | +[code] |
| 103 | + |
| 104 | + from datetime import datetime |
| 105 | + from functools import singledispatch |
| 106 | + |
| 107 | + @singledispatch |
| 108 | + def to_serializable(val): |
| 109 | + """Used by default.""" |
| 110 | + return str(val) |
| 111 | + |
| 112 | + @to_serializable.register(datetime) |
| 113 | + def ts_datetime(val): |
| 114 | + """Used if *val* is an instance of datetime.""" |
| 115 | + return val.isoformat() + "Z" |
| 116 | + |
| 117 | +[/code] |
| 118 | + |
| 119 | +Now you can call `to_serializable()` on `datetime` instances too and single |
| 120 | +dispatch will pick the correct function: |
| 121 | + |
| 122 | +[code] |
| 123 | + |
| 124 | + >>> json.dumps({"msg": "hi", "ts": datetime.now()}, |
| 125 | + ... default=to_serializable) |
| 126 | + '{"ts": "2016-08-20T13:08:59.153864Z", "msg": "hi"}' |
| 127 | + |
| 128 | +[/code] |
| 129 | + |
| 130 | +This gives you the power to put your serializers wherever you want: along with |
| 131 | +the classes, in a separate module, or along with JSON-related code? _You_ |
| 132 | +choose! But your _classes_ stay clean and you don’t have a huge `if-elif-else` |
| 133 | +branch that you cargo-cult between your projects. |
| 134 | + |
| 135 | +## Going Further |
| 136 | + |
| 137 | +Obviously the utility of `@singledispatch` goes far beyond JSON. Binding |
| 138 | +different behaviors to different types in general and object serialization in |
| 139 | +particular are universally useful4. Some of my proofreaders mentioned they |
| 140 | +tried a ghetto approximation using `dict`s of classes to callables and other |
| 141 | +similar atrocities. |
| 142 | + |
| 143 | +In other words, `@singledispatch` just may be the function that you’ve been |
| 144 | +missing although it was there all along. |
| 145 | + |
| 146 | +P.S. Of course there’s also a `*multiple*dispatch` on |
| 147 | +[PyPI](https://pypi.org/project/multipledispatch/). |
| 148 | + |
| 149 | +## Footnotes |
| 150 | + |
| 151 | +* * * |
| 152 | + |
| 153 | + 1. However, from the popular ones: [UltraJSON](https://github.com/esnme/ultrajson) doesn’t support custom object serialization at all and [`python-rapidjson`](https://github.com/kenrobbins/python-rapidjson) only supports the `default()` function. ↩︎ |
| 154 | + 2. Although as you can see it’s manageable with `attrs`; maybe [you should use `attrs`](https://glyph.twistedmatrix.com/2016/08/attrs.html)! ↩︎ |
| 155 | + 3. Unfortunately the API Pyramid uses is currently [undocumented](https://github.com/zopefoundation/zope.interface/issues/41) after being transplanted from [`zope.component`](https://docs.zope.org/zope.component/). ↩︎ |
| 156 | + 4. I’ve been told the original incentive for adding single dispatch to the standard library was a more elegant reimplementation of [`pprint`](https://docs.python.org/3.5/library/pprint.html) (that never happened). ↩︎ |
| 157 | + |
| 158 | +__ |
| 159 | + |
| 160 | +[ __ Twitter ](http://twitter.com/share?text=Better%20Python%20Object%20Serial |
| 161 | +ization&url=https%3a%2f%2fhynek.me%2farticles%2fserialization%2f "Share on |
| 162 | +Twitter" ) [ __ Facebook ](https://www.facebook.com/sharer/sharer.php?u=https% |
| 163 | +3a%2f%2fhynek.me%2farticles%2fserialization%2f "Share on Facebook" ) [ __ |
| 164 | +Google+ ](https://plus.google.com/share?url=https%3a%2f%2fhynek.me%2farticles% |
| 165 | +2fserialization%2f "Share on Google+" ) |
| 166 | + |
| 167 | +#### Hynek Schlawack |
| 168 | + |
| 169 | +In ♥︎ with Python & networks. Occasionally cheating with Go. OSS |
| 170 | +mercenary, blogger, speaker, PSF fellow, coder of mayhem. |
| 171 | + |
| 172 | +__ Berlin/Germany |
| 173 | + |
| 174 | +[__ Twitter](https://twitter.com/hynek "Twitter" ) [__ |
| 175 | +GitHub](https://github.com/hynek "GitHub" ) [__ |
| 176 | +RSS](https://hynek.me/index.xml "RSS" ) (C) 2016 Hynek Schlawack |
| 177 | +All Rights Reserved • [Impressum](https://hynek.me/imprint/) |
| 178 | + |
| 179 | + |
| 180 | + |
0 commit comments