python-rapidjson

Python wrapper around RapidJSON

RapidJSON is an extremely fast C++ JSON serialization library.

We do not support legacy Python versions, you will need to upgrade to Python 3 to use this library.

Latest version documentation is automatically rendered by Read the Docs.

Getting Started

First install python-rapidjson:

$ pip install python-rapidjson

RapidJSON tries to be compatible with the standard library json module so it should be a drop in replacement. Basic usage looks like this:

>>> import rapidjson
>>> data = {'foo': 100, 'bar': 'baz'}
>>> rapidjson.dumps(data)
'{"bar":"baz","foo":100}'
>>> rapidjson.loads('{"bar":"baz","foo":100}')
{'bar': 'baz', 'foo': 100}

If you want to install the development version (maybe to contribute fixes or enhancements) you may clone the repository:

$ git clone --recursive https://github.com/python-rapidjson/python-rapidjson.git

Note

The --recursive option is needed because we use a submodule to include RapidJSON sources. Alternatively you can do a plain clone immediately followed by a git submodule update --init.

Alternatively, if you already have (a compatible version of) RapidJSON includes around, you can compile the module specifying their location with the option --rj-include-dir, for example:

$ python3 setup.py build --rj-include-dir=/usr/include/rapidjson

Performance

python-rapidjson tries to be as performant as possible while staying compatible with the json module.

The following tables show a comparison between this module and other libraries with different data sets. Last row (“overall”) is the total time taken by all the benchmarks.

Each number show the factor between the time taken by each contender and python-rapidjson (in other words, they are normalized against a value of 1.0 for python-rapidjson): the lower the number, the speedier the contender.

In bold the winner.

Serialization

serialize	native [1]	ujson [2]	simplejson [3]	stdlib [4]	yajl [5]
100 arrays dict	0.67	1.31	6.28	2.88	1.74
100 dicts array	0.79	1.19	7.16	2.92	1.69
256 Trues array	1.19	1.41	3.02	2.19	1.20
256 ascii array	1.02	0.92	1.90	1.77	2.05
256 doubles array	1.06	7.55	8.30	7.65	4.39
256 unicode array	0.87	0.72	0.82	0.88	0.53
complex object	0.82	1.41	5.17	3.39	2.87
composite object	0.68	0.93	3.01	1.92	1.85
overall	0.67	1.30	6.27	2.88	1.74

[1]	rapidjson with `number_mode=NM_NATIVE`

[2]	ujson 1.35

[3]	simplejson 3.11.1

[4]	Python 3.6 standard library

[5]	yajl 0.3.5

Deserialization

deserialize	native	ujson	simplejson	stdlib	yajl
100 arrays dict	0.90	0.97	1.48	1.25	1.20
100 dicts array	0.88	0.96	1.99	1.58	1.34
256 Trues array	1.22	1.31	2.08	1.93	2.08
256 ascii array	1.05	1.37	1.14	1.25	1.56
256 doubles array	0.16	0.33	0.72	0.70	0.47
256 unicode array	0.89	0.79	4.12	4.50	1.90
complex object	0.72	0.88	1.36	1.28	1.24
composite object	0.83	0.85	1.94	1.43	1.26
overall	0.90	0.97	1.49	1.25	1.20

DIY

To run these tests yourself, clone the repo and run:

$ tox -e py36 -- -m benchmark --compare-other-engines

Without the option --compare-other-engines it will focus only on RapidJSON. This is particularly handy coupled with the compare past runs functionality of pytest-benchmark:

$ tox -e py36 -- -m benchmark --benchmark-autosave
# hack, hack, hack!
$ tox -e py36 -- -m benchmark --benchmark-compare=0001

----------------------- benchmark 'deserialize': 18 tests ------------------------
Name (time in us)                                                            Min…
----------------------------------------------------------------------------------
test_loads[rapidjson-256 Trues array] (NOW)                         5.2320 (1.0)…
test_loads[rapidjson-256 Trues array] (0001)                        5.4180 (1.04)…
…

To reproduce the tables above, use the option --benchmark-json so that the the results are written in the specified filename the run the benchmark-tables.py script giving that filename as the only argument:

$ tox -e py36 -- -m benchmark --compare-other-engines --benchmark-json=comparison.json
$ python3 benchmark-tables.py comparison.json

Incompatibility

Here are things in the standard json library supports that we have decided not to support:

separators argument. This is mostly used for pretty printing and not supported by RapidJSON so it isn't a high priority. We do support indent kwarg that would get you nice looking JSON anyways.
Coercing keys when dumping. json will turn True into 'True' if you dump it out but when you load it back in it'll still be a string. We want the dump and load to return the exact same objects so we have decided not to do this coercing.

Name		Name	Last commit message	Last commit date
Latest commit History 204 Commits
docs		docs
python-rapidjson		python-rapidjson
rapidjson @ f54b0e4		rapidjson @ f54b0e4
tests		tests
.dir-locals.el		.dir-locals.el
.gitignore		.gitignore
.gitmodules		.gitmodules
.travis.yml		.travis.yml
CHANGES.rst		CHANGES.rst
LICENSE		LICENSE
MANIFEST.in		MANIFEST.in
README.rst		README.rst
appveyor.yml		appveyor.yml
benchmark-tables.py		benchmark-tables.py
requirements-test.txt		requirements-test.txt
setup.cfg		setup.cfg
setup.py		setup.py
tox.ini		tox.ini

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

python-rapidjson

Python wrapper around RapidJSON

Getting Started

Performance

Serialization

Deserialization

DIY

Incompatibility

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 24

Languages

License

python-rapidjson/python-rapidjson

Folders and files

Latest commit

History

Repository files navigation

python-rapidjson

Python wrapper around RapidJSON

Getting Started

Performance

Serialization

Deserialization

DIY

Incompatibility

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 24

Languages

Packages