Ben Chuanlong Du's Blog

It is never too late to learn.

Hands on the json Module in Python

Tips and Traps

  1. It is suggested that you avoid using JSON for serializing and deserializing data. Please refer to Shotcomes of JSON for detailed discussions on this. TOML and YAML are better text-based alternatives to JSON. If serialization and deserialization is done in Python only, pickle is preferred.

  2. Even if you do want to use JSON in Python, the built-in standard library json is not necessarily a good choice. Please refer to JSON Parsing Libraries in Python for more discussions.

  3. The json library does not throw error or given any warning if you have multiple records for the same key in the JSON file!!! This is error-prone when handling large JSON configuration files.

In [1]:
import json
In [3]:
json.load(open("data.json"))
Out[3]:
{'x': 2}

json.dumps

Convert an object to a json string.

json.dump

Serialize obj as a JSON formatted stream to fp (a .write()-supporting file-like object) using this conversion table.

Formatting

In [3]:
json.dumps(json.loads("[{}]"), indent=4, sort_keys=True)
Out[3]:
'[\n    {}\n]'
In [4]:
json.dumps(json.loads(" [{ }]\n"), indent=4, sort_keys=True)
Out[4]:
'[\n    {}\n]'

Seriallizablle Types in JSON Format

  • int, float
  • str

Non-serializable Types

JSON is not a good serialization format as lots of objects in Python cannot be serialized in JSON. pickle is a much better alternative for serializing objects.

datetime.datetime, etc.

datetime.datetime, datetime.date, datetime.timedelta are not serializable. A possible alternative is to use timestamp instead.

A dataclass is not JSON serializable by default.

In [1]:
from dataclasses import dataclass


@dataclass
class Query:
    query: str = ""
    timestamp: float = 0.0
    table: str = ""
In [5]:
json.dumps([Query()])
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-5-dfcdf91d85a9> in <module>
----> 1 json.dumps([Query()])

/usr/local/Cellar/python@3.8/3.8.5/Frameworks/Python.framework/Versions/3.8/lib/python3.8/json/__init__.py in dumps(obj, skipkeys, ensure_ascii, check_circular, allow_nan, cls, indent, separators, default, sort_keys, **kw)
    229         cls is None and indent is None and separators is None and
    230         default is None and not sort_keys and not kw):
--> 231         return _default_encoder.encode(obj)
    232     if cls is None:
    233         cls = JSONEncoder

/usr/local/Cellar/python@3.8/3.8.5/Frameworks/Python.framework/Versions/3.8/lib/python3.8/json/encoder.py in encode(self, o)
    197         # exceptions aren't as detailed.  The list call should be roughly
    198         # equivalent to the PySequence_Fast that ''.join() would do.
--> 199         chunks = self.iterencode(o, _one_shot=True)
    200         if not isinstance(chunks, (list, tuple)):
    201             chunks = list(chunks)

/usr/local/Cellar/python@3.8/3.8.5/Frameworks/Python.framework/Versions/3.8/lib/python3.8/json/encoder.py in iterencode(self, o, _one_shot)
    255                 self.key_separator, self.item_separator, self.sort_keys,
    256                 self.skipkeys, _one_shot)
--> 257         return _iterencode(o, 0)
    258 
    259 def _make_iterencode(markers, _default, _encoder, _indent, _floatstr,

/usr/local/Cellar/python@3.8/3.8.5/Frameworks/Python.framework/Versions/3.8/lib/python3.8/json/encoder.py in default(self, o)
    177 
    178         """
--> 179         raise TypeError(f'Object of type {o.__class__.__name__} '
    180                         f'is not JSON serializable')
    181 

TypeError: Object of type Query is not JSON serializable
In [ ]:
 

Comments