Hands on dict in Python

Tips and Traps¶

Starting from Python 3.7, dict preserves insertion order (i.e., dict is ordered). There is no need to use OrderedDict any more in Python 3.7+. However, set in Python is implemented as an unordered hashset and thus is neither ordered nor sorted. A trick to dedup an iterable values while preserving the order of first occurences is to leverage dict instead set.
```
 {v: None for v in values}.keys()
```
This is also a good trick to yield reproducible dedupped results without using sort.

Construct `dict`¶

Dictionary Comprehension¶

d = {i: i * i for i in range(3)}
d

{0: 0, 1: 1, 2: 4}

Tuple to Dict¶

You can pass an iterable of pairs (a pair is tuple with 2 elements) to dict to create a dict object. The first elements of pairs are the keys and the second elements of pairs are the corresponding values.

d = dict((row[0], (row[1], row[2])) for row in [("Ben", 1, 2), ("Lisa", 2, 3)])

{'Ben': (1, 2), 'Lisa': (2, 3)}

[k for k in d]

['Ben', 'Lisa']

d.items()

dict_items([('Ben', (1, 2)), ('Lisa', (2, 3))])

Passing an iterable of tuples with lengths differ from 2 causes an ValueError.

dict([("abc", 1, 2)])

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-2-19adc35cd255> in <module>()
      1 dict([
----> 2     ('abc', 1, 2)
      3 ])

ValueError: dictionary update sequence element #0 has length 3; 2 is required

Passing an empty iterable to dict generates an empty dict object.

dict([])

{}

pandas.Index is a dict-like Object¶

import pandas as pd

df = pd.DataFrame({"x": [1, 2, 3, 4, 5], "y": [5, 4, 3, 2, 1], "z": [1, 1, 1, 1, 1]})

df.head()

df.index.intersection(d.keys())

Int64Index([0, 1, 2], dtype='int64')

KeyError exception is raise is the key is not found. DefaultDict does not raise an exception when a key is not found but instead returns the default value.

d[3]

---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
<ipython-input-3-d787ddb7dc0e> in <module>()
----> 1 d[3]

KeyError: 3

get is the safe version. It’s equivalent to the following code.

d[3] if 3 in d else None

d.get(3)

Merge Two Dictionaries¶

x = {"a": 1, "b": 2}

y = {"b": 3, "c": 4}

{**x, **y}

{'a': 1, 'b': 3, 'c': 4}

keys¶

d = {"a": 1, "b": 2}
d.keys()

dict_keys(['a', 'b'])

in¶

d = {"a": 1, "b": 2}
"a" in d

True

values¶

d = dict((row[0], (row[1], row[2])) for row in [("Ben", 1, 2), ("Lisa", 2, 3)])

d.values()

dict_values([(1, 2), (2, 3)])

list(v[0] for v in d.values())

[1, 2]

max(v[0] for v in d.values())

2

Iterate Dictionary¶

Iterate Keys¶

d = {"a": 1, "b": 2}
for k in d:
    print(str(k) + ": " + str(d[k]))

a: 1
b: 2

Iterate Key/Value Pairs¶

for k, v in d.items():
    print(str(k) + ": " + str(v))

a: 1
b: 2

Cannot interate (key, value) pairs.

for k, v in d:
    print(str(k))

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-13-21b7a9439b1d> in <module>
----> 1 for k, v in d:
      2     print(str(k))

ValueError: not enough values to unpack (expected 2, got 1)

Iterate Values Directly¶

for v in d.values():
    print(v)

1
2

setdefault - Set Default Value for a Key¶

With the method dict.setdefault, you do not really need defaultdict. As a matter of fact, it is recommended that you use dict over defaultdict for safty reasons.

dic = {"x": 10, "y": 20}
dic

{'x': 10, 'y': 20}

dic.setdefault("x", 0)
dic["x"] += 1
dic

{'x': 11, 'y': 20}

dic.setdefault("z", 0)
dic["z"] += 1
dic

{'x': 11, 'y': 20, 'z': 1}

dic.setdefault("list", [])
dic["list"].append("how")
dic

{'x': 11, 'y': 20, 'z': 1, 'list': ['how']}

Remove Elements¶

d = {"a": 1, "b": 2}
d

{'a': 1, 'b': 2}

del d["a"]
d

{'b': 2}

del d["non_exist_key"]

---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
<ipython-input-4-8875b36fb7df> in <module>
----> 1 del d['non_exist_key']

KeyError: 'non_exist_key'

d.pop("non_exist_key", None)
d

{'b': 2}

Dedup List and Preserve Order¶

Converting a list to a set changes element order

words = [
    "how",
    "are",
    "how",
    "are",
    "how",
    "you",
    "are",
    "how",
    "you",
    "are",
    "you",
    "how",
]

You can use set or numpy.unique to dedup a list but it does not preserver the order of first occurence of elements.

list(set(words))

['how', 'you', 'are']

np.unique(words)

array(['are', 'how', 'you'], dtype='<U3')

One possible way to dedup and preserve the original order of first occurences of elements is to dedup using a dictionary (which preserves insertion order).

" ".join({word: None for word in words})

'how are you'

Sort a Dict¶

Sort a dict object by its keys.

dic = {"how": 2, "are": 4, "you": 3, "doing": 1, "today": 0}
sorted(dic)

['are', 'doing', 'how', 'today', 'you']

sorted(dic.items())

[('are', 4), ('doing', 1), ('how', 2), ('today', 0), ('you', 3)]

Sort a dict object by its values.

x = {"how": 2, "are": 4, "you": 3, "doing": 1, "today": 0}
dict(sorted(x.items(), key=lambda item: item[1]))

{'today': 0, 'doing': 1, 'how': 2, 'you': 3, 'are': 4}

Ref vs Copy¶

d = {"Ben": [1, 2], "Lisa": [2, 3]}

ben = d["Ben"]
ben