Tips and Traps¶

Starting from Python 3.7, dict preserves insertion order (i.e., dict is ordered). There is no need to use OrderedDict any more in Python 3.7+. However, set in Python is implemented as an unordered hashset and thus is neither ordered nor sorted. A trick to dedup an iterable values while preserving the order of first occurences is to leverage dict instead set.
```
 {v: None for v in values}.keys()
```
This is also a good trick to yield reproducible dedupped results without using sort.

Construct `dict`¶

Dictionary Comprehension¶

In [1]:

d = {i: i * i for i in range(3)}
d

Out[1]:

{0: 0, 1: 1, 2: 4}

Tuple to Dict¶

You can pass an iterable of pairs (a pair is tuple with 2 elements) to dict to create a dict object. The first elements of pairs are the keys and the second elements of pairs are the corresponding values.

In [3]:

d = dict((row[0], (row[1], row[2])) for row in [("Ben", 1, 2), ("Lisa", 2, 3)])

In [4]:

Out[4]:

{'Ben': (1, 2), 'Lisa': (2, 3)}

In [6]:

[k for k in d]

Out[6]:

['Ben', 'Lisa']

In [7]:

d.items()

Out[7]:

dict_items([('Ben', (1, 2)), ('Lisa', (2, 3))])

Passing an iterable of tuples with lengths differ from 2 causes an ValueError.

In [2]:

dict([("abc", 1, 2)])

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-2-19adc35cd255> in <module>()
      1 dict([
----> 2     ('abc', 1, 2)
      3 ])

ValueError: dictionary update sequence element #0 has length 3; 2 is required

Passing an empty iterable to dict generates an empty dict object.

In [1]:

dict([])

Out[1]:

{}

pandas.Index is a dict-like Object¶

In [8]:

import pandas as pd

df = pd.DataFrame({"x": [1, 2, 3, 4, 5], "y": [5, 4, 3, 2, 1], "z": [1, 1, 1, 1, 1]})

df.head()

In [12]:

df.index.intersection(d.keys())

Out[12]:

Int64Index([0, 1, 2], dtype='int64')

KeyError exception is raise is the key is not found. DefaultDict does not raise an exception when a key is not found but instead returns the default value.

In [3]:

d[3]

---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
<ipython-input-3-d787ddb7dc0e> in <module>()
----> 1 d[3]

KeyError: 3

get is the safe version. It's equivalent to the following code.

d[3] if 3 in d else None

In [6]:

d.get(3)

Merge Two Dictionaries¶

In [1]:

x = {"a": 1, "b": 2}

In [2]:

y = {"b": 3, "c": 4}

In [3]:

{**x, **y}

Out[3]:

{'a': 1, 'b': 3, 'c': 4}

keys¶

In [9]:

d = {"a": 1, "b": 2}
d.keys()

Out[9]:

dict_keys(['a', 'b'])

in¶

In [1]:

d = {"a": 1, "b": 2}
"a" in d

Out[1]:

True

values¶

In [10]:

d = dict((row[0], (row[1], row[2])) for row in [("Ben", 1, 2), ("Lisa", 2, 3)])

In [11]:

d.values()

Out[11]:

dict_values([(1, 2), (2, 3)])

In [12]:

list(v[0] for v in d.values())

Out[12]:

[1, 2]

In [13]:

max(v[0] for v in d.values())

Out[13]:

Iterate Dictionary¶

Iterate Keys¶

In [10]:

d = {"a": 1, "b": 2}
for k in d:
    print(str(k) + ": " + str(d[k]))

a: 1
b: 2

Iterate Key/Value Pairs¶

In [12]:

for k, v in d.items():
    print(str(k) + ": " + str(v))

a: 1
b: 2

Cannot interate (key, value) pairs.

In [13]:

for k, v in d:
    print(str(k))

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-13-21b7a9439b1d> in <module>
----> 1 for k, v in d:
      2     print(str(k))

ValueError: not enough values to unpack (expected 2, got 1)

Iterate Values Directly¶

In [14]:

for v in d.values():
    print(v)

1
2

setdefault - Set Default Value for a Key¶

With the method dict.setdefault, you do not really need defaultdict. As a matter of fact, it is recommended that you use dict over defaultdict for safty reasons.

In [5]:

dic = {"x": 10, "y": 20}
dic

Out[5]:

{'x': 10, 'y': 20}

In [6]:

dic.setdefault("x", 0)
dic["x"] += 1
dic

Out[6]:

{'x': 11, 'y': 20}

In [7]:

dic.setdefault("z", 0)
dic["z"] += 1
dic

Out[7]:

{'x': 11, 'y': 20, 'z': 1}

In [8]:

dic.setdefault("list", [])
dic["list"].append("how")
dic

Out[8]:

{'x': 11, 'y': 20, 'z': 1, 'list': ['how']}

Remove Elements¶

In [2]:

d = {"a": 1, "b": 2}
d

Out[2]:

{'a': 1, 'b': 2}

In [3]:

del d["a"]
d

Out[3]:

{'b': 2}

In [4]:

del d["non_exist_key"]

---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
<ipython-input-4-8875b36fb7df> in <module>
----> 1 del d['non_exist_key']

KeyError: 'non_exist_key'

In [5]:

d.pop("non_exist_key", None)
d

Out[5]:

{'b': 2}

Dedup List and Preserve Order¶

Converting a list to a set changes element order

In [5]:

words = [
    "how",
    "are",
    "how",
    "are",
    "how",
    "you",
    "are",
    "how",
    "you",
    "are",
    "you",
    "how",
]

You can use set or numpy.unique to dedup a list but it does not preserver the order of first occurence of elements.

In [27]:

list(set(words))

Out[27]:

['how', 'you', 'are']

In [29]:

np.unique(words)

Out[29]:

array(['are', 'how', 'you'], dtype='<U3')

One possible way to dedup and preserve the original order of first occurences of elements is to dedup using a dictionary (which preserves insertion order).

In [17]:

" ".join({word: None for word in words})

Out[17]:

'how are you'

Sort a Dict¶

Sort a dict object by its keys.

In [1]:

dic = {"how": 2, "are": 4, "you": 3, "doing": 1, "today": 0}
sorted(dic)

Out[1]:

['are', 'doing', 'how', 'today', 'you']

In [2]:

sorted(dic.items())

Out[2]:

[('are', 4), ('doing', 1), ('how', 2), ('today', 0), ('you', 3)]

Sort a dict object by its values.

In [3]:

x = {"how": 2, "are": 4, "you": 3, "doing": 1, "today": 0}
dict(sorted(x.items(), key=lambda item: item[1]))

Out[3]:

{'today': 0, 'doing': 1, 'how': 2, 'you': 3, 'are': 4}

Ref vs Copy¶

In [18]:

d = {"Ben": [1, 2], "Lisa": [2, 3]}

In [20]:

ben = d["Ben"]
ben

Out[20]:

[1, 2]

In [21]:

ben[0] = 10000

In [22]:

ben

Out[22]:

[10000, 2]

In [23]:

Out[23]:

{'Ben': [10000, 2], 'Lisa': [2, 3]}

In [25]:

d = {"Ben": 1, "Lisa": 2}

In [26]:

ben = d["Ben"]

In [27]:

ben

Out[27]:

In [28]:

ben = 10000

In [29]:

ben

Out[29]:

In [30]:

Out[30]:

{'Ben': 1, 'Lisa': 2}

In [31]:

x = 1

In [32]:

x += 10

In [33]:

Out[33]:

Ben Chuanlong Du's Blog

And let it direct your passion with reason.

Hands on dict in Python

Tips and Traps¶

Construct `dict`¶

Dictionary Comprehension¶

Tuple to Dict¶

pandas.Index is a dict-like Object¶

Merge Two Dictionaries¶

keys¶

in¶

values¶

Iterate Dictionary¶

Iterate Keys¶

Iterate Key/Value Pairs¶

Iterate Values Directly¶

setdefault - Set Default Value for a Key¶

Remove Elements¶

Dedup List and Preserve Order¶

Sort a Dict¶

Ref vs Copy¶

References¶

Comments

Tips and Traps¶

Construct dict¶

Dictionary Comprehension¶

Tuple to Dict¶

pandas.Index is a dict-like Object¶

Merge Two Dictionaries¶

keys¶

in¶

values¶

Iterate Dictionary¶

Iterate Keys¶

Iterate Key/Value Pairs¶

Iterate Values Directly¶

setdefault - Set Default Value for a Key¶

Remove Elements¶

Dedup List and Preserve Order¶

Sort a Dict¶

Ref vs Copy¶

References¶

Comments

Construct `dict`¶