Things on this page are fragmentary and immature notes/thoughts of the author. Please read with your own judgement!
Reference¶
https://
http://
Comment¶
You are able to specify (via
left_onandright_on) which columns to join in each data frame.Columns that appear in both data frames but not used in joining are distinguished using suffixes.
import pandas as pd
df1 = pd.DataFrame({"x": [1, 2, 3], "y": [5, 4, 3]})
print(df1)
df2 = pd.DataFrame({"x": [10, 20, 30], "z": ["a", "b", "c"]})
print(df2) x y
0 1 5
1 2 4
2 3 3
x z
0 10 a
1 20 b
2 30 c
Default Join¶
Columns (x in this case) appear in both data frames are used for joining.
df1.merge(df2)Loading...
Join on Index¶
df1.merge(df2, left_index=True, right_index=True)Loading...
Join on Specified Columns¶
import pandas as pd
df1 = pd.DataFrame({"id": [1, 2, 3], "v": [5, 4, 3]})
print(df1)
df2 = pd.DataFrame({"x": [1, 2, 3], "y": ["a", "b", "c"]})
print(df2) id v
0 1 5
1 2 4
2 3 3
x y
0 1 a
1 2 b
2 3 c
df1.merge(df2, left_on="id", right_on="x")Loading...
Cartesion/Cross Join¶
import pandas as pd
df1 = pd.DataFrame({"id": [1, 2], "v": [5, 4]})
df1Loading...
df2 = pd.DataFrame({"x": [10, 20], "y": ["a", "b"]})
df2Loading...
df1.assign(key=1).merge(df2.assign(key=1))Loading...
df1.assign(key=1).merge(df2.assign(key=1)).drop("key", axis=1)Loading...