pd.merge(left,
left_index=False, right_index=False, sort=True)```
```merge``` and ```update``` have a lot in common.
[There's a whole family of methods.](https://pandas.pydata.org/pandas-docs/stable/merging.html)
I can understand where a busy IT worker, with the job of helping numerous IT workers, might get frustrated with the redundancy of it all. How many times must we reinvent the same wheels? That's the skeptic's view.
The good news the same as the bad news: redundancy means generic fluency is possible. Some background in SQL helps with pandas. Some background in pandas helps with SQL.
import pandas as pd
import numpy as np
dfA = pd.DataFrame({'A':[1,1,1,1,1,1,1,1],
'B':[2,2,2,2,2,2,2,2]})
dfA
dfB = pd.DataFrame({'C':[3,3,3,3,3,3,3,3],
'D':[4,4,4,4,4,4,4,4]})
You get to choose which columns, left and right, serve as "gear teeth" for synchronizing rows (sewing them together). Or choose the index, not a column.
In the expression below, we go with the one synchronizing element: the index, on both input tables.
pd.merge(dfA, dfB, left_index=True, right_index=True)
import string
dfA.index = list(string.ascii_lowercase[:8]) # new index, of letters instead
dfA
dfB.index = list(string.ascii_lowercase[5:8+5]) # overlapping letters
dfB
pd.merge(dfA, dfB, left_index=True, right_index=True) # intersection, not the union
pd.merge(dfA, dfB, left_index=True, right_index=True, how="left") # left side governs
pd.merge(dfA, dfB, left_index=True, right_index=True, how="right") # right side governs
pd.merge(dfA, dfB, left_index=True, right_index=True, how="outer") # the full union
pd.merge(dfA, dfB, left_index=True, right_index=True, how="inner") # same as intersection
dfA = pd.DataFrame({'A':[1,2,3,4,5,6,7,8],
'B':[2,2,2,2,2,2,2,2],
'key':['dog', 'pig', 'rooster', 'monkey',
'hen', 'cat', 'slug', 'human']})
from numpy.random import shuffle
keys = dfA.key.values.copy() # copy or dfA key will also reorder
shuffle(keys) # in place
dfB = pd.DataFrame({'C':[1,2,3,4,5,6,7,8],
'D':[4,4,4,4,4,4,4,4],
'key': keys})
keys
dfA
dfB
pd.merge(dfA, dfB, on='key') # like "zipping together" on a common column
dfB.rename({"C":"A", "D":"B"}, axis=1, inplace = True)
dfB
dfA
pd.merge(dfA, dfB, on='key', sort=False) # sort on key if sort is True
pd.merge(dfA, dfB, on='key', sort=True) # sort on key if sort is True
pd.merge(dfA, dfB, on='A')
pd.merge(dfA, dfB, left_index=True, right_on="A")
merged = pd.merge(dfA, dfB, left_index=True, right_on="A")
merged.to_json("merged.json") # save for later
How might one simply swap the axes? That's a hint.