Skip to content Skip to sidebar Skip to footer

Cooccurence Matrix From Pandas Dataframe

Problem I have a pandas dataframe, and I need count how many rows are there where each unique entry in the dataframe occurs within the same row of each other entry. Related but d

Solution 1:

WE can do stack then get_dummies and dot then value

s=df.stack().str.get_dummies().sum(level=0).ne(0).astype(int)
s=s.T.dot(s).astype(float)
np.fill_diagonal(s.values, np.nan)
s
Out[33]: 
     A    B    C    D
A  NaN  2.0  2.0  1.0
B  2.0  NaN  2.0  0.0
C  2.0  2.0  NaN  1.0
D  1.0  0.0  1.0  NaN

Post a Comment for "Cooccurence Matrix From Pandas Dataframe"