Cooccurence Matrix From Pandas Dataframe
Problem I have a pandas dataframe, and I need count how many rows are there where each unique entry in the dataframe occurs within the same row of each other entry. Related but d
Solution 1:
WE can do stack
then get_dummies
and dot
then value
s=df.stack().str.get_dummies().sum(level=0).ne(0).astype(int)
s=s.T.dot(s).astype(float)
np.fill_diagonal(s.values, np.nan)
s
Out[33]:
A B C D
A NaN 2.0 2.0 1.0
B 2.0 NaN 2.0 0.0
C 2.0 2.0 NaN 1.0
D 1.0 0.0 1.0 NaN
Post a Comment for "Cooccurence Matrix From Pandas Dataframe"