Skip to content Skip to sidebar Skip to footer

Function To Select From Columns Pandas Df

i have this test table in pandas dataframe Leaf_category_id session_id product_id 0 111 1 987 3 111 4 987 4

Solution 1:

You can use boolean indexing first and then groupby with apply join:

df = pd.DataFrame({'Leaf_category_id':[111,111,111,222,333],
                   'session_id':[1,4,1,2,3],
                   'product_id':[987,987,741,654,321]}, 
                   columns =['Leaf_category_id','session_id','product_id'])

print (df)                   
   Leaf_category_id  session_id  product_id
0               111           1         987
1               111           4         987
2               111           1         741
3               222           2         654
4               333           3         321


print (df[df.Leaf_category_id == 111]
            .groupby('session_id')['product_id']
            .apply(lambda x: ','.join(x.astype(str))))
session_id
1    987,741
4        987
Name: product_id, dtype: object

EDIT by comment:

print (df.groupby(['Leaf_category_id','session_id'])['product_id'].apply(lambda x: ','.join(x.astype(str)))
         .reset_index())
   Leaf_category_id  session_id product_id
01111987,741111149872222265433333321

Or if need for each unique value in Leaf_category_idDataFrame:

for i in df.Leaf_category_id.unique():
    print (df[df.Leaf_category_id == i] \
                .groupby('session_id')['product_id'] \
                .apply(lambda x: ','.join(x.astype(str))) \
                .reset_index())

   session_id product_id
01987,74114987
   session_id product_id
02654
   session_id product_id
03321

Post a Comment for "Function To Select From Columns Pandas Df"