Skip to content Skip to sidebar Skip to footer

How To Split Dataframe On Based On Columns Row

I have one excel file , dataframe have 20 rows . after few rows there is again column names row, i want to divide dataframe based on column names row. here is example: x 0 1 2 3 4

Solution 1:

Considering your column name is col , you can first group the dataframe taking a cumsum on the col where the value equals x by df['col'].eq('x').cumsum() , then for each group create a dataframe by taking the values from the 2nd row of that group and the columns as the first value of that group using df.iloc[] and save them in a dictionary:

d={f'df{i}':pd.DataFrame(g.iloc[1:].values,columns=g.iloc[0].values) 
                   for i,g in df.groupby(df['col'].eq('x').cumsum())}

print(d['df1'])
 x
0  0
1  1
2  2
3  3
4  4

print(d['df2'])
    x
0  23
1  34
2   5
3   6

Solution 2:

Use df.index[df['x'] == 'x'] to look for the row index of where the column name appears again. Then, split the dataframe into 2 based on the index found

df = pd.DataFrame(columns=['x'], data=[[0], [1], [2], [3], [4], ['x'], [23], [34], [5], [6]])

df1 = df.iloc[:df.index[df['x'] == 'x'].tolist()[0]]
df2 = df.iloc[df.index[df['x'] == 'x'].tolist()[0]+1:]

Solution 3:

You did't mention this is sample of your dataset.Then you can try this

import pandas as pd 

df1 = []
df2 = []

df1 = pd.DataFrame({'df1': ['x', 0, 1, 2, 3, 4]})

df2 = pd.DataFrame({'df2': ['x', 23, 34, 5, 6]})

display(df1, df2)

Post a Comment for "How To Split Dataframe On Based On Columns Row"