Np.select With More Than Two Pandas Column
I am trying to solve a pandas problem statement. The panda's data frame looks like this : import numpy as np np.random.seed(0) import time import pandas as pd dataframe = pd.DataFr
Solution 1:
You can vectorize this with pandas masking, so that you are only doing the operations needed, but still have the advantages of vectorization. For brevity df is your dataframe:
df['new_column'] = np.nan
mask = df['operation']=='data_a'
df.loc[mask, 'new_column'] = df.loc[mask, 'data_a']
mask = df['operation']=='data_b'
df.loc[mask, 'new_column'] = df.loc[mask, 'data_b']
mask = df['operation']=='avg'
df.loc[mask, 'new_column'] = (df.loc[mask, 'data_a'] + df.loc[mask, 'data_b'])/2
# etc
Post a Comment for "Np.select With More Than Two Pandas Column"