Pandas: Create A Column Based On Applying String Conditions On Other Column
I have a dataframe df as follows: KPI Tata JSW Gross Margin % 0.582 0.476 EBITDA Margin % 0.191 0.23 EBIT Margin % 0.145 0.183 SG
Solution 1:
Use np.select
for multiple conditions and values
conditions = [
df['KPI'].str.contains('Margin| Revenue|Revenue/|ROE|ROA'),
df['KPI'].str.contains('/Revenue|Current|Quick|Turnover')
]
values = ['Max', 'Min']
df['scope'] = pd.np.select(conditions, values, default='Min/Max')
Keep the default
parameter to the value you desire when all conditions don't match.
OR
If you have only one condition, then,
condition = df['KPI'].str.contains('Margin| Revenue|ROE|ROA')
df['scope'] = pd.np.where(condition, 'Max', 'Min')
The first parameter to np.where
is the condition, second is value to put when True and third is value to put when False
Solution 2:
I think you are looking for something like this:
import pandas as pd
import re
def fn(row):
if re.search('/Revenue|Current|Quick|Turnover', row['KPI']):
return 'Min'
elif re.search('Margin|Revenue|ROA|ROE', row['KPI']):
return 'Max'
df = pd.read_csv('so.csv')
df['scope'] = df.apply (lambda row: fn(row), axis=1)
print (df)
This simply uses df.apply()
function which takes each row and applies the provided function on it.
This gives following result on given data:
0 Gross Margin % 0.5820 0.4760 Max
1 EBITDA Margin % 0.1910 0.2300 Max
2 EBIT Margin % 0.1450 0.1830 Max
3 SG&A/Revenue 0.1410 0.0300 Min
4 COGS/Revenue 0.4180 0.5240 Min
5 CapE/Revenue 0.0577 0.1204 Min
6 ROA 0.0640 0.0930 Max
7 ROE 0.1380 0.2430 Max
8 Revenue/Employee $K 290.9000 934.4000 Max
9 Inventory Turnover 2.2000 3.2700 Min
10 AR Turnover 13.0200 14.2900 Min
11 Tot Asset Turnover 0.6800 0.7400 Min
12 Current Ratio 0.9000 0.8000 Min
13 Quick Ratio 0.3000 0.4000 Min
Hope this helps!
Solution 3:
- you can use
apply
function on column you want.
import pandas as pd
import re
d = pd.DataFrame({'a':['a b c','b c d','p q r','d e f','c b a'],'b':[1,2,3,4,5]})
d['scope'] = d['a'].apply(lambda x: 'MAX' if re.search('a|b|e', x) else 'MIN')
d
Output:
a b scope
0 a b c 1 MAX
1 b c d 2 MAX
2 p q r 3 MIN
3 d e f 4 MAX
4 c b a 5 MAX
- for your data this should work.
df['Scope'] = df['KPI'].apply(lambda x: 'MAX' if re.search('Margin| Revenue|ROE|ROA', x) else 'MIN')
Post a Comment for "Pandas: Create A Column Based On Applying String Conditions On Other Column"