Skip to content Skip to sidebar Skip to footer

Pandas: Create A Column Based On Applying String Conditions On Other Column

I have a dataframe df as follows: KPI Tata JSW Gross Margin % 0.582 0.476 EBITDA Margin % 0.191 0.23 EBIT Margin % 0.145 0.183 SG

Solution 1:

Use np.select for multiple conditions and values

conditions = [
    df['KPI'].str.contains('Margin| Revenue|Revenue/|ROE|ROA'),
    df['KPI'].str.contains('/Revenue|Current|Quick|Turnover')
]
values = ['Max', 'Min']
df['scope'] = pd.np.select(conditions, values, default='Min/Max')

Keep the default parameter to the value you desire when all conditions don't match.

OR

If you have only one condition, then,

condition = df['KPI'].str.contains('Margin| Revenue|ROE|ROA')
df['scope'] = pd.np.where(condition, 'Max', 'Min')

The first parameter to np.where is the condition, second is value to put when True and third is value to put when False


Solution 2:

I think you are looking for something like this:

import pandas as pd
import re

def fn(row):
    if re.search('/Revenue|Current|Quick|Turnover', row['KPI']):
        return 'Min'
    elif re.search('Margin|Revenue|ROA|ROE', row['KPI']):
        return 'Max'

df = pd.read_csv('so.csv')

df['scope'] = df.apply (lambda row: fn(row), axis=1)
print (df)

This simply uses df.apply() function which takes each row and applies the provided function on it.

This gives following result on given data:

0        Gross Margin %    0.5820    0.4760   Max
1       EBITDA Margin %    0.1910    0.2300   Max
2         EBIT Margin %    0.1450    0.1830   Max
3          SG&A/Revenue    0.1410    0.0300   Min
4          COGS/Revenue    0.4180    0.5240   Min
5          CapE/Revenue    0.0577    0.1204   Min
6                   ROA    0.0640    0.0930   Max
7                   ROE    0.1380    0.2430   Max
8   Revenue/Employee $K  290.9000  934.4000   Max
9    Inventory Turnover    2.2000    3.2700   Min
10          AR Turnover   13.0200   14.2900   Min
11   Tot Asset Turnover    0.6800    0.7400   Min
12        Current Ratio    0.9000    0.8000   Min
13          Quick Ratio    0.3000    0.4000   Min

Hope this helps!


Solution 3:

  • you can use apply function on column you want.
import pandas as pd
import re
d = pd.DataFrame({'a':['a b c','b c d','p q r','d e f','c b a'],'b':[1,2,3,4,5]})

d['scope'] = d['a'].apply(lambda x: 'MAX' if re.search('a|b|e', x) else 'MIN')

d

Output:

      a     b   scope
0   a b c   1   MAX
1   b c d   2   MAX
2   p q r   3   MIN
3   d e f   4   MAX
4   c b a   5   MAX
  • for your data this should work.
df['Scope'] = df['KPI'].apply(lambda x: 'MAX' if re.search('Margin| Revenue|ROE|ROA', x) else 'MIN')

Post a Comment for "Pandas: Create A Column Based On Applying String Conditions On Other Column"