Attempting To Find The 5 Largest Values Per Month Using Groupby
I am attempting to show the top three values of nc_type for each month. I tried using n_largest but that doesn't do it by date. Original Data: area
Solution 1:
Scenario 1MultiIndex series
occurred_datenc_type1.0x3y4z13w24f3412.0d18g10w44a27g42Name:test,dtype:int64
Call sort_values
+ groupby
+ head
:
df.sort_values(ascending=False).groupby(level=0).head(2)
occurred_date nc_type
12.0 w 44
g 421.0 f 34
w 24
Name: test, dtype: int64
Change head(2)
to head(5)
for your situation.
Or, expanding upon my comment with nlargest
, you could do:
df.groupby(level=0).nlargest(2).reset_index(level=0, drop=1)
occurred_date nc_type
1.0 f 34
w 2412.0 w 44
g 42
Name: test, dtype: int64
Scenario 23-col dataframe
occurred_date nc_type value
01.0 x 311.0 y 421.0 z 1331.0 w 2441.0 f 34512.0 d 18612.0 g 10712.0 w 44812.0a27912.0 g 42
You can use sort_values
+ groupby
+ head
:
df.sort_values(['occurred_date', 'value'],
ascending=[True, False]).groupby('occurred_date').head(2)
occurred_date nc_type value
41.0 f 3431.0 w 24712.0 w 44912.0 g 42
Change head(2)
to head(5)
for your scenario.
Scenario 3MultiIndex Dataframe
test
occurred_date nc_type
1.0 x 3
y 4
z 13
w 24
f 34
12.0 d 18
g 10
w 44
a 27
g 42
Or, with nlargest
.
df.groupby(level=0).test.nlargest(2)\
.reset_index(level=0, drop=1)
occurred_date nc_type
1.0 f 34
w 2412.0 w 44
g 42
Name: test, dtype: int64
Solution 2:
I'd include group_keys=False
df.groupby('occurred_date', group_keys=False).nlargest(3)
occurred_date nc_type
1.0 f 34
w 24
z 1312.0 w 44
g 42
a 27
Name: value, dtype: int64
Post a Comment for "Attempting To Find The 5 Largest Values Per Month Using Groupby"