Trying To Understand .apply() In Pandas
Solution 1:
See the documentation for pd.DataFrame.apply
:
Notes
In the current implementation apply calls func twice on the first column/row to decide whether it can take a fast or slow code path. This can lead to unexpected behavior if func has side-effects, as they will take effect twice for the first column/row.
Your function check_fruit
does have side-effects, namely asking the user for some input, which happens once more than you would expect.
In general, apply
and other data frame functions are meant to be used with functions that transform the data in some way, not with application logic. You do not get any particular benefit for not writing out the loop explicitly in this case, so the best you can do is probably just go through each row by hand:
import pandas as pd
defcheck_fruit(row):
# ...
df = pd.DataFrame({'fruit': ['apple', 'apple', 'apple', 'apple', 'apple'],
'result': [''] * 5})
for row in df.iterrows():
check_fruit(row)
Solution 2:
@jdehesa explained why the first row was being repeated.
My second question was: why isn't the new data being returned. I found the problem, very noob mistake. I had row['result']=='Correct'
instead of row['result']='Correct'
.i.e. ==
vs =
.
Post a Comment for "Trying To Understand .apply() In Pandas"