What's The Best Way To Use Fuzzywuzzy To Compare Each Value Of A Column With All The Values Of A Separate Dataframe's Column?
Having a really tough time with this one. Say I have two dataframes, one that has fruits and another one that has types of fruit candy. There's lots of other data in each of the da
Solution 1:
It's slow because you're currently working in O(N^2).
Rather than using iterrows, use dictionaries to iterate instead. This can be done with the following:
candydict = candy.to_dict{}
fruitdict = fruit.to_dict{}
for k,v in candydict.items():
for k2,v2 in fruitdict.items():
#do the rest of your comparisons here
This should speed it up significantly.
Post a Comment for "What's The Best Way To Use Fuzzywuzzy To Compare Each Value Of A Column With All The Values Of A Separate Dataframe's Column?"