Skip to content Skip to sidebar Skip to footer

What's The Best Way To Use Fuzzywuzzy To Compare Each Value Of A Column With All The Values Of A Separate Dataframe's Column?

Having a really tough time with this one. Say I have two dataframes, one that has fruits and another one that has types of fruit candy. There's lots of other data in each of the da

Solution 1:

It's slow because you're currently working in O(N^2).

Rather than using iterrows, use dictionaries to iterate instead. This can be done with the following:

candydict = candy.to_dict{}
fruitdict = fruit.to_dict{}

for k,v in candydict.items():
   for k2,v2 in fruitdict.items():
      #do the rest of your comparisons here

This should speed it up significantly.

Post a Comment for "What's The Best Way To Use Fuzzywuzzy To Compare Each Value Of A Column With All The Values Of A Separate Dataframe's Column?"