Python Associate Urls's Ids And Url's Titles In Lists

Question

continution of this question: Python beautifulsoup how to get the line after 'href' I have this HTML code , 'Rubin_Steiner'] In [112]: url = ['http://pluzz.francetv.fr/videos/monte_le_son_live_,101973832.html'] In [113]: get_id = get_id = re.findall('\d+', url[0]) # find consecutive digits In [114]: results = [x for x in titles] + get_id In [115]: results Out[115]: ['Monte le son', 'Rubin_Steiner', '101973832']

As I say in my comments, when you add titles to your titles list, group corresponding titles in sublists, it is impossible to tell which belongs where without some way of indexing the groupings. I have grouped them in sublists to show you how it works.

In [3]: url = ['http://pluzz.francetv.fr/videos/monte_le_son_live_,101973832.html',   'http://pluzz.francetv.fr/videos/fare_maohi_,102103928.html']

In [4]: titles = [['Monte le son', 'Rubin_Steiner'], ['Fare maohi']]   # need to sub list to match to url position

In [5]: get_ids = [re.findall('\d+', x) for x in url] # get all ids, position in list will match sub list position in titles

In [6]: results= [t + i for t, i in zip(titles, get_ids)] # this is why sub lists are useful, each position of the sub lists correspond.

In [7]: results

Out[7]: [['Monte le son', 'Rubin_Steiner', '101973832'], ['Fare maohi', '102103928']]

In [11]: final_results=[ " ".join(y) for y in  results ]

In [12]: final_results

Out[12]: ['Monte le son Rubin_Steiner 101973832', 'Fare maohi 102103928'] # join strings in each sublist

Python Dictionary

Python Associate Urls's Ids And Url's Titles In Lists

Post a Comment for "Python Associate Urls's Ids And Url's Titles In Lists"