continution of this question: Python beautifulsoup how to get the line after 'href' I have this HTML code
, 'Rubin_Steiner']
In [112]: url = ['http://pluzz.francetv.fr/videos/monte_le_son_live_,101973832.html']
In [113]: get_id = get_id = re.findall('\d+', url[0]) # find consecutive digits
In [114]: results = [x for x in titles] + get_id
In [115]: results
Out[115]: ['Monte le son', 'Rubin_Steiner', '101973832']
As I say in my comments, when you add titles to your titles list, group corresponding titles in sublists, it is impossible to tell which belongs where without some way of indexing the groupings. I have grouped them in sublists to show you how it works.
In [3]: url = ['http://pluzz.francetv.fr/videos/monte_le_son_live_,101973832.html', 'http://pluzz.francetv.fr/videos/fare_maohi_,102103928.html']
In [4]: titles = [['Monte le son', 'Rubin_Steiner'], ['Fare maohi']]
In [5]: get_ids = [re.findall('\d+', x) for x in url]
In [6]: results= [t + i for t, i in zip(titles, get_ids)]
In [7]: results
Out[7]: [['Monte le son', 'Rubin_Steiner', '101973832'], ['Fare maohi', '102103928']]
In [11]: final_results=[ " ".join(y) for y in results ]
In [12]: final_results
Out[12]: ['Monte le son Rubin_Steiner 101973832', 'Fare maohi 102103928']
Post a Comment for "Python Associate Urls's Ids And Url's Titles In Lists"