Skip to content Skip to sidebar Skip to footer

Extracting Comments From News Articles

My question is similar to the one asked here: https://stackoverflow.com/questions/14599485/news-website-comment-analysis I am trying to extract comments from any news article. E.g.

Solution 1:

its inside an iframe. check for a frame with id="dsq2".

now the iframe has a src attr which is a link to the actual site that has the comments.

so in beautiful soup: css_soup.select("#dsq2") and get the url from the src attribute. it will lead you to a page that has only comments.

to get the actual comments, after you get the page from src you can use this css selector: .post-message p

and if you want to load more comment, when you click to the more comments buttons it seems to be sending this:

http://disqus.com/api/3.0/threads/listPostsThreaded?limit=50&thread=1660715220&forum=cnn&order=popular&cursor=2%3A0%3A0&api_key=E8Uh5l5fHZ6gD8U3KycjAIAk46f68Zw7C6eW8WSjZvCLXebZ7p0r1yrYDrLilk2F


Post a Comment for "Extracting Comments From News Articles"