Extracting Comments From News Articles
My question is similar to the one asked here: https://stackoverflow.com/questions/14599485/news-website-comment-analysis I am trying to extract comments from any news article. E.g.
Solution 1:
its inside an iframe
. check for a frame with id="dsq2"
.
now the iframe
has a src
attr which is a link to the actual site that has the comments.
so in beautiful soup: css_soup.select("#dsq2")
and get the url from the src attribute. it will lead you to a page that has only comments.
to get the actual comments, after you get the page from src you can use this css selector: .post-message p
and if you want to load more comment, when you click to the more comments buttons it seems to be sending this:
http://disqus.com/api/3.0/threads/listPostsThreaded?limit=50&thread=1660715220&forum=cnn&order=popular&cursor=2%3A0%3A0&api_key=E8Uh5l5fHZ6gD8U3KycjAIAk46f68Zw7C6eW8WSjZvCLXebZ7p0r1yrYDrLilk2F
Post a Comment for "Extracting Comments From News Articles"