使用selenium爬取动态网页评论

xiaoxiao2021-02-28  82

爬取网站:http://www.santostang.com/2017/03/02/hello-world/

首先定位到frame:

通过Ctrl+Shift+C定位,并且搜索frame,定位框架所在位置: 找到HTML代码:

< iframe title = "livere" scrolling = "no" src = "https://livere.me/comment/city?id=city&refer=www.santostang.com/2017/03/02/hello-world/&uid=MTAyMC8yODU4My81MTU0&site=http://www.santostang.com/2017/03/02/hello-world/&title=Hello world! - 数据科学@唐松Santos" style = "min-width: 100%; width: 100px; height: 6177px; overflow: hidden; border: 0px none; z-index: 124212;" id = "lv-comment-567" frameborder = "0" > < / iframe >

在selenium中我们通过指定iframe的title名来定位:

driver.switch_to.frame(driver.find_element_by_css_selector("iframe[title='livere']"))

然后定位每条评论的div

通过Ctrl+Shift+C定位,点击评论,找到div代码:

<div class="reply-content"><p> 哪里哪里在哪里? </p></div>

在selenium中通过查找对应的div找到评论:

comments = driver.find_elements_by_css_selector('div.reply-content')

可以看到找到的评论在<p></p>中。对每个评论遍历一遍:

for eachcomment in comments: content = eachcomment.find_element_by_tag_name('p') print (content.text)

查看运行结果:

转载请注明原文地址: https://www.6miu.com/read-2622689.html

最新回复(0)