python 一个HTML文件，找出正文和链接

xiaoxiao2021-02-28 108

Python 练习册，每天一个小程序

第 0008 题：一个HTML文件，找出里面的正文。

第 0009 题：一个HTML文件，找出里面的链接。

0000-0010题链接

代码如下：

# coding=utf-8 from bs4 import BeautifulSoup def sechBodyUrl(path): with open(path,encoding='utf-8') as fp: text = BeautifulSoup(fp, 'lxml') urls = text.findAll('a') for u in urls: print(u['href']) content = text.get_text().strip('\n') return content sechBodyUrl('0007.html') #print(searchBody('0007.html'))

测试结果如下：

转载请注明原文地址: https://www.6miu.com/read-45159.html

技术

最新回复(0)