【matlab 爬虫】用matlab做网络爬虫入门系列1

xiaoxiao2021-02-28  127

一、需求分析 抓取内容:

二、实现代码

clc,clear %% % 设置不用科学计数法显示数据 format short g % % % 读取源代码 sourcefile=urlread('file:///D:/Program Files/MATLAB/R2013a/gui3/sheet.html'); % 正则表达式获取第一行数据存为data1 expr1='<td .*?>(.*?)</td>'; [datafile1, data_tokens1] = regexp(sourcefile, expr1, 'match', 'tokens'); data1 = zeros(size(data_tokens1)); for idx1 = 1:length(data_tokens1) data1(idx1) = str2double(data_tokens1{idx1}{1}); end %正则表达式获取第二行到最后一行存为data2 expr2='<td>(.*?)</td>'; [datafile2, data_tokens2] = regexp(sourcefile, expr2, 'match', 'tokens'); data2 = zeros(size(data_tokens2)); for idx2 = 16:length(data_tokens2) data2(idx2) = str2double(data_tokens2{idx2}{1}); end data2=data2(1,16:end); % 合并data1和data2 data3=[data1 data2]; %%得到每个变量的数据 number=data3(1,1:15:end)'; Month=data3(1,2:15:end)'; Day=data3(1,3:15:end)'; Time=data3(1,4:15:end)'; p=data3(1,5:15:end)'; a=data3(1,6:15:end)'; v=data3(1,7:15:end)'; T=data3(1,8:15:end)'; Cp=data3(1,9:15:end)'; s=data3(1,10:15:end)'; t1=data3(1,11:15:end)'; S=data3(1,12:15:end)'; H=data3(1,13:15:end)'; P=data3(1,14:15:end)'; n=data3(1,15:15:end)'; % 转换成矩阵 data4=[number Month Day Time p a v T Cp s t1 S H P n]
转载请注明原文地址: https://www.6miu.com/read-51868.html

最新回复(0)