1. Http常用请求类型
OPTIONS: 返回服务器针对特定资源所支持的 http 请求方法。 HEAD: 向服务器索要与 get 请求相一致的响应,只不过响应体将不会被返回。 GET: 向特定资源发出请求 PUT: 向指定资源位置上传其最新内容 POST: 向指定资源提交数据进行处理请求 DELETE: 请求服务器删除指定 URI 所标识的资源 PATCH: 用来将局部修改应用于某一资源
2. HTTP常见状态码
200/OK : 请求成功 201/Created: 请求已被实现,且一个新资源已根据请求被建立, URI 跟随 Location 头信息返回。 202/Accepted: 服务器已接受请求,但尚未处理。 400/Bad Request: 请求无法被服务器理解 401/Unauthorized: 当前请求需要用户验证 403/Forbidden: 服务器已理解请求,但拒绝执行。 404/Not Found
3. requests.get(url) 获取url的response对象
用于动态响应客户端请示,控制发送给用户的信息,并将动态生成响应。
import requests from io import BytesIO from PIL import Image import json url = 'http://www.baidu.com' r = requests.get(url) # 获取response对象 print(r) print(r.status_code) # 获取状态码,200表示成功 print(r.encoding) # 获取编码方式 <Response [200]> 200 ISO-8859-14. 传递参数:比如:http://aaa.com?pageId=1&type=content
params = {'k1':'v1','k2':'v2'} r = requests.get('http://httpbin.org/get', params) print(r.url) http://httpbin.org/get?k1=v1&k2=v25. 二进制数据: 获取并保存图片为例,注意:
response.text返回的是Unicode型的数据。response.content返回的是bytes型也就是二进制的数据。response.iter_content(n=1024) 按块返回二进制数据(此处为一次获取1024个二进制数据),n默认为1 r = requests.get('https://timgsa.baidu.com/timg?image&quality=80&size=b9999_10000&sec\ =1530370070095&di=42cc5a0b3201fe4e89f05ca243a9502b&imgtype=0&src=http\ ://ww2.sinaimg.cn/bmiddle/0067Ewosgw1f5xwpkn00nj30ij0rsth2.jpg') image = Image.open(BytesIO(r.content)) # content里是二进制数据,text中是文本数据 image.save('meinv.png') with open('meinv2.png', 'wb+') as fw: for chunk in r.iter_content(1024): # 一次写入1024个数据 fw.write(chunk)6. json处理
r = requests.get('https://github.com/timeline.json') print(type(r.json)) print(r.json) <class 'method'> <bound method Response.json of <Response [410]>>7. 提交表单:常用于模拟登录
response.post(url, data=form, , headers = headers) 用于在访问url时提取表单和headers等信息。
form = {'username':'user','password':'pass'} # 提交字典格式的数据就被认为是一个表单 r = requests.post('http://httpbin.org/post', data = form) print(r.text) {"args":{},"data":"","files":{},"form":{"password":"pass","username":"user"},\ "headers":{"Accept":"*/*","Accept-Encoding":"gzip, deflate","Connection":"close",\ "Content-Length":"27","Content-Type":"application/x-www-form-urlencoded",\ "Host":"httpbin.org","User-Agent":"python-requests/2.18.4"},"json":null,\ "origin":"114.213.252.227","url":"http://httpbin.org/post"}其中json.dumps(form) 将dict类型的数据转换为str.
r = requests.post('http://httpbin.org/post', data = json.dumps(form)) print(r.text) {"args":{},"data":"{\"username\": \"user\", \"password\": \"pass\"}","files":{},"form":{},\ "headers":{"Accept":"*/*","Accept-Encoding":"gzip, deflate","Connection":"close",\ "Content-Length":"40","Host":"httpbin.org","User-Agent":"python-requests/2.18.4"},\ "json":{"password":"pass","username":"user"},"origin":"114.213.252.227",\ "url":"http://httpbin.org/post"}8. cookie: 指某些网站为了辨别用户身份、进行 session 跟踪而储存在用户本地终端上的数据(通常经过加密)
url = 'http://www.baidu.com' r = requests.get(url) cookies = r.cookies # 获取cookie for k, v in cookies.get_dict().items(): print(k, v) BDORZ 27315 cookies = {'c1':'v1', 'c2':'v2'} r = requests.get('http://httpbin.org/cookies', cookies = cookies) print(r.text) {"cookies":{"c1":"v1","c2":"v2"}}9. 重定向和重定向历史
简单来说,HTTPS协议是由SSL+HTTP协议构建的可进行加密传输、身份认证的网络协议,要比http协议安全
重定向(Redirect)就是通过各种方法将各种网络请求重新定个方向转到其它位置(如:网页重定向)。
r = requests.head('http://github.com', allow_redirects = True) print(r.url) print(r.status_code) print(r.history) https://github.com/ # github被重定向到https网页 200 [<Response [301]>]