这是caffe文档中Notebook Examples的第一篇,链接地址http://nbviewer.jupyter.org/github/BVLC/caffe/blob/master/examples/00-classification.ipynb
这个例子利用CaffeNet模型对caffe文件夹下的那张小猫的图像进行分类,逐层可视化图像特征,CaffeNet基于ImageNet。同时比较了CPU和GPU操作。
1. 准备模型,引入必要的模块:
由于这里是用CaffeNet的测试阶段因此,需要下载参数文件,在caffe的根目录下运行:
[plain] view plain copy ./scripts/download_model_binary.py 或者直接在 http://dl.caffe.berkeleyvision.org/上下载 bvlc_reference_caffenet.caffemodel文件,下载完成后放到$CAFFE-ROOT/models/bvlc_reference_caffenet文件夹下, $CAFFE-ROOT为caffe根目录
在ipython下运行
[python] view plain copy import numpy as np #调用numpy模块,调用名称为np import matplotlib.pyplot as plt #调用matplotlib.pyplot模块,调用名称为plt import sys caffe_root = '/home/username/caffe-master' #caffe根目录 sys.path.append('/usr/lib/python2.7/dist-packages') model_file = caffe_root + '/models/bvlc_reference_caffenet/deploy.prototxt' #CaffeNet网络结构 pretrained = caffe_root + '/models/bvlc_reference_caffenet/bvlc_reference_caffenet.caffemodel' #参数文件 image_file = caffe_root+'/examples/images/cat.jpg' #测试数据 npload = caffe_root + '/python/caffe/imagenet/ilsvrc_2012_mean.npy' #计算平均值 import caffe plt.rcParams['figure.figsize'] = (10, 10) # 显示图标大小为10 plt.rcParams['image.interpolation'] = 'nearest' # 图形差值以最近为原则 plt.rcParams['image.cmap'] = 'gray' #背景颜色为灰色2. 设置CPU摸式运行,加载模型和参数文件,配置输入预处理
[python] view plain copy caffe.set_mode_cpu() net = caffe.Net(model_file, pretrained, caffe.TEST) #构建网络 transformer = caffe.io.Transformer({'data':net.blobs['data'].data.shape }) transformer.set_transpose('data',(2,0,1)) transformer.set_mean('data',np.load(npload).mean(1).mean(1)) #计算像素平均值 transformer.set_raw_scale('data',255) #将图像转到灰白空间 transformer.set_channel_swap('data',(2,1,0)) # 参考模型通道为BGR,需转换成RGB,括号中的数字表示排列顺序4测试数据
[python] view plain copy net.blobs['data'].reshape(50, 3, 227, 227) net.blobs['data'].data[...] = transformer.preprocess('data',caffe.io.load_image(image_file)) #读取文件 out = net.forward() print("Predicted class is #{}.".format(out['prob'][0].argmax())) 5. 显示图片: plt.imshow(transformer.deprocess('data', net.blobs['data'].data[0]))
6. 获取标签
命令行caffe-根目录输入语句
[plain] view plain copy ./data/ilsvrc12/get_ilsvrc_aux.sh之后代码输入:
[python] view plain copy imagenet_labels_filename = caffe_root + '/data/ilsvrc12/synset_words.txt' labels = np.loadtxt(imagenet_labels_filename, str, delimiter='\t') top_k = net.blobs['prob'].data[0].flatten().argsort()[-1:-6:-1] print labels[top_k] 可以看到如下结果 [python] view plain copy ['n02123045 tabby, tabby cat' 'n02123159 tiger cat' 'n02124075 Egyptian cat' 'n02119022 red fox, Vulpes vulpes' 'n02127052 lynx, catamount'] 7.CPU下获取一次运行事件 [python] view plain copy net.forward() # call once for allocation %timeit net.forward() 输出:1 loops, best of 3: 4.53 s per loop 8.在GPU下运行 [python] view plain copy caffe.set_device(0) caffe.set_mode_gpu() net.forward() # call once for allocation %timeit net.forward() 输出:1 loops, best of 3: 397 ms per loop 9. 网络各层特征和结构 [python] view plain copy [(k, v.data.shape) for k, v in net.blobs.items()]输出:参数中第一个为网络名,后面四个数分别为批处理大小,滤波器个数,每个神经元中图像大小
[python] view plain copy [('data', (50, 3, 227, 227)), ('conv1', (50, 96, 55, 55)), ('pool1', (50, 96, 27, 27)), ('norm1', (50, 96, 27, 27)), ('conv2', (50, 256, 27, 27)), ('pool2', (50, 256, 13, 13)), ('norm2', (50, 256, 13, 13)), ('conv3', (50, 384, 13, 13)), ('conv4', (50, 384, 13, 13)), ('conv5', (50, 256, 13, 13)), ('pool5', (50, 256, 6, 6)), ('fc6', (50, 4096)), ('fc7', (50, 4096)), ('fc8', (50, 1000)), ('prob', (50, 1000))] 10. 参数的形状, [python] view plain copy [(k, v[0].data.shape) for k, v in net.params.items()] 输出网络参数: [python] view plain copy [('conv1', (96, 3, 11, 11)), ('conv2', (256, 48, 5, 5)), ('conv3', (384, 256, 3, 3)), ('conv4', (384, 192, 3, 3)), ('conv5', (256, 192, 3, 3)), ('fc6', (4096, 9216)), ('fc7', (4096, 4096)), ('fc8', (1000, 4096))]11. 用于可视化的函数
[python] view plain copy # take an array of shape (n, height, width) or (n, height, width, channels) # and visualize each (height, width) thing in a grid of size approx. sqrt(n) by sqrt(n) def vis_square(data, padsize=1, padval=0): data -= data.min() data /= data.max() # force the number of filters to be square n = int(np.ceil(np.sqrt(data.shape[0]))) padding = ((0, n ** 2 - data.shape[0]), (0, padsize), (0, padsize)) + ((0, 0),) * (data.ndim - 3) data = np.pad(data, padding, mode='constant', constant_values=(padval, padval)) # tile the filters into an image data = data.reshape((n, n) + data.shape[1:]).transpose((0, 2, 1, 3) + tuple(range(4, data.ndim + 1))) data = data.reshape((n * data.shape[1], n * data.shape[3]) + data.shape[4:]) plt.imshow(data) 12. 显示conv1的滤波器:共96个 [python] view plain copy # the parameters are a list of [weights, biases] filters = net.params['conv1'][0].data vis_square(filters.transpose(0, 2, 3, 1))
13. 第一层的输出:前36个
[python] view plain copy feat = net.blobs['conv1'].data[0, :36] vis_square(feat, padval=1)14. 第二层的滤波器,conv2
这一层有256个滤波器,每一个由5x5x48维, 这里只展示前48个滤波器。每个通道分开显示,因此每个滤波器是一行
[python] view plain copy filters = net.params['conv2'][0].data vis_square(filters[:48].reshape(48**2, 5, 5))15. 第二层前36个输出(第二个卷积层输出,已修正的,共有256个通道,这里显示前36个)
[python] view plain copy feat = net.blobs['conv2'].data[0, :36] vis_square(feat, padval=1)
16. 第三层卷积层的输出(已修正的,所有的384个通道)
[python] view plain copy feat = net.blobs['conv3'].data[0] vis_square(feat, padval=0.5)17. 第四层卷积层输出(已修正的,所有384个通道)
[python] view plain copy feat = net.blobs['conv4'].data[0] vis_square(feat, padval=0.5) 18. 第5层卷积输出(已修正的,所有的256个通道) [python] view plain copy feat = net.blobs['conv5'].data[0] vis_square(feat, padval=0.5)19.第5层池化后的输出
[python] view plain copy feat = net.blobs['pool5'].data[0] vis_square(feat, padval=1)20. 第一个全连接层输出直方分布图(fc6,已修正的)
[python] view plain copy feat = net.blobs['fc6'].data[0] plt.subplot(2, 1, 1) plt.plot(feat.flat) plt.subplot(2, 1, 2) _ = plt.hist(feat.flat[feat.flat > 0], bins=100)21. 第二个全连接层的输出(fc7, 已修正的)分布不在平均
[python] view plain copy feat = net.blobs['fc7'].data[0] plt.subplot(2, 1, 1) plt.plot(feat.flat) plt.subplot(2, 1, 2) _ = plt.hist(feat.flat[feat.flat > 0], bins=100)
22. 最后的概率输出(prob)
[python] view plain copy feat = net.blobs['prob'].data[0] plt.plot(feat.flat) 23. 看一下概率最大的几个标签 [python] view plain copy # load labels imagenet_labels_filename = caffe_root + 'data/ilsvrc12/synset_words.txt' try: labels = np.loadtxt(imagenet_labels_filename, str, delimiter='\t') except: !/data/ilsvrc12/get_ilsvrc_aux.sh labels = np.loadtxt(imagenet_labels_filename, str, delimiter='\t') # sort top k predictions from softmax output top_k = net.blobs['prob'].data[0].flatten().argsort()[-1:-6:-1] print labels[top_k]打印结果:
[python] view plain copy ['n02123045 tabby, tabby cat' 'n02123159 tiger cat' 'n02124075 Egyptian cat' 'n02119022 red fox, Vulpes vulpes' 'n02127052 lynx, catamount']参考网站:
http://nbviewer.jupyter.org/github/BVLC/caffe/blob/master/examples/00-classification.ipynb