win10+Theano配置，网上的东西100条，也就一条有用

xiaoxiao2021-02-28 37

重点的重点来了，首先，我要说：theano支持到了cudnn5.1，就停止更新了。维护一年后彻底放弃，所以大部分都是用于实验室做一些toy data。我是1060的显卡，于是安装了CUDA9和CUDA8，CUDA8是我在配置openpose的时候一起下的。在我用CUDA9配置完以后才知道CUDA9来做根本就不可能，因为支持的CUDNN是7.1.不信可以试试。

我下的Cnaconda3，为了以后配置tensor简单一些（ts在win下只支持3）

然后配置环境。script加到系统变量，以及Cnaconda也加进去。然后cmd执行 conda install numpy scipy mkl-service libpython m2w64-toolchain nose conda config --add channels https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/free/ conda config --set show_channel_urls yes

conda config --add channels https://mirrors.tuna.tsinghua.edu.cn/anaconda/cloud/msy 这里是再添加镜像，避免不必要的弯路。 conda install numpy scipy mkl-service libpython m2w64-toolchain nose conda install pygpu theano theano-cache purge

还有一堆环境变量：

下面图里的上下位置，决定了是用cuda8还是9（对我个人而言）

随后在cmd的根目录加下面这个东西。

主要的问题在于一个.theanorc.txt的问题

[global] cxx=E:\AnaConda\AnaConda\Library\mingw-w64\bin\g++.exe device=cuda floatX=float32 [gpuarray] preallocate=0.75 [dnn] enabled = True include_path=C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v8.0\include library_path=C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v8.0\lib\x64

就这些，网上的一大堆什么加这个加那个，不敢说不对，至少在我这行不通。成功了以后是这样：

我们测试一下代码：证明我们配置成功了。对比一下下面两段代码：

from theano import function, config, shared, tensor import numpy import time vlen = 10 * 30 * 768 # 10 x #cores x # threads per core iters = 1000 rng = numpy.random.RandomState(22) x = shared(numpy.asarray(rng.rand(vlen), config.floatX)) f = function([], tensor.exp(x)) print(f.maker.fgraph.toposort()) t0 = time.time() for i in range(iters): r = f() t1 = time.time() print("Looping %d times took %f seconds" % (iters, t1 - t0)) print("Result is %s" % (r,)) if numpy.any([isinstance(x.op, tensor.Elemwise) and ('Gpu' not in type(x.op).__name__) for x in f.maker.fgraph.toposort()]): print('Used the cpu') else: print('Used the gpu')结果为： from theano import function, config, shared, tensor import numpy import time vlen = 10 * 30 * 768 # 10 x #cores x # threads per core iters = 1000 rng = numpy.random.RandomState(22) x = shared(numpy.asarray(rng.rand(vlen), config.floatX)) f = function([], tensor.exp(x).transfer(None)) print(f.maker.fgraph.toposort()) t0 = time.time() for i in range(iters): r = f() t1 = time.time() print("Looping %d times took %f seconds" % (iters, t1 - t0)) print("Result is %s" % (numpy.asarray(r),)) if numpy.any([isinstance(x.op, tensor.Elemwise) and ('Gpu' not in type(x.op).__name__) for x in f.maker.fgraph.toposort()]): print('Used the cpu') else: print('Used the gpu')

我的运行结果：可以看到0.38和0.041359的差距，大概10倍，看来标压CPU就是快哈，我看官网的是100倍，总之成功。

时间就是生命，请不要浪费生命！

转载请注明原文地址: https://www.6miu.com/read-2627594.html

技术

最新回复(0)