pip 安装的 不支持CPU指令,源码安装 tensorflow(机器学习) 内存溢出 。

xiaoxiao2021-02-28  28

林老板发我信息如下: 2018-06-29 14:40:11.224113: I tensorflow/core/platform/cpu_feature_guard.cc:140] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA 

  pip 安装的 不支持CPU指令

并没有 详细步骤,很乱,主要是内存溢出  cpu指令 与 GPU应用,

root@ubuntu:~# uname -a

Linux ubuntu 4.15.0-23-generic #25-Ubuntu SMP Wed May 23 18:02:16 UTC 2018 x86_64 x86_64 x86_64 GNU/Linuxroot@ubuntu:~# cat /etc/isiscsi/     issue      issue.net  root@ubuntu:~# cat /etc/issueUbuntu 18.04 LTS \n \l

官网说gcc为 4.8 我降级为4.8了  cudn cudnn nccl  都安装了

文件

-rw-r--r--  1 root root   22649439 Jun 29 18:32 tensorflow-1.8.0.tar.gz

root@ubuntu:~/tensorflow-1.8.0#./configure

WARNING: --batch mode is deprecated. Please instead explicitly shut down your Bazel server using the command "bazel shutdown".You have bazel 0.15.0 installed.Please specify the location of python. [Default is /usr/bin/python]: /usr/bin/python3Found possible Python library paths:  /usr/lib/python3/dist-packages  /usr/local/lib/python3.6/dist-packagesPlease input the desired Python library path to use.  Default is [/usr/lib/python3/dist-packages]Do you wish to build TensorFlow with jemalloc as malloc support? [Y/n]: yjemalloc as malloc support will be enabled for TensorFlow.Do you wish to build TensorFlow with Google Cloud Platform support? [Y/n]: nNo Google Cloud Platform support will be enabled for TensorFlow.Do you wish to build TensorFlow with Hadoop File System support? [Y/n]: yHadoop File System support will be enabled for TensorFlow.Do you wish to build TensorFlow with Amazon S3 File System support? [Y/n]: yAmazon S3 File System support will be enabled for TensorFlow.Do you wish to build TensorFlow with Apache Kafka Platform support? [Y/n]: yApache Kafka Platform support will be enabled for TensorFlow.Do you wish to build TensorFlow with XLA JIT support? [y/N]: yXLA JIT support will be enabled for TensorFlow.Do you wish to build TensorFlow with GDR support? [y/N]: yGDR support will be enabled for TensorFlow.Do you wish to build TensorFlow with VERBS support? [y/N]: yVERBS support will be enabled for TensorFlow.Do you wish to build TensorFlow with OpenCL SYCL support? [y/N]: nNo OpenCL SYCL support will be enabled for TensorFlow.Do you wish to build TensorFlow with CUDA support? [y/N]: yCUDA support will be enabled for TensorFlow.Please specify the CUDA SDK version you want to use, e.g. 7.0. [Leave empty to default to CUDA 9.0]: Please specify the location where CUDA 9.0 toolkit is installed. Refer to README.md for more details. [Default is /usr/local/cuda]: Please specify the cuDNN version you want to use. [Leave empty to default to cuDNN 7.0]: Please specify the location where cuDNN 7 library is installed. Refer to README.md for more details. [Default is /usr/local/cuda]:Do you wish to build TensorFlow with TensorRT support? [y/N]: nNo TensorRT support will be enabled for TensorFlow.Please specify the NCCL version you want to use. [Leave empty to default to NCCL 1.3]: Please specify a list of comma-separated Cuda compute capabilities you want to build with.You can find the compute capability of your device at: https://developer.nvidia.com/cuda-gpus.Please note that each additional compute capability significantly increases your build time and binary size. [Default is: 6.1,6.1]Do you want to use clang as CUDA compiler? [y/N]: nvcc will be used as CUDA compiler.Please specify which gcc should be used by nvcc as the host compiler. [Default is /usr/bin/gcc]: Do you wish to build TensorFlow with MPI support? [y/N]: nNo MPI support will be enabled for TensorFlow.Please specify optimization flags to use during compilation when bazel option "--config=opt" is specified [Default is -march=native]: Would you like to interactively configure ./WORKSPACE for Android builds? [y/N]: nNot configuring the WORKSPACE for Android builds.Preconfigured Bazel build configs. You can use any of the below by adding "--config=<>" to your build command. See tools/bazel.rc for more details.        --config=mkl            # Build with MKL support.        --config=monolithic     # Config for mostly static monolithic build.Configuration finishedroot@ubuntu:~/tensorflow-1.8.0#

可以完成 编译

  bazel build --jvmopt="-server -Xms20480m"  -c opt --copt=-msse3 --copt=-msse4.1 --copt=-msse4.2 --copt=-mavx --copt=-mavx2 --copt=-mfma //tensorflow/tools/pip_package:build_pip_package

不可以 完成  因为加了--config=cuda 启用GPU   报错如下 内存溢出,30G 呀, 是不是哪里不对。

  bazel build --jvmopt="-server -Xms20480m"  -c opt --copt=-mavx --copt=-mavx2 --copt=-mfma --copt=-mfpmath=both --copt=-msse4.2 --config=cuda -k //tensorflow/tools/pip_package:build_pip_package

[13,053 / 14,131] 32 actions running    Compiling tensorflow/core/kernels/strided_slice_op_gpu.cu.cc; 320s local    Compiling tensorflow/core/kernels/strided_slice_op_gpu.cu.cc; 319s local    Compiling tensorflow/core/kernels/batch_matmul_op_real.cc; 277s local    Compiling tensorflow/core/kernels/batch_matmul_op_complex.cc; 277s local    Compiling tensorflow/core/kernels/batch_matmul_op_real.cc; 277s local    Compiling tensorflow/core/kernels/argmax_op.cc; 277s local    Compiling tensorflow/core/kernels/argmax_op.cc; 277s local    Compiling tensorflow/core/kernels/conv_ops.cc; 232s local ...Server terminated abruptly (error code: 14, error message: '', log file: '/root/.cache/bazel/_bazel_root/a1181ec4a71ba55a9d58ce400c896b81/server/jvm.out')

root@ubuntu:~# free -g

              total        used        free      shared  buff/cache   available Mem:             15          15           0           0           0           0 Swap:            14          14           0

root@ubuntu:~#

root@ubuntu:~/tensorflow-1.8.0# swapoff /tmp/swap  卸载行添加的交换分区 dd if=/dev/zero of=/tmp/swap bs=1MB count=27648 添加27GB 容量mkswap /tmp/swap  创建交换分区  root@ubuntu:~# swapon /tmp/swap  挂着交换分区swapon: /tmp/swap: insecure permissions 0644, 0600 suggested.root@ubuntu:~# free -g 验证 ,之前有3GB 交换分区               total        used        free      shared  buff/cache   availableMem:             15           0           0           0          14          14

Swap:            29           0          29

最后成功

INFO: Elapsed time: 576.296s, Critical Path: 467.47sINFO: 766 processes: 766 local.

INFO: Build completed successfully, 898 total actions

生成 tensorflow-1.8.0-cp36-cp36m-linux_x86_64.whl 安装包

bazel-bin/tensorflow/tools/pip_package/build_pip_package tensorflow_pkg_GPU

卸载之前的

oot@ubuntu:~/tensorflow-1.8.0/tensorflow_pkg_GPU# pip3 uninstall tensorflow

Cannot uninstall requirement tensorflow, not installed

安装

root@ubuntu:~/tensorflow-1.8.0/tensorflow_pkg_GPU# pip3 install tensorflow-1.8.0-cp36-cp36m-linux_x86_64.whl Processing ./tensorflow-1.8.0-cp36-cp36m-linux_x86_64.whlRequirement already satisfied: gast>=0.2.0 in /usr/local/lib/python3.6/dist-packages (from tensorflow==1.8.0)Requirement already satisfied: termcolor>=1.1.0 in /usr/local/lib/python3.6/dist-packages (from tensorflow==1.8.0)Collecting protobuf>=3.4.0 (from tensorflow==1.8.0)  Downloading https://files.pythonhosted.org/packages/fc/f0/db040681187496d10ac50ad167a8fd5f953d115b16a7085e19193a6abfd2/protobuf-3.6.0-cp36-cp36m-manylinux1_x86_64.whl (7.1MB)    100% |鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅| 7.1MB 30kB/s Requirement already satisfied: numpy>=1.13.3 in /usr/local/lib/python3.6/dist-packages (from tensorflow==1.8.0)Requirement already satisfied: six>=1.10.0 in /usr/lib/python3/dist-packages (from tensorflow==1.8.0)Requirement already satisfied: tensorboard<1.9.0,>=1.8.0 in /usr/local/lib/python3.6/dist-packages (from tensorflow==1.8.0)Requirement already satisfied: absl-py>=0.1.6 in /usr/local/lib/python3.6/dist-packages (from tensorflow==1.8.0)Requirement already satisfied: astor>=0.6.0 in /usr/local/lib/python3.6/dist-packages (from tensorflow==1.8.0)Requirement already satisfied: wheel>=0.26 in /usr/local/lib/python3.6/dist-packages (from tensorflow==1.8.0)Requirement already satisfied: grpcio>=1.8.6 in /usr/local/lib/python3.6/dist-packages (from tensorflow==1.8.0)Requirement already satisfied: setuptools in /usr/local/lib/python3.6/dist-packages (from protobuf>=3.4.0->tensorflow==1.8.0)Requirement already satisfied: bleach==1.5.0 in /usr/local/lib/python3.6/dist-packages (from tensorboard<1.9.0,>=1.8.0->tensorflow==1.8.0)Requirement already satisfied: werkzeug>=0.11.10 in /usr/local/lib/python3.6/dist-packages (from tensorboard<1.9.0,>=1.8.0->tensorflow==1.8.0)Requirement already satisfied: html5lib==0.9999999 in /usr/local/lib/python3.6/dist-packages (from tensorboard<1.9.0,>=1.8.0->tensorflow==1.8.0)Requirement already satisfied: markdown>=2.6.8 in /usr/local/lib/python3.6/dist-packages (from tensorboard<1.9.0,>=1.8.0->tensorflow==1.8.0)Installing collected packages: protobuf, tensorflowSuccessfully installed protobuf-3.6.0 tensorflow-1.8.0    

    测试 文件

root@ubuntu:~# cat tf.py import tensorflow as tfhello = tf.constant('Hello, TensorFlow!')sess = tf.Session()print(sess.run(hello))测试

    root@ubuntu:~# python3 tf.py 2018-07-04 17:40:04.645445: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1356] Found device 0 with properties: name: GeForce GTX 1080 Ti major: 6 minor: 1 memoryClockRate(GHz): 1.582pciBusID: 0000:02:00.0totalMemory: 10.92GiB freeMemory: 10.76GiB2018-07-04 17:40:04.815237: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1356] Found device 1 with properties: name: GeForce GTX 1080 Ti major: 6 minor: 1 memoryClockRate(GHz): 1.582pciBusID: 0000:03:00.0totalMemory: 10.92GiB freeMemory: 10.76GiB2018-07-04 17:40:04.819954: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1435] Adding visible gpu devices: 0, 12018-07-04 17:40:05.475453: I tensorflow/core/common_runtime/gpu/gpu_device.cc:923] Device interconnect StreamExecutor with strength 1 edge matrix:2018-07-04 17:40:05.475524: I tensorflow/core/common_runtime/gpu/gpu_device.cc:929]      0 1 2018-07-04 17:40:05.475533: I tensorflow/core/common_runtime/gpu/gpu_device.cc:942] 0:   N Y 2018-07-04 17:40:05.475537: I tensorflow/core/common_runtime/gpu/gpu_device.cc:942] 1:   Y N 2018-07-04 17:40:05.476107: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1053] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 10413 MB memory) -> physical GPU (device: 0, name: GeForce GTX 1080 Ti, pci bus id: 0000:02:00.0, compute capability: 6.1)2018-07-04 17:40:05.680812: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1053] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:1 with 10413 MB memory) -> physical GPU (device: 1, name: GeForce GTX 1080 Ti, pci bus id: 0000:03:00.0, compute capability: 6.1)

b'Hello, TensorFlow!'

中文解说

https://www.tensorflow.org/install/install_sources

翻译老外的,没有采用

https://www.52cv.net/?p=511

转载请注明原文地址: https://www.6miu.com/read-2400124.html

最新回复(0)