2019年3月7日木曜日

Fedora29でTensorflow 1.13.1(CUDA10.0 cuDNN7.5)をビルドする

Tensorflowの1.13.1がリリースされたのでソースビルド。
ただし、XLAをdisableにしないとビルドできない問題がある(以下のissue)。

Fedora29でもソースビルドしなくてもpipでインストールできるようになった(はず)。
気にしなければ、pipでインストールするかnvidia-docker上に構築するほうがよい。

環境

  • Fedora29 x86_64
  • python3.7(virtualenv)
  • CUDA10.0 + cuDNN7.5
  • GPU=>NVIDIA GTX1080

事前の準備


BAZELのビルド

Bazelもソースビルドする(リポジトリのバージョンだとTensorflowが期待するバージョンと一致しない)。
Bazelの19.2〜22.0までのバージョであればビルド可能。
今回は19.2をソースビルドした。
ソースビルドの方法は以下を参照。
ビルドが終わったらoutputフォルダのパスをPATHに追加しておく。

$ export PATH=$PATH:"bazel dir"output

Tensorflowのビルド

公式のページ通りだけど、まずはvirtualenv(virtualenvwapper)でTensorflow用の仮想Python環境を作成し、必要なモジュールをインストール。

$ mkvirtualenv tf -p python3
$ pip3 install pip six numpy wheel mock
$ pip3 install -U keras_applications==1.0.6 --no-deps
$ pip install -U keras_preprocessing==1.0.5 --no-deps


githubからソースを取得。

wget https://github.com/tensorflow/tensorflow/archive/v1.13.1.tar.gz

configureでは、CUDAのサポートを有効とすることと、Host compilerにGCC7のgccのパスを指定してあげること。

$ ./configure 
WARNING: --batch mode is deprecated. Please instead explicitly shut down your Bazel server using the command "bazel shutdown".
INFO: Invocation ID: 557cf704-5d53-4a73-8118-153c5d42f71e
You have bazel 0.21.0- (@non-git) installed.
Please specify the location of python. [Default is /home/xxxxx/.virtualenvs/tf/bin/python]: 




Traceback (most recent call last):
  File "", line 1, in 
AttributeError: module 'site' has no attribute 'getsitepackages'
Found possible Python library paths:
  /home/xxxxx/.virtualenvs/tf/lib/python3.7/site-packages
Please input the desired Python library path to use.  Default is [/home/xxxxx/.virtualenvs/tf/lib/python3.7/site-packages]


Do you wish to build TensorFlow with XLA JIT support? [Y/n]: 
XLA JIT support will be enabled for TensorFlow.


Do you wish to build TensorFlow with OpenCL SYCL support? [y/N]: 
No OpenCL SYCL support will be enabled for TensorFlow.


Do you wish to build TensorFlow with ROCm support? [y/N]: 
No ROCm support will be enabled for TensorFlow.


Do you wish to build TensorFlow with CUDA support? [y/N]: Y
CUDA support will be enabled for TensorFlow.


Please specify the CUDA SDK version you want to use. [Leave empty to default to CUDA 10.0]:   




Please specify the location where CUDA 10.0 toolkit is installed. Refer to README.md for more details. [Default is /usr/local/cuda]: 




Please specify the cuDNN version you want to use. [Leave empty to default to cuDNN 7]: 




Please specify the location where cuDNN 7 library is installed. Refer to README.md for more details. [Default is /usr/local/cuda]: 




Do you wish to build TensorFlow with TensorRT support? [y/N]: 
No TensorRT support will be enabled for TensorFlow.


Please specify the locally installed NCCL version you want to use. [Default is to use https://github.com/nvidia/nccl]: 




Please specify a list of comma-separated Cuda compute capabilities you want to build with.
You can find the compute capability of your device at: https://developer.nvidia.com/cuda-gpus.
Please note that each additional compute capability significantly increases your build time and binary size. [Default is: 6.1]: 




Do you want to use clang as CUDA compiler? [y/N]: 
nvcc will be used as CUDA compiler.


Please specify which gcc should be used by nvcc as the host compiler. [Default is /usr/lib64/ccache/gcc]: /home/xxxxx/gcc/7.3/bin/gcc




Do you wish to build TensorFlow with MPI support? [y/N]: 
No MPI support will be enabled for TensorFlow.


Please specify optimization flags to use during compilation when bazel option "--config=opt" is specified [Default is -march=native -Wno-sign-compare]: 




Would you like to interactively configure ./WORKSPACE for Android builds? [y/N]: 
Not configuring the WORKSPACE for Android builds.


Preconfigured Bazel build configs. You can use any of the below by adding "--config=<>" to your build command. See .bazelrc for more details.
    --config=mkl             # Build with MKL support.
    --config=monolithic      # Config for mostly static monolithic build.
    --config=gdr             # Build with GDR support.
    --config=verbs           # Build with libverbs support.
    --config=ngraph          # Build with Intel nGraph support.
    --config=dynamic_kernels    # (Experimental) Build kernels into separate shared objects.
Preconfigured Bazel build configs to DISABLE default on features:
    --config=noaws           # Disable AWS S3 filesystem support.
    --config=nogcp           # Disable GCP support.
    --config=nohdfs          # Disable HDFS support.
    --config=noignite        # Disable Apacha Ignite support.
    --config=nokafka         # Disable Apache Kafka support.
    --config=nonccl          # Disable NVIDIA NCCL support.
Configuration finished

ビルド。

$ ./bazel-bin/tensorflow/tools/pip_package/build_pip_package /tmp/tensorflow_pkg
$ pip3 install /tmp/tensorflow_pkg/tensorflow-1.13.1-cp37-cp37m-linux_x86_64.whl

あとは2時間待つ
ビルドが完了したらpipパッケージを作成して、インストール!

$ ./bazel-bin/tensorflow/tools/pip_package/build_pip_package /tmp/tensorflow_pkg
$ pip3 install /tmp/tensorflow_pkg/tensorflow-1.13.1-cp37-cp37m-linux_x86_64.whl

インストール後の確認

  • tf.__version__が1.13.1であること。
  • tf.Session()でGPUデバイスを認識していること。


$ python3
Python 3.7.2 (default, Jan 16 2019, 19:49:22) 
[GCC 8.2.1 20181215 (Red Hat 8.2.1-6)] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import tensorflow as tf
>>> tf.__version__
'1.13.1'
>>> tf.Session()
2019-03-06 22:14:48.432753: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:998] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2019-03-06 22:14:48.433565: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1433] Found device 0 with properties: 
name: GeForce GTX 1070 major: 6 minor: 1 memoryClockRate(GHz): 1.7085
pciBusID: 0000:09:00.0
totalMemory: 7.93GiB freeMemory: 7.54GiB
2019-03-06 22:14:48.433588: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1512] Adding visible gpu devices: 0
2019-03-06 22:14:48.434683: I tensorflow/core/common_runtime/gpu/gpu_device.cc:984] Device interconnect StreamExecutor with strength 1 edge matrix:
2019-03-06 22:14:48.434699: I tensorflow/core/common_runtime/gpu/gpu_device.cc:990]      0 
2019-03-06 22:14:48.434708: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1003] 0:   N 
2019-03-06 22:14:48.435178: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 7339 MB memory) -> physical GPU (device: 0, name: GeForce GTX 1070, pci bus id: 0000:09:00.0, compute capability: 6.1)


以上

0 件のコメント:

コメントを投稿