2020年3月28日土曜日

Fedora 31でTensorflow 1.15.2(CUDA10.2 cuDNN7.6.5)をビルドする

目的


前回のブログにつづき、TensorFlow 1.15.2をソースビルドする。1.x系と2.x系は両方あった方がよい。Dockerを使えばよいのだが、簡単に使える方がよいのでビルドする。


環境


前回と同様である。
  • Fedora 31 x86_64
  • python 3.7.6(virtualenv)
  • CUDA 10.2 + cuDNN 7.6.5
  • CPU AMD Ryzen 7 1700
  • GPU GeForce GTX 1070


事前準備


こちらも前回のブログと同様なので割愛。
詳細は、こちらを参照。ただし、Bazelのバージョンは0.26.1を使用する。


Tensorflowのビルド


さて、本題。TensorFlow 1.15.2をビルドする。1.x系の注意事項。

virtualenvの設定


$ kvirtualenv -p python3 tf1.15.2
$ pip install pip six numpy wheel setuptools mock 'future>=0.17.1'
$ pip install keras_applications --no-deps
$ pip install keras_preprocessing --no-deps


ビルド


ソースを展開後、third_party/nccl/build_defs.bzl.tpl の116行目を削除する。これでCUDA 10.2 + cuDNN 7.6.5でのビルドができる。

$ diff -up third_party/nccl/build_defs.bzl.tpl third_party/nccl/build_defs.bzl.tpl.org
--- third_party/nccl/build_defs.bzl.tpl 2020-03-28 21:06:18.313022179 +0900
+++ third_party/nccl/build_defs.bzl.tpl.org 2020-03-28 21:06:05.835143967 +0900
@@ -113,6 +113,7 @@ def _device_link_impl(ctx):
"--cmdline=--compile-only",
"--link",
"--compress-all",
+ "--bin2c-path=%s" % bin2c.dirname,
"--create=%s" % tmp_fatbin.path,
"--embedded-fatbin=%s" % fatbin_h.path,
] + images,


ビルドを行う。途中でsys_gettidの問題でエラーとなる。
なお、/home/xxx/.cache/bazel/_bazel_xxx/aca0f394050ca263374306622f61644f のパスは、bazelのキャッシュであり毎回変わるので注意。

$ wget https://github.com/tensorflow/tensorflow/archive/v1.15.2.tar.gz
$ tar xf v1.15.2.tar.gz
$ cd tensorflow-1.15.2/
$ bazel build \
--config=opt \
--config=v1 \
--cxxopt="-D_GLIBCXX_USE_CXX11_ABI=0" \
--config=cuda \
--config=nonccl \
--verbose_failures \
//tensorflow/tools/pip_package:build_pip_package

ERROR: /home/xxx/.cache/bazel/_bazel_xxx/aca0f394050ca263374306622f61644f/external/grpc/BUILD:507:1: C++ compilation of rule '@grpc//:gpr_base' failed (Exit 1): crosstool_wrapper_driver_is_not_gcc failed: error executing command
(cd /home/xxx/.cache/bazel/_bazel_xxx/aca0f394050ca263374306622f61644f/execroot/org_tensorflow && \
exec env - \
PATH=/home/xxx/.virtualenvs/tf2.2-rc1/bin:/home/xxx/.local/bin:/home/xxx/bin:/usr/share/Modules/bin:/usr/local/bin:/usr/local/sbin:/usr/bin:/usr/sbin \
PWD=/proc/self/cwd \
external/local_config_cuda/crosstool/clang/bin/crosstool_wrapper_driver_is_not_gcc -MD -MF bazel-out/host/bin/external/grpc/_objs/gpr_base/log_linux.pic.d '-frandom-seed=bazel-out/host/bin/external/grpc/_objs/gpr_base/log_linux.pic.o' '-DGRPC_ARES=0' -iquote external/grpc -iquote bazel-out/host/bin/external/grpc -isystem external/grpc/include -isystem bazel-out/host/bin/external/grpc/include '-std=c++11' -Wno-builtin-macro-redefined '-D__DATE__="redacted"' '-D__TIMESTAMP__="redacted"' '-D__TIME__="redacted"' -fPIC -U_FORTIFY_SOURCE '-D_FORTIFY_SOURCE=1' -fstack-protector -Wall -fno-omit-frame-pointer -no-canonical-prefixes -fno-canonical-system-headers -DNDEBUG -g0 -O2 -ffunction-sections -fdata-sections -g0 '-march=native' -g0 -c external/grpc/src/core/lib/gpr/log_linux.cc -o bazel-out/host/bin/external/grpc/_objs/gpr_base/log_linux.pic.o)
Execution platform: @bazel_tools//platforms:host_platform
external/grpc/src/core/lib/gpr/log_linux.cc:43:13: error: ambiguating new declaration of ‘long int gettid()’
static long gettid(void) { return syscall(__NR_gettid); }
^~~~~~
In file included from /usr/include/unistd.h:1170,
from external/grpc/src/core/lib/gpr/log_linux.cc:41:
/usr/include/bits/unistd_ext.h:34:16: note: old declaration ‘__pid_t gettid()’
extern __pid_t gettid (void) __THROW;
^~~~~~
external/grpc/src/core/lib/gpr/log_linux.cc:43:13: warning: ‘long int gettid()’ defined but not used [-Wunused-function]
static long gettid(void) { return syscall(__NR_gettid); }
^~~~~~
Target //tensorflow/tools/pip_package:build_pip_package failed to build
INFO: Elapsed time: 2333.653s, Critical Path: 275.29s
INFO: 7193 processes: 7193 local.
FAILED: Build did NOT complete successfully

ビルドエラーが発生後、該当の external/grpc/src/core/lib/gpr/log_linux.cc を修正する。"gettid"を "sys_gettid"に置き換えて競合しないようにする。

--- /home/nobuo/.cache/bazel/_bazel_xxx/aca0f394050ca263374306622f61644f/external/grpc/src/core/lib/gpr/log_linux.cc.org 2020-03-28 21:51:12.510625621 +0900
+++ /home/nobuo/.cache/bazel/_bazel_xxx/aca0f394050ca263374306622f61644f/external/grpc/src/core/lib/gpr/log_linux.cc 2020-03-28 21:50:36.429895235 +0900
@@ -40,7 +40,7 @@
#include <time.h>
#include <unistd.h>
-static long gettid(void) { return syscall(__NR_gettid); }
+static long sys_gettid(void) { return syscall(__NR_gettid); }
void gpr_log(const char* file, int line, gpr_log_severity severity,
const char* format, ...) {
@@ -70,7 +70,7 @@ void gpr_default_log(gpr_log_func_args*
gpr_timespec now = gpr_now(GPR_CLOCK_REALTIME);
struct tm tm;
static __thread long tid = 0;
- if (tid == 0) tid = gettid();
+ if (tid == 0) tid = sys_gettid();
timer = static_cast<time_t>(now.tv_sec);
final_slash = strrchr(args->file, '/');
ビルド完了後、pipパッケージを作成して、インストールする。

$ bazel build \
--config=opt \
--config=v1 \
--cxxopt="-D_GLIBCXX_USE_CXX11_ABI=0" \
--config=cuda \
--config=nonccl \
--verbose_failures \
//tensorflow/tools/pip_package:build_pip_package
$ ./bazel-bin/tensorflow/tools/pip_package/build_pip_package /tmp/tensorflow_pkg
$ pip install /tmp/tensorflow_pkg/tensorflow-1.15.2-cp37-cp37m-linux_x86_64.whl

インストール確認



  • tf.__version__が1.15.2であること。
  • GPUデバイスを認識していること。
  • $ python
    Python 3.7.6 (default, Jan 30 2020, 09:44:41)
    [GCC 9.2.1 20190827 (Red Hat 9.2.1-1)] on linux
    Type "help", "copyright", "credits" or "license" for more information.
    >>> import tensorflow as tf
    >>> tf.__version__
    '1.15.2'
    >>> from tensorflow.python.client import device_lib
    >>> device_lib.list_local_devices()
    2020-03-28 21:40:55.825351: I tensorflow/core/platform/profile_utils/cpu_utils.cc:94] CPU Frequency: 2993890000 Hz
    2020-03-28 21:40:55.827348: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x55a6744ec540 initialized for platform Host (this does not guarantee that XLA will be used). Devices:
    2020-03-28 21:40:55.827545: I tensorflow/compiler/xla/service/service.cc:176] StreamExecutor device (0): Host, Default Version
    2020-03-28 21:40:55.878251: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcuda.so.1
    2020-03-28 21:40:56.375602: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
    2020-03-28 21:40:56.379531: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x55a67459c1d0 initialized for platform CUDA (this does not guarantee that XLA will be used). Devices:
    2020-03-28 21:40:56.379602: I tensorflow/compiler/xla/service/service.cc:176] StreamExecutor device (0): GeForce GTX 1070, Compute Capability 6.1
    2020-03-28 21:40:56.379917: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
    2020-03-28 21:40:56.380660: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1639] Found device 0 with properties:
    name: GeForce GTX 1070 major: 6 minor: 1 memoryClockRate(GHz): 1.7085
    pciBusID: 0000:09:00.0
    2020-03-28 21:40:56.381097: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.2
    2020-03-28 21:40:56.383815: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10
    2020-03-28 21:40:56.390800: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcufft.so.10
    2020-03-28 21:40:56.391482: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcurand.so.10
    2020-03-28 21:40:56.400449: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusolver.so.10
    2020-03-28 21:40:56.407601: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusparse.so.10
    2020-03-28 21:40:56.424543: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudnn.so.7
    2020-03-28 21:40:56.424919: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
    2020-03-28 21:40:56.426401: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
    2020-03-28 21:40:56.430032: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1767] Adding visible gpu devices: 0
    2020-03-28 21:40:56.432332: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.2
    2020-03-28 21:40:56.439218: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1180] Device interconnect StreamExecutor with strength 1 edge matrix:
    2020-03-28 21:40:56.439362: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1186] 0
    2020-03-28 21:40:56.439453: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1199] 0: N
    2020-03-28 21:40:56.442431: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
    2020-03-28 21:40:56.444038: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
    2020-03-28 21:40:56.444975: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1325] Created TensorFlow device (/device:GPU:0 with 6983 MB memory) -> physical GPU (device: 0, name: GeForce GTX 1070, pci bus id: 0000:09:00.0, compute capability: 6.1)
    [name: "/device:CPU:0"
    device_type: "CPU"
    memory_limit: 268435456
    locality {
    }
    incarnation: 1257751299983349578
    , name: "/device:XLA_CPU:0"
    device_type: "XLA_CPU"
    memory_limit: 17179869184
    locality {
    }
    incarnation: 7465901702980950830
    physical_device_desc: "device: XLA_CPU device"
    , name: "/device:XLA_GPU:0"
    device_type: "XLA_GPU"
    memory_limit: 17179869184
    locality {
    }
    incarnation: 14187726965059283561
    physical_device_desc: "device: XLA_GPU device"
    , name: "/device:GPU:0"
    device_type: "GPU"
    memory_limit: 7323051623
    locality {
    bus_id: 1
    links {
    }
    }
    incarnation: 12656403030500723883
    physical_device_desc: "device: 0, name: GeForce GTX 1070, pci bus id: 0000:09:00.0, compute capability: 6.1"
    ]
    >>>

    OK!

    Fedora 31でTensorflow 2.2-rc1(CUDA10.2 cuDNN7.6.5)をビルドする

    目的


    Tensorflowの2.2-rc1をFedora 30でソースビルドする。
    2.2の正式版リリースに向けての準備と備忘録。


    環境


    • Fedora 31 x86_64
    • python 3.7.6(virtualenv)
    • CUDA 10.2 + cuDNN 7.6.5
    • CPU AMD Ryzen 7 1700
    • GPU GeForce GTX 1070


    事前の準備


    GCC8のビルド


    前回の記事と同様、CUDA 10.2がサポートするGCCは8で、Fedora 31のGCC9ではビルドができない。このため、まずはGCC8のビルドを行う。
    GCCのビルドについては、以前の記事を参照。


    GCC8.4のソースダウンロード&ビルド


    以下でビルドを行う。
    # GCC build from source
    $ wget https://ftp.gnu.org/gnu/gcc/gcc-8.4.0/gcc-8.4.0.tar.gz
    $ tar xf gcc-8.4.0.tar.gz
    $ ./contrib/download_prerequisites
    $ ../configure \
    --enable-bootstrap \
    --enable-languages=c,c++ \
    --prefix=/home/xxx/gcc/8.4 \
    --enable-shared \
    --enable-threads=posix \
    --enable-checking=release \
    --disable-multilib \
    --with-system-zlib \
    --enable-__cxa_atexit \
    --disable-libunwind-exceptions \
    --enable-gnu-unique-object \
    --enable-linker-build-id \
    --with-gcc-major-version-only \
    --with-linker-hash-style=gnu \
    --enable-plugin \
    --enable-initfini-array \
    --with-isl \
    --enable-libmpx \
    --enable-gnu-indirect-function \
    --build=x86_64-redhat-linux
    $ make -j$(nproc)
    $ make install


    specsファイルの作成


    コンパイルしたGCC7でビルドした際に、適切な動的リンクライブラリ(libstdc++.so)がリンクされるようにSPECEファイルを修正する。
    $ /home/xxx/gcc/8.4/bin/gcc -dumpspecs > specs
    $ vi specs
    # before
    *link_libgcc:
    %D
    # after
    *link_libgcc:
    %{!static:%{!static-libgcc:-rpath /home/nobuo/gcc/8.4/lib64/}} %D
    $ mv specs /home/xxx/gcc/8.4/lib/gcc/x86_64-redhat-linux/8/
    $ diff -up specs /home/xxx/gcc/8.4/lib/gcc/x86_64-redhat-linux/8/specs
    --- specs 2020-03-22 15:27:41.626467627 +0900
    +++ /home/xxx/gcc/8.4/lib/gcc/x86_64-redhat-linux/8/specs 2020-03-22 15:25:53.875378926 +0900
    @@ -107,7 +107,7 @@ collect2
    *link_libgcc:
    -%D
    +%{!static:%{!static-libgcc:-rpath /home/xxx/gcc/8.4/lib64/}} %D
    *md_exec_prefix:


    Environment Modulesの設定


    GCC8をEnvironment Modulesで切り替えられるようにする。/etc/modulefiles 配下に、gcc8xのファイルを作成する。
    ※筆者の環境にはソースビルドしたGCC5、GCC7がある。
    #%Module 1.0
    #
    # gcc-8.X module for use with 'environment-modules' package:
    #
    conflict gcc5x gcc7x
    prepend-path PATH /home/xxx/gcc/8.4/bin/


    Bazelのビルド


    Bazelをソースビルドする(リポジトリのバージョンだとTensorflowが期待するバージョンと一致しないことがあるため)。2.2-rc1はBazel2.0.0が必要なため、ソースビルドした。

    公式のソースビルド方法はここを参照。手順どおりであり詳細の説明は割愛。


    CUDA、cuDNNのインストール


    CUDA: 10.2、cuDNN: 7.6.5をインストール。CUDAはRPM Fusion Howto/ CUDA を参考にインストールを行う。cuDNNはNVIDIAのダウンロードサイトからダウンロード、インストールを行う。


    Tensorflowのビルド


    さて、本題。TensorFlow 2.2-rc1をビルドする。


    virtualenvの設定


    まずはvirtualenv(virtualenvwapper)でTensorflow用の仮想Python環境を作成し、必要なモジュールをインストールする。
    $ mkvirtualenv -p python3 tf2.2-rc1
    $ pip install pip six numpy wheel setuptools mock 'future>=0.17.1'
    $ pip install keras_applications --no-deps
    $ pip install keras_preprocessing --no-deps


    ビルド


    Githubからソースを取得し、configureスクリプト実行し、ビルドを行う。

    • CUDAのサポートを有効とする。
    • Host compilerにGCC8のgccのパスを指定してあげる。
    • ビルドオプションには"--config=v2"と"--config=nonccl "(NCCLのライブラリをインストールしたがビルドエラーとなったので外す)を指定。
    • 日本語環境では依存関係のダウンロードで失敗したので、ビルド前に"LANG=C"で回避。
    $ wget https://github.com/tensorflow/tensorflow/archive/v2.2.0-rc2.tar.gz
    $ tar xf v2.2.0-rc2.tar.gz
    $ cd tensorflow-2.2.0-rc2/
    $ ./configure
    You have bazel 2.0.0- (@non-git) installed.
    Please specify the location of python. [Default is /home/xxx/.virtualenvs/tf2.2-rc1/bin/python]:
    Found possible Python library paths:
    /home/xxx/.virtualenvs/tf2.2-rc1/lib/python3.7/site-packages
    Please input the desired Python library path to use. Default is [/home/xxx/.virtualenvs/tf2.2-rc1/lib/python3.7/site-packages]
    Do you wish to build TensorFlow with OpenCL SYCL support? [y/N]:
    No OpenCL SYCL support will be enabled for TensorFlow.
    Do you wish to build TensorFlow with ROCm support? [y/N]:
    No ROCm support will be enabled for TensorFlow.
    Do you wish to build TensorFlow with CUDA support? [y/N]: Y
    CUDA support will be enabled for TensorFlow.
    Do you wish to build TensorFlow with TensorRT support? [y/N]:
    No TensorRT support will be enabled for TensorFlow.
    Found CUDA 10.2 in:
    /usr/local/cuda/lib64
    /usr/local/cuda/include
    Found cuDNN 7 in:
    /usr/local/cuda/lib64
    /usr/local/cuda/include
    Please specify a list of comma-separated CUDA compute capabilities you want to build with.
    You can find the compute capability of your device at: https://developer.nvidia.com/cuda-gpus.
    Please note that each additional compute capability significantly increases your build time and binary size, and that TensorFlow only supports compute capabilities >= 3.5 [Default is: 6.1]:
    Do you want to use clang as CUDA compiler? [y/N]:
    nvcc will be used as CUDA compiler.
    Please specify which gcc should be used by nvcc as the host compiler. [Default is /usr/bin/gcc]: /home/xxx/gcc/8.4/bin/gcc
    Please specify optimization flags to use during compilation when bazel option "--config=opt" is specified [Default is -march=native -Wno-sign-compare]:
    Would you like to interactively configure ./WORKSPACE for Android builds? [y/N]:
    Not configuring the WORKSPACE for Android builds.
    Preconfigured Bazel build configs. You can use any of the below by adding "--config=<>" to your build command. See .bazelrc for more details.
    --config=mkl # Build with MKL support.
    --config=monolithic # Config for mostly static monolithic build.
    --config=ngraph # Build with Intel nGraph support.
    --config=numa # Build with NUMA support.
    --config=dynamic_kernels # (Experimental) Build kernels into separate shared objects.
    --config=v2 # Build TensorFlow 2.x instead of 1.x.
    Preconfigured Bazel build configs to DISABLE default on features:
    --config=noaws # Disable AWS S3 filesystem support.
    --config=nogcp # Disable GCP support.
    --config=nohdfs # Disable HDFS support.
    --config=nonccl # Disable NVIDIA NCCL support.
    Configuration finished
    $ LANG=C
    $ bazel build \
    --config=opt \
    --config=v2 \
    --cxxopt="-D_GLIBCXX_USE_CXX11_ABI=0" \
    --config=cuda \
    --config=nonccl \
    --verbose_failures \
    //tensorflow/tools/pip_package:build_pip_package
    $ ./bazel-bin/tensorflow/tools/pip_package/build_pip_package /tmp/tensorflow_pkg
    $ pip install /tmp/tensorflow_pkg/tensorflow-2.2.0rc1-cp37-cp37m-linux_x86_64.whl

    なお、エラーは以下であった。

    ERROR: An error occurred during the fetch of repository 'local_config_cuda':
    Traceback (most recent call last):
    File "/home/nobuo/tensorflow-2.2.0-rc2/third_party/gpus/cuda_configure.bzl", line 1210
    _create_local_cuda_repository(<1 more arguments>)
    File "/home/nobuo/tensorflow-2.2.0-rc2/third_party/gpus/cuda_configure.bzl", line 934, in _create_local_cuda_repository
    _find_libs(repository_ctx, <2 more arguments>)
    File "/home/nobuo/tensorflow-2.2.0-rc2/third_party/gpus/cuda_configure.bzl", line 577, in _find_libs
    _check_cuda_libs(repository_ctx, <2 more arguments>)
    File "/home/nobuo/tensorflow-2.2.0-rc2/third_party/gpus/cuda_configure.bzl", line 479, in _check_cuda_libs
    execute(repository_ctx, <1 more arguments>)
    File "/home/nobuo/tensorflow-2.2.0-rc2/third_party/remote_config/common.bzl", line 208, in execute
    fail(<1 more arguments>)
    Repository command failed
    Traceback (most recent call last):
    File "script.py", line 88, in <module>
    main()
    File "script.py", line 77, in main
    check_cuda_lib(path, check_soname=args[i + 1] == "True")
    File "script.py", line 62, in check_cuda_lib
    output = subprocess.check_output([objdump, "-p", path]).decode("ascii")
    UnicodeDecodeError: 'ascii' codec can't decode byte 0xe3 in position 46: ordinal not in range(128)
    ERROR: Skipping '//tensorflow/tools/pip_package:build_pip_package': no such package '@local_config_cuda//cuda': Traceback (most recent call last):
    File "/home/nobuo/tensorflow-2.2.0-rc2/third_party/gpus/cuda_configure.bzl", line 1210
    _create_local_cuda_repository(<1 more arguments>)
    File "/home/nobuo/tensorflow-2.2.0-rc2/third_party/gpus/cuda_configure.bzl", line 934, in _create_local_cuda_repository
    _find_libs(repository_ctx, <2 more arguments>)
    File "/home/nobuo/tensorflow-2.2.0-rc2/third_party/gpus/cuda_configure.bzl", line 577, in _find_libs
    _check_cuda_libs(repository_ctx, <2 more arguments>)
    File "/home/nobuo/tensorflow-2.2.0-rc2/third_party/gpus/cuda_configure.bzl", line 479, in _check_cuda_libs
    execute(repository_ctx, <1 more arguments>)
    File "/home/nobuo/tensorflow-2.2.0-rc2/third_party/remote_config/common.bzl", line 208, in execute
    fail(<1 more arguments>)
    Repository command failed
    Traceback (most recent call last):
    File "script.py", line 88, in <module>
    main()
    File "script.py", line 77, in main
    check_cuda_lib(path, check_soname=args[i + 1] == "True")
    File "script.py", line 62, in check_cuda_lib
    output = subprocess.check_output([objdump, "-p", path]).decode("ascii")
    UnicodeDecodeError: 'ascii' codec can't decode byte 0xe3 in position 46: ordinal not in range(128)
    WARNING: Target pattern parsing failed.
    ERROR: no such package '@local_config_cuda//cuda': Traceback (most recent call last):
    File "/home/nobuo/tensorflow-2.2.0-rc2/third_party/gpus/cuda_configure.bzl", line 1210
    _create_local_cuda_repository(<1 more arguments>)
    File "/home/nobuo/tensorflow-2.2.0-rc2/third_party/gpus/cuda_configure.bzl", line 934, in _create_local_cuda_repository
    _find_libs(repository_ctx, <2 more arguments>)
    File "/home/nobuo/tensorflow-2.2.0-rc2/third_party/gpus/cuda_configure.bzl", line 577, in _find_libs
    _check_cuda_libs(repository_ctx, <2 more arguments>)
    File "/home/nobuo/tensorflow-2.2.0-rc2/third_party/gpus/cuda_configure.bzl", line 479, in _check_cuda_libs
    execute(repository_ctx, <1 more arguments>)
    File "/home/nobuo/tensorflow-2.2.0-rc2/third_party/remote_config/common.bzl", line 208, in execute
    fail(<1 more arguments>)
    Repository command failed
    Traceback (most recent call last):
    File "script.py", line 88, in <module>
    main()
    File "script.py", line 77, in main
    check_cuda_lib(path, check_soname=args[i + 1] == "True")
    File "script.py", line 62, in check_cuda_lib
    output = subprocess.check_output([objdump, "-p", path]).decode("ascii")
    UnicodeDecodeError: 'ascii' codec can't decode byte 0xe3 in position 46: ordinal not in range(128)
    INFO: Elapsed time: 17.516s
    INFO: 0 processes.
    FAILED: Build did NOT complete successfully (0 packages loaded)
    currently loading: tensorflow/tools/pip_package


    インストール確認


    • tf.__version__が2.2-rc1であること。
    • GPUデバイスを認識していること。

    $ python
    Python 3.7.6 (default, Jan 30 2020, 09:44:41)
    [GCC 9.2.1 20190827 (Red Hat 9.2.1-1)] on linux
    Type "help", "copyright", "credits" or "license" for more information.
    >>> import tensorflow as tf
    2020-03-28 20:33:39.876383: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.2
    >>> tf.__version__
    '2.2.0-rc1'
    >>> from tensorflow.python.client import device_lib
    >>> device_lib.list_local_devices()
    2020-03-28 20:33:53.907523: I tensorflow/core/platform/profile_utils/cpu_utils.cc:102] CPU Frequency: 2993890000 Hz
    2020-03-28 20:33:53.909094: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x7f4490000b60 initialized for platform Host (this does not guarantee that XLA will be used). Devices:
    2020-03-28 20:33:53.909175: I tensorflow/compiler/xla/service/service.cc:176] StreamExecutor device (0): Host, Default Version
    2020-03-28 20:33:53.913388: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcuda.so.1
    2020-03-28 20:33:54.328519: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
    2020-03-28 20:33:54.329111: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x5636ff009c50 initialized for platform CUDA (this does not guarantee that XLA will be used). Devices:
    2020-03-28 20:33:54.329200: I tensorflow/compiler/xla/service/service.cc:176] StreamExecutor device (0): GeForce GTX 1070, Compute Capability 6.1
    2020-03-28 20:33:54.330516: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
    2020-03-28 20:33:54.331671: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1561] Found device 0 with properties:
    pciBusID: 0000:09:00.0 name: GeForce GTX 1070 computeCapability: 6.1
    coreClock: 1.7085GHz coreCount: 15 deviceMemorySize: 7.93GiB deviceMemoryBandwidth: 238.66GiB/s
    2020-03-28 20:33:54.331729: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.2
    2020-03-28 20:33:54.383662: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10
    2020-03-28 20:33:54.414185: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcufft.so.10
    2020-03-28 20:33:54.421131: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcurand.so.10
    2020-03-28 20:33:54.487881: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusolver.so.10
    2020-03-28 20:33:54.495069: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusparse.so.10
    2020-03-28 20:33:54.600519: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudnn.so.7
    2020-03-28 20:33:54.600790: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
    2020-03-28 20:33:54.602014: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
    2020-03-28 20:33:54.602748: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1703] Adding visible gpu devices: 0
    2020-03-28 20:33:54.603373: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.2
    2020-03-28 20:33:55.428570: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1102] Device interconnect StreamExecutor with strength 1 edge matrix:
    2020-03-28 20:33:55.428613: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1108] 0
    2020-03-28 20:33:55.428620: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1121] 0: N
    2020-03-28 20:33:55.429484: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
    2020-03-28 20:33:55.430019: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
    2020-03-28 20:33:55.430531: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1247] Created TensorFlow device (/device:GPU:0 with 7019 MB memory) -> physical GPU (device: 0, name: GeForce GTX 1070, pci bus id: 0000:09:00.0, compute capability: 6.1)
    [name: "/device:CPU:0"
    device_type: "CPU"
    memory_limit: 268435456
    locality {
    }
    incarnation: 1662223253576067142
    , name: "/device:XLA_CPU:0"
    device_type: "XLA_CPU"
    memory_limit: 17179869184
    locality {
    }
    incarnation: 987987008933275735
    physical_device_desc: "device: XLA_CPU device"
    , name: "/device:XLA_GPU:0"
    device_type: "XLA_GPU"
    memory_limit: 17179869184
    locality {
    }
    incarnation: 6863376769852385113
    physical_device_desc: "device: XLA_GPU device"
    , name: "/device:GPU:0"
    device_type: "GPU"
    memory_limit: 7360981248
    locality {
    bus_id: 1
    links {
    }
    }
    incarnation: 11906225243831981649
    physical_device_desc: "device: 0, name: GeForce GTX 1070, pci bus id: 0000:09:00.0, compute capability: 6.1"
    ]
    >>>

    OK!