TensorFlow : Install Docker Image (GPU)2022/08/10 |
Install TensorFlow which is the Machine Learning Library.
On this example, Install TensorFlow official Docker Image with GPU support and run it on Containers.
|
|
[1] | |
[2] | Install and use TensorFlow Docker (GPU) by root user account. If you'd like to run it by common users, refer to [4] section. |
# Pull TensorFlow GPU image [root@dlp ~]# podman pull docker.io/tensorflow/tensorflow:latest-gpu
podman images REPOSITORY TAG IMAGE ID CREATED SIZE docker.io/tensorflow/tensorflow latest-gpu c8d4e2940044 36 hours ago 6 GB docker.io/tensorflow/tensorflow latest 976c17ec6daa 36 hours ago 1.48 GB # verify to run [nvidia-smi] [root@dlp ~]# podman run -e NVIDIA_VISIBLE_DEVICES=all --rm docker.io/tensorflow/tensorflow:latest-gpu nvidia-smi Thu Sep 8 08:08:15 2022 +-----------------------------------------------------------------------------+ | NVIDIA-SMI 515.65.01 Driver Version: 515.65.01 CUDA Version: 11.7 | |-------------------------------+----------------------+----------------------+ | GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. | | | | MIG M. | |===============================+======================+======================| | 0 NVIDIA GeForce ... Off | 00000000:05:00.0 Off | N/A | | 0% 53C P5 15W / 120W | 0MiB / 6144MiB | 0% Default | | | | N/A | +-------------------------------+----------------------+----------------------+ +-----------------------------------------------------------------------------+ | Processes: | | GPU GI CI PID Type Process name GPU Memory | | ID ID Usage | |=============================================================================| | No running processes found | +-----------------------------------------------------------------------------+ # verify to run TensorFlow [root@dlp ~]# podman run -e NVIDIA_VISIBLE_DEVICES=all --rm docker.io/tensorflow/tensorflow:latest-gpu \ python -c "import tensorflow as tf; print(tf.reduce_sum(tf.random.normal([1000, 1000])))" 2022-09-08 08:09:00.910339: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX2 FMA To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags. 2022-09-08 08:09:01.070418: E tensorflow/stream_executor/cuda/cuda_blas.cc:2981] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered 2022-09-08 08:09:02.828431: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:980] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero 2022-09-08 08:09:02.835621: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:980] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero 2022-09-08 08:09:02.838364: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:980] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero 2022-09-08 08:09:02.841856: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX2 FMA To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags. 2022-09-08 08:09:02.846371: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:980] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero 2022-09-08 08:09:02.849344: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:980] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero 2022-09-08 08:09:02.852309: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:980] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero 2022-09-08 08:09:03.511643: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:980] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero 2022-09-08 08:09:03.511945: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:980] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero 2022-09-08 08:09:03.512086: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:980] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero 2022-09-08 08:09:03.512262: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1616] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 5381 MB memory: -> device: 0, name: NVIDIA GeForce GTX 1060 6GB, pci bus id: 0000:05:00.0, compute capability: 6.1 tf.Tensor(-194.77834, shape=(), dtype=float32) |
[3] | If SELinux is enabled, change policy. |
[root@dlp ~]#
vi my-python.te # create new module my-python 1.0; require { type container_t; type xserver_misc_device_t; type device_t; class chr_file { getattr ioctl map open read write }; } #============= container_t ============== allow container_t device_t:chr_file map; allow container_t device_t:chr_file { getattr ioctl open read write }; allow container_t xserver_misc_device_t:chr_file map; checkmodule -m -M -o my-python.mod my-python.te [root@dlp ~]# semodule_package --outfile my-python.pp --module my-python.mod [root@dlp ~]# semodule -i my-python.pp |
[4] | To run CUDA and TensorFlow container by common users, it needs to change settings. |
[root@dlp ~]#
vi /etc/nvidia-container-runtime/config.toml disable-require = false #swarm-resource = "DOCKER_RESOURCE_GPU" [nvidia-container-cli] #root = "/run/nvidia/driver" #path = "/usr/bin/nvidia-container-cli" environment = [] #debug = "/var/log/nvidia-container-toolkit.log" #ldcache = "/etc/ld.so.cache" load-kmods = true # uncomment and change to [true] no-cgroups = true #user = "root:video" ldconfig = "@/sbin/ldconfig" #alpha-merge-visible-devices-envvars = false [nvidia-container-runtime] #debug = "/var/log/nvidia-container-runtime.log" # verify to run containers to login as a common user
[cent@dlp ~]$
[cent@dlp ~]$ podman pull docker.io/tensorflow/tensorflow:latest-gpu
podman images REPOSITORY TAG IMAGE ID CREATED SIZE docker.io/tensorflow/tensorflow latest-gpu c8d4e2940044 36 hours ago 6 GB docker.io/tensorflow/tensorflow latest-jupyter c94342dbd1e8 36 hours ago 1.72 GB docker.io/tensorflow/tensorflow latest 976c17ec6daa 36 hours ago 1.48 GB # verify to run [nvidia-smi] [cent@dlp ~]$ podman run --rm --security-opt=label=disable \ --hooks-dir=/usr/share/containers/oci/hooks.d/ \ docker.io/tensorflow/tensorflow:latest-gpu /usr/bin/nvidia-smi Thu Sep 8 08:20:51 2022 +-----------------------------------------------------------------------------+ | NVIDIA-SMI 515.65.01 Driver Version: 515.65.01 CUDA Version: 11.7 | |-------------------------------+----------------------+----------------------+ | GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. | | | | MIG M. | |===============================+======================+======================| | 0 NVIDIA GeForce ... Off | 00000000:05:00.0 Off | N/A | | 0% 53C P5 15W / 120W | 0MiB / 6144MiB | 0% Default | | | | N/A | +-------------------------------+----------------------+----------------------+ +-----------------------------------------------------------------------------+ | Processes: | | GPU GI CI PID Type Process name GPU Memory | | ID ID Usage | |=============================================================================| | No running processes found | +-----------------------------------------------------------------------------+ # verify to run Hello World test script on container [cent@dlp ~]$ podman run -e NVIDIA_VISIBLE_DEVICES=all --rm --security-opt=label=disable \ --hooks-dir=/usr/share/containers/oci/hooks.d/ \ docker.io/tensorflow/tensorflow:latest-gpu \ python -c "import tensorflow as tf; hello = tf.constant('Hello, TensorFlow World'); tf.print(hello)" 2022-09-08 08:21:43.417819: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX2 FMA To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags. 2022-09-08 08:21:43.580492: E tensorflow/stream_executor/cuda/cuda_blas.cc:2981] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered 2022-09-08 08:21:45.287472: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:980] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero 2022-09-08 08:21:45.292619: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:980] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero 2022-09-08 08:21:45.293005: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:980] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero 2022-09-08 08:21:45.293908: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX2 FMA To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags. 2022-09-08 08:21:45.294298: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:980] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero 2022-09-08 08:21:45.294581: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:980] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero 2022-09-08 08:21:45.294902: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:980] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero 2022-09-08 08:21:45.892204: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:980] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero 2022-09-08 08:21:45.892465: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:980] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero 2022-09-08 08:21:45.892657: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:980] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero 2022-09-08 08:21:45.892873: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1616] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 5381 MB memory: -> device: 0, name: NVIDIA GeForce GTX 1060 6GB, pci bus id: 0000:05:00.0, compute capability: 6.1 Hello, TensorFlow World |
Sponsored Link |
|