機械学習をGPUで回すときの処理速度についての質問になります。
以前、Tensorflowの簡単なプログラムをGTX1060を活用して回していたところ、
実行時間が約1分かかっていました。
現在はRTX2080を活用しているのですが、同様のプログラムを回すと。
約7分かかってしまいます。
私の中では、後者のGPUの方が性能が良いものだと考えていたので、
処理速度もかなり早くなると考えています。
これはGPUの設定の問題なのでしょうか?
以下にGPUの種類やドライバ等の詳細を記載しておきます。
GTX1060
101:00.0 VGA compatible controller: NVIDIA Corporation Device 1c3 (rev a1) 201:00.1 Audio device: NVIDIA Corporation Device 10f1 (rev a1) 3NVRM version: NVIDIA UNIX x86_64 Kernel Module 384.130 Wed Mar 21 03:37:26 PDT 2018 4GCC version: gcc version 5.4.0 20160609 (Ubuntu 5.4.0-6ubuntu1~16.04.11) 5 6memory:6072MiB 7
RTX2080
101:00.0 VGA compatible controller: NVIDIA Corporation GV102 (rev a1) 201:00.1 Audio device: NVIDIA Corporation Device 10f7 (rev a1) 301:00.2 USB controller: NVIDIA Corporation Device 1ad6 (rev a1) 401:00.3 Serial bus controller [0c80]: NVIDIA Corporation Device 1ad7 (rev a1) 5 6NVRM version: NVIDIA UNIX x86_64 Kernel Module 410.104 Tue Feb 5 22:58:30 CST 2019 7GCC version: gcc version 7.4.0 (Ubuntu 7.4.0-1ubuntu1~18.04.1) 8 9memory:10989MiB
以下にGPUを認識した際のログになります。
GTX1060
1>>> device_lib.list_local_devices() 22019-09-19 13:11:36.656233: I tensorflow/core/platform/cpu_feature_guard.cc:140] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA 32019-09-19 13:11:36.799838: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:898] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero 42019-09-19 13:11:36.800313: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1356] Found device 0 with properties: 5name: GeForce GTX 1060 6GB major: 6 minor: 1 memoryClockRate(GHz): 1.7085 6pciBusID: 0000:01:00.0 7totalMemory: 5.93GiB freeMemory: 5.86GiB 82019-09-19 13:11:36.800345: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1435] Adding visible gpu devices: 0 92019-09-19 13:11:40.913458: I tensorflow/core/common_runtime/gpu/gpu_device.cc:923] Device interconnect StreamExecutor with strength 1 edge matrix: 102019-09-19 13:11:40.913573: I tensorflow/core/common_runtime/gpu/gpu_device.cc:929] 0 112019-09-19 13:11:40.913613: I tensorflow/core/common_runtime/gpu/gpu_device.cc:942] 0: N 122019-09-19 13:11:40.969819: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1053] Created TensorFlow device (/device:GPU:0 with 5641 MB memory) -> physical GPU (device: 0, name: GeForce GTX 1060 6GB, pci bus id: 0000:01:00.0, compute capability: 6.1) 13[name: "/device:CPU:0" 14device_type: "CPU" 15memory_limit: 268435456 16locality { 17} 18incarnation: 2725849591917179920 19, name: "/device:GPU:0" 20device_type: "GPU" 21memory_limit: 5915803648 22locality { 23 bus_id: 1 24 links { 25 } 26} 27incarnation: 17210757529656703054 28physical_device_desc: "device: 0, name: GeForce GTX 1060 6GB, pci bus id: 0000:01:00.0, compute capability: 6.1" 29]
RTX2080
1>>> device_lib.list_local_devices() 22019-09-19 13:08:29.200675: I tensorflow/core/platform/cpu_feature_guard.cc:140] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA 32019-09-19 13:08:29.843699: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:898] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero 42019-09-19 13:08:29.844129: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1356] Found device 0 with properties: 5name: GeForce RTX 2080 Ti major: 7 minor: 5 memoryClockRate(GHz): 1.545 6pciBusID: 0000:01:00.0 7totalMemory: 10.73GiB freeMemory: 10.53GiB 82019-09-19 13:08:29.844144: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1435] Adding visible gpu devices: 0 92019-09-19 13:08:30.028718: I tensorflow/core/common_runtime/gpu/gpu_device.cc:923] Device interconnect StreamExecutor with strength 1 edge matrix: 102019-09-19 13:08:30.028754: I tensorflow/core/common_runtime/gpu/gpu_device.cc:929] 0 112019-09-19 13:08:30.028762: I tensorflow/core/common_runtime/gpu/gpu_device.cc:942] 0: N 122019-09-19 13:08:30.029023: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1053] Created TensorFlow device (/device:GPU:0 with 10177 MB memory) -> physical GPU (device: 0, name: GeForce RTX 2080 Ti, pci bus id: 0000:01:00.0, compute capability: 7.5) 13[name: "/device:CPU:0" 14device_type: "CPU" 15memory_limit: 268435456 16locality { 17} 18incarnation: 14902398693009919416 19, name: "/device:GPU:0" 20device_type: "GPU" 21memory_limit: 10672347546 22locality { 23 bus_id: 1 24 links { 25 } 26} 27incarnation: 1321506289675005500 28physical_device_desc: "device: 0, name: GeForce RTX 2080 Ti, pci bus id: 0000:01:00.0, compute capability: 7.5" 29]
環境:
tensorflow バージョン: 1.8.0
CUDA バージョン: 10.0
分かる方がいましたら、回答いただけると助かります。
※ご回答いただいた内容に質問させていただくこともあるかと思いますので、
※よろしければご返信いただければと思います。
回答1件
あなたの回答
tips
プレビュー