CentOS7.6 recognize all the V100 32GB in my SYS-9029GP-TNVRT, but when I run "./deviceQuery", the utility told me system's not yet initialized.
# cd /root/NVIDIA_CUDA-10.1_Samples/1_Utilities/deviceQuery
# ./deviceQuery
./deviceQuery Starting...
CUDA Device Query (Runtime API) version (CUDART static linking)
cudaGetDeviceCount returned 802
-> system not yet initialized
Result = FAIL
below is the installed software versions.
NVIDIA-SMI 418.87.00
Driver Version: 418.87.00
CUDA Version: 10.1
Any other prerequisites to run up the system?
Please make sure nVidia fabric manager is installed and activated.
1. install nvidia DCGM.
2. terminate the nv-hostengine* first in order to enable fabric manager.
# sudo nv-hostengine -t
3. Enable Fabric manager:
#service nvidia-fabricmanager start
document for reference:
https://docs.nvidia.com/datacenter/dcgm/latest/dcgm-user-guide/overview.html