We have a new Supermicro Server AS-4124GS-TNR equipped with eight NVIDIA RTX A6000. The OS is Ubuntu 20.04.2, the NVIDIA driver version is 460.73.01 (no Nouveau driver used), the CUDA Version is 11.2.
We ran a few tests on the GPUs and the sysem was stable. However, after some GPU idling the system freezed twice.
GpuPowerMizerMode ist set to
1, so this is not the cause for the freeze.
Is there something else we have to set / configure to prevent a freeze during idling? Or is another reason likely?