I have received a second hand Dell R720. This machine has a PERC H710p RAID controller and it originally came with 6 x 600GB 6G SAS 10k 2.5″ drives, 192GB RAM and 2x Xeon E5-2696v2 12 core CPUs.
I am trying to install a host OS on it but whatever I try installing, it always fails with I/O related error. Looking at similar issues described on the internet, everything points to a hardware failure.
I have tried following RAID combinations to narrow down where the issue might be:
- 1 disk RAID 0 (/dev/sda), 5 disks in RAID 5 (/dev/sdb): errors when attempting install on /dev/sda and /dev/sdb
- 3x RAID 1 sda, sdb, sdc respectively: errors when attempting install on /dev/sda, /dev/sdb and /dev/sdc
- removed 3 drives, tried 1x RAID 5 on all 3 drives: errors
The operating systems / hosts I tried installing so far:
- Alpine 3.13: reported I/O error, installer exits to ash, nothing happens
- Ubuntu 16 LTS: reported I/O error
- Ubuntu 18 LTS: udevadm settle retried multiple times, I/O errors reported right before, installer crashes and restarts to region selection
- Ubuntu 20 LTS: same as Ubuntu 18
- CentOS 7: reported python anaconda error when trying to write to disk before I was able to even put the root password in, installer hangs, machine requires hard reboot
- XenServer 7.0: installer stopped at 68%, machine requires hard reboot
With every one of these, regardless of which disk group I use for the OS, as soon as the installer attempts writing the partition table, all disks from the selected disk group start blinking amber. On 18 / 20 ubuntu installer, consistently when it is time to put the user name, server name and password. After reboot, disks are blinking green again. In RAID configuration (CTRL+R), all disks are online, VD state is reported as Optimal. I have SATA AHCI set in boot properties in BIOS.
I ran the lifecycle manager tests on the server, everything is dandy. No errors reported, except of the missing PERC battery as the server does not have one physically installed. I understand why I would need this battery for data consistency on power loss but it should not prevent me from installing the OS. I suspect that the RAID controller is faulty but I am not an expert.
How can I diagnose what is going on?