Sysadmin Documentation/GPU servers

From UGCS
Jump to: navigation, search

Haru and Mako are the UGCS gpu compute servers. They have some cuda GPUs and go really flast like sanic

Stats

  • Haru - Custom, i7-????, ??GB ram, 1x ?TB HDD, 3x GTX 570
  • Mako - Custom, i7-4790K, 32GB ram, 1x 3TB HDD, 2x GTX 570 (third to be installed somehow)

Roles

  • GPU compute servers

Setup

These servers run the worst distro of linux ever, version 16.04, since nvidia is run by terrorists and doesn't officially support debian. The morally right thing to do would be to run these machines on a less evil os like CentOS (still pretty evil), but that would cause difficulties due to the larger differences with debian. Users also like ubuntu, even though they shouldn't.

drive setup

We only have 1 HDD in each machine. We really should do raid, but high availability is pretty much a non-issue here. Ram is plentiful so there is no swap. We also won't be mounting homedirs, since these machines may have third party sudoers and our nfs mounts are full of secrets.

Our partitioning strategy is then:

  • /boot - 2GB
  • / - 10GB on lvm
  • /var - 10GB on lvm
  • /tmp - 10GB on lvm
  • /home - 1TB on lvm

networking

Ubuntu enabled systemd-networkd, which means editing the resolv.conf file is a no go. We'll figure out how to get this crap working later, but it's not like it matters as long as the machine runs.

utils and stuff

mostly we just need compilers and whatnot

  • build-essential

cuda installation

just do the usual and follow the instructions. Nvidia recommends we use the deb installation method so we will. Yes, this is absolute terrorism and violates the UGCS rules. We're gonna do it anyways lol.

http://docs.nvidia.com/cuda/cuda-installation-guide-linux/#ubuntu-installation