I am currently managing two machines in my department. This page describes how to use them.
Machines
corsa
IP: 10.10.5.55 (you need to be in FCUL’s VPN)
CPU: 2x AMD EPYC 7443 24-Core Processor
RAM: 130 GB
GPUs: 2x NVIDIA A30
Disk: 900GB nvme
machine-learning
IP: 10.101.85.133 (you need to be in FCUL’s VPN)
CPU: 2x Intel® Xeon® Silver 4216 CPU @ 2.10GHz
RAM: 65 GB
GPUs: 2x NVIDIA Tesla T4
Disk: 500GB
Software
Both machines run ubuntu 22.04 LTS. I recommend you use a docker image to contain all your software.
But the main restriction is that you cannot run GPU or heavy CPU programs directly! Unless you have a special authorization, you need to submit jobs via SLURM.
To do so, you need to create a bash script with the following contents:
To execute this job, you should execute sbatch submit_job.sh
and you can query its status by running squeue
.