ALICE partition system update 2023-01
The cpu-nodes have been in very high demand in the recent months. While this is of course a great sign for ALICE, it has also increased the pending time of all kinds of cpu jobs in the queue. This is true in particular for the cpu-short and cpu-long partitions. Therefore, we have been looking into ways to improve the situation. We very much understand that long waiting times can be frustrating.
We have started the process of expanding the cpu nodes of ALICE, but purchasing new nodes takes time and we also dependent on when the new datacenter becomes available for us. Therefore, we expect that we can add new cpu nodes to the cluster at the earliest in early Q3 2023.
In order to improve the throughput of jobs in the cpu partitions until then, we have analyzed the CPU utilization of the gpu and high-memory nodes of ALICE. We found that in most cases gpu and high-memory jobs do not make use of all CPUs and all memory of those nodes. Therefore, we decided to make most of the gpu nodes and the high-memory node available to the cpu-short partition. In total, this gives the cpu-short partition shared access to 216 additional cores.
Because of the demand on the GPUs, we have taken steps to ensure that the impact on the scheduling of gpu jobs should remain minimal. For once, node851 and node852 will remain exclusive to the gpu-short/gpu-medium and the gpu-long partition, respectively, to ensure that at least one node can be immediately available to users. In addition, the gpu and mem partitions have a higher priority over the cpu-partitions. This way, jobs submitted to the gpu and mem partitions should have a higher priority compared to jobs in the cpu-partition.
For those of you who do not actively require Infiniband for their jobs in the cpu-short partition, no changes are necessary to the batch script. If you make use of Infiniband (e.g., for running MPI jobs), you need to add #SBATCH --constraint=ib
to your batch script. This makes sure that Slurm schedules your job on one of the cpu nodes because the gpu and high-memory nodes currently do not have Infiniband.
We will monitor the impact of the changes closely to see if everything is working as expected and then review the situation after a few weeks. The wiki page on the ALICE partition system has been updated accordingly: Partitions on ALICE.
If you have any questions or feedback, just let us know through the ALICE Helpdesk. We are also happy to assist with improving workflows on ALICE.