HPCC users,

 

Please see the latest cluster news below.

 

Datacenter cooling outage

 

Most of the HPC cluster underwent an emergency shutdown last night due to a cooling failure in the KEC datacenter as temperatures had reached critical levels.  Unfortunately many jobs or interactive sessions were terminated as a result of the emergency shutdown.  Most of the cooling has been restored, and HPC resources are slowly being brought back online to a level that can be accommodated by the available cooling.

 

New HPC portal apps

 

New interactive apps have been added to the HPC portal:

 

            Matlab

            Mathematica

            R Studio

            Ansys Workbench (for approved Ansys users only)

 

If you use any of these applications, check these out and let me know if you have any trouble using them. You can check out the HPC portal here:

 

https://ondemand.hpc.engr.oregonstate.edu

 

DGX queue change reminder

 

This is a reminder that the DGX partitions have been redefined as follows:

 

If you need 4 GPUs or less, please use the “dgx” partition.

 

If you need 4 GPUs or more, please use the “dgx2” partition.

 

If your jobs to the dgx/dgx2 partitions are pending with “QOSMinGRES” or “QOSMaxGRES”, or if your jobs are rejected for those reasons, that means you need to change the partition as noted above.

 

For the latest cluster news and status updates, check out the link below:

 

https://it.engineering.oregonstate.edu/hpc/hpc-cluster-status-and-news

 

Cheers,

 

Rob Yelle

HPC Manager