
Cluster users, It appears that the problem with the HPC share storage has been fixed. While it is difficult to confirm, it appears that the problem with the storage was related to the abrupt power outage that brought down the cluster last week. Anyways, all HPC nodes are being rebooted to roll out the fix, and cluster operation is currently being restored and resources should become available shortly. Note that submit-c is currently offline. Please use submit, submit-a or submit-b instead. If anyone has any problems with the HPC storage or the cluster in general, let me know. Rob From: Yelle, Robert Brian <robert.yelle@oregonstate.edu> Date: Monday, July 22, 2024 at 3:51 PM To: cluster-users@engr.orst.edu <cluster-users@engr.orst.edu> Subject: Cluster offline due to storage HPC users, The cluster is offline again, this time due to a failure of the HPC share storage. A number of users noticed issues with their jobs or python environments, and accessing their files or directories. I have opened up a ticket with the vendor on this issue and have been working with them over the course of the day but no resolution yet. It is not clear what caused the issue or when it will be resolved. I’ll update the list when I have more answers. Rob Yelle HPC Manager