
HPCC users, Sometime after 8pm last night, there was a systems control failure that caused the chillers and pumps to stop working. This led to a rapid increase in the server room temperature, causing it to go to unsafe levels for the compute servers. The entire cluster had to be shut down to protect these servers. Facilities was notified, and they were able fix the problem and restore cooling to the server room, and they will continue to evaluate the cooling situation. In the meantime the cluster will be brought online later this morning, but at reduced capacity over the weekend and through next week to lessen the heat load during the upcoming heat wave. I know that some of you have deadlines fast approaching, and we will do what we can to accommodate you. For up-to-date status on the cluster, check out the link below: https://it.engineering.oregonstate.edu/hpc/hpc-cluster-status-and-news<https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fit.engineering.oregonstate.edu%2Fhpc%2Fhpc-cluster-status-and-news&data=05%7C01%7Ccluster-users%40engr.orst.edu%7C301ffd9a40c74d6b668808db9a7e11c6%7Cce6d05e13c5e4d6287a84c4a2713c113%7C0%7C0%7C638273638468106340%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=arWqHgziBxj7dlPNDyehP04%2F0TTahjChvYilzTjhLCU%3D&reserved=0> Rob Yelle HPC Manager