Thanks Rick,

 

Have any of our endpoints changed?

 

--Tom

 

Thomas G. Dietterich, Distinguished Professor Voice: 541-737-5559

School of Electrical Engineering              FAX: 541-737-1300

  and Computer Science                        URL: eecs.oregonstate.edu/~tgd

US Mail: 1148 Kelley Engineering Center

Office: 2067 Kelley Engineering Center

Oregon State Univ., Corvallis, OR 97331-5501

 

From: Rainqc-jobman <rainqc-jobman-bounces@engr.oregonstate.edu> On Behalf Of Rick Hagenaars - CITG
Sent: Wednesday, March 15, 2023 11:15
To: RainQC Job Manager <rainqc-jobman@engr.oregonstate.edu>
Subject: Re: [Rainqc-jobman] RainQC Job Manager daily report 2023-03-11

 

[This email originated from outside of OSU. Use caution with links and attachments.]

[This email originated from outside of OSU. Use caution with links and attachments.]

Hi all,

 

After spending 50 hours trying to migrate the RabbitMQ into the new "Messages for RabbitMQ" service on IBM Cloud we decided to rather use an external managed RabbitMQ cluster and everything worked within 1 hour. Systems should now be operational again and the regular QA/QC is now catching up with the data from the last couple of days.

 

I expect from tomorrow on everything should be running normal again (at least until we need to start migrating CF applications end of May).

 

 

Kind regards,

 

Rick Hagenaars

TAHMO Project, Faculty CiTG, TU Delft, Stevinweg 1, 2628CN, Delft The Netherlands 

T (M) +31(0)645833496


From: Slater, Michael <slater@oregonstate.edu>
Sent: Tuesday, March 14, 2023 3:50:04 AM
To: Rick Hagenaars - CITG
Subject: RE: [Rainqc-jobman] RainQC Job Manager daily report 2023-03-11

 

Hi Rick,

 

Thanks for letting me know about the outages. At worst, we’ll just re-add some scoring jobs and run them again. The last few days have been rather puzzling to me, as my own data completeness estimates showed that station data was in the system, but RainQC seemed to be having troubles reading any data after a certain date. That might be the difference between reading raw data vs. controlled data, though.

 

Best wishes for the migrations and other updates!

 

Cheers,

Michael

 

From: Rainqc-jobman <rainqc-jobman-bounces@engr.oregonstate.edu> On Behalf Of Rick Hagenaars - CITG
Sent: Sunday, March 12, 2023 1:03 AM
To: RainQC Job Manager <rainqc-jobman@engr.oregonstate.edu>
Subject: Re: [Rainqc-jobman] RainQC Job Manager daily report 2023-03-11

 

[This email originated from outside of OSU. Use caution with links and attachments.]

[This email originated from outside of OSU. Use caution with links and attachments.]

Hi Michael,

 

There’s indeed been some substantial outages, last week due to METER and this week because we needed to do migrations for the database. The compose services will no longer be operational after next Wednesday on IBM cloud. For sensordx we already changed the Postgres last November so it shouldn’t get affected.

 

There might be some more downtime on Monday since the rabbitmq migration wasn’t successful last week.

 

Kind regards,

 

Rick Hagenaars

 

On 11 Mar 2023, at 10:28, Slater, Michael <slater@oregonstate.edu> wrote:



I haven’t made the RainQC call in a while, but what appears to be large scale data layer failures seem to be popping up more than usual.

 

When I pulled the daily station data to compute the data completeness estimate, we had >60% of the models at 100% data completeness. For some unknown reason, a few minutes afterwards, the RainQC server could not pull data sufficient to compute even the scores for even a single model. But DID compute the scores for 11 models from before. So there was some sort of lapse in the availability of yesterday’s data (for today’s jobs).

 

--ms

 

From: slater@oregonstate.edu <slater@oregonstate.edu>
Sent: Saturday, March 11, 2023 1:06 AM
To: rainqc-jobman@ENGR.ORST.EDU
Cc: Slater, Michael <slater@oregonstate.edu>
Subject: RainQC Job Manager daily report 2023-03-11

 

Current UTC date: 2023-03-11 -> scoring models for previous day: 2023-03-10
---------------------------------------------------------------------------------
Daily Model Data Completeness Check:
data completeness    50% | complete models: 180 of 273 (65.93%)
data completeness    60% | complete models: 179 of 273 (65.57%)
data completeness    70% | complete models: 179 of 273 (65.57%)
data completeness    75% | complete models: 179 of 273 (65.57%)
data completeness    80% | complete models: 179 of 273 (65.57%)
data completeness    85% | complete models: 178 of 273 (65.20%)
data completeness    90% | complete models: 178 of 273 (65.20%)
data completeness    95% | complete models: 178 of 273 (65.20%)
data completeness   100% | complete models: 168 of 273 (61.54%)
----------------------------
station status | total: 313, delayed: 71, offline 24h: 48, offline week: 42
 | battery, min: 0, max: 100, mean: 63.81, std dev: 31.58
 | battery, common values: [(100, 221), (0, 50), (85, 2), (72, 2), (83, 2)]
 | battery <= mean, common countries: [('GH', 18), ('UG', 13), ('KE', 7), ('ZM', 4), ('ML', 4)]
----------------------------
63 LOW DATA (< 0.9) and 55 NO DATA weather stations impacted 95 RainQC models
LOW/NO data station impact on models: [('TA00076', 6), ('TA00231', 4), ('TA00530', 4), ('TA00700', 4), ('TA00011', 3), ('TA00173', 3), ('TA00032', 3), ('TA00035', 3), ('TA00036', 3), ('TA00267', 3), ('TA00066', 3), ('TA00126', 3), ('TA00217', 3), ('TA00482', 3), ('TA00542', 3), ('TA00308', 2), ('TA00043', 2), ('TA00050', 2), ('TA00102', 2), ('TA00165', 2), ('TA00487', 2), ('TA00684', 2), ('TA00210', 2), ('TA00223', 2), ('TA00232', 2), ('TA00262', 2), ('TA00271', 2), ('TA00314', 2), ('TA00691', 2), ('TA00339', 2), ('TA00373', 2), ('TA00398', 2), ('TA00451', 2), ('TA00462', 2), ('TA00471', 2), ('TA00278', 1), ('TA00014', 1), ('TA00031', 1), ('TA00044', 1), ('TA00095', 1), ('TA00157', 1), ('TA00201', 1), ('TA00212', 1), ('TA00219', 1), ('TA00229', 1), ('TA00237', 1), ('TA00260', 1), ('TA00276', 1), ('TA00287', 1), ('TA00290', 1), ('TA00336', 1), ('TA00350', 1), ('TA00362', 1), ('TA00369', 1), ('TA00422', 1), ('TA00432', 1), ('TA00493', 1), ('TA00524', 1), ('TA00533', 1), ('TA00652', 1), ('TA00655', 1), ('TA00677', 1), ('TA00702', 1)]
-----------------------------------------------------------
Processed daily jobs for UTC date: 2023-03-10
Start time: 2023-03-11T08:22:49+00:00
End time  : 2023-03-11T09:05:45+00:00
Elapsed time HH:MM:SS: 0:42:56
---------------------
Before job processing job table stats:
Total 'success' count: 122
Total 'failure' count: 137
Total record count: 1093
Job history table record count: 45128
Scoring job record table record count: 182
---------------------
After job processing job table stats:
Total 'success' count:               11 (flag=2 count:   0) (flag 2->1 downgrades:   0)
 | 'success' count for 2023-03-10:    0 (flag=2 count:   0)
 | 'success' count for 2023-03-09:   11 (flag=2 count:   0)
 | 'success' count for 2023-03-08:    0 (flag=2 count:   0)
 | 'success' count for 2023-03-07:    0 (flag=2 count:   0)
 | 'success' count for 2023-03-06:    0 (flag=2 count:   0)
 | 'success' count for 2023-03-05:    0 (flag=2 count:   0)
 | 'success' count for 2023-03-04:    0 (flag=2 count:   0)
Anomalies (flag=2):
--------
Total 'failure' count: 134
Total record count: 1107
Job history table record count: 45387
Scoring job record table record count: 183
-----------------------------------------------------------

--
Rainqc-jobman mailing list
Rainqc-jobman@ENGR.ORST.EDU
https://urldefense.com/v3/__https://it.engineering.oregonstate.edu/mailman/listinfo/rainqc-jobman__;!!PAKc-5URQlI!6QGgmZLAwrD_dlWr4-HM0eUE7_rYBr-0ArOApIjXdJ1cqmyYyeT2I1cUNp2-y7XgdMI0enN3em4U3EcC4x6ZqA8J9dA$