[This email originated from outside of OSU. Use caution with links and attachments.]

The extremely low data completeness at 100% is similar to something we saw in the past. Namely, a number of stations are missing a few measurements, likely due to some database remaining updates that hadn't happened at the time of the data completeness check. The time required to ingest the weather data and update the system we query seems to creep upward at times. 

My thought is that I need to add another hour or two of delay before we score the models for the previous day. We are currently at +5hrs beyond UTC midnight. An extra delay would take us to a +6hr or +7hr offset from UTC midnight. Originally we started with a +4hr offset and this worked reasonably well for some years, e.g., 45-50% data completeness IIRC. Now, even the +5hr delay appears to be close to the edge of insufficiency. We do have more stations, so there
may be some issue with throughput.

The RainQC success count for the day indicates that we are still scoring stations (as we don't rely on 100% data completeness for scoring). 
 | 'success' count for 2024-08-16:   79 (flag=2 count:   9)
We are still leaving some models unscored and I think this is due to a mix of flag=4 data  and a lack of appropriate retries in the RainQC code. 

One caveat: I don't know how flag=3 is handled. Let's say a station was flagged a month ago as flag=3 due to the satellite QC process. Is new data for that station set with flag=3 until the station is marked 'repaired' (or the equivalent)?
Or is flag=3 purely retroactive?
Michael



On Sat, Aug 17, 2024 at 2:24 PM Dietterich, Thomas via Rainqc-jobman <rainqc-jobman@engr.oregonstate.edu> wrote:

These are alarmingly low numbers.

 

Thomas G. Dietterich, Distinguished Professor (Emeritus)

School of EECS, Oregon State University

US Mail: 1148 Kelley Engineering Center, Corvallis, OR 97331-5501 USA

Office: 2063 Kelley Engineering Center

Voice: 541-737-5559; FAX: 541-737-1300

https://web.engr.oregonstate.edu/~tgd/

 

From: RainQC Job Manager <noreply@tahmo.org>
Sent: Friday, August 16, 2024 23:27
To: rainqc-jobman@ENGR.ORST.EDU
Cc: Michael Slater <science@baidala.com>; Dietterich, Thomas <tgd@oregonstate.edu>
Subject: RainQC Job Manager daily report 2024-08-17

 

[This email originated from outside of OSU. Use caution with links and attachments.]

Current UTC date: 2024-08-17 -> scoring models for previous day: 2024-08-16
---------------------------------------------------------------------------------
Daily Model Data Completeness Check:
data completeness    50% | complete models: 126 of 273 (46.15%)
data completeness    60% | complete models: 124 of 273 (45.42%)
data completeness    70% | complete models: 116 of 273 (42.49%)
data completeness    75% | complete models: 113 of 273 (41.39%)
data completeness    80% | complete models: 109 of 273 (39.93%)
data completeness    85% | complete models: 109 of 273 (39.93%)
data completeness    90% | complete models: 109 of 273 (39.93%)
data completeness    95% | complete models: 103 of 273 (37.73%)
data completeness   100% | complete models:  35 of 273 (12.82%)
----------------------------
station status | total: 313, delayed: 126, offline 24h: 77, offline week: 71
 | battery, min: 0, max: 100, mean: 56.92, std dev: 27.4
 | battery, common values: [(100, 163), (0, 81), (91, 3), (74, 3), (89, 3)]
 | battery <= mean, common countries: [('KE', 24), ('GH', 23), ('UG', 11), ('ML', 10), ('ZM', 6)]
----------------------------
96 LOW DATA (< 0.9) and 80 NO DATA weather stations impacted 164 RainQC models
LOW/NO data station impact on models: [('TA00651', 9), ('TA00568', 8), ('TA00185', 7), ('TA00199', 7), ('TA00016', 6), ('TA00198', 6), ('TA00327', 6), ('TA00320', 5), ('TA00045', 5), ('TA00587', 5), ('TA00166', 5), ('TA00243', 5), ('TA00231', 4), ('TA00379', 4), ('TA00301', 4), ('TA00530', 4), ('TA00565', 4), ('TA00543', 4), ('TA00700', 4), ('TA00636', 4), ('TA00035', 3), ('TA00222', 3), ('TA00041', 3), ('TA00178', 3), ('TA00116', 3), ('TA00126', 3), ('TA00141', 3), ('TA00274', 3), ('TA00360', 3), ('TA00217', 3), ('TA00482', 3), ('TA00385', 3), ('TA00289', 3), ('TA00436', 3), ('TA00430', 3), ('TA00574', 3), ('TA00542', 3), ('TA00013', 2), ('TA00020', 2), ('TA00308', 2), ('TA00256', 2), ('TA00118', 2), ('TA00165', 2), ('TA00148', 2), ('TA00487', 2), ('TA00210', 2), ('TA00223', 2), ('TA00232', 2), ('TA00224', 2), ('TA00271', 2), ('TA00399', 2), ('TA00309', 2), ('TA00314', 2), ('TA00691', 2), ('TA00339', 2), ('TA00364', 2), ('TA00373', 2), ('TA00397', 2), ('TA00451', 2), ('TA00462', 2), ('TA00471', 2), ('TA00592', 2), ('TA00014', 1), ('TA00031', 1), ('TA00062', 1), ('TA00091', 1), ('TA00095', 1), ('TA00123', 1), ('TA00212', 1), ('TA00219', 1), ('TA00237', 1), ('TA00251', 1), ('TA00268', 1), ('TA00269', 1), ('TA00286', 1), ('TA00290', 1), ('TA00316', 1), ('TA00336', 1), ('TA00344', 1), ('TA00356', 1), ('TA00362', 1), ('TA00389', 1), ('TA00396', 1), ('TA00422', 1), ('TA00432', 1), ('TA00433', 1), ('TA00493', 1), ('TA00524', 1), ('TA00528', 1), ('TA00529', 1), ('TA00533', 1), ('TA00535', 1), ('TA00652', 1), ('TA00655', 1), ('TA00677', 1), ('TA00702', 1)]
-----------------------------------------------------------
Processed daily jobs for UTC date: 2024-08-16
Start time: 2024-08-17T05:25:54+00:00
End time  : 2024-08-17T06:26:30+00:00
Elapsed time HH:MM:SS: 1:00:37
---------------------
Before job processing job table stats:
Total 'success' count: 82
Total 'failure' count: 188
Total record count: 1411
Job history table record count: 188135
Scoring job record table record count: 708
---------------------
After job processing job table stats:
Total 'success' count:               87 (flag=2 count:  10) (flag 2->1 downgrades:   3)
 | 'success' count for 2024-08-16:   79 (flag=2 count:   9)
 | 'success' count for 2024-08-15:    8 (flag=2 count:   1)
 | 'success' count for 2024-08-14:    0 (flag=2 count:   0)
 | 'success' count for 2024-08-13:    0 (flag=2 count:   0)
 | 'success' count for 2024-08-12:    0 (flag=2 count:   0)
 | 'success' count for 2024-08-11:    0 (flag=2 count:   0)
 | 'success' count for 2024-08-10:    0 (flag=2 count:   0)
Anomalies (flag=2):
 TA00073 2024-08-16 | score:   326.787 (thresh:   62.936) -- 'pr' t:  0.000 mm n: (0.0 mm, 37 km), (0.0 mm, 57 km), (45.303 mm, 58 km), (0.068 mm, 61 km)
 TA00140 2024-08-15 | score:   128.983 (thresh:   94.060) -- 'pr' t:  0.034 mm n: (17.071 mm, 35 km), (8.983 mm, 39 km), (4.663 mm, 68 km)
 TA00168 2024-08-16 | score:    73.889 (thresh:   72.283) -- 'pr' t:  8.250 mm n: (14.465 mm, 68 km)
 TA00171 2024-08-16 | score:   307.551 (thresh:  137.760) -- 'pr' t:  0.017 mm n: (24.565 mm, 26 km)
 TA00174 2024-08-16 | score:    65.320 (thresh:   11.504) -- 'pr' t: 21.649 mm n: (2.312 mm, 60 km)
 TA00334 2024-08-16 | score:   243.420 (thresh:  217.250) -- 'pr' t:  0.000 mm n: (9.86 mm, 18 km)
 TA00374 2024-08-16 | score:   185.168 (thresh:   67.401) -- 'pr' t:  0.085 mm n: (24.565 mm, 19 km)
 TA00683 2024-08-16 | score:  6346.011 (thresh:   67.770) -- 'pr' t: 10.203 mm n: (95.981 mm, 8 km)
 TA00686 2024-08-16 | score:  1932.579 (thresh:   86.597) -- 'pr' t: 19.204 mm n: (10.203 mm, 24 km), (95.981 mm, 31 km)
 TA00687 2024-08-16 | score:  2949.693 (thresh:  140.819) -- 'pr' t: 35.549 mm n: (95.981 mm, 25 km), (10.203 mm, 33 km)
--------
Total 'failure' count: 187
Total record count: 1414
Job history table record count: 188405
Scoring job record table record count: 709
-----------------------------------------------------------
RainQC JobMan LR | version 1.2
--
Rainqc-jobman mailing list
Rainqc-jobman@engr.oregonstate.edu
https://it.engineering.oregonstate.edu/mailman/listinfo/rainqc-jobman