[This email originated from outside of OSU. Use caution with links and attachments.]

Hi all,

I think the low completeness was caused by tests we've been running on with the data adapter for a new API version of METER. We have about 15 stations that can't retrieve data through the older API but their new API also has issues with even more stations than that.

With a previous test run with the new API standard we also got a low completeness, so this has me worrying that swapping the two might cause issues with them not sharing the same record identifiers (which we use to keep track of the latest measurement we retrieved from them per station).

To add on the previous discussed topic: "If the ticket is closed with an indication that this was a false alarm, the end date is updated to equal the start date, which effectively removed the quality object.". I think during the last meeting we discussed this it was noted that removing the flags is not being done in practice. However the false flags from satellite QC also don't occur that often so I don't think this will have a major impact.

Kind regards,

Rick

From: Rainqc-jobman <rainqc-jobman-bounces@engr.oregonstate.edu> on behalf of Dietterich, Thomas via Rainqc-jobman <rainqc-jobman@engr.oregonstate.edu>
Sent: Sunday, August 18, 2024 12:31 AM
To: Michael Slater <slater@baidala.com>
Cc: Dietterich, Thomas <tgd@oregonstate.edu>; RainQC Job Manager <rainqc-jobman@engr.oregonstate.edu>
Subject: Re: [Rainqc-jobman] RainQC Job Manager daily report 2024-08-17
 

Yes, and if the target station itself is already flagged, we should note that too.

 

Thomas G. Dietterich, Distinguished Professor (Emeritus)

School of EECS, Oregon State University

US Mail: 1148 Kelley Engineering Center, Corvallis, OR 97331-5501 USA

Office: 2063 Kelley Engineering Center

Voice: 541-737-5559; FAX: 541-737-1300

https://web.engr.oregonstate.edu/~tgd/

 

From: Michael Slater <slater@baidala.com>
Sent: Saturday, August 17, 2024 15:00
To: Dietterich, Thomas <tgd@oregonstate.edu>
Cc: RainQC Job Manager <rainqc-jobman@engr.oregonstate.edu>
Subject: Re: [Rainqc-jobman] RainQC Job Manager daily report 2024-08-17

 

[This email originated from outside of OSU. Use caution with links and attachments.]

Thanks. If this is the case, we are also leaving some models unscored due to flag=3 being automatically set on new measurements from a station. At some point in the future, it might be helpful to surface a more specific failure code in the RainQC API, e.g., "Could not score model due to QC flag(s) on data for station(s)" with specifics on the flags and stations.

 

On Sat, Aug 17, 2024 at 2:44 PM Dietterich, Thomas <tgd@oregonstate.edu> wrote:

My understanding is that when Gilbert or Victor adds a flag 3, they create a quality object without an end date, so that flag 3 is automatically applied to all subsequent data. When the ticket is closed that confirms the repair, the end date is updated. If the ticket is closed with an indication that this was a false alarm, the end date is updated to equal the start date, which effectively removed the quality object.

 

--Tom

 

Thomas G. Dietterich, Distinguished Professor (Emeritus)

School of EECS, Oregon State University

US Mail: 1148 Kelley Engineering Center, Corvallis, OR 97331-5501 USA

Office: 2063 Kelley Engineering Center

Voice: 541-737-5559; FAX: 541-737-1300

https://web.engr.oregonstate.edu/~tgd/

 

From: Michael Slater <slater@baidala.com>
Sent: Saturday, August 17, 2024 14:39
To: RainQC Job Manager <rainqc-jobman@engr.oregonstate.edu>
Cc: Dietterich, Thomas <tgd@oregonstate.edu>; Michael Slater <slater@baidala.com>
Subject: Re: [Rainqc-jobman] RainQC Job Manager daily report 2024-08-17

 

[This email originated from outside of OSU. Use caution with links and attachments.]

The extremely low data completeness at 100% is similar to something we saw in the past. Namely, a number of stations are missing a few measurements, likely due to some database remaining updates that hadn't happened at the time of the data completeness check. The time required to ingest the weather data and update the system we query seems to creep upward at times. 

 

My thought is that I need to add another hour or two of delay before we score the models for the previous day. We are currently at +5hrs beyond UTC midnight. An extra delay would take us to a +6hr or +7hr offset from UTC midnight. Originally we started with a +4hr offset and this worked reasonably well for some years, e.g., 45-50% data completeness IIRC. Now, even the +5hr delay appears to be close to the edge of insufficiency. We do have more stations, so there

may be some issue with throughput.

 

The RainQC success count for the day indicates that we are still scoring stations (as we don't rely on 100% data completeness for scoring). 

 | 'success' count for 2024-08-16:   79 (flag=2 count:   9)
We are still leaving some models unscored and I think this is due to a mix of flag=4 data  and a lack of appropriate retries in the RainQC code. 

One caveat: I don't know how flag=3 is handled. Let's say a station was flagged a month ago as flag=3 due to the satellite QC process. Is new data for that station set with flag=3 until the station is marked 'repaired' (or the equivalent)?
Or is flag=3 purely retroactive?
Michael
 

 

On Sat, Aug 17, 2024 at 2:24 PM Dietterich, Thomas via Rainqc-jobman <rainqc-jobman@engr.oregonstate.edu> wrote:

These are alarmingly low numbers.

 

Thomas G. Dietterich, Distinguished Professor (Emeritus)

School of EECS, Oregon State University

US Mail: 1148 Kelley Engineering Center, Corvallis, OR 97331-5501 USA

Office: 2063 Kelley Engineering Center

Voice: 541-737-5559; FAX: 541-737-1300

https://web.engr.oregonstate.edu/~tgd/

 

From: RainQC Job Manager <noreply@tahmo.org>
Sent: Friday, August 16, 2024 23:27
To: rainqc-jobman@ENGR.ORST.EDU
Cc: Michael Slater <science@baidala.com>; Dietterich, Thomas <tgd@oregonstate.edu>
Subject: RainQC Job Manager daily report 2024-08-17

 

[This email originated from outside of OSU. Use caution with links and attachments.]

Current UTC date: 2024-08-17 -> scoring models for previous day: 2024-08-16
---------------------------------------------------------------------------------
Daily Model Data Completeness Check:
data completeness    50% | complete models: 126 of 273 (46.15%)
data completeness    60% | complete models: 124 of 273 (45.42%)
data completeness    70% | complete models: 116 of 273 (42.49%)
data completeness    75% | complete models: 113 of 273 (41.39%)
data completeness    80% | complete models: 109 of 273 (39.93%)
data completeness    85% | complete models: 109 of 273 (39.93%)
data completeness    90% | complete models: 109 of 273 (39.93%)
data completeness    95% | complete models: 103 of 273 (37.73%)
data completeness   100% | complete models:  35 of 273 (12.82%)
----------------------------
station status | total: 313, delayed: 126, offline 24h: 77, offline week: 71
 | battery, min: 0, max: 100, mean: 56.92, std dev: 27.4
 | battery, common values: [(100, 163), (0, 81), (91, 3), (74, 3), (89, 3)]
 | battery <= mean, common countries: [('KE', 24), ('GH', 23), ('UG', 11), ('ML', 10), ('ZM', 6)]
----------------------------
96 LOW DATA (< 0.9) and 80 NO DATA weather stations impacted 164 RainQC models
LOW/NO data station impact on models: [('TA00651', 9), ('TA00568', 8), ('TA00185', 7), ('TA00199', 7), ('TA00016', 6), ('TA00198', 6), ('TA00327', 6), ('TA00320', 5), ('TA00045', 5), ('TA00587', 5), ('TA00166', 5), ('TA00243', 5), ('TA00231', 4), ('TA00379', 4), ('TA00301', 4), ('TA00530', 4), ('TA00565', 4), ('TA00543', 4), ('TA00700', 4), ('TA00636', 4), ('TA00035', 3), ('TA00222', 3), ('TA00041', 3), ('TA00178', 3), ('TA00116', 3), ('TA00126', 3), ('TA00141', 3), ('TA00274', 3), ('TA00360', 3), ('TA00217', 3), ('TA00482', 3), ('TA00385', 3), ('TA00289', 3), ('TA00436', 3), ('TA00430', 3), ('TA00574', 3), ('TA00542', 3), ('TA00013', 2), ('TA00020', 2), ('TA00308', 2), ('TA00256', 2), ('TA00118', 2), ('TA00165', 2), ('TA00148', 2), ('TA00487', 2), ('TA00210', 2), ('TA00223', 2), ('TA00232', 2), ('TA00224', 2), ('TA00271', 2), ('TA00399', 2), ('TA00309', 2), ('TA00314', 2), ('TA00691', 2), ('TA00339', 2), ('TA00364', 2), ('TA00373', 2), ('TA00397', 2), ('TA00451', 2), ('TA00462', 2), ('TA00471', 2), ('TA00592', 2), ('TA00014', 1), ('TA00031', 1), ('TA00062', 1), ('TA00091', 1), ('TA00095', 1), ('TA00123', 1), ('TA00212', 1), ('TA00219', 1), ('TA00237', 1), ('TA00251', 1), ('TA00268', 1), ('TA00269', 1), ('TA00286', 1), ('TA00290', 1), ('TA00316', 1), ('TA00336', 1), ('TA00344', 1), ('TA00356', 1), ('TA00362', 1), ('TA00389', 1), ('TA00396', 1), ('TA00422', 1), ('TA00432', 1), ('TA00433', 1), ('TA00493', 1), ('TA00524', 1), ('TA00528', 1), ('TA00529', 1), ('TA00533', 1), ('TA00535', 1), ('TA00652', 1), ('TA00655', 1), ('TA00677', 1), ('TA00702', 1)]
-----------------------------------------------------------
Processed daily jobs for UTC date: 2024-08-16
Start time: 2024-08-17T05:25:54+00:00
End time  : 2024-08-17T06:26:30+00:00
Elapsed time HH:MM:SS: 1:00:37
---------------------
Before job processing job table stats:
Total 'success' count: 82
Total 'failure' count: 188
Total record count: 1411
Job history table record count: 188135
Scoring job record table record count: 708
---------------------
After job processing job table stats:
Total 'success' count:               87 (flag=2 count:  10) (flag 2->1 downgrades:   3)
 | 'success' count for 2024-08-16:   79 (flag=2 count:   9)
 | 'success' count for 2024-08-15:    8 (flag=2 count:   1)
 | 'success' count for 2024-08-14:    0 (flag=2 count:   0)
 | 'success' count for 2024-08-13:    0 (flag=2 count:   0)
 | 'success' count for 2024-08-12:    0 (flag=2 count:   0)
 | 'success' count for 2024-08-11:    0 (flag=2 count:   0)
 | 'success' count for 2024-08-10:    0 (flag=2 count:   0)
Anomalies (flag=2):
 TA00073 2024-08-16 | score:   326.787 (thresh:   62.936) -- 'pr' t:  0.000 mm n: (0.0 mm, 37 km), (0.0 mm, 57 km), (45.303 mm, 58 km), (0.068 mm, 61 km)
 TA00140 2024-08-15 | score:   128.983 (thresh:   94.060) -- 'pr' t:  0.034 mm n: (17.071 mm, 35 km), (8.983 mm, 39 km), (4.663 mm, 68 km)
 TA00168 2024-08-16 | score:    73.889 (thresh:   72.283) -- 'pr' t:  8.250 mm n: (14.465 mm, 68 km)
 TA00171 2024-08-16 | score:   307.551 (thresh:  137.760) -- 'pr' t:  0.017 mm n: (24.565 mm, 26 km)
 TA00174 2024-08-16 | score:    65.320 (thresh:   11.504) -- 'pr' t: 21.649 mm n: (2.312 mm, 60 km)
 TA00334 2024-08-16 | score:   243.420 (thresh:  217.250) -- 'pr' t:  0.000 mm n: (9.86 mm, 18 km)
 TA00374 2024-08-16 | score:   185.168 (thresh:   67.401) -- 'pr' t:  0.085 mm n: (24.565 mm, 19 km)
 TA00683 2024-08-16 | score:  6346.011 (thresh:   67.770) -- 'pr' t: 10.203 mm n: (95.981 mm, 8 km)
 TA00686 2024-08-16 | score:  1932.579 (thresh:   86.597) -- 'pr' t: 19.204 mm n: (10.203 mm, 24 km), (95.981 mm, 31 km)
 TA00687 2024-08-16 | score:  2949.693 (thresh:  140.819) -- 'pr' t: 35.549 mm n: (95.981 mm, 25 km), (10.203 mm, 33 km)
--------
Total 'failure' count: 187
Total record count: 1414
Job history table record count: 188405
Scoring job record table record count: 709
-----------------------------------------------------------
RainQC JobMan LR | version 1.2

--
Rainqc-jobman mailing list
Rainqc-jobman@engr.oregonstate.edu
https://it.engineering.oregonstate.edu/mailman/listinfo/rainqc-jobman