ROVCTD O2 Reprocessing Example

revised 21-Apr-2021

1) Preliminaries

Why might we need to back-out and reprocess data.

O2 sensors on CTD have slope/offset factors provided by Ed Peltzer determined from Winkler titrations. So we need to manually force the system to back-out and reprocess the CTD data in the Expd database over a specified datetime range.
One or more sensors failed at sea on a long cruise and the calibration was not updated before more data was logged.
There was a mistake in the calibration config file
The RovctdConfigFileMonitorService failed to pick up a change to the calibration file. Though rare, this has happened when Erich or the pilots have upgraded, repaired or modified the computer.

This document covers an example of the most complicated use-case where the sensors have already been replaced AND Ed determines that the slope/offsets of some of the dives logged with an 'older' calibration need changing

Once this use case if understood, the more simple case of just a slope/offset change within the normal day-to-day operations should become clear.

In either case, all things need to start with Erich modifiying the CTDCoeff.ini file on the shipboard rc-rctd or wf-rctd windows computers where the labview gui is run.

First several steps below will check the workflow that happens at sea to get the labview config file from the CTDTech's computer into the at-sea database.

sequenceDiagram autonumber participant CTDTech participant CTDCoeff.ini participant CfgFileMonitorService participant alaskanwind.DOCR_CTD_Config CTDTech->> CTDCoeff.ini:updates CfgFileMonitorService-->>CTDCoeff.ini:detects change CfgFileMonitorService-->>alaskanwind.DOCR_CTD_Config:inserts file into database record

Erich edits the file using Ed’s new slope/offsets and saves it.
The RovCtdCfgFileMonitorService running on the same computer should detect the change and will update the shipboard database on Glacierwind or Alaskanwind (typ takes < 30 seconds ).
Then - if swapping sensors out, Erich will need to update the CTDCoeff.ini file a second time with his new sensors.

The RovCtdCfgFileMonitorService is documented HERE

Warning

Your goal is to establish the proper chronological series of calibration entries in both the on-board and on-shore databases using the automated processes. This will save you a lot of by-hand corrections and confusion in the downstream processing.

2) Check the CTDcoeff.ini file

Note

I encourage you to check the date stamp of the CTDCoeff.ini on each system periodically and compare it to the datestamps in the calibration tables in Expd. If you RemoteDesktop into rc-rctd or wf-rctd - Ask first as you will bump whoever is on !!!

Check the file in use on the ship walking over to the boat or use remote desktop.

Below is a sample screengrab of the directory from the WesternFlyer's wf-rctd. I will be using the data shown in the discussion that follows as a 'use case' where we have to do the two sets of updates (new slope/offsets so we can back-correct old data, followed by a sensor swap and process of new sensor data)

Figure 1 - Screengrab Figure 1: Screengrab of CTD Coefficient files on WesternFlyer

Examine the results (see Figure 1):

Note the timestamp of the current CTDcoeff.ini (8/6/2020 10:25) this should be the file with the ‘new’ sensors.
Also in this case note Erich has carefully made himeself a safety copy of the prior file.

In our 'complex example' use case - this should be the one with with new slope/offsets from Ed. Hopefully Erich left it as exactly “CTDCoeff.ini” for long enough ( ~30secs) that it got picked up by the file monitor service.

3) Verify Cfg Loaded to Shipboard Database

Next we need to verify both cfgs came thru to the shipboard database. On alaskanwind.wf.mbari.org (remotely via Aquadata Studio), run the following query against the DOCR_CTD_Config table of the DOCR_CTD_Config database. (sorry-prob. should have given them different names....)

(On the Rachel Carson it would be server = Glacierwind and the VNTA_CTD_Config database)

    SELECT top 10 * FROM [dbo].[DOCR_CTD_Config] ORDER BY eventtime desc

Here is the database and table.

Figure 2a - Screengrab Figure 2a: The on-ship database

Here are the query results.

Figure 2b - Screengrab Figure 2b: Query run against on-ship database

Examine the results in Figure 2b above.

Note the ‘localtime’ column shows two updates – the most recent one matches the most recent file in Fig1. The second-most-recent, while not exact match - is consistent with Erich having modified the slope/offset, letting it post, then renaming (saving) it as a backup.
Note the ‘id’ column shows the database assigned key for the entries – jot those down, we will use those values next.

4) Verify Cfg Loaded to On-shore Database

Next we will verify the workflow that gets the config changes from shipboard databases to on-shore database. Approximately this:

sequenceDiagram autonumber participant DracoTaskScheduler participant DocRickettsCtdCfgToPerseus participant cpyRovCtdCfg2Perseus.exe participant alaskanwind.DOCR_CTD_Config participant perseus.RovCtdCfg_DocRicketts DracoTaskScheduler->> DocRickettsCtdCfgToPerseus:runs task DocRickettsCtdCfgToPerseus->>cpyRovCtdCfg2Perseus.exe:executes cpyRovCtdCfg2Perseus.exe-->>perseus.RovCtdCfg_DocRicketts:deletes all records cpyRovCtdCfg2Perseus.exe-->>alaskanwind.DOCR_CTD_Config: reads all records from cpyRovCtdCfg2Perseus.exe-->>perseus.RovCtdCfg_DocRicketts: inserts all records to

We need to verify both of those cfgs came thru to the on-shore database. Using Aquadata Studio expd database on perseus (as user=expddba)

First check that they made it from Alaskanwind to the temporary table on Perseus EXPD database

    SELECT top 10 * FROM [dbo].[RovCtdCfg_DocRicketts] ORDER BY eventtime desc

Figure 3 - Screengrab Figure 3 Checking if the cfgs made it from ship to the staging table in Expd on perseus

Examine the results (See Figure 3):

Make sure all events have come across using the ‘id’ keys obtain from Fig 2.
If not ... check using draco task scheduler, run the DocRickettsCtdCfgToPerseus job (or Ventana’s) and then re-run the above.

5) Verify Cfg Cascades to RovCtdCfg

Next check that the calibration cascades up to the parsed Expd RovCtdCfg table: On perseus (as expddba)

This is this is workflow we want to verify.

sequenceDiagram autonumber participant DracoTaskScheduler participant EXPD_NightlyRovctdLoadToPerseus participant loadNightlyRovctd_perseus.bat participant updateRovCtdCfg_perseus.pl %%participant loadRawRovctdLogrFiles_perseus.pl participant table.RovCtdCfg_DocRicketts participant table.RovCtdCfg participant loadRawRovctdLogrFiles_perseus.pl DracoTaskScheduler-->>EXPD_NightlyRovctdLoadToPerseus: Runs task EXPD_NightlyRovctdLoadToPerseus-->>loadNightlyRovctd_perseus.bat: Calls script loadNightlyRovctd_perseus.bat-->>updateRovCtdCfg_perseus.pl: Executes updateRovCtdCfg_perseus.pl-->>table.RovCtdCfg_DocRicketts:New configs ? updateRovCtdCfg_perseus.pl-->>table.RovCtdCfg:Parse each new config updateRovCtdCfg_perseus.pl-->>table.RovCtdCfg:Insert each parsed config loadNightlyRovctd_perseus.bat-->>loadRawRovctdLogrFiles_perseus.pl: Execute for each year/vehicle

        SELECT top 10 * FROM [dbo].[RovCtdCfg] ORDER BY startdtg desc

Figure 4 - Screengrab Figure 4

Examine the results (see Figure 4):

You’ll notice combined data for both vehicles are in this table....
Don’t expect start/end dtg’s to match the CTDCoeff.ini file timestamps... aren’t supposed to... (startdtg, enddtg is when a calibration is to be applied over a range of dives)
What is important is that cfgfileid_fk fields match the ‘id’ fields of what came off of the ships.
You will notice that when compared to Figure 3 above, the most recent configs from the the DocRicketts haven’t cascaded up yet.
In this case, I happened to be working on this the ‘same day’ as Erich has made his updates. Since the cascade is set to run in the evening, they havent sync'd up yet. (task EXPD_NightlyRovCtdLoadToPerseus had not run yet).
Also note how the old, last-known docr config (cfgfileid_fk = 342) is valid beginning startdtg in October 2019, with no enddtg specified (this is the sign to the CTD processing it is the ‘most current’ calibration )

6) Force a Cascade If needed

So - based on results above – we need to go to Draco and force the cascade to happen.

Using RemoteDesktop, log in to draco as SHORE\DB_EXPD_TASKEXEC

Open/Run TaskScheduler

Execute job: EXPD_NightlyRovCtdLoadToPerseus_early (it includes a step that runs the calibration cascade, (specifically the script updateRovCtdCfg_perseus.pl)

Now re-run the above check Again on perseus (as expddba)

SELECT top 10 * FROM [dbo].[RovCtdCfg] ORDER BY startdtg desc

Figure 5 - Screengrab Figure 5

Examine the results again carefully (see figure 5)

Ok, now there are entries for the two new cfgfileid_fk’s.
The cascade update program also closed-out (set the ‘enddtg’ the old October 2019 config (cfgfileid_fk=342).
Note also – there is an earlier cfgfileid_fk=343 but its start/end is prior to 342.... This is not a problem as fk’s are not always ‘sequential’- what is important is the time-span....
Cascade update also closed out cfgfileid_fk = 344 (because it also found a newer cfgfileid_fk = 345).
344 is the one with slope/offsets we want to use to reprocess Ed’s dive series
345 is the one with slope/offsets we want to use going forward using the new sensors. Note that its enddtg is null – indicating to future processing it is the ‘most current’ calibration.

The above still isn’t quite what we need. Next we will have to manually update the start and end times of cfgfileid_fk’s 343 and 344 to accommodate the time-span of the dives Ed has specified as needing reprocessing using the slope and offset. That is – we need to set the endtime of 342 and the starttime of 344 to match start of the first dive in the series.

7) Adjust Calibration Start/End Datetimes

Ed said: “New correction factors to use beginning with docr-1218 and running through docr- 1264 and beyond ...”

On perseus (as expddba)

SELECT * FROM [dbo].[Dive] WHERE RovName = 'docr' AND DiveNumber = 1218

Figure 6 - Screengrab Figure 6

Examine the results (see figure 6): There is the starttime we need. NOTE IT IS GMT

Using the starttime from dive 1218 above (Fig 6), the query below, and Aquadata Studio’s editing feature to edit the endtime of id 342 and startime of 344.

SELECT top 10 * FROM [dbo].[RovCtdCfg] WHERE vehicle = 'docr' ORDER BY startdtg desc

Figure 7 - Screengrab Figure 7

Examine the results (see figure 7)

Warning

The text and figures below contain an error.. The DiveStartDtg from the dive table is in GMT. The startdtg and enddtg fields of the RovCtdCfg table are LOCAL time. So I should have converted GMT to Local before making that entry.. sorry!

Note

There is a trigger on the RovCtdCfg table startdtg and enddtg fields that does a conversion back to GMT fields further into the table. So you need to make your changes to the local time fields.

Note how the enddtg for cfgfileid_fk=342 has been adjusted to close one second before dive 1218 started, and cfgfileid_fk=344 startdtg begins when dive 1218 started.
Note the startdtg and enddtg time progression now make logical sense:
- Dives prior to 1218 will use id 342
- Dives starting with 1218 up thru 8/6/20 10:25:05 will now use id 344 o Any dives beyond 8/6/20 10:25 will use id 345
Also – pls notice that ‘id’ for ‘cfgfileid_fk’ 342 is 1138... that is what has been use up to this point to process the data. We will see that id again in a moment....
Next we will back out data from dives 1218 thru current (or whatever Ed specified....)

8) Prepare for the Reprocessing - Backout Old Data)

Now that calibrations have been updated we need to back out processed data so that new calibrations are applied. Following from above - we will back out data from DocRicketts dives 1218 thru current. Since data loads are done by “day-of-year” we need to find that for dive 1218:

SELECT divenumber, divestartdtg, datepart(yy, divestartdtg) as yr , datepart(dy,divestartdtg)
FROM [dbo].[DiveSummary]
where rovname = 'docr'
AND divenumber = 1218

The above returns yy=2019, doy=319

Note

Likewise if Ed had specified an ending dive we would retrieve its ‘diveenddtg’ values

Next we will use and Aquadata Studio’s editing feature to set isProcessed to 0 any place it’s a 1 for a dive in the range, (causes processing to cascade delete all dependant rows)

Warning

Cascade delete cant span years, so if your reprocessing work span year boundries (like our example does) – we do it in two chunks...)

Below shows looking up loaded data beginning day 319 of 2019

SELECT *
FROM [dbo].[RawCtdLoad] where deviceid = 1672 and yr = 2019
and yrday >= 319
order by yrday desc

Warning

that needs to be >= 319 because we want to include day 319 !!!!

Figure 8 - Screengrab Figure 8 - ALERT correct the qry above to be >= 319 because we want to include day 319 !!!!

We want to set isProcessed to 0 any place it’s a 1, and for the processing to cascade delete...

In aquadata, use execute-edit on the above
Scroll to right – notice column ‘actionDescription’ status is ‘Processing completed thru binavg’
Set several of the isProcessed bit to 0 then ‘save and refesh’ (or use ‘cmd-s’ shortcut on a mac)
After several seconds – ‘actionDescription’ status should change to xxxx processed data rows unloaded’
Repeat until all yrdays in range are unloaded.

Then do next year if needed.... Just modify the same query as needed (see below) and repeat setting the isProcessed bit.

SELECT *
FROM [dbo].[RawCtdLoad] where deviceid = 1672 and yr = 2020
and yrday > 0
order by yrday desc

So now all the ‘old’ processed data is gone.... (Both the 1Hz and 15 sec BinAveraged).

All that’s left is for draco to reprocess the raw data using the new calibrations. It can be done now by force or left to nightly processing to pick it up. Once that is done, you will likely need to go back in and set the QC flags to what Ed recommends.

9) Manually Start a Processing Run

Here is how to manually kick off reprocessing on draco if you dont feel like waiting for the normal middle of the night run.

Using RemoteDesktop, log in to draco as SHORE\DB_EXPD_TASKEXEC.
Open/Run Windows TaskScheduler
Execute job: EXPD_NightlyRovCtdLoadToPerseus_early

Note it can take several minutes up to several hours depending on how much data you are reprocessing. You can get some comfort that thing are proceeding by either looking at the log file on Draco or by re-running the RawCtdLoad queries and see the isProcessed flag being reset to ‘1’ and the actionDescriptions updating.

Or this query adds a helpful filter on actionDescriptions:

SELECT *
FROM [dbo].[RawCtdLoad]
where deviceid = 1672
and yr = 2019
and yrday >= 319
and actionDescription not like '%Load successful, 0 rows loaded%' order by yrday desc

Figure 9 - Screengrab Figure 9 - Note how isProcessed flags are starting to show complete processing

10) Manually Force a DiveSummary Update

Next - if you are doing a forced reprocessing, after all processing has completed, you should also do a forced update of the DiveSummary table.

Using RemoteDesktop, log in to draco as SHORE\DB_EXPD_TASKEXEC Open/Run TaskScheduler

Execute job: EXPD_UpdateDirtyDiveSummaries

11) Update QC Flags

You will need to re-QC_flag any dives that Ed has instructed to be flagged.

That process is fully described HERE: https://bitbucket.org/mbari/se-ie-doc/src/master/docs/systems/expedition-database/o2-qc-management.md