Ship and ROV Core Data Workflow
Introduction
Data is logged on ships 24/7 into yearday files in:
- WesternFlyer:
/coredata/data - RachelCarson (pre-4/2021 upgrade):
/coredata/data - RachelCarson (starting 4/2021)
/home/ops/corelogging/rc/data
Transfer to shore and processing are handled exactly the same. The only significant difference is the platform names involved.
Due to the recent datamanger/navproc replacement effort (4/2021) on the Rachel Carson, the root directory paths and some of the scripts have diverged slightly (ex. there are some additional cmd line arguments passed in to check platform names etc.)
- ship designators are
wflyandrcsn - rov designators are
docr,tibr, andvnta
I will describe in detail below the Carson processing path but the same workflow applies to the Flyer.
Data Workflow Diagram
Note this is a XOJO App built to an executable] end subgraph DocRickettsDivelogToPerseus cpyDivelog2Perseus.exe[cpyDivelog2Perseus.exe
Note this is a XOJO App built to an executable] end subgraph EXPD_ExpeditionDataLoads subgraph ExpeditionDataLoads_perseus.bat popExpeditionData_perseus.pl popEDwithDBqueries_perseus.pl end end subgraph EXPD_ExpireAndDeleteWaypoints_0 EXPD_ExpireAndDeleteWaypoints_0.sql --call--> sp_expireWaypointsMsg end subgraph EXPD_ExpireAndDeleteWaypoints_7 EXPD_ExpireAndDeleteWaypoints_7.sql --call--> sp_expireWaypointsMsg end subgraph EXPD_NightlyCamlogLoadToPerseus_Late subgraph loadNightlyCamlog_perseus.bat LoadCamlogdata_perseus.pl end end subgraph EXPD_NightlyMinirovNavLoadToPerseus subgraph load_minirov_nav.bat urn1[UpdateRovNavdata_perseus.pl] lrn1[LoadRovNavdata_perseus.pl] end end subgraph EXPD_NightlyNavLoadToPerseus_Early subgraph loadALLNightlyNavdata_perseus.bat urn2[UpdateRovNavdata_perseus.pl] lrn2[LoadRovNavdata_perseus.pl] end end subgraph EXPD_NightlyNavLoadToPerseus_Late note3[Same as EXPD_NightlyNavLoadToPerseus_Early] end subgraph EXPD_NightlyNavUpdateToPerseus subgraph updateALLNightlyNavdata_perseus.bat urn3[UpdateRovNavdata_perseus.pl] lrn3[LoadRovNavdata_perseus.pl] end end subgraph EXPD_NightlyProcessMiniRovctdToPerseus subgraph processRawMinirovCtd.bat processMinirovCtdData_perseus.pl end end subgraph EXPD_NightlyRovctdLoadToPerseus_Early subgraph loadNightlyRovctd_perseus.bat updateRovCtdCfg_perseus.pl loadRawRovctdLogrFiles_perseus.pl processRawRovctdLogrData_perseus.pl end end subgraph EXPD_NightlyRovctdLoadToPerseus_Late note4[EXPD_NightlyRovctdLoadToPerseus_Early] end subgraph EXPD_UpdateDirtyDiveSummaries EXPD_UpdateDirtyDiveSummaries.sql --call--> AdminUpdateDiveSummaryByDiveID end subgraph VentanaCtdCfgToPerseus sourcetable --> cprovctd2[cpyRovCtdCfg2Perseus.exe
Note this is a XOJO App built to an executable] --> RovCtdCfg_Ventana end subgraph VentanaDivelogToPerseus divesourcetable --> cpdive2[cpyDivelog2Perseus.exe
Note this is a XOJO App built to an executable] --> VentanaPilotsDive end end click cpyRovCtdCfg2Perseus.exe href "https://bitbucket.org/mbari/cpyrovctdcfg2perseus/src/master/" click cpyDivelog2Perseus.exe href "https://bitbucket.org/mbari/cpydivelog2perseus/src/master/" click popExpeditionData_perseus.pl href "https://bitbucket.org/mbari/expd_data_loads/src/master/Expedition/popExpeditionData_perseus.pl" click popEDwithDBqueries_perseus.pl href "https://bitbucket.org/mbari/expd_data_loads/src/master/Expedition/popEDwithDBqueries_perseus.pl" click EXPD_ExpireAndDeleteWaypoints_0.sql href "https://bitbucket.org/mbari/expd_data_loads/src/master/SqlExecScripts/EXPD_ExpireAndDeleteWaypoints_0.sql" click EXPD_ExpireAndDeleteWaypoints_7.sql href "https://bitbucket.org/mbari/expd_data_loads/src/master/SqlExecScripts/EXPD_ExpireAndDeleteWaypoints_7.sql" click LoadCamlogdata_perseus.pl href "https://bitbucket.org/mbari/expd_data_loads/src/master/Camlog/LoadCamlogdata_perseus.pl" click urn1 href "https://bitbucket.org/mbari/expd_data_loads/src/master/Navdata/UpdateRovNavdata_perseus.pl" click lrn1 href "https://bitbucket.org/mbari/expd_data_loads/src/master/Navdata/LoadRovNavdata_perseus.pl" click urn2 href "https://bitbucket.org/mbari/expd_data_loads/src/master/Navdata/UpdateRovNavdata_perseus.pl" click lrn2 href "https://bitbucket.org/mbari/expd_data_loads/src/master/Navdata/LoadRovNavdata_perseus.pl" click urn3 href "https://bitbucket.org/mbari/expd_data_loads/src/master/Navdata/UpdateRovNavdata_perseus.pl" click lrn3 href "https://bitbucket.org/mbari/expd_data_loads/src/master/Navdata/LoadRovNavdata_perseus.pl" click processMinirovCtdData_perseus.pl href "https://bitbucket.org/mbari/expd_data_loads/src/master/RovCtd/processMinirovCtdData_perseus.pl" click updateRovCtdCfg_perseus.pl href "https://bitbucket.org/mbari/expd_data_loads/src/master/RovCtd/updateRovCtdCfg_perseus.pl" click loadRawRovctdLogrFiles_perseus.pl href "https://bitbucket.org/mbari/expd_data_loads/src/master/RovCtd/loadRawRovctdLogrFiles_perseus.pl" click processRawRovctdLogrData_perseus.pl href "https://bitbucket.org/mbari/expd_data_loads/src/master/RovCtd/processRawRovctdLogrData_perseus.pl" click cprovctd2 href "https://bitbucket.org/mbari/cpyrovctdcfg2perseus/src/master/" click cpdive2 href "https://bitbucket.org/mbari/cpydivelog2perseus/src/master/"
TODOS
- Find the source code and get into repository for makegeo3DReplayFiles_perseus.pl
- Find the source code and get into repository for makeKmlFiles_perseus.pl
- Link the Draco processes to what source log files they pull from in the diagram above.
Step 1 - Ship To Shore FTP
Ship to shore nightly transfers are invoked by cron on the on-board 'navproc' computer. Below shows the cron cmds for RachaelCarson and WesternFlyer
59 23 * * * /home/ops/corelogging/rc/scripts/rcAnnoFtpTransfer.perl rc > /home/ops/corelogging/rc/logs/out_nightly_trnsfr.log 2>&1
0 0 * * * /coredata/bin/wflyAnnoFtpTransfer.perl wfnavproc1 > /coredata/logs/.out_late_trnsfr 2>&1
Note that the Rachel Carson script requires the argument "rc" and the Flyer requires "wfnavproc1". Also the machine hostname, login and the directory paths to executables and the datafiles are different on the two systems. The newer Carson script uses the argument to perform some additional checks before setting the target ftp and local data source directories etc.
The diagram below shows the ship-to-shore work flow. An identical workflow occurs for the Western Flyer.
- Various loggers create daily log files. Only the three 'core' loggers are shown in the diagram for brevity. There are many more that are handled by the nightly transfer process (winfrog, lodestar, m3rs, dataprobe etc)
- cron on the navproc/corenav invokes the xxAnnoFtpTransfer.perl script at around midnight.
- the perl script gzips the prior days files and pushes them to the on-shore MBARI anonymous ftp server if it is "reachable" over the network. The target destination is in the public incoming/rcnavproc1 or incoming/wfnavproc1 directory
- if successful, the script moves the file to a 'transfercomplete' sub-directory. If transfer is not successful it just leaves the file in place and tries again next night.
Warning
Nothing in the logr filename designates the ship. Be careful configuring the target ftp directory not to mix up files of the same name from different platforms.
FIGURE 1.
Shore-side FTP to local 'DataTransfer' directory
A script is run hourly by cron on machine coredata as user coredata that transfers the files from the anonymous ftp server to a staging local 'datatransfer' directory (either /u/coredata/shipandrov/datatransfer/rcsn or /u/coredata/shipandrov/datatransfer/wfly)
An additional 'kill script' runs shortly after to abort any ftp transfer that may have hung.
These scripts and code are in bitbucket at https://bitbucket.org/mbari/navprocessing/src/master/
# Scripts below check frequently (hourly) for files in anonymous ftp dirs and move them to the preprocessing location
# on coredata ~/shipandrov/datatransfer
0 * * * * /u/coredata/shipandrov/navprocessing/bin/wflyAnnoFtpToShore.sh > /u/coredata/.log_wflyAnnoFtpToShore 2>&1
1 * * * * /u/coredata/shipandrov/navprocessing/bin/rcsnAnnoFtpToShore.sh > /u/coredata/.log_rcsnAnnoFtpToShore 2>&1
59 * * * * /u/coredata/shipandrov/navprocessing/bin/yourkillingme.sh > /u/coredata/.log_yourkillingme 2>&1
FIGURE 2.
Data files will remain on the ftp server until they 'age' to the point that IS automatically purges them (6 days by IS policy?). As a result of of this, the files will be swept from 'ftp' to 'datatransfer' many times. This is normal and the repeated files are ignored in follow-on processing steps.
Processing from datatransfer to Archives and RovNavEdit
Scripts below run the nightly processing of logr files in several steps.
20 22 * * * /u/coredata/shipandrov/navprocessing/bin/NightlyRC > /u/coredata/.log_LateNightlyRC 2>&1
25 22 * * * /u/coredata/shipandrov/navprocessing/bin/NightlyWF > /u/coredata/.log_LateNightlyWF 2>&1
NightlyRC and NightlyWF are shell scripts that each call two perl programs as described below.
- NightlyRC calls moveLogrFiles.perl which
- move files from 'datatransfer' to permanent, write-protected archives in
/mbari/ShipData/logger/ - removes any identical files from 'datatransfer' that are already in the archive.
- removes unneeded m3rs files
- If its a 'shipnavlogr' file (ship&rov nav combined) it places a copy into the
~/shipandrov/navprocessing/$ship/tododirectory to stage it for the next step which does nav processing.
- move files from 'datatransfer' to permanent, write-protected archives in
Note
The next diagram shows the first step of the NightlyRC script workflow. First step is the moveLogrfiles. A parallel workflow exists for the NightlyWF.
FIGURE 3.
It is important to note: Sometimes during the datafile transfer pipeline something fails. We occasionally end up with a few zero-length or corrupt files. Usually due to a network issue, a problem with the ftp transfer, or for some reason the transfer was initiated at sea before midnight gmt. moveLogrFiles.perl is very careful to check that any 'duplicate' file it detects coming in thru 'datatransfer' is an exact match to the file if it already exists in the archive. If it is not it will send a panic email saying you must resolve the conflict manually. You should:
- investigate the files stuck in
~/shipandrov/navprocessing/$ship/todo - compare the file length to the same ShipData archive file.
- you can also compare to the file stored on the ship (usually found in the 'transfercomplete' directory)
- Usually its obvious which should be the kept file - its nearly always the bigger file.
- Once determined which file(s) to retain:
- If the 'good' file(s) are already in the archive, simply delete the 'bad' file(s) in both the 'datatransfer' and in the FTP pub incoming directories.
- IF the 'bad' file(s) are in the archive, you will need to ask IS to delete them. (archive files are intentionally write/delete protected.) After the file(s) are gone from the archive, the moveLogrFiles.perl should pick up the 'good' files from 'todo' the next time it is run.
Note
The next diagram shows the second step of the NightlyRC script workflow - the call to processNav. A parallel workflow exists for the NightlyWF.
FIGURE 4.
- A few things to note in
~/shipandrov/navprocessing/bin/processNav.perl- rawpath
~/shipandrov/navprocessing/$ship/todo- where it looks for incoming files. - prcpath
~/shipandrov/navprocessing/$ship/tmp- a scratch directory for temp files - archpath
`/mbari/RovNavEdit/- the root path for results - creates tmp ship and rov nav files in format required for editing by both mbSystem and the Expd database load scripts
- performs very light cleaning etc of tmp ship and rov nav files
- ignores data when ship at mbari dock
- copies shipdata.txt file from tmp to RovNavEdit/YYYY/shipname
- copies rovnavdata.txt file from tmp to RovNavEdit/YYYY/rovname
- deletes the tmp .txt files
- deletes the gzip file in 'todo'
- rawpath
Warning
the processNav.perl script contains hard coded values for the datamanger items names expected in logr files. If any logged item names change within the logr files then these names need to follow.
Nav Editing
The .txt files in the /mbari/RovNavEdit directories that were created above are the files that will be loaded into Expd database 'raw nav' data tables.
- Rov nav .txt files will also be hand-edited by VideoLab personel using the MBSystem or other editing tools.
- In the past the year-day organized .txt files were hand-edited, but this produced artifacts at the year-day boundaries
- This step will produce a file of the same format with the 'edited' appended to the filename.
- Ship nav files are not currently edited.
Note
This completes the processing workflow that occurs on machine coredata8. Everything is now considered archived and staged for database loading. The loads are performed by Windows TaskManager jobs run on the Windows server "Draco"
RovCtd Configuration and Calibration Data
The labview ctd gui running on rc_rctd and wf_rctd computers is where the CTD calibrations info is maintained by the CTD Tech. There is an entire workflow that gets that information to shore and into the Expd database. It is described HERE
Database Loads on Draco
Preliminaries
- Load scripts and direct SQL update queries are run via the Windows TaskScheduler on Draco
- User login is SHORE\DB_EXPD_TASKEXEC
- All scripts and queries reside on drive D: in directory EXPD_Data_Loads
- They are all in the bitbucket repository at: Expd_Data_Loads
Tasks on the scheduler typically either:
- Execute a .bat file that then calls perl or python programs with cmd line arguments that perform a load
- Directly execute a windows executable file (.exe) to perform some task.
- Directly execute some SQL command on the SqlServer via SQLCMD.EXE
All jobs are scheduled to run at least nightly. They can also be run at any time by right-click on the job name and selecting 'Run'.

Expd Data Loads
The common repeated pattern we use to manage all of the various coredata streams from the ships and rov's is use of a 'load table' which is specific for each data stream.
- Coredata are logged in 'yearday' data files. Data files are organized in the archives by year and platform name subdirectories
- Individual data file loads are managed and tracked by a "load table"
- Scripts typically are passed a platform and year cmd line arguments to load.
- Scripts create a list of all yearday files for a platform/year
- Check each filename against the 'load table' to if exists and if it has been previousy loaded.
- if 'isLoaded' or 'isBlocked' the script just skips to next file.
- if file is not loaded yet, script takes an entry in the load table and then attempts to load.
- if load fails, leave the 'isLoaded' flag = 0 else set it to 1.
- If more than one year's worth of data needs to be checked for load/reload, the script simply needs to be called again with different year value.
There may be other fields and flags in a data stream's 'load table' that are specific to the type of data. These may be flags to other downstream processing tasks that also ustilize the 'load table' to manage the workflow.
- isLoaded flags typically have a trigger associated which can be used to back-out dependent rows in the database.
- isProcessed flag in some tables such as rovctdload has a trigger to back out dependents 'processed' rows in the database.
- isBlocked flags can be manually set by the dba to prevent the back-out or reload of data. This flag is rarely used such as when abnormal steps have been taken to manually correct/modify data in the database base tables (such as substituting CTD pressure for Digiquartz pressure when the sensor failed).
Note
Below will describe the load for each major coredata stream. You will notice the above pattern repeated with slight variations.
Shipnav Ingest to Expd
After the extracted navYYYYDDDSSSS.txt files appear in the ATLAS RovNavEdit/YYYY/rcsn or wfly directories they are will be loaded into Expd database.Unlike ROV nav, there is no manual editing step performed on ship nav data, They will be loaded by the next TaskScheduler run of the job EXPD_NightlyNavLoadToPerseus_Early .. or '_Late' - we run twice as some people want the nav data soon after the ship returns. (See also Figure 4 above for how the files get placed here.)
The Draco taskScheduler job runs the large loadAllNightlyNavdata_perseus.bat script which does many things - including invoking the perl program D:\EXPD_Data_Loads\Navdata\LoadShipNavdata_perseus.pl for each ship, by year/yearday.
This perl program scans the Atlas share RovNavEdit/YYYY/rcsn or wfly directory for ship nav files that need loading.
It uses the EXPD database RawShipNavLoad table to determine this.
- If there is no entry for the file it attempts a load and makes an entry
- If there is an entry and the isLoaded flag false (and isBlocked is not set) it attempts a load.
Note
The perl program has several hard-coded lat/lon locations that MBARI vessels have commonly docked at. This is to prevent loading unnecessary data when the ship is in port. In addition to Moss Landing dock, there are locations for ports of call in Hawaii, Gulf of Mexico, Eureka, Astoria etc.
- Should a data reload be needed - using a tool like Aquadata Studio to set the isLoaded flag to 0 will cause a trigger to fire and back out all dependent data rows.
Rovnav Ingest to Expd
Similar to how ship nav data are loaded as described above, ROV nav data is handled much the same way. Raw rov nav files are loaded automatically from the RovNavEdit share on atlas. After raw nav edits are performed by VideoLab personel the raw file, there is an edited file created in the same directory as the raw file. (eg nav2021062vntaedited.txt). This procedure is now deprecated in favor of editing by_dive data files as described in expd_mapping_workflow.
The two 'load tables' for rov are RawRovNavLoad and CleanRovNavLoad. They operate in the same way as the Shipnav Ingest described above.
During the Draco TaskScheduler pass that ran the loadAllNightlyNavdata_perseus.bat script - it invokes several perl calls to load rov nav.
- First it calls perl D:\EXPD_Data_Loads\Navdata\UpdateRovNavdata_perseus.pl for each ROV. This step
examines timestamps for the edited rov nav files to detect any nav that has been re-edited. If found, that
entry's
isLoadedflag is set to 0 to cause the previously loaded data to be deleted.
Warning
the .bat script only contains calls to UpdateRovNavdata_perseus.pl for the current year and several years in the past. Every new year it needs to be edited to make entries for the year just ended. See New Year Rollover
1.The .bat script the calls D:\EXPD_Data_Loads\Navdata\LoadRovNavdata_perseus.pl with arguments of vehicle, year vnta|docr raw|clean
-
The bat script lastly invoke perl program D:\EXPD_Data_Loads\Navdata\MergeNavdata_perseus.pl which will add ship lat/lon/heading back into the raw rov nav tables so it is basically 'pre-joined' for more cconvenient query writing.
-
Should a data reload be needed - using a tool like Aquadata Studio to set the isLoaded flag to 0 will cause a trigger to fire and back out all dependent data rows.
The loadAllNightlyNavdata_perseus.bat script looks back only a few years for newly edited nav files. Another script,
updateAllNightlyNavdata_perseus.bat, loops through all years of ROV data and is scheduled to run nightly by the task scheduler.
In order for downtream databases to know when re-edited ROV nav data has been reloaded the start_datetime and end_datetime
columns in the CleanRovNavLoad table can be used for the time bounds of data in the reloaded nav file.
Camera Ingest to Expd
Similar to how ship nav data are loaded as described above, ROV camera log data are handled much the same way.
Raw camlog files are loaded automatically. These data are loaded directly from the raw logr files found in the the ShipData/logger share on Atlas. No intermediate file is needed.
The Draco taskScheduler job runs the large loadNightlyCamlog_perseus.bat script which invokes the perl program D:\EXPD_Data_Loads\Camlog\LoadCamlogdata_perseus.pl for each rov, by year/yearday.
This perl program scans the Atlas share ShipData/logger/YYYY/vnta or docr directory for camlog files that need loading
The load table for camlog is CamlogLoad. It operates in the same way as the Shipnav Ingest described above.
Prior to early 2018, either a good tape timecode was required for a camlog record to be inserted. Camlog data was critical for match tape timecode to gmt clock time. When we no longer recorded to tape the perl script was modified to use the ROV.CTD.INWATER flag to only load data while the sub was in use as data such as zoom, focus, iris are still important.
Note
Database load of main camera Pan and Tilt has never been needed. Starting in April 2021 pan and tile are now logged in the camlog file for Ventana. For DocRickets and earlier Ventana, Pan and Tilt were logged in a seperate file - but only for most recent few years.
RovCtd Ingest to Expd
Similar to how ship nav data are loaded as described above, ROVCTD log data are currently handled much the same way.
Raw files are loaded automatically by TaskScheduler job on Draco. These data are loaded directly from the raw logr gzip files found in the ShipData/logger share on Atlas. No intermediate file is needed.
The current load table for managing rovctd loads AND PROCESSING is RawCtdLoad. From late 1999 thru mid-2008 load table is RovCtdLoad. Loads and processing before then was handled quite differently (see Some Improtant History below)
The rovctd load sequence is more complex than for the nav and camlog loads due to the need to apply calibrations, filter and despike the data and store both raw and bin averaged data.
The Draco taskSchedulerjob runs the large loadNightlyRovctd_perseus.bat script which invokes a sequence of perl programs.
-
Processing is run directly against the raw data in the database using calibration data in the RovCtdCfg table. The .bat script first invokes the perl program D:\EXPD_Data_Loads\RovCtd\updateRovCtdCfg_perseus.pl to make sure the latest/greatest calibrations are in the system. (see: Calibration Workflow
-
The perl program D:\EXPD_Data_Loads\RovCtd\loadRawRovctdLogrFiles_perseus.pl is run for each rov, by year/yearday. It loads raw (P,T,C and analog data) into 'raw tables' partitioned by vehicle and year (eg tables VentanaRawCtdData_YYYY or DocRickettsRawCtdData_YYYY). This perl program scans the Atlas share ShipData/logger/YYYY/vnta or docr directory for rovctdlogr files that need loading
-
after running the raw loads for the current year, both vehicles, the bat script runs the raw load for years past that may have been unloaded
Warning
the .bat script only invokes loads for the current year and several years in the past. Every new year it needs to be edited to make entries for the year just ended. See New Year Rollover
- The bat script then invokes D:\EXPD_Data_Loads\RovCtd\processRawRovctdLogrData_perseus.pl several times. This is the program that
- retrieves raw seabird CTD data from database.
- retrieves the correct calibration from the RovCtdCfg table
- Despikes and applies calibration, computes derived variables like salinity and O2.
- Uploads results to the processed high-frequency data tables
- Sends sql cmds necessary to bin the data into the '15sec bin date tables on the server
Note
isLoaded is used to signal the raw data load and trigger the unload if needed. isProcessed is used to signal the processing script or trigger the processed data unload. raw data very seldom needs to be unloaded. isProcessed is often used to cause a reprocessing when calibrations change etc.
Warning
before triggering an unload of raw data via the isLoaded flag, first trigger the unload of the processed records via the isProcessed flag.
Dives and DiveSummary table
Dives and Expeditions are created thru the Expd web pages. Scientists and pilots often think sets of data in terms of dive numbers with start and end times. This is most often the way the database is queried. But core data loads described above have no knowledge of the concept of dive, start, end. Until a Dive is created (usually thru the post-cruise reports) the data cant be shown or queried 'by dive'
There is an important table called DiveSummary which contains certain important statistics related to the dive. It gets a new record inserted whenever a dive is created (insert trigger on table Dive) It also gets updates via Dive table triggers when start/end times are changed. It has a 'dirtybit' that is set whenever these trigger fire to indicate that the dive needs to be re-summarized.
The TaskScheduler job (Expd_UpdateDirtyDiveSummaries) is run nightly that executes the sql contained in the file D:\EXPD_Data_Loads\SqlExecScripts\EXPD_UpdateDirtyDiveSummaries.sql
That query checks the dirtybit for any dives needing attention and computes new statistics then clears then dirtybit.
Some Important History
Rovctd data handling has changed numerous times since we started in 1988.
At that time we only had Ventana. Data were collected on low capacity floppy disks using Seabird CTD PC software. (earliest Ctd data may have been recorded by the MOS system and later moved to Seabird PC software - I cant quite recall 30 years later!)
Because of storage limitations on the hard drives and floppys of the time the data were bin averaged to 15 seconds.
Floppy disks were hand carried from the ship to Pacific Grove by the scientists along with nav data on magnetic tape cartridges from the old 'meridian ocean systems' (MOS) system and VCR tapes.
The data Sybase database "MODB" was implemented to hold the data, and 'load paths' were written in 'C' to process and load the data. CTD data loaded into MODB on the HP850 after being despiked and bin-averaged.
In 1999 we left Sybase for MSSql server and the current Expd database. Legacy data were bulk-copied from MODB to Expd. New programs were implemented in linux to log and process the data. This was due to the need handle the sensors chosen for rov Tiburon which Seabird did not support. (Falmouth sensors, and later the Aanderra O2 Optode)
During the "Tiburon years" (1197-2007/8) the load paths and processing for the two vehicles also diverged due to the differences in the shipboard computing systems. Western Flyer ran the datamanager/navproc architecture that was developed for the Tiburon control system. The Point Lobos continued to run the old MOS for nav. Later ( ~8/1999 ) the Pt Lobos was upgraded to the datamanager/navproc architecture for core datastreams (although not for the subsea control system as with the datamanager/navproc on Tiburon).
The older data 1988 thru approximately 2009 have a very different archive layout and processing history. The source raw and processed files are archived on the Atlas ShipData share in the subdirectory "rovctd". The directory layout is rovname/year/yrday. The critical files for the database load scripts of the time were the "der" (derived) files (eg. 2008316vntaCtdDer.txt). These could be either 15 second average, later 2 second average, and eventually 1 second was used (and the bin averaging was eventually done directly in the database via SQL cmds as described above in RovCtd Ingest.)
Warning
From 1988 thru the end of 1997 we only archived 15-second bin data. Raw data starts in 1998 in Expd. We continue to compute 15 sec bin data as users want to be able to query or compute statistics accross the entire history. Raw high frequency (1 or 2 sec) started to be loaded directly into base tables in 1988. Prior to that we only stored 15-second processed data in 15 second bin averages.
NO_ITEM, NO_PROV, NO_PUB, NO_SUB In Log Files and Usage In Processing
When the log files are generated on the ship, there is logic to decide if there should be a special flag written in place of the data itself for certain situations. There are four possible flags that can be written and they are NO_ITEM, NO_PROV, NO_PUB, and NO_SUB. The meaning behind each of this is as follows:
- NO_ITEM: This means that in the configuration for the logger on the ship, the item name for this column was not available in the data. This can happen when there just isn't any data available in the LCM message coming from the publisher.
- NO_PROV: There are subtle differences depending on the instrument providing the data and setting this flag, but mainly it indicates if data timed out, was not received correctly or not parsed correctly.
- NO_PUB: The logger looks to see if the timestamp when the data was last parsed successfully is past a pre-defined 'maxAge' in the logger configuration file, if so, this flag gets set
- NO_SUB: This indicates that while an item was specified in the configuration file for the logger, no data subscriber was actually instantiated and usually indicates a logger mis-configuration.
In order to understand how those flags are used in data processing, the table below describes if and how each flag is used in the processing scripts.
| Status | processNav.pl | loadRawRovctdLogrFiles_perseus.pl | LoadCamlogdata_perseus.pl | LoadLegacyCamlogdata_perseus.pl | LoadShipNavdata_perseus.pl |
|---|---|---|---|---|---|
| NO_ITEM | Equivalent to NO_PROV and indicates no data available for the field. Usually replaced missing value | Equivalent to NO_PROV and indicates no data available for the field. Usually replaced missing value | Equivalent to NO_PROV and indicates no data available for the field. Usually replaced missing value | Equivalent to NO_PROV and indicates no data available for the field. Usually replaced missing value | Equivalent to NO_PROV and indicates no data available for the field. Usually replaced missing value but is also used to check for valid GPS data (return 1 or 0 depending) |
| NO_PROV | Equivalent to NO_ITEM and indicates no data available for the field. Usually replaced missing value | Equivalent to NO_ITEM and indicates no data available for the field. Usually replaced missing value | Equivalent to NO_ITEM and indicates no data available for the field. Usually replaced missing value | Equivalent to NO_ITEM and indicates no data available for the field. Usually replaced missing value | Equivalent to NO_ITEM and indicates no data available for the field. Usually replaced missing value but is also used to check for valid GPS data (return 1 or 0 depending) |
| NO_PUB | Not used | Not used | Not used | Not used | Not used |
| NO_SUB | Not used | Not used | Not used | Not used | Not used |