Skip to content

MiniROV Data Processing

This page documents the software that was developed in order to integrate the data coming off the MiniROV into the Expedition database.

VideoLab personnel (Lonny) are responsible for creating the DiveNumber/Ship/Scientist entries using the MiniROV Divelog Application. This is a web application that connects to the MINIROV_DiveLog database on the MSSQL server on perseus. It allows the user to enter/edit dives in that database. Currently only Lonny has the password to it in order to reduce entry errors. Periodically tasks running on machine coredata(8) run to pull new dive entries over int the Expd database. The MinirovDivelog application is described in detail in the DiveLog section of the se-ie-doc site.

Minirov_Divelog database to Expd Database Ingest

Note

Divelog ingest into Expd is completely asynchronous and decoupled from the routine raw minirov data file processing.  Dive entries are convenient for downstream uses such as the Expd web app or 'by dive' data searches etc but they are completely unnecessary for successful raw data ingest and processing.

Divelog data flow is shown below

graph TD divelog-minirov -- user adds dive --> MINIROVDiveLogDive click divelog-minirov "https://coredata8.shore.mbari.org/divelog-minirov/" "Click to go to web app" MINIROVDiveLogDive --> minirovdivelog2expd.py minirovdivelog2expd.py --> sp_insertDive sp_insertDive --> EXPDDiveTable[Dive] subgraph Coredata8 as user coredata divelog-minirov[MiniROV Divelog Web App] cron --> runMinirovDivelogUploader runMinirovDivelogUploader --> minirovdivelog2expd.py end subgraph "Perseus MSSQL" subgraph MINIROV_DiveLog Database MINIROVDiveLogDive[Dive] end subgraph EXPD Database EXPDDiveTable[Dive] sp_insertDive end end

An entry in the crontab (shown below) runs the runMinirovDivelogUploader shell script.

#merge from minirovdivlog database to Expd.Dive table
00 * * * * /u/coredata/minirov-cos8/scripts/runMinirovDivelogUploader  >> /u/coredata/minirov/runlogs/minirovDivelogUploader.log 2>&1

The script itself consists of the following

#! /bin/sh
echo activating the python environment
source /u/coredata/minirov-cos8/venv-minirov/bin/activate
python /u/coredata/minirov-cos8/scripts/minirovdivelog2expd.py
exit 0

The shell script activates a python virtual environment and invokes the script minirovdivelog2expd.py that does all the work. minirovdivelog2expd.py queries the MINIROV_DiveLog Database for new Dive entries, and reformats them for entry into the EXPD database Dive table. The Dive insert is done using the sp_insertDive() stored procedure to ensure expd database referential integrity.

SEE ALSO Create an Expedition and links to MiniROV dives for non-MBARI ship Expeditions

Mini ROV Data Processing

The Pilot 'Contract'

  • MiniROV pilot is responsible for logging data at sea and transfer to shore.landing root directory path is: //atlas.shore.mbari.org/ProjectLibrary/901123.ROV1k/Data/Logs
  • Files will be of ‘type’ CTD, NAV, DVL, ROV
  • Files are to be organized there to subdirs by ‘type’
  • Note that ‘ROV_xxx’ files land in a subdir called 'Vehicle' instead of 'ROV'
  • All files are CSV with header tags describing column names
  • All of the data are timestamped with GMT seconds since 1-1-1970 (utcsecs)
  • Logging starts/stops when the vehicle enters/leaves the water.
  • Data files should start/end within a few seconds of each other.
  • CTD and NAV are required.
  • DVL (for altimeter) and ROV (for heading) are optional

Step 1. Archive, inspect and register original data files

graph TD subgraph "Perseus MSSQL" subgraph EXPD Database Tables MinirovLogfiles MinirovRawCtdLoad MinirovRawNavLoad MinirovRawDvlLoad MinirovRawRovLoad end end subgraph ShipData Atlas Share minirov-->ShipDataLogs[logs] end subgraph Coredata8-via cron as usercoredata runMinirovFileArchiver-->minirovfilearchiver.py end subgraph "ProjectLibrary 901123.ROV1k Share" /Data/Logs end /Data/Logs-->minirovfilearchiver.py minirovfilearchiver.py-->ShipDataLogs minirovfilearchiver.py-->MinirovLogfiles minirovfilearchiver.py-->MinirovRawCtdLoad minirovfilearchiver.py-->MinirovRawNavLoad minirovfilearchiver.py-->MinirovRawDvlLoad minirovfilearchiver.py-->MinirovRawRovLoad

The schematic above shows pieces involved in the initial archive and registration of minirov raw data files.

The sequence diagram below shows the main actions between the pieces. Cron runs the runMinirovFileArchiver shell script. The shell script activates a python virtual environment and invokes the script minirovfilearchiver.py that does all the work. It is invoked four times, once for each type file.

sequenceDiagram participant Pilot participant ROV1kShare participant cron participant MinirovFileArchiver.sh participant minirovfilearchiver.py participant Expd participant Atlas/ShipData/minirov/logs Pilot->> ROV1kShare: Deposit files cron->> MinirovFileArchiver.sh: runs MinirovFileArchiver.sh->> minirovfilearchiver.py:runs w/CTD minirovfilearchiver.py-->>ROV1kShare: looks for new CTD file minirovfilearchiver.py->> Atlas/ShipData/minirov/logs: Copy to Atlas/ShipData/minirov/CTD archive minirovfilearchiver.py->> Expd: Register in MinirovLogfiles minirovfilearchiver.py->> Expd: Inspect & Register in MinirovRawCtdLoad MinirovFileArchiver.sh->> minirovfilearchiver.py:runs w/NAV minirovfilearchiver.py-->>ROV1kShare: looks for new NAV file minirovfilearchiver.py->> Atlas/ShipData/minirov/logs: Copy to Atlas/ShipData/minirov/NAV archive minirovfilearchiver.py->> Expd: Register in MinirovLogfiles minirovfilearchiver.py->> Expd: Inspect & Register in MinirovRawNavLoad MinirovFileArchiver.sh->> minirovfilearchiver.py:runs w/ROV minirovfilearchiver.py-->>ROV1kShare: looks for new ROV file minirovfilearchiver.py->> Atlas/ShipData/minirov/logs: Copy to Atlas/ShipData/minirov/ROV archive minirovfilearchiver.py->> Expd: Register in MinirovLogfiles minirovfilearchiver.py->> Expd: Inspect & Register in MinirovRawRovLoad MinirovFileArchiver.sh->> minirovfilearchiver.py:runs w/DVL minirovfilearchiver.py-->>ROV1kShare: looks for new DVL file minirovfilearchiver.py->> Atlas/ShipData/minirov/logs: Copy to Atlas/ShipData/minirov/DVL archive minirovfilearchiver.py->> Expd: Register in MinirovLogfiles minirovfilearchiver.py->> Expd: Inspect & Register in MinirovRawDvlLoad

The crontab entry on coredata8 to run the runMinirovFileArchiver shell script

#sweep files from Dales directories to atlas archive and register with database load table
30 17 * * * /u/coredata/minirov-cos8/scripts/runMinirovFileArchiver  >> /u/coredata/minirov/runlogs/minirovFileArchiver.log 2>&1

The contents of the runMinirovFileArchiver shell script.

#! /bin/sh
echo activating the python environment
source /u/coredata/minirov-cos8/venv-minirov/bin/activate
python /u/coredata/minirov-cos8/scripts/minirovfilearchiver.py NAV
python /u/coredata/minirov-cos8/scripts/minirovfilearchiver.py CTD
python /u/coredata/minirov-cos8/scripts/minirovfilearchiver.py DVL
python /u/coredata/minirov-cos8/scripts/minirovfilearchiver.py ROV
exit 0

minirovfilearchiver.py

  • Crawls the //Atlas/ProjectLibrary/901123.ROV1k/Data/Logs/[NAV|CTD|DVL|ROV] directories looking for data files that have not been registered in the MinirovLogfiles database table.
  • Copies them to the //Atlas/ShipData/minirov/logs/[NAV|CTD|DVL|ROV]/YYYY directory as the official permanent archive.
  • NOTE: files on atlas in ShipData/minirov and are automatically write-protected (need IS admin to modify or delete)
  • Files are also scanned to see if we can determine start/end times, number of records, and a crude test is done to see if columns contain ‘realistic’ data.
  • Registers ALL files (even empty ones) in a database table on Expd (Table: MinirovLogfiles), along with start/endtimes etc. from above.
  • On successful archive and registration, the isArchived bit is set in the MinirovLogfiles table and the isBlocked flag is cleared. isBlocked and isProcessed flags are currently read in the query but not used – they are meant to be back compatable with how other core ship/rov data loads are handled
  • If 'valid' data are found within the file, the archiver also registers the file along with its time and number of records metadata in the type-specific load tracking table e.g. MinirovRawCtdLoad, MinirovRawNavLoad, MinirovRawRovLoad or MinirovRawDvlLoad (this is where device-specific processing is controlled and tracked by subsequent steps in the minirov workflow)
  • The type-specific tables have flags initialized that ARE used by subsequent steps in the minirov workflow (isLoaded, isBlocked, isProcessed etc).

Step 2. Minirov CTD Load

Nightly CTD processing on coredata(8)

graph TD subgraph "Perseus MSSQL" subgraph EXPD Database Tables MinirovRawCtdLoad MinirovRawCtdData_YYYY end end subgraph ShipData Atlas Share minirov-->ShipDataLogs[logs/CTD/YYYY] end subgraph Coredata8-via cron as user coredata runMinirovCtdUploader-->ctdfileloader.py end subgraph "ProjectLibrary 901123.ROV1k Share" /Data/Logs end /Data/Logs-->ctdfileloader.py ctdfileloader.py -->ShipDataLogs ctdfileloader.py -->MinirovRawCtdLoad ctdfileloader.py -->MinirovRawCtdData_YYYY

The schematic above shows pieces involved ingest of raw CTD data into the expd database.

Cron runs the runMinirovCtdUploader shell script.

00 20 * * * /u/coredata/minirov-cos8/scripts/runMinirovCtdUploader  >> /u/coredata/minirov/runlogs/minirovCtdUploader.log 2>&1
#! /bin/sh
echo activating the python environment
source /u/coredata/minirov-cos8/venv-minirov/bin/activate
python /u/coredata/minirov-cos8/scripts/ctdfileloader.py
exit 0

The shell script activates a python virtual environment and invokes the script ctdfileloader.py that does all the work.

  • This script queries the MinirovRawCtdLoad table to see if it can find any files that have not been loaded.
  • If it finds a CTD file that has not been processed, it parses the file, writes the data into rows in the proper MinirovRawCTD_YYYY table.
  • Then updates the entry in the MinirovRawCtdLoad table to keep track of the files that have been successfully loaded (or failed)
  • Note: Clearing the isLoaded field of the load table will cascade delete dependent rows in the MinirovRawCTD_YYYY table if data needs to be backed out.

Nightly CTD processing on draco

graph TD subgraph "Perseus MSSQL" subgraph EXPD Database Tables MinirovRawCtdLoad MinirovRawCtdData_YYYY RovctdData_YYYY RovctdBinData_YYYY end end subgraph "Draco user=DB_TASK_EXEC" TaskScheduler-->|runs|processRawMinirovCtd.bat -->|runs|processMinirovCtdData.pl end processMinirovCtdData.pl -->|maintain history|MinirovRawCtdLoad processMinirovCtdData.pl -->|queries dataset|MinirovRawCtdData_YYYY processMinirovCtdData.pl -->|inserts rows|RovctdData_YYYY processMinirovCtdData.pl -->|inserts rows|RovctdBinData_YYYY

Final processing and merge into the expd database ctd tables for all rov's is done by the microsoft TaskScheduler on draco.

draco d:\EXPD_Data_Loads\RovCtd\processRawMinirovCtd.bat and processMinirovCtdData_perseus.pl

REM -- Run the perl script to process data from the raw ctd tables
perl D:\EXPD_Data_Loads\RovCtd\processMinirovCtdData_perseus.pl D:\EXPD_Data_Loads\Log\EXPD_processminirovctd.txt 
exit /b 0
  • Every night, ‘Task Scheduler’ on Draco (windows machine) runs a script that looks for MinirovRawCtdLoad table files the are loaded but have isProcessed =0
  • On success, the core ctd raw 1second and 15sec binned tables are loaded and the isProcessed field of the load table is set.
  • Note: Clearing the isProcessed field of the load table will cascade delete dependent row in the processed ctd data tables if data needs to be backed out.
  • NOTE: We compute salinity using standard mbari algorithms
  • NOTE: O2 QC flags are currently set to suspect.

Step 3. Minirov NAV load (plus ROV,DVL)

Nightly Nav processing on coredata(8)

graph TD subgraph "Perseus MSSQL" subgraph EXPD Database Tables MinirovRawNavLoad MinirovRawCtdLoad MinirovRawRovLoad MinirovRawDvlLoad end end subgraph Coredata8-via cron as user coredata runMinirovNavUploader-->minirov2mbsystem.py-->|spawns|mbminirovnav["#47;usr#47;local#47;bin#47;mbminirovnav"] end subgraph "RovNavEdit Atlas Share" RovNavEditLogs[RovNavEdit/YYYY/minirov/filename.mb165] end subgraph "RovNavEdited Atlas Share" RovCleanNavLogs[RovNavEdit/YYYY/minirov/filename_edited.txt] end subgraph ShipData Atlas Share ShipDataNavLogs[minirov/logs/NAV/YYYY]-.-ShipDataCtdLogs[minirov/logs/CTD/YYYY]-.-ShipDataRovLogs[minirov/logs/ROV/YYYY]-.-|dvl optional|ShipDataDvlLogs[minirov/logs/DVL/YYYY] end mbminirovnav -->|writes| RovNavEditLogs mbminirovnav -->|reads| ShipDataNavLogs minirov2mbsystem.py --> MinirovRawNavLoad minirov2mbsystem.py --> MinirovRawCtdLoad minirov2mbsystem.py --> MinirovRawRovLoad minirov2mbsystem.py --> MinirovRawDvlLoad RovNavEditLogs --> |Video Lab Manual Editing|RovCleanNavLogs

The schematic above shows pieces involved ingest of raw minirov NAV data into the MBARI RovNavEdit processing stream.

Cron runs the runMinirovNavUploader shell script.

30 20 * * * /u/coredata/minirov-cos8/scripts/runMinirovNavUploader  >> /u/coredata/minirov/runlogs/minirovNavUploader.log
#! /bin/sh
echo activating the python environment
source /u/coredata/minirov-cos8/venv-minirov/bin/activate
python /u/coredata/minirov-cos8/scripts/minirov2mbsystem.py
exit 0

The shell script activates a python virtual environment and invokes the script minirov2mbsystem.py that does most of the work. It uses the MinirovRawNavLoad, MinirovRawCtdLoad, MinirovRawCtdLoad and MinirovRawDvlLoad tables to find datasets that need to be processed into the MBARI RovNav editing pipeline. If it finds datasets that meet criteria it spawns the MBSystem program mbminirovnav program to produce an 'mb165' format file and stages it in the RovNavEdit share on Atlas for video lab personnel to clean.

Note

The main criteria is that the files all begin within 60 seconds of each other to be considered a Dataset

The actual Expd database load is performed by Windows TaskScheduler on Draco. see D:\EXPD_Data_Loads\Navdata\load_minirovnav.bat

Installation Steps

First clone the repo from bitbucket into user coredata's home directory on coredata8

cd ~
git clone git clone https://youruserid@bitbucket.org/mbari/minirovdatamanagement.git

You should now have a folder: ~/minirovdatamanagement

Next we set up a python2.7 virtual environment. The virtenv tool for python 2.7 should already be installed in /bin by IS. Let's check it:

which virtualenv
  /bin/virtualenv
virtualenv --version
  virtualenv 20.0.20 from /usr/lib/python2.7/site-packages/virtualenv/__init__.pyc

So now we can create the virtual environment venv-minirov

cd ~
mkdir minirov-cos8
cd minirov-cos8
virtualenv venv-minirov

And test we can activate it

Note

We need to be in bash shell not csh to (csh does not work, see the [following issue](https://github.com/conda/conda/issues/3176)

And while we are there, we can go ahead and install the pymssql package

cd ~/minirov-cos8/
~/minirov-cos8]$ bash
source ./venv-minirov/bin/activate
pip install pymssql
deactivate
exit

Next we make the scripts and runlogs directories and copy over the scripts from the repo source.

cd ~/minirov-cos8
mkdir scripts
mkdir runlogs
cp ~/minirovdatamanagement/scripts/* ./scripts

Next we patch the ‘official’ netrc.py that comes with python to comment out the lines at approx. 106-110 re. file permissions being too permissive. We currently are using NTFS directories on atlas that do not expose unix-style file permissions. We will use the file: netrc.patch from our scripts directory. The patch result will be placed in our local virtualenv python site-packages directory.

Note

This hack will also requires the any python scripts that connect to mssql server to include a step that modifies the search path that will cause site-packages to be the first location python looks at and finds our modified netrc.py.
cd ~/minirov-cos8/scripts
cp /usr/lib64/python2.7/netrc.py ../venv-minirov/lib/python2.7/site-packages/netrc.py
patch -u -b ../venv-minirov/lib/python2.7/site-packages/netrc.py -i ./netrc_2.7.patch

Check for the correct perseus expd database login in /u/coredata/.netrc file. I do not document what that setup here for obvious security reasons. Talk to other developers in the group on how to configure that file.

Now we can check that the database connections work using the dbtest.py script.

cd ~/minirov-cos8/scripts
./runDatabaseConnectTester.sh
activating the python environment
(3, u'Schramm', u'Rich', u'rich@mbari.org', u'MBARI', True, True, UUID('b0331cc1-ac18-4ab3-9309-23fe51930f66'))
did it3
ID=3, LastName=Schramm
done

You will also need the MBSystem program mbminirovnav installed (See Karen Salamy)

Lastly make sure cron jobs are correctly set.

######    minirov processing   #####
00 * * * * /u/coredata/minirov-cos8/scripts/runMinirovDivelogUploader  >> /u/coredata/minirov-cos8/runlogs/minirovDivelogUploader.log 2>&1
30 17 * * * /u/coredata/minirov-cos8/scripts/runMinirovFileArchiver  >> /u/coredata/minirov-cos8/runlogs/minirovFileArchiver.log 2>&1
00 20 * * * /u/coredata/minirov-cos8/scripts/runMinirovCtdUploader  >> /u/coredata/minirov-cos8/runlogs/minirovCtdUploader.log 2>&1
30 20 * * * /u/coredata/minirov-cos8/scripts/runMinirovNavUploader  >> /u/coredata/minirov-cos8/runlogs/minirovNavUploader.log 2>&1

Current issues

  1. No Flmr in PlatformLookup table in EXPD - its there as of 6/1/21 -rs
  2. No data in 901123.ROV1k Share that I can see. - yes, see Project share on Atlas 901123ROV1k/Data/Logs -rs
  3. Question: There is a MinirovRawNavLoad table, but no MinirovRawNavData_YYYY tables? Correct, data are loaded thru the RovNavEdit workflow directly into the same tables as Ventana and DockRicketts. -rs