Tools for integrating inertial sensor da

Below we describe the theory and basic operations involved in processing data from inertial measurement units (IMUs). The primary sensors discussed are 3-axis accelerometers, 3-axis magnetometers and a pressure sensor for measuring depth, but our scripts also allow integration of video, audio, gyroscope, light and temperature data. Step-by-step instructions including additional figures, video tutorials, updated code and frequently asked questions are available through the webpage of a workshop offered in December 2020 (https://catsworkshop.sites.stanford.edu/). The workshop’s home page contains direct links to the MATLAB code repository (https://GitHub.com/wgough/CATS-Methods-Materials), which contains a step-by-step wiki,^{Footnote 1} as well as to a dryad depository (https://datadryad.org/stash/share/KFi8G5QC7DFPYXynQeSotxtXqANZL70LFUGEiiDTSMU) with example data for a user to practice with. We discuss tag processing in four parts:

I.
Downloading, viewing and importing tag data.
II.
Bench calibrations for individual tags.
III.
Calculating orientation, motion and position (the “prh.mat” file).
IV.
Applications (see “Results” section).

Part 0—platform requirements and setup

The described tools have been tested on a Windows system running MATLAB versions from 2014 to 2021a. With some third-party exceptions, such as Trackplot, most described packages can be run on other systems, such as Macintoshes, but have not been tested; some known compatibility issues are listed in the Discussion. All MATLAB tools described herein are stored at the above GitHub link, allowing for a living, open-source set of tools, where version history can be tracked and updates from collaborators are encouraged [23, 24]. To install code, we recommend that users download the GitHub desktop client (https://desktop.github.com), which allows code to be updated to match the current online version, ensuring seamless updates. The MATLAB environment then needs to be pointed to the GitHub folder by either adding the folder directly to the path upon opening MATLAB, or creating or editing a startup.m file in the default MATLAB directory that includes a line pointing to the tools folder, e.g., “addpath(genpath(‘C:\Users\Dave\Documents\GitHub\CATS-Methods-Materials\CATSMatlabTools’));”. Example data used in the tutorial can be accessed through the above-linked dryad depository.

Though scripts are designed to be folder-structure independent, generally they work more seamlessly and efficiently, with fewer necessary user inputs, if scripts and data are organized according to the file structure outlined on the tag wiki. A compressed “.rar” file containing a template folder structure and a template TAG GUIDE to store metadata is available in the “templates” folder in the CATSMatlabTools.

Part I—downloading, viewing and importing tag data

With the miniaturization of storage chips and batteries, the amount of data that can be collected at high resolution has increased rapidly (Fig. 1). For example, daily diaries from Wildlife Computers can record 32 Hz accelerometer and pressure data for weeks at a time [25, 26], suction-cup attached CATS deployments have remained on animals for upwards of 96 h in Arctic and Antarctic waters, and DTAGs attached to seals have recorded 240 kHz acoustics and 200 Hz IMU data for 21 days [27]. Each tag manufacturer copes with this data abundance in a proprietary way, usually by compressing data to maximize storage capacity and minimize download time in a way that also minimizes errors during data write.

Because the format of raw data varies across tag type, a critical step is to import data into a common format to facilitate downstream processing using the same tools. Our import scripts conglomerate data into two formats: (1) an Adata matrix and corresponding Atime vector with the accelerometer data at its original sampling rate—often much higher than other sensors to facilitate detection of low-frequency vocalizations [29] or to estimate speed from the amplitude of tag vibrations [18] and (2) a data table format with common header names (Fig. 2). We provide import scripts for CATS, Acousondes [30], Little Leonardos [31], Wildlife Computers’ TDR-10s and SPLASH tags [25], and Loggerhead Instruments’ openTags [32], and other tags can be processed using subsequent scripts if the data is organized with the variable names described above. The CATS data we describe in detail offloads from the tag as a series of CSV (comma-separated value) files, and the script importCATSdata.m automatically combines these files into a single data table. Early versions of CATS tags offloaded in a single CSV file which needed to be broken up into smaller formats before import (e.g., using CSV splitter: https://www.erdconcepts.com/dbtoolbox.html) due to memory constraints. Also generated by the import script is a variable (Hzs) that reports the original sample rate for all of the sensors, allowing for downstream matching of sampling rates using the dec_dc.m and interp2length.m scripts from http://www.animaltags.org/.

Our best practice recommendation is to run the import script (e.g., importCATSdata.m) immediately upon download of data from a recovered tag. Embedded in the script are three tools that can aid the researcher in the field: (1) a plot of depth vs time that can allow for a first run estimation of animal behavior (Fig. 3A); (2) a plot of other critical sensors (specifically accelerometer and magnetometer data) to gauge immediately whether there may be errors in the deployment data or whether the tag is okay to deploy again in the field (Fig. 3B); and (3) a “tag on time” tool that allows for precise determination of the tag on and tag off time, which can be useful to help researchers adjust deployment procedures (Fig. 3C).

Part II—bench calibrations (performed once for each individual tag)

A critical aspect of bio-logging is comparing data across deployments and individuals. While some derived data sets will always be tag-placement dependent—e.g., Overall Dynamic Body Acceleration (ODBA) [33, 34] and Minimum Specific Acceleration (MSA) [21]—many data streams are comparable across deployments provided that units are consistent and accurate, which requires calibration. Before deploying a tag, we recommend applying a series of bench tests to (a) determine the tag-specific axis conventions of each device (Fig. 4); (b) convert the engineering units into consistent scientific/engineering units (typically SI, though we relate acceleration to the acceleration due to gravity); (c) provide a baseline calibration to increase the accuracy of deployment-specific calibrations using in situ data; (d) test the flotation of tags before deployment to ensure recovery antennae are maximally extended above the surface; and (e) test the recovery methods (e.g., ARGOS or VHF). Although calibration steps a–c can also be completed after a tag has been deployed in the field, the chance of tag loss or malfunction after a first deployment, but before calibrations can be done, is substantial. Some tag manufacturers provide bench calibrations and information about axis conventions with each tag purchased. In these cases, though it may still be useful to test the tag to confirm the calibrations, it may also be sufficient to construct calibration matrices without running additional tests. Examples are below and through the wiki to construct rotation matrices that can be applied to rotate manufacturer’s recommendations to the conventions used in downstream processing (described below).

Axis conventions are mathematically arbitrary, though for convenience in processing it is useful to have the same conventions across deployments. In downstream processing, our scripts assume a north-east-down (NED) orientation (Fig. 4), such that the first sensor axis (data.Acc1 or Adata(:,1)) is the x-axis, reading positive values when the front of the tag is facing up (opposite the direction of the force of gravity), the second sensor axis (y) faces to the tag’s right (positive when the right side of the tag faces up), and the third axis (z) points down (positive when the tag is on its back with the bottom side facing up). Although the assignment of axis conventions is arbitrary, the principles that inspired our choice of a right-handed, NED orientation were: (a) all rotations should be in the same direction around an axis—we chose counter-clockwise to match conventions from trigonometry; (b) we wanted pitch to be positive when an animal is ascending to the surface (rostrum facing towards the surface); and (c) we wanted animal heading to match conventional compass bearings. Roll in the scenario is determined by the above constraints and ends up being positive when rolling to the animal’s right. Other commonly used axis conventions do not meet all of these criteria. For instance, a north-east-up (NEU) convention—the baseline in DTAGs and some other tag types—with the same (b) and (c) restrictions forces pitch and heading to have directionally opposite rotations (counter-clockwise and clockwise, respectively, Fig. 4b), and the roll is arbitrarily determined in DTAG nomenclature to be to the animal’s left (clockwise rotation). To convert between an NED and an NEU reference frame (e.g., when using DTAG scripts with CATS data), switch the sign of the z-axis in all sensors, and note that roll calculated with our tools would have opposite sign of roll calculated with DTAG tools. Note, however, that tools that use a netCDF structure format (described below), usually have axis convention information embedded in the structure for each sensor, so users should be aware if no adjustments are necessary.

An individual tag’s internal axis conventions depend on the orientation of the sensor package within the tag; however, so different tag versions may arrive to the user with different axis conventions. That is, though our analysis scripts assume an NED orientation, the raw data exported from the tag could have the x-values in any of the three data columns, for example, and those values could have the opposite sign from our assumptions. The first step we outline on the wiki site, then, is to determine the axis conventions used by the tag by maneuvering the device through a series of static positions (for the accelerometer and magnetometer) and motions (for the gyroscope) to reveal how the sensor package is arranged within the tag. The user is then prompted to edit the script axisconventions.m to account for any deviations from the convention (in which the first axis reads x, second y and third z). As an example, if an uncalibrated tag displays positive values in the first column of the accelerometer matrix when the tag is upside down (and zero in the other 2-axes), positive values in the second column when the tag has the anterior side facing to the sky, and negative values in the third column when the tag is on its left side, a user would edit axisconventions.m to define the original axis accelerometer axis conventions (axAo) as

$$axAo = \, \left[ {{\text{z x }}{-}{\text{y}}} \right]$$

(1)

and the script would automatically calculate a rotation matrix (axA) that is right-multiplied by the raw tag data in downstream processing:

$$axA = \left[ {\begin{array}{*{20}c} 0 & 0 & 1 \\ 1 & 0 & 0 \\ 0 & { - 1} & 0 \\ \end{array} } \right].$$

(2)

Then, once these values are stored in an individual tag’s “<tagID>cal.mat” file, downstream processing that imports that calibration file will automatically correct the tag’s internal alignment to be consistent with all processed data, ensuring that axis conventions do not have to be considered in future processing steps.

Other calibration steps are detailed in the tag wiki. They involve using the earth’s gravitational and magnetic fields, which have known values at given locations, to convert raw sensor data into engineering units. The main calibrateCATS.m script guides users through the application of calibration steps to collected data, relying on the spherical calibration scripts from http://www.animaltags.org/ to create the base calibration procedures.

Calibration files can also contain other information, such as pressure and temperature factors and offsets, which are specific to individual tags. For example, in our cal files, the user can set pcal (a pressure factor) and pconst (a pressure offset). Many tag types, including CATS tags, include pressure and temperature factors preprogrammed into the default outputs by the manufacturer, but others may need to be written into the cal files or may need bench calibrations. Float tests are additionally recommended in the water conditions for each environment in which tags will be deployed as even small differences in water density can affect the vertical tilt and flotation of the tag (Fig. 5).

Part III—calculating orientation, motion and position (the prh file) for each deployment

The MainCATSprhTool.m script is divided into sections (termed “cells” in MATLAB parlance), each of which performs a set of tasks that lead to the “prh” (pitch-roll-heading) file with the filename “<deploymentID> <samplerate>prh.mat”. For example, the example data results in a file name of “mn200312-58 10Hzprh.mat”, where mn200312-58 is the deployment ID consisting of a species ID (‘mn’ for Megaptera novaeangliae), a date in YYMMDD format and a tag ID number (58) of the specific device deployed on that animal, and the sample rate of the resulting file is 10 Hz. The prh file concept was introduced for DTAG processing by Mark Johnson [20, 7] and the output files from our scripts are designed to be compatible with DTAG prh files (though see note above and Fig. 4 regarding axis conventions). The prh file contains additional variables that are the basis for biological studies, such as speed, depth, time, accelerometer and magnetometer readings in the animal’s frame of reference, and northing and easting distance from the start position. A full list of variables created in this process are defined in the file “CATSVarNames.txt” in the GitHub repository.

Each cell of MainCATSprhTool.m performs a discrete set of tasks that build on each other. The process can be paused at any point and progress is automatically saved in an “INFO.mat” file that can be used to return to previously completed steps of the process and make edits. The beginning of each cell contains some parameters that can be adjusted depending on the deployment type. For instance, deployments without video can set the variable nocam to true in cell 4, which then triggers a simpler version of many processes, and creates empty variables when video-specific values are called for.

IIIa—video processing

We bring up the camera variable example, since a specific focus of these tools is the integration of video data with inertial sensor data. From a theoretical perspective, video data should be relatively straight-forward to work with, despite the basic constraints of increased storage and battery needs. If you know the start time of a video and the video’s sample rate in frames per second (fps), nominally the video can be easily lined up with other data streams. However, in practice video data is particularly prone to several sources of error, the most challenging of which are time offsets from when the video is signaled to “turn on” to when it starts recording, as well as skipped frames when the processor is overloaded—a common occurrence in visually complex pelagic environments with light conditions that change rapidly as an animal changes orientation in three dimensions (Additional file 1: Video S1). Video recorded on commercial consumer devices is typically “finalized”, meaning that it stores metadata, such as file duration, in the file and can be read by a variety of media types and is typically free from errors. Early versions of CATS tags utilized off-the-shelf products from GoPro, Oregon Scientific and others that finalize videos before writing to a memory card. The problem with finalized videos is that if there is a write error in any part of the video, e.g., from a sudden reduction in power supply, the video cannot be written and the whole video, which could be 30 min of data, is unreadable. To avoid this problem, modern CATS tags are unfinalized in the raw format that downloads from the tag, which means data is more likely to be available, but also more likely to have errors and can only be read by certain media players, such as VLC (https://www.videolan.org/vlc/) or MPC-HC (https://mpc-hc.org/) until they are processed further.

Cell 1 of the MainCATSprhTool.m script has a similar function as the importCATSdata.m script in that it can read a variety of video types and resolutions that have been included in various versions of CATS tags. This cell can be skipped if there is no video or audio in the utilized tag type. The basic functionality has two phases: (1) read in audio data from both audio and video files, storing “.wav” files and raw data in an “audioData” folder and (2) read the timestamps off each video frame for purposes of synchronization. Part 2 is driven by an mmread.m script (https://www.mathworks.com/matlabcentral/fileexchange/8028-mmread) that reads the encoded timestamp of each frame, and the workhorse function makeMovieTimes.m also provides the option of using an optical character reader to directly read the embedded timestamp in the corner of the video (Fig. 6A). The final metadata about each frame is stored in a “movieTimes.mat” file that is read in later.

At the conclusion of the prh process, a StitchDataonVideo.m script allows for processed sensor data to be written on top of the video frame, expanding the frame size of the resulting movie to maintain original video resolution (Additional file 2: Video S2, Fig. 6B). This process facilitates biological interpretation of the video as well as video auditing, as the orientation, motion and depth data can easily indicate where points of interest are (e.g., feeding events that have characteristic signatures). This process requires significant computer processing time, working in small chunks (typically 10–15 s) of video, and creating a folder full of 15 s partial videos. On a typical personal computer with 32 GB RAM, it can take 10–20 min for 1 min of video to process, and requires a screen width at least 21.5% greater than the video frame width (or two monitors), as well as ~ 100 GB of hard drive space for every hour of raw video. The final step, synching the partial videos using Adobe Media Encoder or similar video stitching software, reduces the video sizes back to standard sizes (~ 3–4 GB/h of video). The resulting videos are finalized in “.mp4” format and can be read on any standard media player. The intermediately created videos can then be deleted.

IIIb—data processing

After cell 1, which could be skipped if there is no video or audio data, the remainder of the cells in MainCATSprhTool.m should be run regardless of the specific data being processed. The MainCATSprhTool.m script guides the user step by step through the data analysis process in the following order (with additional details of critical steps below): cell 2 loads data; cell 3 loads calibration data and trims non-biological data from the raw data; cell 4 synchs the video and the data—for data that does not have embedded timestamps, there is an option to use another synchronization method (for whales this can be the times of surfacings observed in the video); cell 5 locates the precise beginning and end of the deployment in the data; cell 6 begins making the animal-frame variables and performs an in situ pressure calibration (using fix_pressure.m from http://www.animaltags.org/); cell 7 performs in situ calibrations of the accelerometer and magnetometer data using the spherical_cal.m scripts from http://www.animaltags.org/; cell 8 calculates the orientation of the tag on the animal and looks for places, where the orientation may have changed (tag slip); cell 9 is currently inactive, but remains as a placeholder for users who wish to more finely calibrate gyroscope data; cell 10 imports metrics of turbulence—acoustic flow noise [35, 19] and accelerometer jiggle [18]—that can be used as proxies for forward speed; cell 11 regresses those proxies against orientation-corrected depth rate (OCDR) [36]; cell 12 saves a simple version of the prh file that has comparable variables as DTAG prh files (though see notes on axis conventions above); cell 13 adds any tag-collected GPS data or other known animal locations into the prh file, then creates a geo-referenced pseudotrack of animal position [37]; and cell 14 summarizes the processed deployment information into a visual “quicklook.jpg” that allows a researcher to quickly scroll through deployments to see the critical data from each.

Specific guidance for implementing each step is available on the GitHub wiki, and we provide additional descriptions for some of the unique processing points below. Synching data with video in cell 4 has two main resulting variables: vidDN that records the start time of each video (where DN stands for Date Number, a MATLAB date-time format equivalent to days since the start of year 0), and vidDurs, the duration of each video in seconds. Occasionally, videos before deployment will be recorded but discarded. In this case, where, for example, video number 3 is the first video of the deployment, there would be a value of NaN (the MATLAB not-a-number signifier) in the first two entries of the vidDN and vidDurs vectors.

For most cetacean deployments, where the orientation of the tag on the target animal cannot be finely controlled due to the deployment method on free-swimming animals, the tag axes must be mathematically rotated so that they align with the animal axes in NED orientation. Johnson and Tyack [7] refer to this process as rotating the tag’s frame of reference (tag frame) to the whale’s frame of reference (whale frame). The procedure we use (based on [20, 7], is similar in theory to the currently available prhpredictor.m tool from http://www.animaltags.org/, though our estimateprh.m script also directly includes the ability to detect and modify tag slips, an increased ability to zoom in and out of data regions, and has more thorough descriptions of the user controls displayed directly on the plots. Mathematically, the procedure involves calculating a rotation matrix, W, that is the product of a rotation matrix that accounts for pitch and roll of the tag (W_pr) and a rotation matrix (W_y) that accounts for the yaw of the tag in relation to the whale’s axes (Fig. 7). W can then be applied to the tag sensor data for the accelerometer (At), magnetometer (Mt) and gyroscope (Gt) to create the whale frame (or animal frame) matrices Aw, Mw, and Gw. The estimateprh.m script in step 8b of MainCATSprhTool.m calculates these matrices automatically by asking a user to identify periods of time when the animal is thought to be in a “typical” orientation—that is, when its body is aligned with the earth’s frame of reference (which often occurs for whales, while they are breathing or just between breaths)—and constructing a rotation matrix that rotates the tag data during that time such that the z-axis reads − 1 g (Fig. 7A). This does not resolve the orientation of the x- and y-axes, however, so an additional period of time, often as the animal finishes an ascent to the surface or starts a descent from the surface, where the animal can be assumed to be rotating nearly exclusively around the y-axis (a change in pitch) is identified and used to adjust the rotation matrix in the yaw direction until rotation around the y-axis is isolated during the identified period (Fig. 7B, C). This procedure involves some amount of user selection of the defined period and an understanding of “typical” cetacean behavior. To limit the amount of trial and error, which may be especially difficult for deployments with a lot of tag motion (i.e., tag slips resulting in substantial changes in tag frame relative to whale frame) over time, our script allows for iterative changes to the user selections and immediate feedback of the resulting calculated Euler angles (pitch, roll and heading, Fig. 8). If the tag moves at all during a deployment, as is common in tags attached with suction-cups, the rotation matrix must be calculated for each distinct period of tag orientation. Our script allows for tag slips to be identified using either the accelerometer data or video data, and the prh estimator allows for those identified tag slips to be adjusted. If tag slips take place over a period of time, the prh estimator calculates a unique rotation matrix for each time step between the start and end of the slip, using the calculated rotation matrices as the start and end points (Fig. 8B).

Animal speed through the water can be difficult to measure directly (though see sensors described in [39, 35, 40, 41, 42, 43], particularly for deployments, where tag orientation on the animal is unpredictable or the flow over the sensor structure varies from laboratory conditions [43]. If a speed sensor is included on a bio-logging device our process allows for easy inclusion in the prh file as a speed variable—a table with various columns representing different speed metrics and the associated prediction errors. If a speed sensor is not included, as in typical CATS tags, cell 10 steps the user through analysis of two metrics of turbulent noise that have been shown to increase commensurate with animal speed: flow noise over a hydrophone [44, 45, 35, 9] from acoustic files and the vibrations of the tag as measured by high sample rate (preferably ≥ 50 Hz) accelerometers [18]. Cell 11 then regresses those metrics against periods of steep ascent or descent, where speed can be estimated from changes in depth (as in [19, 36]. The speedfromRMS.m script provides increased flexibility to adjust pitch and depth restrictions for OCDR calculations (Fig. 9) than an earlier version published in Cade et al. [18]. During the processing of acoustic files for flow noise and the alignment of acoustic files—which in CATS tags may have temporal gaps between the files—with the sensor data, the stitchaudio.m script combines all the audio into a single file, a useful tool for acoustic auditing.

After a basic prh file is created in cell 12, cell 13a adds in any surface position data available from surface observations (e.g., [46, 47], or on animal GPS locations (usually from a fast-acquistion system, e.g., FastLoc: [48, 49]. Smoothing the speed, pitch and heading data (by first low-pass filtering accelerometer and magnetometer matrices using a finite impulse response filter, available at http://www.animaltags.org/ and then recalculating orientation) allows for a track of the animal to be estimated from motion data using the http://www.animaltags.org/ script ptrack.m. Using the known surface positions, the error accumulated from integrating the motion data is smoothed between known positions (as described in [37]) to generate an estimate of position using the provided script gtrack.m as part of cell 13b. The resulting geoPtrack provides x (Eastings), y (Northings) and z (depth) values in meters from the start of the track. Given a known start position, our scripts provide code at the end of cell 13b that can convert this track into a GPS position at each time step. It should be noted that without sufficient surface positions, this process can diverge from the true position quickly due to the repeated integration of small errors. However, with sufficient anchor points, this process can create a robust estimate of position (Fig. 10). The number of points that is “sufficient” will depend on the accuracy of the speed metric, pitch and heading determination, as well as the presence of any subsurface currents (as integrated inertial sensors will, even if perfect, only give position through the water). A user can test the expected accuracy of the track between known positions by comparing the calculated pseudotrack to the calculated georeferenced pseudotrack to determine how quickly the track diverges from known positions. This process also allows for some flexibility, depending on the research question, to iteratively adjust the track to account for obvious errors (such as going over land). For instance, in an environment adjacent to a complex shoreline, an animal’s movement in a track that parallels the contours of a shoreline may be able to be used as an approximate anchor point for the track (e.g., [50]).