Skip to content

Processing configuration

Processing raw datasets often requires adjusting parameters, such as outlier detection, corrections or metadata.

If you process your data with an individual analysis script, you can pass these configuration parameters directly to the processing method, such as AQGRawData.process().

from gravitools.aqg import read_aqg_raw_dataset

rawdata = read_aqg_raw_dataset("20240620_163341.zip")
rawdata.process(
    operator="John Doe",
    comment="This is a comment.",
    dg_syst="123 nm/s²",
)

However, we generally recommended to put configuration parameters in a configuration file, because this provides the following advantages:

  • Analysis scripts remain general and applicable to other datasets
  • It serves as documentation and is human-readable
  • It can be archived with the data

It doesn't matter what format the configuration file is in, as long as it maps to a Python dictionary. However, the recommended format is YAML, because it is easy to read and allows for comments.

Recommended: Although it is not necessary, it is strongly recommneded to use units for all parameters applicable in your config file. If parameters are provided without units, standard units are assumed (and conversion is done likewise). See here for the standard units.

Here is an example configuration file:

configuration.yml
# Sensor head setup orientation to assume, if not otherwise specified
default_orientation: 0
syst_uncertainty: 102.4 nm/s²
detect_outliers:
  sigma_threshold: 5
  neighbors: 15
  min_gap: '30s'

# Configuration for an individual instrument
meter AQG-B02:
  syst_uncertainty: 102.4 nm/s²
  dg_syst: 14 nm/s²
  tilt_offset:
    - -2.4269e-06 rad
    - -5.526e-07 rad
  2023-12-01:
    dg_syst: 120 nm/s²

# Configuration for a specific measurement location
point CA:
  pressure_admittance: -3 nm/s²/hPa
  vgg : -3008 nm/s²/m

# Configuration for a specific dataset
dataset 20240620_163341:
  operator: John Doe
  orientation: 0
  point: CA
  comment:
    This is a comment text.
  outlier_ranges:
    2024-01-01 13:00:00 .. 2024-01-01 13:15:00:
      comment: Earthquake
  change_tilt_offsets:
    - -7.9e-06 rad
    - -6.89e-05 rad

You can use PyYAML to read the file to a Python dictionary.

import yaml

with open("configuration.yml", encoding="utf-8") as f:
    config = yaml.load(f.read(), Loader=yaml.CLoader)

Use combine_dataset_config() to select the parameters that are relevant for a specific processing workflow.

from gravitools.config import combine_dataset_config

dataset_config = combine_dataset_config(
    config,
    dataset="20240620_163341",
    meter="AQG-B02",
    point="CA",
    date="2024-01-01"
)

print(dataset_config)

Output:

{
    "comment": "This is a comment text.",
    "default_orientation": 0,
    "dg_syst": 120 nm/s²,
    "operator": "John Doe",
    "orientation": 0,
    "point": "CA",
    "pressure_admittance": -3 nm/s²/hPa,
    "vgg": -3008 nm/s²/m,
    "syst_uncertainty": 102.4 nm/s²,
}

This parameter selection can then be passed to the processing method.

from gravitools.aqg import read_aqg_raw_dataset

rawdata = read_aqg_raw_dataset("20240620_163341.zip")
rawdata.process(**dataset_config)

Configuration parameters

The list of accepted configuration parameters is defined by the processing method.

AQGRawData.process()

Metadata

These become part of the dataset metadata.

orientation
Sensor head setup orientation.
tilt_offset
Tilt offset angle used during measurement. This parameter is currently not contained in the metadata recorded with the AQG dataset.
dg_*
Any parameter starting with dg_ is interpreted as a constant gravity correction.
dg_syst
Instrumental systematic bias correction.
syst_uncertainty
Instrumental systematic uncertainty.
site_name
Renamed parameter measurement_site_name from the metadata (.info file) recorded with the AQG dataset.
point
Identifier of a survey site. site_name is often used not only for the point identifier, but also to note additional information, such as the verbose site name or sensor orientation. point is only the identifier.
operator
Name of organization or people who operated the gravimeter and recorded the data.
comment
A single- or multi-line comment on this dataset and its processing.
vgg
Vertical gravity gradient.
pressure_admittance
The admittance factor used for atmospheric pressure correction.

Processing procedure

These are only used once during the processing procedure and not copied to the dataset metadata. However, processing steps are recorded in the log, see AQGRawData.log().

default_orientation
Sensor head setup orientation to imply, if no orientation is explicitly specified.
outlier_ranges
Dictionary of time ranges to be marked as outliers. The dictionary keys should have format YYYY-mm-dd HH:MM:SS .. YYYY-mm-dd HH:MM:SS. Dictionary values can be another dictionary containing metadata, such as a comment explaining why the range is marked as outlier, however, this information is currently not processed further.
station_height_difference
Offset to apply to the measurement reference height.
change_tilt_offset
A two-tuple of new tilt offset values to recalculate the tilt angles.
time_period
A two-tuple of a start and a end date to which the data will be cut to.
recalculate_dg_pressure
True, to recalculate the atmospheric pressure correction. (Default: true)
recalculate_dg_polar
True, to recalculate the polar motion correction (Default: true)
detect_outliers
A dictionary of parameters for outlier detection, see detect_outliers(). Leave unspecified (or None), to use default values. Pass False, to deactivate outlier detection. (Default: None)

Configuration structure

Parameters at the top-level of the configuration structure will apply globally to all datasets.

configuration.yml
# This is a global configuration
default_orientation: 0

Parameters inside a dataset block apply to a specific dataset. When combining parameters, dataset parameters take highest priority.

configuration.yml
dataset [IDENTIFIER]:
  orientation: 0

Parameters that apply to all datasets at a specific location can be placed inside a point block.

configuration.yml
point [IDENTIFIER]:
  # Atmospheric pressure correction admittance factor
  pressure_admittance: -3 nm/s²/hPa

Similarly, parameters inside a meter block apply to all datasets taken with this instrument.

configuration.yml
meter [IDENTIFIER]:
  # Systematic bias correction
  dg_syst: 123 nm/s²

Parameters that change over time, such as instrument characterization, can be placed in a date-block.

configuration.yml
meter [IDENTIFIER]:
  dg_syst: 14 nm/s²
  2023-12-01:
    # Systematic bias correction after 2023-12-01
    dg_syst: 120 nm/s²