Processing configuration
Processing raw datasets often requires adjusting parameters, such as outlier detection, corrections or metadata.
If you process your data with an individual analysis script, you can pass these
configuration parameters directly to the processing method, such as
AQGRawData.process().
from gravitools.aqg import read_aqg_raw_dataset
rawdata = read_aqg_raw_dataset("20240620_163341.zip")
rawdata.process(
operator="John Doe",
comment="This is a comment.",
dg_syst="123 nm/s²",
)
However, we generally recommended to put configuration parameters in a configuration file, because this provides the following advantages:
- Analysis scripts remain general and applicable to other datasets
- It serves as documentation and is human-readable
- It can be archived with the data
It doesn't matter what format the configuration file is in, as long as it maps to a Python dictionary. However, the recommended format is YAML, because it is easy to read and allows for comments.
Recommended: Although it is not necessary, it is strongly recommneded to use units for all parameters applicable in your config file. If parameters are provided without units, standard units are assumed (and conversion is done likewise). See here for the standard units.
Here is an example configuration file:
# Sensor head setup orientation to assume, if not otherwise specified
default_orientation: 0
syst_uncertainty: 102.4 nm/s²
detect_outliers:
sigma_threshold: 5
neighbors: 15
min_gap: '30s'
# Configuration for an individual instrument
meter AQG-B02:
syst_uncertainty: 102.4 nm/s²
dg_syst: 14 nm/s²
tilt_offset:
- -2.4269e-06 rad
- -5.526e-07 rad
2023-12-01:
dg_syst: 120 nm/s²
# Configuration for a specific measurement location
point CA:
pressure_admittance: -3 nm/s²/hPa
vgg : -3008 nm/s²/m
# Configuration for a specific dataset
dataset 20240620_163341:
operator: John Doe
orientation: 0
point: CA
comment:
This is a comment text.
outlier_ranges:
2024-01-01 13:00:00 .. 2024-01-01 13:15:00:
comment: Earthquake
change_tilt_offsets:
- -7.9e-06 rad
- -6.89e-05 rad
You can use PyYAML to read the file to a Python dictionary.
import yaml
with open("configuration.yml", encoding="utf-8") as f:
config = yaml.load(f.read(), Loader=yaml.CLoader)
Use combine_dataset_config() to
select the parameters that are relevant for a specific processing workflow.
from gravitools.config import combine_dataset_config
dataset_config = combine_dataset_config(
config,
dataset="20240620_163341",
meter="AQG-B02",
point="CA",
date="2024-01-01"
)
print(dataset_config)
Output:
{
"comment": "This is a comment text.",
"default_orientation": 0,
"dg_syst": 120 nm/s²,
"operator": "John Doe",
"orientation": 0,
"point": "CA",
"pressure_admittance": -3 nm/s²/hPa,
"vgg": -3008 nm/s²/m,
"syst_uncertainty": 102.4 nm/s²,
}
This parameter selection can then be passed to the processing method.
from gravitools.aqg import read_aqg_raw_dataset
rawdata = read_aqg_raw_dataset("20240620_163341.zip")
rawdata.process(**dataset_config)
Configuration parameters
The list of accepted configuration parameters is defined by the processing method.
AQGRawData.process()
Metadata
These become part of the dataset metadata.
orientation- Sensor head setup orientation.
tilt_offset- Tilt offset angle used during measurement. This parameter is currently not contained in the metadata recorded with the AQG dataset.
dg_*- Any parameter starting with
dg_is interpreted as a constant gravity correction. dg_syst- Instrumental systematic bias correction.
syst_uncertainty- Instrumental systematic uncertainty.
site_name- Renamed parameter
measurement_site_namefrom the metadata (.info file) recorded with the AQG dataset. point- Identifier of a survey site.
site_nameis often used not only for the point identifier, but also to note additional information, such as the verbose site name or sensor orientation.pointis only the identifier. operator- Name of organization or people who operated the gravimeter and recorded the data.
comment- A single- or multi-line comment on this dataset and its processing.
vgg- Vertical gravity gradient.
pressure_admittance- The admittance factor used for atmospheric pressure correction.
Processing procedure
These are only used once during the processing procedure and not copied to the dataset metadata. However, processing steps are recorded in the log, see AQGRawData.log().
default_orientation- Sensor head setup orientation to imply, if no orientation is explicitly specified.
outlier_ranges- Dictionary of time ranges to be marked as outliers. The dictionary keys
should have format
YYYY-mm-dd HH:MM:SS .. YYYY-mm-dd HH:MM:SS. Dictionary values can be another dictionary containing metadata, such as a comment explaining why the range is marked as outlier, however, this information is currently not processed further. station_height_difference- Offset to apply to the measurement reference height.
change_tilt_offset- A two-tuple of new tilt offset values to recalculate the tilt angles.
time_period- A two-tuple of a start and a end date to which the data will be cut to.
recalculate_dg_pressure- True, to recalculate the atmospheric pressure correction. (Default: true)
recalculate_dg_polar- True, to recalculate the polar motion correction (Default: true)
detect_outliers- A dictionary of parameters for outlier detection, see
detect_outliers(). Leave
unspecified (or None), to use default values. Pass
False, to deactivate outlier detection. (Default: None)
Configuration structure
Parameters at the top-level of the configuration structure will apply globally to all datasets.
# This is a global configuration
default_orientation: 0
Parameters inside a dataset block apply to a specific dataset. When combining
parameters, dataset parameters take highest priority.
dataset [IDENTIFIER]:
orientation: 0
Parameters that apply to all datasets at a specific location can be placed
inside a point block.
point [IDENTIFIER]:
# Atmospheric pressure correction admittance factor
pressure_admittance: -3 nm/s²/hPa
Similarly, parameters inside a meter block apply to all datasets taken with
this instrument.
meter [IDENTIFIER]:
# Systematic bias correction
dg_syst: 123 nm/s²
Parameters that change over time, such as instrument characterization, can be placed in a date-block.
meter [IDENTIFIER]:
dg_syst: 14 nm/s²
2023-12-01:
# Systematic bias correction after 2023-12-01
dg_syst: 120 nm/s²