rainfallqc.checks.pypwsqc_filters¶
Quality control checks translated from the pyPWSQC framework (https://pypwsqc.readthedocs.io/en/latest/).
The PWSQC framework includes filters originally develop for automated PWS within the COST Action OPENSENSE.
‘run_’ and ‘check_’ relate to the algorithms from pyPWSQC.
Functions are ordered alphabetically.
- rainfallqc.checks.pypwsqc_filters.check_faulty_zeros(neighbour_data, neighbour_metadata, neighbouring_gauge_ids, neighbour_metadata_gauge_id_col, time_res, projection, nint, n_stat, max_distance_for_neighbours=10000.0, time_units='seconds since 1970-01-01 00:00:00', rainfall_attributes={'coverage_contant_type': 'physicalMeasurement', 'long_name': 'rainfall amount per time unit', 'name': 'rainfall', 'units': 'mm'}, lat_lon_attributes={'unit': 'degrees in WGS84 projection'}, global_attributes=None)[source]¶
Will flag faulty zeros based on neighbours …
- Parameters:
neighbour_data (
DataFrame) – Rainfall data of neighbouring gauges with time colneighbour_metadata (
DataFrame) – Metadata for the rainfall data with ‘latitude’ and ‘longitude’neighbour_metadata_gauge_id_col (
str) – Column with the gauge IDtarget_gauge_col – Target gauge column
neighbouring_gauge_ids: – List of ids with neighbouring gauges
time_res (
str) – Time resolution of dataprojection (
str) – cartesian/metric coordinate systemnint (
int) – Number of intervalsn_stat (
int) – Number of stationsmax_distance_for_neighbours (
int|float) – Maximum distance to consider for neighbourstime_units (
str) – Units and encoding of the ‘time’ columnrainfall_attributes (
dict) – Attributes for rainfall in the xarray Datasetlat_lon_attributes (
dict) – Attributes for lat and lon in the xarray Datasetglobal_attributes (
dict) – Global attributes for xarray Datasetneighbouring_gauge_ids (
List[str])
- Return type:
Dataset- Returns:
- :
- neighbour_data_ds_filtered
Data with flags for faulty zeros
Examples
available at: https://pypwsqc.readthedocs.io/en/latest/notebooks/merged_filters.html
- rainfallqc.checks.pypwsqc_filters.check_high_influx_filter(neighbour_data)[source]¶
High influx filter.
- Parameters:
neighbour_data (
DataFrame) – Rainfall data of neighbouring gauges with time col- Return type:
None- Returns:
- :
- neighbour_data
todo
- rainfallqc.checks.pypwsqc_filters.check_station_outlier(neighbour_data, neighbour_metadata, neighbouring_gauge_ids, neighbour_metadata_gauge_id_col, time_res, projection, evaluation_period, mmatch, gamma, n_stat, max_distance_for_neighbours=10000.0, time_units='seconds since 1970-01-01 00:00:00', rainfall_attributes={'coverage_contant_type': 'physicalMeasurement', 'long_name': 'rainfall amount per time unit', 'name': 'rainfall', 'units': 'mm'}, lat_lon_attributes={'unit': 'degrees in WGS84 projection'}, global_attributes=None)[source]¶
Station outlier.
- Parameters:
neighbour_data (
DataFrame) – Rainfall data of neighbouring gauges with time colneighbour_metadata (
DataFrame) – Metadata for the rainfall data with ‘latitude’ and ‘longitude’neighbour_metadata_gauge_id_col (
str) – Column with the gauge IDtarget_gauge_col – Target gauge column
neighbouring_gauge_ids: – List of ids with neighbouring gauges
time_res (
str) – Time resolution of dataprojection (
str) – cartesian/metric coordinate systemevaluation_period (
int) – length of (rolling) window for correlation calculationmmatch (
int) – threshold for number of matching rainy intervals in evaluation periodgamma (
float) – threshold for rolling median pearson correlationn_stat (
int) – Number of stationsmax_distance_for_neighbours (
int|float) – Maximum distance to consider for neighbourstime_units (
str) – Units and encoding of the ‘time’ columnrainfall_attributes (
dict) – Attributes for rainfall in the xarray Datasetlat_lon_attributes (
dict) – Attributes for lat and lon in the xarray Datasetglobal_attributes (
dict) – Global attributes for xarray Datasetneighbouring_gauge_ids (
List[str])
- Return type:
Dataset- Returns:
- :
- neighbour_data_ds_filtered
Data with flags for station outliers
Examples
available at: https://pypwsqc.readthedocs.io/en/latest/notebooks/merged_filters.html
- rainfallqc.checks.pypwsqc_filters.compute_distance_matrix(neighbour_data_ds)[source]¶
Compute a distance matrix.
- Parameters:
neighbour_data_ds (
Dataset) – xarray dataset of neighbour data- Return type:
Dataset- Returns:
- :
- distance_matrix
A distance matrix of all neighbouring gauges
- rainfallqc.checks.pypwsqc_filters.convert_neighbour_data_to_xarray(neighbour_data, neighbour_metadata, projection, time_units='seconds since 1970-01-01 00:00:00', rainfall_attributes={'coverage_contant_type': 'physicalMeasurement', 'long_name': 'rainfall amount per time unit', 'name': 'rainfall', 'units': 'mm'}, lat_lon_attributes={'unit': 'degrees in WGS84 projection'}, global_attributes=None)[source]¶
Convert neighbour data in polars format to xarray dataset.
- Parameters:
neighbour_data (
DataFrame) – Rainfall data of neighbouring gauges with time colneighbour_metadata (
DataFrame) – Metadata for the rainfall data with ‘latitude’ and ‘longitude’projection (
str) – cartesian/metric coordinate systemtime_units (
str) – Units and encoding of the ‘time’ columnrainfall_attributes (
dict) – Attributes for rainfall in the xarray Datasetlat_lon_attributes (
dict) – Attributes for lat and lon in the xarray Datasetglobal_attributes (
dict) – Global attributes for xarray Dataset
- Return type:
Dataset- Returns:
- :
- neighbour_data_ds
xarray dataset with assigned attributes
- rainfallqc.checks.pypwsqc_filters.run_bias_correction(neighbour_data)[source]¶
Bias correction.
- Parameters:
neighbour_data (
DataFrame) – Rainfall data of neighbouring gauges with time col- Return type:
None- Returns:
- :
- neighbour_data
todo
- rainfallqc.checks.pypwsqc_filters.run_event_based_filter(neighbour_data)[source]¶
Event based filter (EBF).
- Parameters:
neighbour_data (
DataFrame) – Rainfall data of neighbouring gauges with time col- Return type:
None- Returns:
- :
- neighbour_data
todo
- rainfallqc.checks.pypwsqc_filters.run_indicator_correlation(neighbour_data)[source]¶
Run indicator correlation.
- Parameters:
neighbour_data (
DataFrame) – Rainfall data of neighbouring gauges with time col- Return type:
None- Returns:
- :
- neighbour_data
todo
- rainfallqc.checks.pypwsqc_filters.run_peak_removal(neighbour_data)[source]¶
Peak removal.
- Parameters:
neighbour_data (
DataFrame) – Rainfall data of neighbouring gauges with time col- Return type:
None- Returns:
- :
- neighbour_data
todo
- rainfallqc.checks.pypwsqc_filters.subset_distance_matrix(neighbour_data_ds, distance_matrix, max_distance_for_neighbours)[source]¶
Compute a distance matrix.
- Parameters:
neighbour_data_ds (
Dataset) – xarray dataset of neighbour datadistance_matrix (
Dataset) – A distance matrix of all neighbouring gaugesmax_distance_for_neighbours (
int|float) – Maximum distance to consider for neighbours
- Return type:
Dataset- Returns:
- :
- neighbour_data_ds
A distance matrix of all neighbouring gauges
Functions¶
|
Will flag faulty zeros based on neighbours ... |
|
High influx filter. |
|
Station outlier. |
|
Compute a distance matrix. |
|
Convert neighbour data in polars format to xarray dataset. |
|
Bias correction. |
|
Event based filter (EBF). |
|
Run indicator correlation. |
|
Peak removal. |
|
Compute a distance matrix. |