rainfallqc.utils.neighbourhood_utils¶
All neighbourhood and nearby related operations.
- rainfallqc.utils.neighbourhood_utils.compute_km_distances_from_target_id(gauge_network_metadata, target_id, station_id_col)[source]¶
Compute kilometre distances between gauges in network and target gauges.
- Parameters:
gauge_network_metadata (
DataFrame) – Metadata for gauge network. Each gauge must have ‘longitude’ and ‘latitude’.target_id (
str) – Target gauge to compare against.station_id_col (
str) – Column name for station ID in gauge_network_metadata
- Return type:
DataFrame- Returns:
- :
- neighbour_distances_df
Data of distances to a target gauge in kilometers
- rainfallqc.utils.neighbourhood_utils.compute_temporal_overlap_days(start_1, end_1, start_2, end_2)[source]¶
Compute temporal overlap in days.
Note: assumes that the data is contiguous.
- Parameters:
start_1 (
datetime) – Start time of timestamp 1end_1 (
datetime) – End time of timestamp 2start_2 (
datetime) – Start time of timestamp 2end_2 (
datetime) – End time of timestamp 2
- Return type:
int- Returns:
- :
- overlap_days
Days that overlap between the two timestamps
- rainfallqc.utils.neighbourhood_utils.compute_temporal_overlap_days_from_target_id(gauge_network_metadata, target_id, station_id_col, start_datetime_col, end_datetime_col)[source]¶
Compute overlap in days between target gauges and its neighbours.
Note: assumes that the data is contiguous.
- Parameters:
gauge_network_metadata (
DataFrame) – Metadata for gauge network. Each gauge must have ‘longitude’ and ‘latitude’.target_id (
str) – Target gauge to compare against.station_id_col (
str) – Column name for station ID in gauge_network_metadatastart_datetime_col (
str) – Column name for start datetime in gauge_network_metadataend_datetime_col (
str) – Column name for end datetime in gauge_network_metadata
- Return type:
DataFrame- Returns:
- :
- neighbour_overlap_days_df
Neighbouring gauges with overlap days to target gauge.
- rainfallqc.utils.neighbourhood_utils.get_ids_of_n_nearest_overlapping_neighbouring_gauges(gauge_network_metadata, target_id, distance_threshold, n_closest, min_overlap_days, station_id_col='station_id', start_datetime_col='start_datetime', end_datetime_col='end_datetime')[source]¶
Get gauge IDs of nearest n time-overlapping neighbouring gauges.
- Parameters:
gauge_network_metadata (
DataFrame) – Metadata for gauge network. Each gauge must have ‘longitude’ and ‘latitude’.target_id (
str) – Target gauge to compare against.distance_threshold (
int|float) – Threshold for maximum distance consideredn_closest (
int) – Number of closest neighbours.min_overlap_days (
int) – Minimum overlap between target and neighbouring gaugesstation_id_col (
str) – Column name for station ID in gauge_network_metadata (default ‘station_id’)start_datetime_col (
str) – Column name for start datetime in gauge_network_metadata (default ‘start_datetime’)end_datetime_col (
str) – Column name for end datetime in gauge_network_metadata (default ‘end_datetime’)
- Return type:
list- Returns:
- :
- neighbouring_gauge_id
IDs of neighbouring gauges within a given distance to target and min overlapping days
- rainfallqc.utils.neighbourhood_utils.get_n_closest_neighbours(neighbour_distances_df, distance_threshold, n_closest)[source]¶
Get closest neighbours from neighbour distances data.
Will return more than number of n_closest if there is multiple values that are equal at that index. Will not return values that are 0 dist away.
- Parameters:
neighbour_distances_df (
DataFrame) – Data of distances to a target gaugedistance_threshold (
int|float) – Threshold for maximum distance consideredn_closest (
int) – Number of closest neighbours.
- Return type:
DataFrame- Returns:
- :
- n_closest_neighbour_df
Data of n_closest neighbours
- rainfallqc.utils.neighbourhood_utils.get_nearest_non_nan_etccdi_val_to_gauge(etccdi_data, etccdi_name, gauge_lat, gauge_lon, max_distance_km=500)[source]¶
Get the value at the nearest non-nan ETCCDI grid cell to the gauge coordinates.
- Parameters:
etccdi_data (
Dataset) – ETCCDI data with given variable to checketccdi_name (
str) – ETCCDI variable name to checkgauge_lat (
int|float) – latitude of the rain gaugegauge_lon (
int|float) – longitude of the rain gaugemax_distance_km (
int|float) – Maximum distance in km to search for a non-nan value (default 500 km)
- Return type:
Dataset- Returns:
- :
- nearby_etccdi_data
ETCCDI data at the nearest grid cell with non-nan values
- rainfallqc.utils.neighbourhood_utils.get_neighbours_with_min_overlap_days(neighbour_overlap_days_df, min_overlap_days)[source]¶
Get neighbours around gauge at least min_overlap_days of overlapping time steps.
Note: assumes that the data is contiguous.
- Parameters:
neighbour_overlap_days_df (
DataFrame) – Neighbouring gauges with overlap days to target gauge.min_overlap_days (
int) – Minimum overlap between target and neighbouring gauges
- Return type:
DataFrame- Returns:
- :
- neighbour_overlap_days_df
Neighbouring gauges with at least min_overlap_days overlap days.
- rainfallqc.utils.neighbourhood_utils.get_rain_not_minima_column(data, target_col, other_col)[source]¶
Get rain not equal to minima column.
Combines two functions for getting non_zero_minima i.e. 0.1 and then get ‘rain_not_minima’
- Parameters:
data (
DataFrame) – Rainfall datatarget_col (
str) – Target rainfall columnother_col (
str) – Other rainfall column
- Return type:
DataFrame- Returns:
- :
- data_w_minima_col
Rainfall data with rain is minima column
- rainfallqc.utils.neighbourhood_utils.get_target_neighbour_non_zero_minima(data, target_col, other_col, default_minima=0.1)[source]¶
Get minimum non-zero value in rainfall data between target and neighbour.
- Parameters:
data (
DataFrame) – Rainfall datatarget_col (
str) – Target rainfall columnother_col (
str) – Other rainfall columndefault_minima (
float) – Default minimum to use for non-zero value
- Return type:
float- Returns:
- :
- non_zero_minima
Minimum non-zero value.
- rainfallqc.utils.neighbourhood_utils.make_rain_not_minima_column_target_or_neighbour(data, target_col, other_col, data_minima)[source]¶
Get rain values that are not minima rainfall for target or neighbour.
- Parameters:
data (
DataFrame) – Rainfall datatarget_col (
str) – Target rainfall columnother_col (
str) – Other rainfall columndata_minima (
float) – Data minimum (i.e. lowest non-zero value)
- Return type:
DataFrame- Returns:
- :
- data
Rainfall data with “rain_not_minima” column
Functions¶
Compute kilometre distances between gauges in network and target gauges. |
|
|
Compute temporal overlap in days. |
Compute overlap in days between target gauges and its neighbours. |
|
Get gauge IDs of nearest n time-overlapping neighbouring gauges. |
|
Get closest neighbours from neighbour distances data. |
|
Get the value at the nearest non-nan ETCCDI grid cell to the gauge coordinates. |
|
Get neighbours around gauge at least min_overlap_days of overlapping time steps. |
|
|
Get rain not equal to minima column. |
|
Get minimum non-zero value in rainfall data between target and neighbour. |
Get rain values that are not minima rainfall for target or neighbour. |