Quality Control

Build QC your way - flexible checks with a consistent framework.

Why use Time-Stream?

Quality control is essential for any environmental dataset, but QC rules vary between projects, organisations, and sensor types. Time-Stream doesn’t make those decisions for you - instead, it provides a framework for applying common types of QC.

One-liner

QC checks are lightweight, configurable, and explicit:

tf_flagged = tf.qc_check(
   "comparison", "rainfall", compare_to=0, operator="<", into="rainfall_flag"
)

A single call with rich meaning: “I want to QC check when my rainfall data is less than a value of 0, with results saved to a column named rainfall_flag.

Key benefits

  • You stay in control Flexibility to choose your thresholds, operators, and ranges.

  • Reproducible QC The same logic can be applied across datasets.

  • Traceable results Checks can add explicit boolean columns or flag values for later analysis.

  • Flexible Combine multiple checks, apply them in sequence, or restrict them to intervals.

In more detail

The qc_check() method applies a single QC check to one column. It can return a boolean mask (for filtering) or update the TimeFrame with a new column containing the results of the QC check. Each QC check is configurable through parameters specific to that check - see examples below.

Comparison check

Compare values against a constant or list using operators: <, <=, >, >=, ==, !=, is_in.

Use for value thresholds or lists of error codes.

Example: Temperature greater than or equal to 50:

tf = tf.qc_check(
    "comparison", "temperature", compare_to=50, operator=">=", into=True
)
shape: (10, 5)
┌─────────────────────┬─────────────┬───────────────┬──────────────┬───────────────────────────────┐
│ timestamp           ┆ temperature ┆ precipitation ┆ sensor_codes ┆ __qc__temperature__comparison │
│ ---                 ┆ ---         ┆ ---           ┆ ---          ┆ ---                           │
│ datetime[μs]        ┆ i64         ┆ i64           ┆ i64          ┆ bool                          │
╞═════════════════════╪═════════════╪═══════════════╪══════════════╪═══════════════════════════════╡
│ 2023-01-01 00:00:00 ┆ 24          ┆ -3            ┆ 992          ┆ false                         │
│ 2023-01-01 01:00:00 ┆ 22          ┆ 0             ┆ 1            ┆ false                         │
│ 2023-01-01 02:00:00 ┆ -35         ┆ 5             ┆ 1            ┆ false                         │
│ 2023-01-01 03:00:00 ┆ 26          ┆ 10            ┆ 1            ┆ false                         │
│ 2023-01-01 04:00:00 ┆ 24          ┆ 2             ┆ 1            ┆ false                         │
│ 2023-01-01 05:00:00 ┆ 26          ┆ 0             ┆ 1            ┆ false                         │
│ 2023-01-01 06:00:00 ┆ 28          ┆ 0             ┆ 1            ┆ false                         │
│ 2023-01-01 07:00:00 ┆ 50          ┆ 3             ┆ 991          ┆ true                          │
│ 2023-01-01 08:00:00 ┆ 52          ┆ 1             ┆ 995          ┆ true                          │
│ 2023-01-01 09:00:00 ┆ 29          ┆ 0             ┆ 1            ┆ false                         │
└─────────────────────┴─────────────┴───────────────┴──────────────┴───────────────────────────────┘

Example: Sensor codes within a list:

error_codes = [991, 992, 993, 994, 995]
tf = tf.qc_check(
    "comparison", "sensor_codes", compare_to=error_codes, operator="is_in", into=True
)
shape: (10, 5)
┌─────────────────────┬─────────────┬───────────────┬──────────────┬───────────────────────────────┐
│ timestamp           ┆ temperature ┆ precipitation ┆ sensor_codes ┆ __qc__sensor_codes__compariso │
│ ---                 ┆ ---         ┆ ---           ┆ ---          ┆ n                             │
│ datetime[μs]        ┆ i64         ┆ i64           ┆ i64          ┆ ---                           │
│                     ┆             ┆               ┆              ┆ bool                          │
╞═════════════════════╪═════════════╪═══════════════╪══════════════╪═══════════════════════════════╡
│ 2023-01-01 00:00:00 ┆ 24          ┆ -3            ┆ 992          ┆ true                          │
│ 2023-01-01 01:00:00 ┆ 22          ┆ 0             ┆ 1            ┆ false                         │
│ 2023-01-01 02:00:00 ┆ -35         ┆ 5             ┆ 1            ┆ false                         │
│ 2023-01-01 03:00:00 ┆ 26          ┆ 10            ┆ 1            ┆ false                         │
│ 2023-01-01 04:00:00 ┆ 24          ┆ 2             ┆ 1            ┆ false                         │
│ 2023-01-01 05:00:00 ┆ 26          ┆ 0             ┆ 1            ┆ false                         │
│ 2023-01-01 06:00:00 ┆ 28          ┆ 0             ┆ 1            ┆ false                         │
│ 2023-01-01 07:00:00 ┆ 50          ┆ 3             ┆ 991          ┆ true                          │
│ 2023-01-01 08:00:00 ┆ 52          ┆ 1             ┆ 995          ┆ true                          │
│ 2023-01-01 09:00:00 ┆ 29          ┆ 0             ┆ 1            ┆ false                         │
└─────────────────────┴─────────────┴───────────────┴──────────────┴───────────────────────────────┘

Range check

Check if values lie inside or outside a min-max interval.

Use for physical plausibility bounds (e.g. temperature between -30 and 50 °C).

Example: Temperatures outside of the range -30 to 50:

tf = tf.qc_check(
    "range",
    "temperature",
    min_value=-30,
    max_value=50,
    closed="none",  # Range is not inclusive of min and max value
    within=False,  # Flag values outside of this range
    into=True,
)
shape: (10, 5)
┌─────────────────────┬─────────────┬───────────────┬──────────────┬──────────────────────────┐
│ timestamp           ┆ temperature ┆ precipitation ┆ sensor_codes ┆ __qc__temperature__range │
│ ---                 ┆ ---         ┆ ---           ┆ ---          ┆ ---                      │
│ datetime[μs]        ┆ i64         ┆ i64           ┆ i64          ┆ bool                     │
╞═════════════════════╪═════════════╪═══════════════╪══════════════╪══════════════════════════╡
│ 2023-01-01 00:00:00 ┆ 24          ┆ -3            ┆ 992          ┆ false                    │
│ 2023-01-01 01:00:00 ┆ 22          ┆ 0             ┆ 1            ┆ false                    │
│ 2023-01-01 02:00:00 ┆ -35         ┆ 5             ┆ 1            ┆ true                     │
│ 2023-01-01 03:00:00 ┆ 26          ┆ 10            ┆ 1            ┆ false                    │
│ 2023-01-01 04:00:00 ┆ 24          ┆ 2             ┆ 1            ┆ false                    │
│ 2023-01-01 05:00:00 ┆ 26          ┆ 0             ┆ 1            ┆ false                    │
│ 2023-01-01 06:00:00 ┆ 28          ┆ 0             ┆ 1            ┆ false                    │
│ 2023-01-01 07:00:00 ┆ 50          ┆ 3             ┆ 991          ┆ true                     │
│ 2023-01-01 08:00:00 ┆ 52          ┆ 1             ┆ 995          ┆ true                     │
│ 2023-01-01 09:00:00 ┆ 29          ┆ 0             ┆ 1            ┆ false                    │
└─────────────────────┴─────────────┴───────────────┴──────────────┴──────────────────────────┘

Time range check

Flag data between specific time ranges.

Use for known bad periods such as sensor outages or calibration times.

Example: Flag rainfall values between the hours of 01:00 and 03:00:

tf = tf.qc_check(
    "time_range",
    "precipitation",
    min_value=time(1, 0),
    max_value=time(3, 0),
    into=True
)
shape: (10, 5)
┌─────────────────────┬─────────────┬───────────────┬──────────────┬───────────────────────────────┐
│ timestamp           ┆ temperature ┆ precipitation ┆ sensor_codes ┆ __qc__precipitation__time_ran │
│ ---                 ┆ ---         ┆ ---           ┆ ---          ┆ g…                            │
│ datetime[μs]        ┆ i64         ┆ i64           ┆ i64          ┆ ---                           │
│                     ┆             ┆               ┆              ┆ bool                          │
╞═════════════════════╪═════════════╪═══════════════╪══════════════╪═══════════════════════════════╡
│ 2023-01-01 00:00:00 ┆ 24          ┆ -3            ┆ 992          ┆ false                         │
│ 2023-01-01 01:00:00 ┆ 22          ┆ 0             ┆ 1            ┆ true                          │
│ 2023-01-01 02:00:00 ┆ -35         ┆ 5             ┆ 1            ┆ true                          │
│ 2023-01-01 03:00:00 ┆ 26          ┆ 10            ┆ 1            ┆ true                          │
│ 2023-01-01 04:00:00 ┆ 24          ┆ 2             ┆ 1            ┆ false                         │
│ 2023-01-01 05:00:00 ┆ 26          ┆ 0             ┆ 1            ┆ false                         │
│ 2023-01-01 06:00:00 ┆ 28          ┆ 0             ┆ 1            ┆ false                         │
│ 2023-01-01 07:00:00 ┆ 50          ┆ 3             ┆ 991          ┆ false                         │
│ 2023-01-01 08:00:00 ┆ 52          ┆ 1             ┆ 995          ┆ false                         │
│ 2023-01-01 09:00:00 ┆ 29          ┆ 0             ┆ 1            ┆ false                         │
└─────────────────────┴─────────────┴───────────────┴──────────────┴───────────────────────────────┘

Example: Flag temperature values between 03:30 and 09:30 on the 1st January:

tf = tf.qc_check(
    "time_range",
    "temperature",
    min_value=datetime(2023, 1, 1, 3, 30),
    max_value=datetime(2023, 1, 1, 9, 30),
    into=True,
)
shape: (10, 5)
┌─────────────────────┬─────────────┬───────────────┬──────────────┬───────────────────────────────┐
│ timestamp           ┆ temperature ┆ precipitation ┆ sensor_codes ┆ __qc__temperature__time_range │
│ ---                 ┆ ---         ┆ ---           ┆ ---          ┆ ---                           │
│ datetime[μs]        ┆ i64         ┆ i64           ┆ i64          ┆ bool                          │
╞═════════════════════╪═════════════╪═══════════════╪══════════════╪═══════════════════════════════╡
│ 2023-01-01 00:00:00 ┆ 24          ┆ -3            ┆ 992          ┆ false                         │
│ 2023-01-01 01:00:00 ┆ 22          ┆ 0             ┆ 1            ┆ false                         │
│ 2023-01-01 02:00:00 ┆ -35         ┆ 5             ┆ 1            ┆ false                         │
│ 2023-01-01 03:00:00 ┆ 26          ┆ 10            ┆ 1            ┆ false                         │
│ 2023-01-01 04:00:00 ┆ 24          ┆ 2             ┆ 1            ┆ true                          │
│ 2023-01-01 05:00:00 ┆ 26          ┆ 0             ┆ 1            ┆ true                          │
│ 2023-01-01 06:00:00 ┆ 28          ┆ 0             ┆ 1            ┆ true                          │
│ 2023-01-01 07:00:00 ┆ 50          ┆ 3             ┆ 991          ┆ true                          │
│ 2023-01-01 08:00:00 ┆ 52          ┆ 1             ┆ 995          ┆ true                          │
│ 2023-01-01 09:00:00 ┆ 29          ┆ 0             ┆ 1            ┆ true                          │
└─────────────────────┴─────────────┴───────────────┴──────────────┴───────────────────────────────┘

Spike check

Detect sudden jumps using neighbour differences.

Use for unrealistic single-point spikes.

Example: Spike check on temperature data:

tf = tf.qc_check(
    "spike", "temperature", threshold=10.0, into=True
)
shape: (10, 5)
┌─────────────────────┬─────────────┬───────────────┬──────────────┬──────────────────────────┐
│ timestamp           ┆ temperature ┆ precipitation ┆ sensor_codes ┆ __qc__temperature__spike │
│ ---                 ┆ ---         ┆ ---           ┆ ---          ┆ ---                      │
│ datetime[μs]        ┆ i64         ┆ i64           ┆ i64          ┆ bool                     │
╞═════════════════════╪═════════════╪═══════════════╪══════════════╪══════════════════════════╡
│ 2023-01-01 00:00:00 ┆ 24          ┆ -3            ┆ 992          ┆ null                     │
│ 2023-01-01 01:00:00 ┆ 22          ┆ 0             ┆ 1            ┆ false                    │
│ 2023-01-01 02:00:00 ┆ -35         ┆ 5             ┆ 1            ┆ true                     │
│ 2023-01-01 03:00:00 ┆ 26          ┆ 10            ┆ 1            ┆ false                    │
│ 2023-01-01 04:00:00 ┆ 24          ┆ 2             ┆ 1            ┆ false                    │
│ 2023-01-01 05:00:00 ┆ 26          ┆ 0             ┆ 1            ┆ false                    │
│ 2023-01-01 06:00:00 ┆ 28          ┆ 0             ┆ 1            ┆ false                    │
│ 2023-01-01 07:00:00 ┆ 50          ┆ 3             ┆ 991          ┆ false                    │
│ 2023-01-01 08:00:00 ┆ 52          ┆ 1             ┆ 995          ┆ false                    │
│ 2023-01-01 09:00:00 ┆ 29          ┆ 0             ┆ 1            ┆ null                     │
└─────────────────────┴─────────────┴───────────────┴──────────────┴──────────────────────────┘

Note

The result doesn’t flag the neighbouring high values of 50 and 52. The spike test detects a sudden jump where one value sits between otherwise normal values.

Note

The result returns null for the first and last values; the spike test relies on comparisons with neighbouring values.

Flat line check

Detect consecutive repeated (or near-repeated) values.

Use when a sensor stuck at a fixed value should be flagged as suspect.

Example: Flag temperature values stuck at the same reading for 3 or more consecutive timesteps:

tf = tf.qc_check(
    "flat_line", "temperature", min_count=3, into=True
)
shape: (10, 3)
┌─────────────────────┬─────────────┬──────────────────────────────┐
│ timestamp           ┆ temperature ┆ __qc__temperature__flat_line │
│ ---                 ┆ ---         ┆ ---                          │
│ datetime[μs]        ┆ f64         ┆ bool                         │
╞═════════════════════╪═════════════╪══════════════════════════════╡
│ 2023-01-01 00:00:00 ┆ 18.0        ┆ false                        │
│ 2023-01-01 01:00:00 ┆ 20.0        ┆ true                         │
│ 2023-01-01 02:00:00 ┆ 20.0        ┆ true                         │
│ 2023-01-01 03:00:00 ┆ 20.0        ┆ true                         │
│ 2023-01-01 04:00:00 ┆ 20.0        ┆ true                         │
│ 2023-01-01 05:00:00 ┆ 22.0        ┆ false                        │
│ 2023-01-01 06:00:00 ┆ 21.0        ┆ false                        │
│ 2023-01-01 07:00:00 ┆ 0.0         ┆ true                         │
│ 2023-01-01 08:00:00 ┆ 0.0         ┆ true                         │
│ 2023-01-01 09:00:00 ┆ 0.0         ┆ true                         │
└─────────────────────┴─────────────┴──────────────────────────────┘

Example: Using ignore_value - suppress flagging when the repeated value is 0.0:

tf = tf.qc_check(
    "flat_line", "temperature", min_count=3, ignore_value=0.0, into=True
)
shape: (10, 3)
┌─────────────────────┬─────────────┬──────────────────────────────┐
│ timestamp           ┆ temperature ┆ __qc__temperature__flat_line │
│ ---                 ┆ ---         ┆ ---                          │
│ datetime[μs]        ┆ f64         ┆ bool                         │
╞═════════════════════╪═════════════╪══════════════════════════════╡
│ 2023-01-01 00:00:00 ┆ 18.0        ┆ false                        │
│ 2023-01-01 01:00:00 ┆ 20.0        ┆ true                         │
│ 2023-01-01 02:00:00 ┆ 20.0        ┆ true                         │
│ 2023-01-01 03:00:00 ┆ 20.0        ┆ true                         │
│ 2023-01-01 04:00:00 ┆ 20.0        ┆ true                         │
│ 2023-01-01 05:00:00 ┆ 22.0        ┆ false                        │
│ 2023-01-01 06:00:00 ┆ 21.0        ┆ false                        │
│ 2023-01-01 07:00:00 ┆ 0.0         ┆ false                        │
│ 2023-01-01 08:00:00 ┆ 0.0         ┆ false                        │
│ 2023-01-01 09:00:00 ┆ 0.0         ┆ false                        │
└─────────────────────┴─────────────┴──────────────────────────────┘

Note

More than one ignore_value can be specified in a list, e.g. [0.0, 20.0]

Example: Using tolerance - flag values that barely change (within 0.01) for 3 or more consecutive readings:

The data below drifts slightly around 20 °C (varying by less than 0.01 between readings) before jumping to a different range. The tolerance parameter catches these near-flat runs that exact equality would miss.

tf = tf.qc_check(
    "flat_line", "temperature", min_count=3, tolerance=0.1, into=True
)
shape: (10, 3)
┌─────────────────────┬─────────────┬──────────────────────────────┐
│ timestamp           ┆ temperature ┆ __qc__temperature__flat_line │
│ ---                 ┆ ---         ┆ ---                          │
│ datetime[μs]        ┆ f64         ┆ bool                         │
╞═════════════════════╪═════════════╪══════════════════════════════╡
│ 2023-01-01 00:00:00 ┆ 18.0        ┆ false                        │
│ 2023-01-01 01:00:00 ┆ 20.0        ┆ true                         │
│ 2023-01-01 02:00:00 ┆ 20.005      ┆ true                         │
│ 2023-01-01 03:00:00 ┆ 20.001      ┆ true                         │
│ 2023-01-01 04:00:00 ┆ 19.991      ┆ true                         │
│ 2023-01-01 05:00:00 ┆ 22.0        ┆ false                        │
│ 2023-01-01 06:00:00 ┆ 20.99       ┆ true                         │
│ 2023-01-01 07:00:00 ┆ 21.003      ┆ true                         │
│ 2023-01-01 08:00:00 ┆ 21.009      ┆ true                         │
│ 2023-01-01 09:00:00 ┆ 20.997      ┆ true                         │
└─────────────────────┴─────────────┴──────────────────────────────┘

Additional parameters

Observation interval

Specify an observation interval to restrict the QC check to a specific time window. This is useful when:

  • You only want to QC a specific period of observations (e.g. summer 2024).

  • You need to re-run checks on recent data without reprocessing the full archive.

  • You want to exclude known bad periods (e.g. sensor maintenance) from checks.

Into

The into argument controls what you get back:

  • into=False → return a boolean Series (mask of failed rows).

  • into=True → add a new boolean column with an automatic name.

  • into="my_column" → add a new boolean column with a custom name.

Note

If a column name already exists, Time-Stream auto-suffixes it to avoid overwriting.