Quality Control

Build QC your way - flexible checks with a consistent framework.

Why use Time-Stream?

Quality control is essential for any environmental dataset, but QC rules vary between projects, organisations, and sensor types. Time-Stream doesn’t make those decisions for you - instead, it provides a framework for applying common types of QC.

One-liner

QC checks are lightweight, configurable, and explicit:

tf_flagged = tf.qc_check(
   "comparison", "rainfall", compare_to=0, operator="<", into="rainfall_flag"
)

A single call with rich meaning: “I want to QC check when my rainfall data is less than a value of 0, with results saved to a column named rainfall_flag.

Key benefits

  • You stay in control Flexibility to choose your thresholds, operators, and ranges.

  • Reproducible QC The same logic can be applied across datasets.

  • Traceable results Checks can add explicit boolean columns or flag values for later analysis.

  • Flexible Combine multiple checks, apply them in sequence, or restrict them to intervals.

In more detail

The qc_check() method applies a single QC check to one column. It can return a boolean mask (for filtering) or update the TimeFrame with a new column containing the results of the QC check. Each QC check is configurable through parameters specific to that check - see examples below.

Available checks

  • "comparison" - compare values against a constant or list using operators: <, <=, >, >=, ==, !=, is_in

    Use for value thresholds or list of error codes.

  • "range" - check if values lie inside/outside a min–max interval.

    Use for physical plausibility bounds (e.g. temperature between −50 and 50 °C).

  • "time_range" - flag data between specific time ranges.

    Use for known bad periods such as sensor outages or calibration times.

  • "spike" - detect sudden jumps using neighbour differences.

    Use for unrealistic single-point spikes.

Examples:

  1. Temperature greater than or equal to 50

tf = tf.qc_check(
    "comparison", "temperature", compare_to=50, operator=">=", into=True
)
shape: (10, 5)
┌─────────────────────┬─────────────┬───────────────┬──────────────┬───────────────────────────────┐
│ timestamp           ┆ temperature ┆ precipitation ┆ sensor_codes ┆ __qc__temperature__comparison │
│ ---                 ┆ ---         ┆ ---           ┆ ---          ┆ ---                           │
│ datetime[μs]        ┆ i64         ┆ i64           ┆ i64          ┆ bool                          │
╞═════════════════════╪═════════════╪═══════════════╪══════════════╪═══════════════════════════════╡
│ 2023-01-01 00:00:00 ┆ 24          ┆ -3            ┆ 992          ┆ false                         │
│ 2023-01-01 01:00:00 ┆ 22          ┆ 0             ┆ 1            ┆ false                         │
│ 2023-01-01 02:00:00 ┆ -35         ┆ 5             ┆ 1            ┆ false                         │
│ 2023-01-01 03:00:00 ┆ 26          ┆ 10            ┆ 1            ┆ false                         │
│ 2023-01-01 04:00:00 ┆ 24          ┆ 2             ┆ 1            ┆ false                         │
│ 2023-01-01 05:00:00 ┆ 26          ┆ 0             ┆ 1            ┆ false                         │
│ 2023-01-01 06:00:00 ┆ 28          ┆ 0             ┆ 1            ┆ false                         │
│ 2023-01-01 07:00:00 ┆ 50          ┆ 3             ┆ 991          ┆ true                          │
│ 2023-01-01 08:00:00 ┆ 52          ┆ 1             ┆ 995          ┆ true                          │
│ 2023-01-01 09:00:00 ┆ 29          ┆ 0             ┆ 1            ┆ false                         │
└─────────────────────┴─────────────┴───────────────┴──────────────┴───────────────────────────────┘
  1. Sensor codes within a list

error_codes = [991, 992, 993, 994, 995]
tf = tf.qc_check(
    "comparison", "sensor_codes", compare_to=error_codes, operator="is_in", into=True
)
shape: (10, 5)
┌─────────────────────┬─────────────┬───────────────┬──────────────┬───────────────────────────────┐
│ timestamp           ┆ temperature ┆ precipitation ┆ sensor_codes ┆ __qc__sensor_codes__compariso │
│ ---                 ┆ ---         ┆ ---           ┆ ---          ┆ n                             │
│ datetime[μs]        ┆ i64         ┆ i64           ┆ i64          ┆ ---                           │
│                     ┆             ┆               ┆              ┆ bool                          │
╞═════════════════════╪═════════════╪═══════════════╪══════════════╪═══════════════════════════════╡
│ 2023-01-01 00:00:00 ┆ 24          ┆ -3            ┆ 992          ┆ true                          │
│ 2023-01-01 01:00:00 ┆ 22          ┆ 0             ┆ 1            ┆ false                         │
│ 2023-01-01 02:00:00 ┆ -35         ┆ 5             ┆ 1            ┆ false                         │
│ 2023-01-01 03:00:00 ┆ 26          ┆ 10            ┆ 1            ┆ false                         │
│ 2023-01-01 04:00:00 ┆ 24          ┆ 2             ┆ 1            ┆ false                         │
│ 2023-01-01 05:00:00 ┆ 26          ┆ 0             ┆ 1            ┆ false                         │
│ 2023-01-01 06:00:00 ┆ 28          ┆ 0             ┆ 1            ┆ false                         │
│ 2023-01-01 07:00:00 ┆ 50          ┆ 3             ┆ 991          ┆ true                          │
│ 2023-01-01 08:00:00 ┆ 52          ┆ 1             ┆ 995          ┆ true                          │
│ 2023-01-01 09:00:00 ┆ 29          ┆ 0             ┆ 1            ┆ false                         │
└─────────────────────┴─────────────┴───────────────┴──────────────┴───────────────────────────────┘
  1. Temperatures outside of min and max range (below -30 and above 50)

tf = tf.qc_check(
    "range",
    "temperature",
    min_value=-10,
    max_value=50,
    closed="none",  # Range is not inclusive of min and max value
    within=False,  # Flag values outside of this range
    into=True,
)
shape: (10, 5)
┌─────────────────────┬─────────────┬───────────────┬──────────────┬──────────────────────────┐
│ timestamp           ┆ temperature ┆ precipitation ┆ sensor_codes ┆ __qc__temperature__range │
│ ---                 ┆ ---         ┆ ---           ┆ ---          ┆ ---                      │
│ datetime[μs]        ┆ i64         ┆ i64           ┆ i64          ┆ bool                     │
╞═════════════════════╪═════════════╪═══════════════╪══════════════╪══════════════════════════╡
│ 2023-01-01 00:00:00 ┆ 24          ┆ -3            ┆ 992          ┆ false                    │
│ 2023-01-01 01:00:00 ┆ 22          ┆ 0             ┆ 1            ┆ false                    │
│ 2023-01-01 02:00:00 ┆ -35         ┆ 5             ┆ 1            ┆ true                     │
│ 2023-01-01 03:00:00 ┆ 26          ┆ 10            ┆ 1            ┆ false                    │
│ 2023-01-01 04:00:00 ┆ 24          ┆ 2             ┆ 1            ┆ false                    │
│ 2023-01-01 05:00:00 ┆ 26          ┆ 0             ┆ 1            ┆ false                    │
│ 2023-01-01 06:00:00 ┆ 28          ┆ 0             ┆ 1            ┆ false                    │
│ 2023-01-01 07:00:00 ┆ 50          ┆ 3             ┆ 991          ┆ true                     │
│ 2023-01-01 08:00:00 ┆ 52          ┆ 1             ┆ 995          ┆ true                     │
│ 2023-01-01 09:00:00 ┆ 29          ┆ 0             ┆ 1            ┆ false                    │
└─────────────────────┴─────────────┴───────────────┴──────────────┴──────────────────────────┘
  1. Flag rainfall values between the hours of 01:00 and 03:00

tf = tf.qc_check(
    "time_range",
    "precipitation",
    min_value=time(1, 0),
    max_value=time(3, 0),
    into=True
)
shape: (10, 5)
┌─────────────────────┬─────────────┬───────────────┬──────────────┬───────────────────────────────┐
│ timestamp           ┆ temperature ┆ precipitation ┆ sensor_codes ┆ __qc__precipitation__time_ran │
│ ---                 ┆ ---         ┆ ---           ┆ ---          ┆ g…                            │
│ datetime[μs]        ┆ i64         ┆ i64           ┆ i64          ┆ ---                           │
│                     ┆             ┆               ┆              ┆ bool                          │
╞═════════════════════╪═════════════╪═══════════════╪══════════════╪═══════════════════════════════╡
│ 2023-01-01 00:00:00 ┆ 24          ┆ -3            ┆ 992          ┆ false                         │
│ 2023-01-01 01:00:00 ┆ 22          ┆ 0             ┆ 1            ┆ true                          │
│ 2023-01-01 02:00:00 ┆ -35         ┆ 5             ┆ 1            ┆ true                          │
│ 2023-01-01 03:00:00 ┆ 26          ┆ 10            ┆ 1            ┆ true                          │
│ 2023-01-01 04:00:00 ┆ 24          ┆ 2             ┆ 1            ┆ false                         │
│ 2023-01-01 05:00:00 ┆ 26          ┆ 0             ┆ 1            ┆ false                         │
│ 2023-01-01 06:00:00 ┆ 28          ┆ 0             ┆ 1            ┆ false                         │
│ 2023-01-01 07:00:00 ┆ 50          ┆ 3             ┆ 991          ┆ false                         │
│ 2023-01-01 08:00:00 ┆ 52          ┆ 1             ┆ 995          ┆ false                         │
│ 2023-01-01 09:00:00 ┆ 29          ┆ 0             ┆ 1            ┆ false                         │
└─────────────────────┴─────────────┴───────────────┴──────────────┴───────────────────────────────┘
  1. Flag temperature values between 03:30 on the 1st January and 09:30 on the 1st January

tf = tf.qc_check(
    "time_range",
    "temperature",
    min_value=datetime(2023, 1, 1, 3, 30),
    max_value=datetime(2023, 1, 1, 9, 30),
    into=True,
)
shape: (10, 5)
┌─────────────────────┬─────────────┬───────────────┬──────────────┬───────────────────────────────┐
│ timestamp           ┆ temperature ┆ precipitation ┆ sensor_codes ┆ __qc__temperature__time_range │
│ ---                 ┆ ---         ┆ ---           ┆ ---          ┆ ---                           │
│ datetime[μs]        ┆ i64         ┆ i64           ┆ i64          ┆ bool                          │
╞═════════════════════╪═════════════╪═══════════════╪══════════════╪═══════════════════════════════╡
│ 2023-01-01 00:00:00 ┆ 24          ┆ -3            ┆ 992          ┆ false                         │
│ 2023-01-01 01:00:00 ┆ 22          ┆ 0             ┆ 1            ┆ false                         │
│ 2023-01-01 02:00:00 ┆ -35         ┆ 5             ┆ 1            ┆ false                         │
│ 2023-01-01 03:00:00 ┆ 26          ┆ 10            ┆ 1            ┆ false                         │
│ 2023-01-01 04:00:00 ┆ 24          ┆ 2             ┆ 1            ┆ true                          │
│ 2023-01-01 05:00:00 ┆ 26          ┆ 0             ┆ 1            ┆ true                          │
│ 2023-01-01 06:00:00 ┆ 28          ┆ 0             ┆ 1            ┆ true                          │
│ 2023-01-01 07:00:00 ┆ 50          ┆ 3             ┆ 991          ┆ true                          │
│ 2023-01-01 08:00:00 ┆ 52          ┆ 1             ┆ 995          ┆ true                          │
│ 2023-01-01 09:00:00 ┆ 29          ┆ 0             ┆ 1            ┆ true                          │
└─────────────────────┴─────────────┴───────────────┴──────────────┴───────────────────────────────┘
  1. Spike check on temperature data

tf = tf.qc_check(
    "spike", "temperature", threshold=10.0, into=True
)
shape: (10, 5)
┌─────────────────────┬─────────────┬───────────────┬──────────────┬──────────────────────────┐
│ timestamp           ┆ temperature ┆ precipitation ┆ sensor_codes ┆ __qc__temperature__spike │
│ ---                 ┆ ---         ┆ ---           ┆ ---          ┆ ---                      │
│ datetime[μs]        ┆ i64         ┆ i64           ┆ i64          ┆ bool                     │
╞═════════════════════╪═════════════╪═══════════════╪══════════════╪══════════════════════════╡
│ 2023-01-01 00:00:00 ┆ 24          ┆ -3            ┆ 992          ┆ null                     │
│ 2023-01-01 01:00:00 ┆ 22          ┆ 0             ┆ 1            ┆ false                    │
│ 2023-01-01 02:00:00 ┆ -35         ┆ 5             ┆ 1            ┆ true                     │
│ 2023-01-01 03:00:00 ┆ 26          ┆ 10            ┆ 1            ┆ false                    │
│ 2023-01-01 04:00:00 ┆ 24          ┆ 2             ┆ 1            ┆ false                    │
│ 2023-01-01 05:00:00 ┆ 26          ┆ 0             ┆ 1            ┆ false                    │
│ 2023-01-01 06:00:00 ┆ 28          ┆ 0             ┆ 1            ┆ false                    │
│ 2023-01-01 07:00:00 ┆ 50          ┆ 3             ┆ 991          ┆ false                    │
│ 2023-01-01 08:00:00 ┆ 52          ┆ 1             ┆ 995          ┆ false                    │
│ 2023-01-01 09:00:00 ┆ 29          ┆ 0             ┆ 1            ┆ null                     │
└─────────────────────┴─────────────┴───────────────┴──────────────┴──────────────────────────┘

Note

The result doesn’t flag the neighbouring high values of 50, 52. The spike test is really for detecting a sudden jump with one value between “normal” values.

Note

The result return null for the first and last values; the spike test relies of comparisons of neighbouring values.

Observation interval

Specify an observation interval to restrict the QC check to a specific time window. This is useful when:

  • You only want to QC a specific period of observations (e.g. summer 2024).

  • You need to re-run checks on recent data without reprocessing the full archive.

  • You want to exclude known bad periods (e.g. sensor maintenance) from checks.

Into

The into argument controls what you get back:

  • into=False → return a boolean Series (mask of failed rows).

  • into=True → add a new boolean column with an automatic name.

  • into="my_column" → add a new boolean column with a custom name.

Note

If a column name already exists, Time-Stream auto-suffixes it to avoid overwriting.