2025 Week 29

Completely missed doing last weeks update, so due a double updates this week.

❇️ Lots of improvements in the UI

Scatterplots

Click the chart icon on the right to toggle between line charts and scatterplots.

Cross Network Data Plotting

Live data from FDRI and COSMOS can be compared in the same chart.

Temporary Corrections

Since the FDRI data is coming in unprocessed from the sensors and we aren’t ready to plot the processed data, we have added a temporary correction box to the explorer chart table, that allows on the fly corrections to happen. This allows any abitary calculations to be appended to the existing data.

Controls to print, zoom in/out etc

We’ve enabled the plotty controls to allow downloading the charts as png’s for reporting and zoom in/out controls.

‼️ Incident resulting in downtime

service-unavailable-503

We have a fun incident last week, that was an odd combination of bad luck.

We released a change to our staging environment that was missing some changes to config required for k8s which broke the deployment
Shortly after the flux Personal Access Token expired
We couldn’t rollback the change from 1. since the credentials had expired
The sim cards used for our sensors ran out of data so data stopped coming in at a similar time to above (this was completely unrelated)

The solution was to update the flux personal access token and document the process of doing so for next time. We could then rollback and get things working, fix and roll out the change 🚀.

➕ Connecting Data to Metadata

Our Ingestion/API/UI apps for FDRI are currently tracking the site via the campbell cloud id, which is forced into the MQTT topic by campbell’s software, this isn’t ideal as this id isn’t used anywhere else. We are looking at confirming what the site id’s should be across FDRI and how to add them to the messages so once we have metadata we can start looking it up.

🪣 Should we use a database (maybe not)

Continuing our previous discussions on outputing our processed data into a database instead of parquet files on s3, the decision has been reversed (again) so looks like parquet on s3 might be the better fit for now.

We are going to have a lot of sensor data, if we go with a narrow table design; all data in the same db table (time, site_id, variable_id, value). We would have 25,000 rows being added per minute. With a total of 500 loggers x 50 variables per logger x 60 minutes x 24 hours x 365 days = >13 billion “data rows” per year. est 100-500GB per year. Tools like tigerdata would help with this, but especially considering once processed this data is mostly read only, parquet might be a better fit.

Other solutions we discussed were having a wide table for known variables (time, site_id, air_temp, wind_speed, …) and a narrow table for unknown variables, but this adds additional complexity. Another solution is having a table per site, but then we need to managed 500 db tables.

Next steps: we have decided to continue with parquet on s3 partitioned by site and day (All variables in the same parquet file), exactly the same as our unprocessed data for now. We will continue assessing and learning as we build.

🌞 Switching to UV

https://docs.astral.sh/uv/

Since our work is heavily in progress we set off without lockfiles, which we’ve had surprisingly few issues from. But it’s overdue to start locking our dependencies and why not switch to modern python tooling in the progress. All our python code is now using uv.

NRFA data ingestion

We are just in the process of writing out the NRFA data to our level 0 (structured raw data) s3 bucket. Our partitioning structure for this will be different to COSMOS and FDRI, something like:

/nrfa/dataset=daily-flow/batch=ea_api/site=<nrfa_id>/data.parquet

Which will mean querying from the api/ui will be different, the key difference being the batch concept.