2025 Week 49

Keeping up the rhythm, here’s the next update, loads happened in just 5 days!

🫥 Scatterplots

Still in development at the moment but comparison plots should soon be available in the UI. A new tab within the explore page will be added plotting one variable against another, with the R squared value and trendline able to be displayed as well. Variables to be compared can be selected by checking the relevant boxes in the “Compare” column. Selecting a third variable will cause the first to be de-selected.

🐊 SFTP is live

Screenshot 2025-12-08 115732

We now have a live SFTP server setup which allows uploading data directly to S3. This is setup to support high resolution data that isn’t suitable for MQTT, which currently is the Flux/ Eddy Co-variance data.

🤯 Metadata outage

We’ve had a couple metadata service outages this week, one we think is related to new Service Control Policies being enabled on our AWS account which is stopping us from pulling the metadata images from epimorphics AWS ECR, second issue seemed to be new metadata images requiring more resources than requested. Metadata is core to everything we are building so it being down stops everything from working, but we don’t loose data it just becomes further-real-time data.

🫸 Dev metadata API environment and dynamically published metadata

Inbetween the metadata outages, a dev API environment was setup to allow us to test our own changes via the metadata-ingest tool. We now have our first piece of metadata that has been published via a POST request rather than through a github action. This marks a big change allowing us to update our metdata dynamically from other tools in the FDRI infrastructure.

☀️ NMDB (Neutron Monitor Database)

Neutron monitoring data from NMDB (used as background reference when calculating soil moisture in COSMOS) is now being saved in AWS as its own network, and is registered in metadata. Using it within COSMOS processing will demonstrate the cross network power of our system. Cosmic!

🧟 Dead Letter Queue

We recieved a bad message on one of our ingestion data queues, which clogged up the pipeline, we are in the process of setting up dead letter queues so these messages once retried a number of times will get moved to a different queue to allow the next message to come through on the “main” queue. The messages on the dead letter queue can be dealt with independently as and when they occur (hopefully very rarely).

🐢 Api performance issues on site-variables endpoint

We’ve had a few issues with performance for the data api lately, caused by the changes to combine data from duckdb and the metadata api for each endpoint. Whilst this worked fine for FDRI, COSMOS had too much data and caused DuckDB to slow down significantly. We’ve now switched to iterating through the S3 bucket structure directly for the required information where possible (e.g. the names of the site and datasets for a network) which has proven to be significantly faster.

🧙‍♂️ Windmaster metadata changes

How we incorperate deployment metadata such as the height of the windmaster sensor, into our derivations has been an interesting challenge. An approach that treats these configuration as datasets has put the wind in our sails and we will begin implementing very soon. This will unlock even more possibilities for deriving new data in a robust and generalised way.

🍋 AWS Migration

We are currently migrating our AWS account to AWS Landing Zone and reworking some of the core pieces. One of the trigger for this is for us to start thinking more about our staging/production environment split, currently staging is production and production is staging 🙈. Since we aren’t dealing with sensitive data, are only a single development team and low costs are very important. We are thinking of instead of separate AWS accounts, mushing staging and production together, work in progress diagram above.