2025 Week 23
Covering last week and this week, lots of meetings
π§βπ€βπ§ Together Time
The first FDRI WP2 dev team together time happened last week in lancaster, we had some very productive sessions discussing various parts of FDRI. We did lots of drawing on whiteboards and post-it notes.
- ποΈWe played architecture golf
- π₯οΈ Planned our data API
- π· Made a start on agreeing ways of working
- π Planning for NRFA work
- π Data storage discussion
ποΈ Remote logger control
Jo Walsh joined the team, to continue lewisβs work on remote logger control via MQTT. Code here: https://github.com/NERC-CEH/campbell-mqtt-control
π‘NERC Tech Forum
Photo from the core store tour, huge storage facility storing rock samples from across the UK, shelves supporting up to 600 tonnes!
https://www.bgs.ac.uk/news/nerc-tech-forum-2025/
I attended the NERC tech forum at BGS in keyworth this week, it was great to meet some folks from FDRI work package 1 and 3.
ποΈ Organizations in Attendance
- British Geological Survey (BGS)
- British Antarctic Survey (BAS)
- National Centre for Earth Observation (NCEO)
- Scottish Association for Marine Science (SAMS)
- UK Centre for Ecology & Hydrology (UKCEH)
- Plymouth Marine Laboratory (PML)
- National Oceanography Centre (NOC)
Some unstructured notes from the talks/discussions:
- Robotics is in the medawar zone, sensors are easier and cheaper than ever and innovation is waiting to be discovered.
- FMRI is a new project. Much more money than FDRI
- FDRI has a new site being deployed on 23rd June (this was announced elsewhere too)
- WP1 planning to deploy a raspberry pi based camera solution soon too
- Learnt what QGIS is, software for visualising, recording and editing geospatial data, heavily used across the orgs above.
- In one project BGS were having trouble setting up and using MQTT so they switched to using email and using gmail as a backup. At first glance this sounds a bit crazy but it seemed to be working well for them.
- Catching up with WP1, there was mention that maybe higher frequency measurement could be useful? Perhaps 1 second resolution instead of every minute. To be discussed.
- There was talk about one of the challenges with measuring floods on the ground is that the measuring equipment is often damaged, using drones can help solve this problem.
- Part of FDRI is making historical adcp data available. Often visualised in of custom charts which are not line charts and not time series. For example river cross section river flow fluctuations.
- There was a talk on a Paris air quality sensor network, which has a lot of parallels with our setup, they handle 10GB of data per day. They had success with citizen science on making the first ingestion level of data all files, be that json/binary formats, csv etc.
π Replaying stuck/failed ingestion messages
We found ~20,000 messages had been received but stuck for various reasons since we setup the ingestion. We managed to work through them all fixing various bugs along the way. Now we have a real site deployed and data we care about, the importance of better monitoring is higher, we currently donβt know when messages fail to ingest. So far we have avoided any data loss and always found ways to recover, which is good.
Current incoming MQTT message status:
IN_PROGRESS: 0
VALIDATION_FAILED: 0
PARQUET_WRITE_FAILED: 0
PROCESSED: 4,831,098
Thatβs over 4 million messages ingested across the cosmos/fdri network since we started!
πͺ Gridded updates
Short and snappy this week as I ran out of time to do a proper job on these notes!! Gory details here
- The CHESS-MET dataset has now been chunked up and uploaded to the FDRI object store. Accessible to all using:
import fsspec import xarray as xr mapper = fsspec.get_mapper('s3://chess-met/chessmet_fulloutput_yearly_100km_chunks.zarr', anon=True, endpoint_url="https://fdri-o.s3-ext.jc.rl.ac.uk") ds = xr.open_zarr(mapper, consolidated=True)
- The Hourly GEAR gridded rainfall dataset has been chopped into smaller bits (chunks) to allow easier ingestion by scientists (and the technology they use!). Accessible via:
mapper = fsspec.get_mapper('s3://gearhrly/gearhrly_15day_100km_chunks.zarr', anon=True, endpoint_url="https://fdri-o.s3-ext.jc.rl.ac.uk") ds = xr.open_zarr(mapper, consolidated=True)
- Going forward work will be on developing examples for using these and other datasets, starting with developing material for the UKCEH (nee Hydro-JULES π) summer school next month. First prototype on DataLabs.