cyto_ml.visualisation package

Submodules

cyto_ml.visualisation.app module

Streamlit application to visualise how plankton cluster based on their embeddings from a deep learning model

  • Metadata in intake catalogue (basically a dataframe of filenames - later this could have lon/lat, date, depth read from Exif headers

  • Embeddings in chromadb, linked by filename

cyto_ml.visualisation.app.cached_image(url: str) <module 'PIL.Image' from '/opt/hostedtoolcache/Python/3.10.16/x64/lib/python3.10/site-packages/PIL/Image.py'>[source]

Read an image URL from s3 and return a PIL Image Hopefully caches this per-image, so it’ll speed up We tried streamlit_clickable_images but no tiff support

cyto_ml.visualisation.app.closest_grid(size: int | None = 65) None[source]

Given an image URL, render a grid of the N nearest images by cosine distance between embeddings N defaults to 26

cyto_ml.visualisation.app.closest_n(url: str, n: int | None = 26) list[source]

Given an image URL return the N closest ones by cosine distance

cyto_ml.visualisation.app.collections() List[str][source]
cyto_ml.visualisation.app.create_figure(df: DataFrame) Figure[source]

Creates scatter plot based on handed data frame TODO replace this layout with a) most basic image grid, switch between clusters b) …

cyto_ml.visualisation.app.image_embeddings() list[source]
cyto_ml.visualisation.app.image_ids(coll: str) list[source]

Retrieve image embeddings from chroma database. TODO Revisit our available metadata

cyto_ml.visualisation.app.main() None[source]

Main method that sets up the streamlit app and builds the visualisation.

cyto_ml.visualisation.app.pick_image(image: str) None[source]
cyto_ml.visualisation.app.random_image() str[source]
cyto_ml.visualisation.app.show_random_image() None[source]
cyto_ml.visualisation.app.store(coll: str) None[source]

Load the vector store with image embeddings.

cyto_ml.visualisation.config module

Module contents