Label Studio¶
We provide a set for the Label Studio project (data labelling for machine learning) with an ML Backend that suggests, at a minimum, whether or not an image is plankton or detritus.
There’s an example docker-compose.yml file in this project for building and running Label Studio and its backend.
The full configuration for the running service is in this private project <https://github.com/ukceh-rse/podman-host> - please contact a member of the RSE group if you would like access.
ML Backend notes¶
src/label_studio_cyto_ml/model.py contains our custom model code.
It runs two models
A ResNet (could be any deep learning model) that extracts embeddings from an image
A kmeans clustering model which fits the resulting embeddings with a specific label_studio_cyto_ml
Return format¶
It took a bit of figuring out to get the return format right. A single prediction needs returned as an array of results, like this:
PredictionValue(result=[{“id”: int(label), “text”: “test”, “type”: “Choices”}])
The ModelResponse is then an array of these PredictionValue objects
ModelResponse(predictions=predictions)
The prediction needs a type value which internally is a control tag <https://labelstud.io/tags/choices> - many types of these for different media, our checkbox / radio buttons are Choices
The input to the annotation task looks like this (defined when setting up the project)
{‘organism_type’: {‘type’: ‘Choices’, ‘to_name’: [‘image’], ‘inputs’: [{‘type’: ‘Image’, ‘valueType’: None, ‘value’: ‘image’}], ‘labels’: [‘Not-plankton’, ‘Plankton’, ‘Debris’], ‘labels_attrs’: {‘Not-plankton’: {‘value’: ‘Not-plankton’}, ‘Plankton’: {‘value’: ‘Plankton’}, ‘Debris’: {‘value’: ‘Debris’}}}, ‘morphology’: {‘type’: ‘Choices’, ‘to_name’: [‘image’], ‘inputs’: [{‘type’: ‘Image’, ‘valueType’: None, ‘value’: ‘image’}], ‘labels’: [‘Mucilage’, ‘Flagella’, ‘Cilia’, ‘Aerotopes’, ‘Akinetes’, ‘Heterocytes’, ‘Theca/test/exoskeletal structures’, ‘Eggs’, ‘Ephippia’], ‘labels_attrs’: {‘Mucilage’: {‘value’: ‘Mucilage’}, ‘Flagella’: {‘value’: ‘Flagella’}, ‘Cilia’: {‘value’: ‘Cilia’}, ‘Aerotopes’: {‘value’: ‘Aerotopes’}, ‘Akinetes’: {‘value’: ‘Akinetes’}, ‘Heterocytes’: {‘value’: ‘Heterocytes’}, ‘Theca/test/exoskeletal structures’: {‘value’: ‘Theca/test/exoskeletal structures’}, ‘Eggs’: {‘value’: ‘Eggs’}, ‘Ephippia’: {‘value’: ‘Ephippia’}}}, ‘life_form’: {‘type’: ‘Choices’, ‘to_name’: [‘image’], ‘inputs’: [{‘type’: ‘Image’, ‘valueType’: None, ‘value’: ‘image’}], ‘labels’: [‘Unicellular’, ‘Colony’, ‘Filament’], ‘labels_attrs’: {‘Unicellular’: {‘value’: ‘Unicellular’}, ‘Colony’: {‘value’: ‘Colony’}, ‘Filament’: {‘value’: ‘Filament’}}}, ‘shape’: {‘type’: ‘Choices’, ‘to_name’: [‘image’], ‘inputs’: [{‘type’: ‘Image’, ‘valueType’: None, ‘value’: ‘image’}], ‘labels’: [‘Spiky’, ‘Round’, ‘Rod-like’], ‘labels_attrs’: {‘Spiky’: {‘value’: ‘Spiky’}, ‘Round’: {‘value’: ‘Round’}, ‘Rod-like’: {‘value’: ‘Rod-like’}}}, ‘ta’: {‘type’: ‘TextArea’, ‘to_name’: [‘image’], ‘inputs’: [{‘type’: ‘Image’, ‘valueType’: None, ‘value’: ‘image’}], ‘labels’: [], ‘labels_attrs’: {}}}
`Troubleshooting pre-annotations <https://labelstud.io/guide/troubleshooting#Pre-annotations>`
Connection to Label Studio¶
Each Label Studio project needs configured to use an ML backend service.
This could be our custom one or a range of off-the-shelf options (like SAM for segmentation)
Navigate to Project/Settings/Model
Add the URL referring to the container by name, as it reads in docker-compose.yml
For example, our docker-compose.yml has three services, one is named ml-backend, so this is the URL that goes in the project settings:
http://ml-backend:9090/
Label Studio analytics¶
We’ve had some issues with Label Studio enabling analytics by default, then page loads stalling because the analytics service is throttling requests.
As of writing this needs a build from source as well as configuration options, but should be fixed when version 1.17.1 becomes the default docker build (see this issue https://github.com/HumanSignal/label-studio/issues/6430)
git clone https://github.com/HumanSignal/label-studio.git podman build -t heartexlabs/label-studio:latest .
Label Studio Account Management¶
One downside with the free edition is there’s no password reset option, and the only way to do this is via the commandline. When running in podman with a sqlite backend this involves starting a shell on the container,
Open a shell in the running container:
podman exec -it label-studio bash
Use the label-studio utility to change the password
label-studio reset_password –username <username> –password <new_password> to reset your password directly. 1