5  Hands-On Lab: Data Annotation

Goal

This hands-on lab session is designed to give participants practical experience in data annotation for deep learning. Participants will apply the methods, tools, and best practices discussed in the previous session, working directly with datasets to annotate data effectively.

Key Elements

Use of annotation methods and tools, direct dataset interaction

Choose your own adventure(s)

In this section, we’ll provide some links, basic information, and suggested starter activities for variety of annotation tools available today. Have a look and get your hands dirty!

Note: You’ll need some images to annotate in each case. Feel free to use any relevant images you might already have, or just do a web search and find something interesting. Of course, when experimeting with the web-based annotation platforms, be sure not to use upload anything personal, private, or otherwise sensitive.

Ideally you’ll cover:

5.1 Adventure: Make Sense

Web-based app, no setup or account required

.

.
  • MakeSense.ai is a simple, single-user, browser-based image annotation app
  • Supports annotation via bounding boxes, poylgons, points, and lines
  • Upload one or more images, apply/edit annotations, then export annotations
  • Offers model-based semi-automated annotation with an accept/reject interface
  • If you prefer, you can also grab the source code and run it locally using npm or Docker

Things to try

5.2 Adventure: Roboflow

Web-based app, requires (free) account signup

.

.
  • Roboflow offers a cloud-hosted, web-based platform for computer vision, including tooling for data annotation along with model training and deployment
  • They offer a limited free tier, which does not offer any privacy (project and images are automatically public)
  • Nice interface for doing annotations, managing artifacts, managing team

Things to try

5.3 Adventure: CVAT

Web-based app, requires (free) account signup

.

.

Things to try

5.4 Adventure: Zooniverse

Web-based app, no setup or account required

.

.

Zooniverse is a cool community crowdsourcing platform on the web, for data annotation and digitization.

Things to try

5.5 Adventure: IRIS (Intelligently Reinforced Image Segmentation)

Requires local installation (Python + JavaScript application).

See Installation instructions with for setting IRIS up locally with Python/pip.

.

.

IRIS is a tool for doing semi-automated image segmentation of satellite imagery (or images in general), with a goal of accelerating the creation of ML training datasets for Earth Observation. The user interface provides configurable simultaneous views of the same image for multispectral imagery, along with interactive AI-assisted segmentation.

Unlike much of the ML we’ll encounter this week, the backend model in this case is a gradient boosted decision tree. The reason this works sufficiently well is that IRIS is geared toward segmenting multispectral imagery into a small number of classes, training from scratch on each image; the model is able to learn the correlation structure between features and labels by leveraging multiple features per pixel after the human-in-the-loop manually segments and labels pixels.

For more information, check out the YouTube video with the main creator Alistar Francis.

Things to try

5.6 Adventure: Segment-Geospatial (samgeo)

Requires local installation (Python library), or can be run in a hosted notebook environment (JupyterLab, Google Collab, etc).

See Installation notes.

This is an open source tool that you can either install locally or run in JupyterLab (or Google colab).

First check out the online Segment Anything Model (SAM) demo. SAM was developed by Meta AI. It is trained as a generalized segmentation model that is able to segment (but not label) arbitrary objects in an image. It is designed as a promptable tool, which means a user can provide initial point(s) or box(es) that roughly localize an object within an image, and SAM will try to fully segment that object. Alternatively, it can automatically segment an entier image, effectively by self-promtping with a complete grid of points, and then intelligently merging the corresponding segments.

Today, SAM is used by numerous image annotation tools to provide interactive, AI-assisted segmentation capabilities.

One such tool is the segment-geospatial Python package, which provides some base functionality for applying SAM to geospatial data, either programatically or interactively.

Note that in addition to using segment-geospatial directly using Python in a notebook or other environment, you can also play with SAM-assisted segmentation in QGIS and ArcGIS.

Things to try

5.7 Adventure: Label Studio

Python app must be installed and run locally (unless you pay for an Enterprise cloud account)

See Quick start document with instructions for installing with pip and running locally in a web browser.

5.8 Other things to try