Effortless computer vision annotations in Dash with dash-annotate-cv

Unleashing the power of dash for effortless annotations.

The long road to find better ML tools.

One of my favorite tools lately has been dash, a Python library for writing dashboards in a flash. For one thing, it has all the power of Python. For another, it’s super easy to build fantastic looking user interfaces, especially when leveraging the great dash-bootstrap-components library. This is a big step up from just using flask. Best of all, it can leverage the fantastic plotly graphing library, which in my opinion has the best looking plots outside of Python. And not only can we make plots with it, but we can make fully rich interactive plots.

I recently came across image annotations in the dash doc pages. Working on machine learning, this instantly stood out as a missing piece in the giant puzzle of ML tools. It only took 30 or so re-reads of the doc pages to plan out the vision, and this article is about the result:

GitHub - smrfeld/dash-annotate-cv: Dash components for computer vision annotation tasks

dash-annotate-cv is a dash library for computer vision annotation tasks. The initial support is for image-level tags and bounding box annotations, with more features planned. This is not a production library - this is for your local machine, a quick and easy tool to annotate those missing images for that extra evaluation at the end of your project. This is for your (and my!) hobby projects - the ones outside of work that worked on for two nights straight and then delayed until next year. The focus is on re-usability and ease-of-use, and unleashing the power of dash for computer vision annotations.

Motivation - annotations for computer vision tasks

Let’s start by rattling off some of the common types of labeled data in computer vision:

Image-level tags - annotate an entire image with one or more tags.
Bounding boxes - annotate bounding boxes of objects in the image, plus a label for each bounding box (e.g. annotate all the cars in the image with a box).
Image segmentation - annotate the image at the pixel level. While the end-result annotations are typically at the pixel level, this really involves drawing closed polygons on the image around the border of objects, so it is really a generalization of the bounding box task to arbitrary polygons.
Keypoint annotation - this usually refers to annotating skeletons for person tracking applications.
Video-level annotations - without getting into it too deeply, interesting tasks here include event/action localization (spatial and temporal), object tracking, and more generic question-answer style annotations.

There’s too many! Does each one need it’s own tool? I just want to annotate some images!

Tools for computer vision annotation

Just like there’s Tesla and then all the other electric cars (or the iPhone and all the other smart phones, or the [major player] and all the other [competitors to the major player]):

For computer vision, there is CVAT (https://www.cvat.ai/) and then everyone else. CVAT started at Intel, and while it is popular in the computer vision field, it frankly spent a long time as an awfully ugly user interface. It’s been getting revamped quite a bit since CVAT is now its own company and will likely stay the leader in the field.

In addition to a host of competitors, there are also smarter data labelling platforms that combine some amount of machine learning with human-in-the-loop feedback. The big players here include Scale AI, Snorkel and probably an army of other ones that will be insulted at not being mentioned.

Exhaustion - why more?!

If you’re already exhausted then join the club. I just want to annotate some images!

This is what I felt was missing. I don’t want to set up a server for a 100 person annotation team. I don’t want to read doc pages that if printed would rival the 1000-page Red Hat Fedora 2 Unleashed book that’s been collecting dust in my parents attic since 2005. I just want to go through some images on my machine with ease. One of my recent personal projects has involved detecting hand drawn sketches of diagrams. Even just for evaluating the models, I want something easy and quick to measure the performance on a small dataset of my own custom images.

dash-annotate-cv

dash-annotate-cv is a Python library that implements dash All-In-One (AIO) components for simple computer vision annotation tasks. The goal is not production-level tools, but rather ease-of-use on your local box.

What I like best about dash is that it is so easy to build you own UX tools. Dash has recommendations for what it calls “All-in-one” (AIO) components (https://dash.plotly.com/all-in-one-components). These AIO components are intended to contain multiple components (buttons, lists, images, etc) that work together for one experience. They can, however, still be re-used across the application in re-usable ways. By following this AIO standard, you can incorporate the annotation tools into your own, custom designed website.

Two ways to run the annotation tool are currently supported:

A CLI tool that launches a default web page. This is to just get up & running instantly with two commands: pip install dash-annotate-cv and dacv conf.yml . Both the images and labeling tasks are defined in the YAML file.
As a Python library, you can import the AIO components into your own dash app, and customize some of the appearance.

Initial features

Image level tags

Annotating image level tags

dash-annotate-cv supports annotating image-level tags for the whole image.

Bounding boxes

Annotating bounding boxes

dash-annotate-cv supports annotating bounding boxes and labels. Boxes can be also be edited and removed after being drawn.

Complete history

Optionally, the complete history of all edits can be stored including all label & bounding box changes with timestamps. This can be useful to analyze labelling throughput.

CLI and Python library support

dash-annotate-cv can be used both as a CLI app, or as a library of re-usable dash components that can be integrated into your own app. The CLI app is defined by a simple YAML config file that you could write in a minute and start annotating.

What’s missing?

Fancier tasks - don’t expect to annotate skeletons or videos just yet.
All the formats - while JSON is legible and easy to convert to your own needs, out of the box there is not yet any support for COCO, YOLO or other formats.
Docs - until there is a winning LLM tool to automatically write the docs, I’ve been too lazy so far. But since you’re too lazy to read them, it all works out, no?

Getting started

You can checkout the library on GitHub or PyPI to get started for your own project: