Ben Chuanlong Du's Blog

It is never too late to learn.

Preparing Data for AI

Things on this page are fragmentary and immature notes/thoughts of the author. Please read with your own judgement!

General Tips

  1. When you label individual images, it is better to use numerical labels (even though text labels are easier to understand) so that you can avoid mapping between numbers (use for training) and text labels (for human understanding) all the time.

  2. If you have no labeled data to start at all, do NOT hurry to jumping into labeling yet. Check the article Label Image Data Quickly Without Crowdsourcing to see whether you can use any of the tips to ease the work of human labeling.

Free Labeling Tools

  • LabelStudio LabelStudio is the most flexible open source data annotation tool. It lets you label data types like audio, text, images, videos, and time series with a simple and straightforward UI and export to various model formats. The UI supports customization and pre-built labeling templates.

  • LabelMe LabelMe provides an online annotation tool to build image databases for computer vision research.

  • DiffGram DiffGram is opensource and thus free for a self-hosted service.

Commercial Labeling Tools/Platforms

DiffGram

Appen

CrowdFlower

https://www.alegion.com/

Baidu Crowd Outsourcing

References

Labeling Data for AI

Comments