Ben Chuanlong Du's Blog

It is never too late to learn.

Object Detection Using Deep Learning

Things on this page are fragmentary and immature notes/thoughts of the author. Please read with your own judgement!

Concepts

Image Classification

Image Localization

Image Classification: Predict the type or class of an object in an image. Input: An image with a single object, such as a photograph. Output: A class label (e.g. one or more integers that are mapped to class labels). Object Localization: Locate the presence of objects in an image and indicate their location with a bounding box. Input: An image with one or more objects, such as a photograph. Output: One or more bounding boxes (e.g. defined by a point, width, and height). Object Detection: Locate the presence of objects with a bounding box and types or classes of the located objects in an image. Input: An image with one or more objects, such as a photograph. Output: One or more bounding boxes (e.g. defined by a point, width, and height), and a class label for each bounding box.

Image Segmentation - Semantic Segmentation - Instance segemntation

https://en.wikipedia.org/wiki/Image_segmentation

R-CNN

Fast R-CNN

Faster R-CNN

Mask R-CNN

YoLo

Faster R-CNN Mask R-CNN YoLo

Models for Object Detection

Region-Based Convolutional Neural Networks, or R-CNNs, are a family of techniques for addressing object localization and recognition tasks, designed for model performance. You Only Look Once, or YOLO, is a second family of techniques for object recognition designed for speed and real-time use.

Finetune Object Detection Models in PyTorch

Fine-tuning Faster-RCNN using pytorch

Beagle Detector: Fine-tune Faster-RCNN

TORCHVISION OBJECT DETECTION FINETUNING TUTORIAL

References

Non-maximum Suppression (NMS)

A Gentle Introduction to Object Recognition With Deep Learning

From R-CNN to Mask R-CNN

Mask R-CNN: A Beginner's Guide

TORCHVISION OBJECT DETECTION FINETUNING TUTORIAL

Comments