notesum.ai

Published at December 6

From classical techniques to convolution-based models: A review of object detection algorithms

cs.CV

cs.AI

cs.LG

Released Date: December 6, 2024

Authors: Fnu Neha, Deepshikha Bhati¹, Deepak Kumar Shukla², Md Amiruzzaman³

Aff.: ¹Kent State University; ²Rutgers University; ³West Chester University

Arxiv: http://arxiv.org/pdf/2412.05252v1

Refer to caption

Model	Strengths	Limitations
R-CNN (2013)	Simple, foundational; applies CNNs for classification.	High computation for 2000 region classifications; slow (47 sec/image); no end-to-end training.
SPPNet (2015)	Faster than R-CNN; supports multi-scale input via spatial pyramid pooling.	Does not update conv. layers before SPP layer during fine-tuning.
Fast R-CNN (2015)	Faster than SPPNet; introduces ROI pooling to handle varied input sizes.	Relies on selective search for region proposals, not learned during training.
Faster R-CNN (2015)	Uses RPN for fast region proposals; improves efficiency.	Limited in detecting small objects due to single feature map.
Mask R-CNN (2017)	Adds instance segmentation, detecting objects and masks simultaneously.	High computational demand; struggles with motion blur at low resolution.
YOLO (2015)	Real-time detection at 45 fps; single forward pass.	Poor detection of small objects; produces coarse features.
SSD (2016)	Handles various resolutions; uses multi-scale feature maps for detection.	Default boxes may not match all shapes; possible overlapping detections.