notesum.ai
Published at December 6From classical techniques to convolution-based models: A review of object detection algorithms
cs.CV
cs.AI
cs.LG
Released Date: December 6, 2024
Authors: Fnu Neha, Deepshikha Bhati1, Deepak Kumar Shukla2, Md Amiruzzaman3
Aff.: 1Kent State University; 2Rutgers University; 3West Chester University

| Model | Strengths | Limitations |
|---|---|---|
| R-CNN (2013) | Simple, foundational; applies CNNs for classification. | High computation for 2000 region classifications; slow (47 sec/image); no end-to-end training. |
| SPPNet (2015) | Faster than R-CNN; supports multi-scale input via spatial pyramid pooling. | Does not update conv. layers before SPP layer during fine-tuning. |
| Fast R-CNN (2015) | Faster than SPPNet; introduces ROI pooling to handle varied input sizes. | Relies on selective search for region proposals, not learned during training. |
| Faster R-CNN (2015) | Uses RPN for fast region proposals; improves efficiency. | Limited in detecting small objects due to single feature map. |
| Mask R-CNN (2017) | Adds instance segmentation, detecting objects and masks simultaneously. | High computational demand; struggles with motion blur at low resolution. |
| YOLO (2015) | Real-time detection at 45 fps; single forward pass. | Poor detection of small objects; produces coarse features. |
| SSD (2016) | Handles various resolutions; uses multi-scale feature maps for detection. | Default boxes may not match all shapes; possible overlapping detections. |