notesum.ai
Published at December 4Align3R: Aligned Monocular Depth Estimation for Dynamic Videos
cs.CV
Released Date: December 4, 2024
Authors: Jiahao Lu1, Tianyu Huang2, Peng Li1, Zhiyang Dou3, Cheng Lin, Zhiming Cui4, Zhen Dong5, Sai-Kit Yeung1, Wenping Wang6, Yuan Liu7
Aff.: 1HKUST; 2CUHK; 3HKU; 4ShanghaiTech; 5WHU; 6TAMU; 7NTU
![[Uncaptioned image]](https://arxiv.org/html/2412.03079v1/x1.png)
| Category | Method | Indoors & outdoors (Hard) | Indoors (Easy) | ||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|
| Sintel | PointOdyssey val | FlyingThings3D test | Bonn 5 scenes | TUM dynamics | |||||||
| Abs Rel | Abs Rel | Abs Rel | Abs Rel | Abs Rel | |||||||
| Single-frame depth | Depth Anything V2 [54] | 0.348 | 0.592 | 0.214 | 0.688 | 0.267 | 0.616 | 0.118 | 0.882 | 0.184 | 0.750 |
| Depth Pro [4] | 0.418 | 0.559 | 0.167 | 0.779 | 0.322 | 0.537 | 0.067 | 0.974 | 0.106 | 0.887 | |
| Video depth | ChronoDepth [36] | 0.687 | 0.486 | 0.210 | 0.707 | 0.288 | 0.633 | 0.100 | 0.911 | 0.151 | 0.825 |
| DepthCrafter [15] | 0.292 | 0.697 | 0.229 | 0.675 | / | / | 0.075 | 0.971 | 0.176 | 0.744 | |
| DUSt3R [44] | 0.422 | 0.542 | 0.184 | 0.743 | 0.140 | 0.817 | 0.154 | 0.839 | 0.202 | 0.775 | |
| Joint video depth | MonST3R [60] | 0.335 | 0.586 | 0.089 | 0.909 | 0.132 | 0.836 | 0.082 | 0.953 | 0.140 | 0.841 |
| depth & pose | Ours (Depth Anything V2) | 0.253 | 0.681 | 0.078 | 0.929 | 0.106 | 0.890 | 0.075 | 0.972 | 0.109 | 0.915 |
| Ours (Depth Pro) | 0.263 | 0.641 | 0.077 | 0.930 | 0.102 | 0.895 | 0.068 | 0.969 | 0.112 | 0.884 | |