notesum.ai
Published at November 22Instance-Aware Generalized Referring Expression Segmentation
cs.CV
cs.CL
Released Date: November 22, 2024
Authors: E-Ro Nguyen1, Hieu Le2, Dimitris Samaras1, Michael Ryoo1
Aff.: 1Stony Brook University; 2EPFL

| val | testA | testB | ||||||||
| Method | Backbone | cIoU | gIoU | N-acc | cIoU | gIoU | N-acc | cIoU | gIoU | N-acc |
| LLM-based Methods | ||||||||||
| LISA-7B† [13] | SAM-ViT-H | 61.63 | 61.76 | 54.67 | 68.5 | 66.27 | 50.01 | 60.63 | 58.84 | 51.91 |
| GSVA-7B† [37] | SAM-ViT-H | 63.29 | 66.47 | 62.43 | 69.93 | 71.08 | 65.31 | 60.47 | 62.23 | 60.56 |
| SAM4MLLM-7B† [2] | SAM-EffViT-XL1 | 65.66 | 68.37 | 63.71 | 69.62 | 69.05 | 65.96 | 62.35 | 63.71 | 61.25 |
| RES Methods | ||||||||||
| MattNet [42] | ResNet-101 | 47.51 | 48.24 | 41.15 | 58.66 | 59.30 | 44.04 | 45.33 | 46.14 | 41.32 |
| VLT [6] | DarkNet-53 | 52.61 | 52.00 | 47.17 | 62.19 | 63.20 | 48.74 | 50.52 | 50.88 | 48.46 |
| CRIS [34] | ResNet-101 | 55.34 | 56.27 | - | 63.82 | 63.42 | - | 51.04 | 51.79 | - |
| LAVT [40] | Swin-B | 57.64 | 58.40 | 49.32 | 65.32 | 65.90 | 49.25 | 55.04 | 55.83 | 48.46 |
| GRES Methods | ||||||||||
| ReLA [17] | Swin-B | 62.42 | 63.60 | 56.37 | 69.26 | 70.03 | 59.02 | 59.88 | 61.02 | 58.40 |
| LQMFormer [29] | Swin-B | 64.98 | 70.94 | 67.47 | - | - | - | - | - | - |
| MABP [15] | Swin-B | 65.72 | 68.86 | 62.18 | 71.59 | 72.81 | - | 62.76 | 64.04 | - |
| HDC [25] | Swin-B | 65.42 | 68.23 | 63.38 | 71.60 | 72.52 | 65.29 | 62.79 | 63.85 | 60.68 |
| InstAlign(Ours) | Swin-B | 68.94 | 74.34 | 79.72 | 73.22 | 74.51 | 75.65 | 63.88 | 65.74 | 70.72 |