notesum.ai
Published at November 22Information Extraction from Heterogenous Documents without Ground Truth Labels using Synthetic Label Generation and Knowledge Distillation
cs.CL
Released Date: November 22, 2024
Authors: Aniket Bhattacharyya1, Anurag Tripathi1
Aff.: 1Amazon

| Model | Merchant name | Amount | Date | CORD, M=1 | CORD, M=24 | Inference speed | Annual cost | |
|---|---|---|---|---|---|---|---|---|
| Sonnet Zero-Shot (TAIL) | 50% | 70% | 76% | 94% | 93% | 0.3 | $1.25X | |
| LLaVA zero shot TAIL prompt | 24% | 28% | 58% | 48% | 42% | 1.5 | $0.13X | |
| LLaVA-noisy labels | 46% | 63% | 68% | - | - | 1.5 | $0.13X | |
| LLaVA-exact labels | - | - | - | 82% | - | 1.5 | 0.13X | |
| LayoutLMV3-Tail | 41% | 58% | - | 42% | - | 0.8 | 0.2X | |
| LLaVA-Net | 52% | 70% | 83% | 84% | 69% | 1.5 | $0.2X |