notesum.ai
Published at December 6Continuous Speech Tokens Makes LLMs Robust Multi-Modality Learners
cs.SD
eess.AS
eess.SP
Released Date: December 6, 2024
Authors: Ze Yuan1, Yanqing Liu2, Shujie Liu, Sheng Zhao
Aff.: 1Tsinghua University; 2Microsoft Corporation

| Model | WER |
|---|---|
| Mini-Omni | 10.84 |
| Flow-Omni | 8.81 |