notesum.ai
Published at November 11NatureLM-audio: an Audio-Language Foundation Model for Bioacoustics
cs.SD
cs.AI
cs.LG
eess.AS
Released Date: November 11, 2024
Authors: David Robinson1, Marius Miron1, Masato Hagiwara1, Olivier Pietquin1
Aff.: 1Earth Species Project
| Model | esc50 | watkins | cbi | humbugdb | dcase | enabirds | hiceas | rfcx | gibbons |
|---|---|---|---|---|---|---|---|---|---|
| LLM w/o audio | 0.020 | 0.041 | 0.005 | 0.073 | 0.000 | 0.001 | 0.210 | 0.000 | \ul0.013 |
| SALMONN | 0.320 | 0.041 | 0.004 | 0.090 | 0.005 | 0.004 | 0.097 | 0.002 | 0.005 |
| Qwen2-audio | 0.307 | 0.041 | 0.004 | 0.070 | 0.005 | 0.004 | 0.097 | 0.002 | 0.005 |
| BioLingual | \ul0.600 | \ul0.257 | \ul0.705 | \ul0.085 | \ul0.036 | \ul0.109 | 0.429 | \ul0.004 | 0.018 |
| NatureLM-audio | 0.635 | 0.646 | 0.755 | 0.073 | 0.052 | 0.279 | \ul0.390 | 0.039 | 0.003 |