notesum.ai
Published at November 7Enhancing Reverse Engineering: Investigating and Benchmarking Large Language Models for Vulnerability Analysis in Decompiled Binaries
cs.CR
cs.AI
Released Date: November 7, 2024
Authors: Dylan Manuel1, Nafis Tanveer Islam1, Joseph Khoury2, Ana Nunez1, Elias Bou-Harb2, Peyman Najafirad1
Aff.: 1Secure AI an Autonomy Laboratory, University of Texas at San Antonio; 2Louisiana State University

| Model | Training | Acc | Pre. | Rec. | F1 | Acc.V | Acc.B |
|---|---|---|---|---|---|---|---|
| CodeLLaMa | - | 0.56 | 0.6 | 0.78 | 0.68 | 0.78 | 0.23 |
| DBVul | 0.85 | 0.89 | 0.86 | 0.87 | 0.86 | 0.84 | |
| CodeGen2 | - | 0.59 | 0.65 | 0.83 | 0.73 | 0.83 | 0.13 |
| DBVul | 0.91 | 0.93 | 0.94 | 0.94 | 0.94 | 0.86 | |
| Mistral | - | 0.48 | 0.71 | 0.42 | 0.53 | 0.42 | 0.61 |
| DBVul | 0.89 | 0.95 | 0.88 | 0.91 | 0.88 | 0.9 | |
| StarCoder | - | 0.59 | 0.6 | 0.97 | 0.74 | 0.97 | 0.01 |
| DBVul | 0.89 | 0.91 | 0.93 | 0.92 | 0.93 | 0.80 | |
| LLaMa 3 | - | 0.57 | 0.7 | 0.68 | 0.69 | 0.68 | 0.34 |
| DBVul | 0.91 | 0.94 | 0.93 | 0.93 | 0.93 | 0.87 |