notesum.ai
Published at December 10ChocoLlama: Lessons Learned From Teaching Llamas Dutch
cs.CL
Released Date: December 10, 2024
Authors: Matthieu Meeus1, Anthony Rathé2, François Remy3, Pieter Delobelle4, Jens-Joris Decorte3, Thomas Demeester3
Aff.: 1Imperial College London; 2Cavell; 3Ghent University; 4Aleph Alpha

| Total | Batch | Total | Gradient | Total | Wall time | |
|---|---|---|---|---|---|---|
| Model | samples | size | steps | acc. steps | Params | (days) |
| ChocoLlama-2-7B | M | k | M | |||
| ChocoLlama-2-7B-tokentrans | M | k | M | |||
| Llama-3-ChocoLlama | 3.69M | 512 | 57.6k | 8 | 1072M |