notesum.ai
Published at November 5DiffLM: Controllable Synthetic Data Generation via Diffusion Language Models
cs.LG
cs.AI
cs.CL
Released Date: November 5, 2024
Authors: Ying Zhou, Xinyao Wang, Yulei Niu, Yaojie Shen, Lexin Tang, Fan Chen, Ben He, Le Sun, Longyin Wen

| Method | Adult | Default | Magic | Shoppers | Beijing | |||||
|---|---|---|---|---|---|---|---|---|---|---|
| MLE | MLE | MLE | MLE | MLE | ||||||
| Real | 0.927 | - | 0.770 | - | 0.946 | - | 0.926 | - | 0.423 | - |
| SMOTE | 0.899 | 1.60 | 0.741 | 1.48 | 0.934 | 0.91 | 0.911 | 2.68 | 0.593 | 1.85 |
| CTGAN | 0.886 | 16.84 | 0.696 | 16.83 | 0.855 | 9.810 | 0.875 | 21.15 | 0.902 | 21.39 |
| TVAE | 0.878 | 14.22 | 0.724 | 10.17 | 0.887 | 8.250 | 0.871 | 24.51 | 0.770 | 19.16 |
| GOGGLE | 0.778 | 16.97 | 0.584 | 17.02 | 0.654 | 1.900 | 0.658 | 22.33 | 1.090 | 16.93 |
| CoDi | 0.871 | 21.38 | 0.525 | 15.77 | 0.932 | 11.56 | 0.865 | 31.84 | 0.818 | 16.94 |
| TabSyn | 0.915 | 0.58 | 0.764 | 0.85 | 0.938 | 0.88 | 0.920 | 1.43 | 0.582 | 1.12 |
| GReaT | 0.913 | 12.12 | 0.755 | 19.94 | 0.888 | 16.16 | 0.902 | 14.51 | 0.653 | 8.25 |
| DiffLM | 0.894 | 9.16 | 0.793 | 9.33 | 0.910 | 7.04 | 0.912 | 14.43 | 0.717 | 6.05 |