notesum.ai
Published at October 31SFM-Protein: Integrative Co-evolutionary Pre-training for Advanced Protein Sequence Representation
q-bio.QM
cs.AI
cs.CL
cs.LG
Released Date: October 31, 2024
Authors: Liang He1, Peiran Jin1, Yaosen Min1, Shufang Xie1, Lijun Wu1, Tao Qin1, Xiaozhuan Liang2, Kaiyuan Gao3, Yuliang Jiang4, Tie-Yan Liu1
Aff.: 1Microsoft Research AI for Science; 2Zhejiang University; 3Huazhong University of Science and Technology; 4Tsinghua University

| Task | GO-MF | GO-BP | GO-CC | EC |
| GearNet∗ | 0.503 / 0.490 | 0.356 / 0.211 | 0.414 / 0.276 | 0.730 / 0.751 |
| LM-GVP∗ | 0.545 / 0.580 | 0.417 / 0.302 | 0.527 / 0.423 | 0.664 / 0.710 |
| Multiview Contrast∗ | 0.654 / 0.596 | 0.490 / 0.292 | 0.488 / 0.336 | 0.874 / 0.892 |
| ESM2 (650M) | 0.615 / 0.616 | 0.422 / 0.290 | 0.488 / 0.366 | 0.799 / 0.805 |
| ESM2 (3B) | 0.659 / 0.647 | 0.476 / 0.341 | 0.497 / 0.417 | 0.863 / 0.876 |
| SFM-Protein (650M) | 0.649 / 0.645 | 0.471 / 0.330 | 0.516 / 0.395 | 0.855 / 0.884 |
| SFM-Protein (3B) | 0.673 / 0.657 | 0.495 / 0.361 | 0.510 / 0.416 | 0.869 / 0.893 |