notesum.ai

Published at November 26

Linguistic Laws Meet Protein Sequences: A Comparative Analysis of Subword Tokenization Methods

cs.CL
q-bio.QM

Released Date: November 26, 2024

Authors: Burak Suyunu1, Enes Taylan1, Arzucan Özgür1

Aff.: 1Boğaziçi University, Istanbul, Türkiye

Arxiv: http://arxiv.org/abs/2411.17669v1