In:
Frontiers in Bioinformatics, Frontiers Media SA, Vol. 2 ( 2022-2-18)
Abstract:
As one of the most important posttranslational modifications (PTMs), protein lysine glycation changes the characteristics of the proteins and leads to the dysfunction of the proteins, which may cause diseases. Accurately detecting the glycation sites is of great benefit for understanding the biological function and potential mechanism of glycation in the treatment of diseases. However, experimental methods are expensive and time-consuming for lysine glycation site identification. Instead, computational methods, with their higher efficiency and lower cost, could be an important supplement to the experimental methods. In this study, we proposed a novel predictor, BERT-Kgly, for protein lysine glycation site prediction, which was developed by extracting embedding features of protein segments from pretrained Bidirectional Encoder Representations from Transformers (BERT) models. Three pretrained BERT models were explored to get the embeddings with optimal representability, and three downstream deep networks were employed to build our models. Our results showed that the model based on embeddings extracted from the BERT model pretrained on 556,603 protein sequences of UniProt outperforms other models. In addition, an independent test set was used to evaluate and compare our model with other existing methods, which indicated that our model was superior to other existing models.
Type of Medium:
Online Resource
ISSN:
2673-7647
DOI:
10.3389/fbinf.2022.834153
DOI:
10.3389/fbinf.2022.834153.s001
Language:
Unknown
Publisher:
Frontiers Media SA
Publication Date:
2022
detail.hit.zdb_id:
3091287-8
Permalink