Protein Repeats from First Principles
Some natural proteins display recurrent structural patterns. Despite being highly similar at the tertiary structure level, repeating patterns within a single repeat protein can be extremely variable at the sequence level. We use a mathematical definition of a repetition and investigate the occurrenc...
Guardado en:
Autores principales: | , , , , |
---|---|
Formato: | JOUR |
Materias: | |
Acceso en línea: | http://hdl.handle.net/20.500.12110/paper_20452322_v6_n_p_Turjanski |
Aporte de: |
id |
todo:paper_20452322_v6_n_p_Turjanski |
---|---|
record_format |
dspace |
spelling |
todo:paper_20452322_v6_n_p_Turjanski2023-10-03T16:38:17Z Protein Repeats from First Principles Turjanski, P. Parra, R.G. Espada, R. Becher, V. Ferreiro, D.U. quantitative study algorithm biology chemistry Markov chain protein database protein domain protein folding protein motif statistical model amino acid protein Algorithms Amino Acid Motifs Amino Acids Computational Biology Databases, Protein Markov Chains Models, Statistical Protein Domains Protein Folding Proteins Some natural proteins display recurrent structural patterns. Despite being highly similar at the tertiary structure level, repeating patterns within a single repeat protein can be extremely variable at the sequence level. We use a mathematical definition of a repetition and investigate the occurrences of these in sequences of different protein families. We found that long stretches of perfect repetitions are infrequent in individual natural proteins, even for those which are known to fold into structures of recurrent structural motifs. We found that natural repeat proteins are indeed repetitive in their families, exhibiting abundant stretches of 6 amino acids or longer that are perfect repetitions in the reference family. We provide a systematic quantification for this repetitiveness. We show that this form of repetitiveness is not exclusive of repeat proteins, but also occurs in globular domains. A by-product of this work is a fast quantification of the likelihood of a protein to belong to a family. Fil:Turjanski, P. Universidad de Buenos Aires. Facultad de Ciencias Exactas y Naturales; Argentina. Fil:Becher, V. Universidad de Buenos Aires. Facultad de Ciencias Exactas y Naturales; Argentina. Fil:Ferreiro, D.U. Universidad de Buenos Aires. Facultad de Ciencias Exactas y Naturales; Argentina. JOUR info:eu-repo/semantics/openAccess http://creativecommons.org/licenses/by/2.5/ar http://hdl.handle.net/20.500.12110/paper_20452322_v6_n_p_Turjanski |
institution |
Universidad de Buenos Aires |
institution_str |
I-28 |
repository_str |
R-134 |
collection |
Biblioteca Digital - Facultad de Ciencias Exactas y Naturales (UBA) |
topic |
quantitative study algorithm biology chemistry Markov chain protein database protein domain protein folding protein motif statistical model amino acid protein Algorithms Amino Acid Motifs Amino Acids Computational Biology Databases, Protein Markov Chains Models, Statistical Protein Domains Protein Folding Proteins |
spellingShingle |
quantitative study algorithm biology chemistry Markov chain protein database protein domain protein folding protein motif statistical model amino acid protein Algorithms Amino Acid Motifs Amino Acids Computational Biology Databases, Protein Markov Chains Models, Statistical Protein Domains Protein Folding Proteins Turjanski, P. Parra, R.G. Espada, R. Becher, V. Ferreiro, D.U. Protein Repeats from First Principles |
topic_facet |
quantitative study algorithm biology chemistry Markov chain protein database protein domain protein folding protein motif statistical model amino acid protein Algorithms Amino Acid Motifs Amino Acids Computational Biology Databases, Protein Markov Chains Models, Statistical Protein Domains Protein Folding Proteins |
description |
Some natural proteins display recurrent structural patterns. Despite being highly similar at the tertiary structure level, repeating patterns within a single repeat protein can be extremely variable at the sequence level. We use a mathematical definition of a repetition and investigate the occurrences of these in sequences of different protein families. We found that long stretches of perfect repetitions are infrequent in individual natural proteins, even for those which are known to fold into structures of recurrent structural motifs. We found that natural repeat proteins are indeed repetitive in their families, exhibiting abundant stretches of 6 amino acids or longer that are perfect repetitions in the reference family. We provide a systematic quantification for this repetitiveness. We show that this form of repetitiveness is not exclusive of repeat proteins, but also occurs in globular domains. A by-product of this work is a fast quantification of the likelihood of a protein to belong to a family. |
format |
JOUR |
author |
Turjanski, P. Parra, R.G. Espada, R. Becher, V. Ferreiro, D.U. |
author_facet |
Turjanski, P. Parra, R.G. Espada, R. Becher, V. Ferreiro, D.U. |
author_sort |
Turjanski, P. |
title |
Protein Repeats from First Principles |
title_short |
Protein Repeats from First Principles |
title_full |
Protein Repeats from First Principles |
title_fullStr |
Protein Repeats from First Principles |
title_full_unstemmed |
Protein Repeats from First Principles |
title_sort |
protein repeats from first principles |
url |
http://hdl.handle.net/20.500.12110/paper_20452322_v6_n_p_Turjanski |
work_keys_str_mv |
AT turjanskip proteinrepeatsfromfirstprinciples AT parrarg proteinrepeatsfromfirstprinciples AT espadar proteinrepeatsfromfirstprinciples AT becherv proteinrepeatsfromfirstprinciples AT ferreirodu proteinrepeatsfromfirstprinciples |
_version_ |
1782024800959463424 |