Protein Repeats from First Principles

Some natural proteins display recurrent structural patterns. Despite being highly similar at the tertiary structure level, repeating patterns within a single repeat protein can be extremely variable at the sequence level. We use a mathematical definition of a repetition and investigate the occurrenc...

Descripción completa

Guardado en:
Detalles Bibliográficos
Autores principales: Turjanski, P., Parra, R.G., Espada, R., Becher, V., Ferreiro, D.U.
Formato: JOUR
Materias:
Acceso en línea:http://hdl.handle.net/20.500.12110/paper_20452322_v6_n_p_Turjanski
Aporte de:
id todo:paper_20452322_v6_n_p_Turjanski
record_format dspace
spelling todo:paper_20452322_v6_n_p_Turjanski2023-10-03T16:38:17Z Protein Repeats from First Principles Turjanski, P. Parra, R.G. Espada, R. Becher, V. Ferreiro, D.U. quantitative study algorithm biology chemistry Markov chain protein database protein domain protein folding protein motif statistical model amino acid protein Algorithms Amino Acid Motifs Amino Acids Computational Biology Databases, Protein Markov Chains Models, Statistical Protein Domains Protein Folding Proteins Some natural proteins display recurrent structural patterns. Despite being highly similar at the tertiary structure level, repeating patterns within a single repeat protein can be extremely variable at the sequence level. We use a mathematical definition of a repetition and investigate the occurrences of these in sequences of different protein families. We found that long stretches of perfect repetitions are infrequent in individual natural proteins, even for those which are known to fold into structures of recurrent structural motifs. We found that natural repeat proteins are indeed repetitive in their families, exhibiting abundant stretches of 6 amino acids or longer that are perfect repetitions in the reference family. We provide a systematic quantification for this repetitiveness. We show that this form of repetitiveness is not exclusive of repeat proteins, but also occurs in globular domains. A by-product of this work is a fast quantification of the likelihood of a protein to belong to a family. Fil:Turjanski, P. Universidad de Buenos Aires. Facultad de Ciencias Exactas y Naturales; Argentina. Fil:Becher, V. Universidad de Buenos Aires. Facultad de Ciencias Exactas y Naturales; Argentina. Fil:Ferreiro, D.U. Universidad de Buenos Aires. Facultad de Ciencias Exactas y Naturales; Argentina. JOUR info:eu-repo/semantics/openAccess http://creativecommons.org/licenses/by/2.5/ar http://hdl.handle.net/20.500.12110/paper_20452322_v6_n_p_Turjanski
institution Universidad de Buenos Aires
institution_str I-28
repository_str R-134
collection Biblioteca Digital - Facultad de Ciencias Exactas y Naturales (UBA)
topic quantitative study
algorithm
biology
chemistry
Markov chain
protein database
protein domain
protein folding
protein motif
statistical model
amino acid
protein
Algorithms
Amino Acid Motifs
Amino Acids
Computational Biology
Databases, Protein
Markov Chains
Models, Statistical
Protein Domains
Protein Folding
Proteins
spellingShingle quantitative study
algorithm
biology
chemistry
Markov chain
protein database
protein domain
protein folding
protein motif
statistical model
amino acid
protein
Algorithms
Amino Acid Motifs
Amino Acids
Computational Biology
Databases, Protein
Markov Chains
Models, Statistical
Protein Domains
Protein Folding
Proteins
Turjanski, P.
Parra, R.G.
Espada, R.
Becher, V.
Ferreiro, D.U.
Protein Repeats from First Principles
topic_facet quantitative study
algorithm
biology
chemistry
Markov chain
protein database
protein domain
protein folding
protein motif
statistical model
amino acid
protein
Algorithms
Amino Acid Motifs
Amino Acids
Computational Biology
Databases, Protein
Markov Chains
Models, Statistical
Protein Domains
Protein Folding
Proteins
description Some natural proteins display recurrent structural patterns. Despite being highly similar at the tertiary structure level, repeating patterns within a single repeat protein can be extremely variable at the sequence level. We use a mathematical definition of a repetition and investigate the occurrences of these in sequences of different protein families. We found that long stretches of perfect repetitions are infrequent in individual natural proteins, even for those which are known to fold into structures of recurrent structural motifs. We found that natural repeat proteins are indeed repetitive in their families, exhibiting abundant stretches of 6 amino acids or longer that are perfect repetitions in the reference family. We provide a systematic quantification for this repetitiveness. We show that this form of repetitiveness is not exclusive of repeat proteins, but also occurs in globular domains. A by-product of this work is a fast quantification of the likelihood of a protein to belong to a family.
format JOUR
author Turjanski, P.
Parra, R.G.
Espada, R.
Becher, V.
Ferreiro, D.U.
author_facet Turjanski, P.
Parra, R.G.
Espada, R.
Becher, V.
Ferreiro, D.U.
author_sort Turjanski, P.
title Protein Repeats from First Principles
title_short Protein Repeats from First Principles
title_full Protein Repeats from First Principles
title_fullStr Protein Repeats from First Principles
title_full_unstemmed Protein Repeats from First Principles
title_sort protein repeats from first principles
url http://hdl.handle.net/20.500.12110/paper_20452322_v6_n_p_Turjanski
work_keys_str_mv AT turjanskip proteinrepeatsfromfirstprinciples
AT parrarg proteinrepeatsfromfirstprinciples
AT espadar proteinrepeatsfromfirstprinciples
AT becherv proteinrepeatsfromfirstprinciples
AT ferreirodu proteinrepeatsfromfirstprinciples
_version_ 1782024800959463424