Secondary literature sources for LRRcap
The following references were automatically generated.
- Huntley M, Golding GB
- Evolution of simple sequence in proteins.
- J Mol Evol. 2000; 51: 131-40
- Display abstract
The proteins of Saccharomyces cerevisiae contain a high proportion of low-complexity, simple sequences. These are protein segments composed almost exclusively or largely of a single repetitive amino acid polymer and are the most commonly shared feature between proteins. We have examined a survey of other species to determine how widespread this phenomenon might be. This was done by comparing how frequently segments from one protein are present in other proteins. Any recently evolutionarily related proteins were excluded. It was found that the most commonly shared features of eukaryotic proteins were repetitive but that prokaryotes did not contain such shared, extensively redundant repeats. The proportion of eukaryotic proteins that contain a significantly repetitive fraction changes dramatically from species to species. In addition the individual amino acids present in these repeats change between species. This suggests that the primary sequence of the repeats may not be important for their function. Further tests of the yeast repeats confirmed that these repeats evolve more quickly than the remainder of the protein sequence within which they are embedded. These results show that these rapid evolving, simple sequence repeats are in fact the most commonly shared pattern between all of the genomic proteins of eukaryotes.
- Wang J, Wang W
- Modeling study on the validity of a possibly simplified representation of proteins.
- Phys Rev E Stat Phys Plasmas Fluids Relat Interdiscip Topics. 2000; 61: 6981-6
- Display abstract
The folding characteristics of sequences reduced with a possibly simplified representation of five types of residues are shown to be similar to their original ones with the natural set of residues (20 types or 20 letters). The reduced sequences have a good foldability and fold to the same native structure of their optimized original ones. A large ground state gap for the native structure shows the thermodynamic stability of the reduced sequences. The general validity of such a five-letter reduction is further studied via the correlation between the reduced sequences and the original ones. As a comparison, a reduction with two letters is found not to reproduce the native structure of the original sequences due to its homopolymeric features.
- Devos D, Valencia A
- Practical limits of function prediction.
- Proteins. 2000; 41: 98-107
- Display abstract
The widening gap between known protein sequences and their functions has led to the practice of assigning a potential function to a protein on the basis of sequence similarity to proteins whose function has been experimentally investigated. We present here a critical view of the theoretical and practical bases for this approach. The results obtained by analyzing a significant number of true sequence similarities, derived directly from structural alignments, point to the complexity of function prediction. Different aspects of protein function, including (i) enzymatic function classification, (ii) functional annotations in the form of key words, (iii) classes of cellular function, and (iv) conservation of binding sites can only be reliably transferred between similar sequences to a modest degree. The reason for this difficulty is a combination of the unavoidable database inaccuracies and the plasticity of protein function. In addition, analysis of the relationship between sequence and functional descriptions defines an empirical limit for pairwise-based functional annotations, namely, the three first digits of the six numbers used as descriptors of protein folds in the FSSP database can be predicted at an average level as low as 7.5% sequence identity, two of the four EC digits at 15% identity, half of the SWISS-PROT key words related to protein function would require 20% identity, and the prediction of half of the residues in the binding site can be made at the 30% sequence identity level.
- Marino M, Braun L, Cossart P, Ghosh P
- A framework for interpreting the leucine-rich repeats of the Listeria internalins.
- Proc Natl Acad Sci U S A. 2000; 97: 8784-8
- Display abstract
The surface protein InlB of the bacterial pathogen Listeria monocytogenes is required for inducing phagocytosis in various nonphagocytic mammalian cell types in vitro. InlB causes tyrosine phosphorylation of host cell adaptor proteins, activation of phosphoinositide 3-kinase, and rearrangements of the actin cytoskeleton. These events lead to phagocytic uptake of the bacterium by the host cell. InlB belongs to the internalin family of Listeria proteins, which also includes InlA, another surface protein involved in host cell invasion. The internalins are the largest class of bacterial proteins containing leucine-rich repeats (LRR), a motif associated with protein-protein interactions. The LRR motif is found in a functionally diverse array of proteins, including those involved in the plant immune system and in the mammalian innate immune response. Structural and functional interpretations of the sequences of internalin family members are presented in light of the recently determined x-ray crystal structure of the InlB LRR domain.
- Shaitan KV, Mukovskii AI, Beliakov AA, Saraikin SS
- [Statistical distribution of dipeptides in protein structures and dynamic characteristics of some protein fragments]
- Biofizika. 2000; 45: 399-406
- Display abstract
Statistical distributions of the occurrence of dipeptide fragments in proteins were studied. Various algorithms of ordering of files of frequency distribution were used. A correlation of occurrence of pairs of amino acid residues in various classes of proteins was established. The problem of the dynamic compatibility of amino acid residues in protein structures is discussed. The dynamic properties of frequently and seldom occurring dimers of amino acids are compared.
- Katti MV, Sami-Subbu R, Ranjekar PK, Gupta VS
- Amino acid repeat patterns in protein sequences: their diversity and structural-functional implications.
- Protein Sci. 2000; 9: 1203-9
- Display abstract
All the protein sequences from SWISS-PROT database were analyzed for occurrence of single amino acid repeats, tandem oligo-peptide repeats, and periodically conserved amino acids. Single amino acid repeats of glutamine, serine, glutamic acid, glycine, and alanine seem to be tolerated to a considerable extent in many proteins. Tandem oligo-peptide repeats of different types with varying levels of conservation were detected in several proteins and found to be conspicuous, particularly in structural and cell surface proteins. It appears that repeated sequence patterns may be a mechanism that provides regular arrays of spatial and functional groups, useful for structural packing or for one to one interactions with target molecules. To facilitate further explorations, a database of Tandem Repeats in Protein Sequences (TRIPS) has been developed and is available at URL: http://www.ncl-india.org/trips.
- Jung J, Lee B
- Protein structure alignment using environmental profiles.
- Protein Eng. 2000; 13: 535-43
- Display abstract
A new protein structure alignment procedure is described. An initial alignment is made by comparing a one-dimensional list of primary, secondary and tertiary structural features (profiles) of two proteins, without explicitly considering the three-dimensional geometry of the structures. The alignment is then iteratively refined in the second step, in which new alignments are found by three-dimensional superposition of the structures based on the current alignment. This new procedure is fast enough to do all-against-all structural comparisons routinely. The procedure sometimes finds an alignment that suggests an evolutionary relationship and which is not normally obtained if only geometry is considered. All pair-wise comparisons were made among 3539 protein structural domains that represent all known protein structures. The resulting 3539 z-scores were used to cluster the proteins. The number of main clusters increased continuously as the z-cutoff was raised, but the number of multiple-member clusters showed a maximum at z-cutoff values of 5.0 and 5.5. When a z-cutoff value of 5.0 was used, the total number of main clusters was 2043, of which only 336 clusters had more than one member.
- Petsko GA
- The grail problem.
- Genome Biol. 2000; 1: 2-2
- Koppensteiner WA, Lackner P, Wiederstein M, Sippl MJ
- Characterization of novel proteins based on known protein structures.
- J Mol Biol. 2000; 296: 1139-52
- Display abstract
The genome sciences face the challenge to characterize structure and function of a vast number of novel genes. Sequence search techniques are used to infer functional and structural information from similarities to experimentally characterized genes or proteins. The persistent goal is to refine these techniques and to develop alternative and complementary methods to increase the range of reliable inference.Here, we focus on the structural and functional assignments that can be inferred from the known three-dimensional structures of proteins. The study uses all structures in the Protein Data Bank that were known by the end of 1997. The protein structures released in 1998 were then characterized in terms of functional and structural similarity to the previously known structures, yielding an estimate of the maximum amount of information on novel protein sequences that can be obtained from inference techniques.The 147 globular proteins corresponding to 196 domains released in 1998 have no clear sequence similarity to previously known structures. However, 75 % of the domains have extensive structure similarity to previously known folds, and most importantly, in two out of three cases similarity in structure coincides with related function. In view of this analysis, full utilization of existing structure data bases would provide information for many new targets even if the relationship is not accessible from sequence information alone. Currently, the most sophisticated techniques detect of the order of one-third of these relationships.
- Fischer D, Eisenberg D
- Predicting structures for genome proteins.
- Curr Opin Struct Biol. 1999; 9: 208-11
- Display abstract
Assigning three-dimensional protein folds to genome sequences is essential to understanding protein function. Although experimental three-dimensional structures are currently available for only a very small fraction of these sequences, computational fold assignment is able to assign folds to 20-30% of the sequences in various genomes. This percentage varies depending on the particular organism under analysis, on the sensitivities of the methods used and on the number of experimental structures available at the time the assignment is carried out. The fraction of assignable sequences is currently increasing at an annual rate of roughly 18%. If this rate is sustained throughout the coming years, three-dimensional computational models for more than half of the genome sequences may be available by the year 2003.
- Yan B, Zhang W, Ding J, Arnold E
- Pivot residue: an analysis of domain motion in proteins.
- J Protein Chem. 1999; 18: 807-11
- Display abstract
In this study, we present an approach to identify some residues that represent the pivot points to experience conformational changes between open (unligand) and closed (ligand) forms of a protein. First, an angle, theta, formed by 4 consecutive Ca atoms in polypeptide backbones was introduced. The difference of this angle, deltatheta, from the equivalent residues between the open and the closed form was used to represent the local torsion changes in the protein structure, and the residue with the maximum among deltatheta was identified to be a pivot residue. We demonstrate the ability of our method by identifying the pivot residues from five proteins, Lysozyme mutates, Lactoferrin, Lay/Arg/Orn-binding protein, Calmodulin and Catabolit gene activator protein. These pivot residues are located at the hinges in the proteins, they are hinge points for the domain motion. These examples also show that the pivot residues are useful to distinguish the mechanism between shear motion and hinge motion in a protein.
- Groves MR, Barford D
- Topological characteristics of helical repeat proteins.
- Curr Opin Struct Biol. 1999; 9: 383-9
- Display abstract
The recent elucidation of protein structures based upon repeating amino acid motifs, including the armadillo motif, the HEAT motif and tetratricopeptide repeats, reveals that they belong to the class of helical repeat proteins. These proteins share the common property of being assembled from tandem repeats of an alpha-helical structural unit, creating extended superhelical structures that are ideally suited to create a protein recognition interface.
- Li W, Liu Z, Lai L
- Protein loops on structurally similar scaffolds: database and conformational analysis.
- Biopolymers. 1999; 49: 481-95
- Display abstract
A general problem in comparative modeling and protein design is the conformational evaluation of loops with a certain sequence in specific environmental protein frameworks. Loops of different sequences and structures on similar scaffolds are common in the Protein Data Bank (PDB). In order to explore both structural and sequential diversity of them, a data base of loops connecting similar secondary structure fragments is constructed by searching the data base of families of structurally similar proteins and PDB. A total of 84 loop families having 2-13 residues are found among the well-determined structures of resolution better than 2.5 A. Eight alpha-alpha, 20 alpha-beta, 19 beta-alpha, and 37 beta-beta families are identified. Every family contains more than 5 loop motifs. In each family, no loops share same sequence and all the frameworks are well superimposed. Forty-three new loop classes are distinguished in the data base. The structural variability of loops in homologous proteins are examined and shown in 44 families. Motif families are characterized with geometric parameters and sequence patterns. The conformations of loops in each family are clustered into subfamilies using average linkage cluster analysis method. Information such as geometric properties, sequence profile, sequential and structural variability in loop, structural alignment parameters, sequence similarities, and clustering results are provided. Correlations between the conformation of loops and loop sequence, motif sequence, and global sequence of PDB chain are examined in order to find how loop structures depend on their sequences and how they are affected by the local and global environment. Strong correlations (R > 0.75) are only found in 24 families. The best R value is 0.98. The data base is available through the Internet.
- Andersson ME, Nordlund P
- A revised model of the active site of alternative oxidase.
- FEBS Lett. 1999; 449: 17-22
- Display abstract
The plant mitochondrial protein alternative oxidase catalyses dioxygen dependent ubiquinol oxidation to yield ubiquinone and water. A structure of this protein has previously been proposed based on an assumed structural homology to the di-iron carboxylate family of proteins. However, these authors suggested the protein has a very different topology than the known structures of di-iron carboxylate proteins. We have re-examined this model and based on comparison of recent sequences and structural data on di-iron carboxylate proteins we present a new model of the alternative oxidase which allows prediction of active site residues and a possible membrane binding motif.
- Blatch GL, Lassle M
- The tetratricopeptide repeat: a structural motif mediating protein-protein interactions.
- Bioessays. 1999; 21: 932-9
- Display abstract
The tetratricopeptide repeat (TPR) motif is a protein-protein interaction module found in multiple copies in a number of functionally different proteins that facilitates specific interactions with a partner protein(s). Three-dimensional structural data have shown that a TPR motif contains two antiparallel alpha-helices such that tandem arrays of TPR motifs generate a right-handed helical structure with an amphipathic channel that might accommodate the complementary region of a target protein. Most TPR-containing proteins are associated with multiprotein complexes, and there is extensive evidence indicating that TPR motifs are important to the functioning of chaperone, cell-cycle, transcription, and protein transport complexes. The TPR motif may represent an ancient protein-protein interaction module that has been recruited by different proteins and adapted for specific functions. BioEssays 1999;21:932-939.
- Sivula T et al.
- Evolutionary aspects of inorganic pyrophosphatase.
- FEBS Lett. 1999; 454: 75-80
- Display abstract
Based on the primary structure, soluble inorganic pyrophosphatases can be divided into two families which exhibit no sequence similarity to each other. Family I, comprising most of the known pyrophosphatase sequences, can be further divided into prokaryotic, plant and animal/fungal pyrophosphatases. Interestingly, plant pyrophosphatases bear a closer similarity to prokaryotic than to animal/fungal pyrophosphatases. Only 17 residues are conserved in all 37 pyrophosphatases of family I and remarkably, 15 of these residues are located at the active site. Subunit interface residues are conserved in animal/fungal but not in prokaryotic pyrophosphatases.
- Nakamura H
- [Structural genomics: approach of structure biology towards genome analysis]
- Tanpakushitsu Kakusan Koso. 1999; 44: 590-7
- Backofen R, Will S, Bornberg-Bauer E
- Application of constraint programming techniques for structure prediction of lattice proteins with extended alphabets.
- Bioinformatics. 1999; 15: 234-42
- Display abstract
MOTIVATION: Predicting the ground state of biopolymers is a notoriously hard problem in biocomputing. Model systems, such as lattice proteins, are simple tools and valuable to test and improve new methods. Best known are models with sequences composed from a binary (hydrophobic and polar) alphabet. The major drawback is the degeneracy, i.e. the number of different ground state conformations. RESULTS: We show how recently developed constraint programming techniques can be used to solve the structure prediction problem efficiently for a higher order alphabet. To our knowledge it is the first report of an exact and computationally feasible solution to model proteins of length up to 36 and without resorting to maximally compact states. We further show that degeneracy is reduced by more than one order of magnitude and that ground state conformations are not necessarily compact. Therefore, more realistic protein simulations become feasible with our model.
- Bateman A, Murzin AG, Teichmann SA
- Structure and distribution of pentapeptide repeats in bacteria.
- Protein Sci. 1998; 7: 1477-80
- Display abstract
We report the discovery of a novel family of proteins, each member contains tandem pentapeptide (five residue) repeats, described by the motif A(D/N)LXX. Members of this family are both membrane bound and cytoplasmic. The function of these repeats is uncertain, but they may have a targeting or structural function rather than enzymatic activity. This family is most common in cyanobacteria, suggesting a function related to cyanobacterial-specific metabolism. Although no experimental information is available for the structure of this family, it is predicted that the tandem pentapeptide repeats will form a right-handed beta-helical structure. A structural model of the pentapeptide repeats is presented.
- Heffron S et al.
- Sequence profile of the parallel beta helix in the pectate lyase superfamily.
- J Struct Biol. 1998; 122: 223-35
- Display abstract
The parallel beta helix structure found in the pectate lyase superfamily has been analyzed in detail. A comparative analysis of known structures has revealed a unique sequence profile, with a strong positional preference for specific amino acids oriented toward the interior of the parallel beta helix. Using the unique sequence profile, search patterns have been constructed and applied to the sequence databases to identify a subset of proteins that are likely to fold into the parallel beta helix. Of the 19 families identified, 39% are known to be carbohydrate-binding proteins, and 50% belong to a broad category of proteins with sequences containing leucine-rich repeats (LRRs). The most striking result is the sequence match between the search pattern and four contiguous segments of internalin A, a surface protein from the bacterial pathogen Listeria monocytogenes. A plausible model of the repetitive LRR sequences of internalin A has been constructed and favorable 3D-1D profile scores have been calculated. Moreover, spectroscopic features characteristic of the parallel beta helix topology in the pectate lyases are present in the circular dichroic spectrum of internalin A. Altogether, the data support the hypothesis that sequence search patterns can be used to identify proteins, including a subset of LRR proteins, that are likely to fold into the parallel beta helix.
- Liu YT, Yin HL
- Identification of the binding partners for flightless I, A novel protein bridging the leucine-rich repeat and the gelsolin superfamilies.
- J Biol Chem. 1998; 273: 7920-7
- Display abstract
Flightless-I (fliI) is a novel member of the gelsolin family that is important for actin organization during Drosophila embryogenesis and myogenesis. Drosophila fliI and the human homolog FLI both contain the classic gelsolin 6-fold segmental repeats and an amino-terminal extension of 16 tandem leucine-rich repeats (LRR). LRR repeats form amphipathic beta-alpha structural units that mediate protein-protein interactions. Although there are close to 100 known LRR domain-containing proteins, only a few binding pairs have been identified. In this paper, we used biochemical and genetic approaches to identify proteins that interact with human FLI. In vitro synthesized FLI bound to actin-Sepharose and binding was reduced by competition with excess soluble actin. Actin binding was mediated through the gelsolin-like domain and not the LRR domain. Although the FLI LRR module is most closely related to the LRR domains of Ras-interactive proteins, FLI does not associate with Ras, selected Ras effectors, or other Ras-related small GTPases. Two-hybrid screens using FLI LRR as bait identified a novel LRR binding partner. The 0.65-kilobase pair (kb) clone from the screen survived additional rounds of stringent two-hybrid pairwise assays, establishing a specific interaction. Binding to FLI LRR was corroborated by co-immunoprecipitation with FLI LRR. The translated sequence of the FLI LRR associated protein (FLAP) encodes a novel protein not represented in the data base. Northern blot analyses revealed four FLAP messages of approximately 2.7, 2.9, 3.3, and 5.1 kb, which are differentially expressed in the tissues tested. Skeletal and cardiac muscles are particularly rich in the 3.3-kb FLAP message, and the FLI message as well. Full-length FLAP clones were isolated from a mouse skeletal muscle cDNA library. They have an open reading frame which encodes for a protein containing 626 amino acids. Sequence analyses predict that the FLAP protein is rich in alpha-helices and contains stretches of dimeric coiled coil in its middle region and COOH terminus. The identification of actin and FLAP as the binding ligands for the gelsolin-like domain and the LRR domain, respectively, suggests that FLI may link the actin cytoskeleton to other modules implicated in intermolecular recognition and structural organization.
- Hocking AM, Shinomura T, McQuillan DJ
- Leucine-rich repeat glycoproteins of the extracellular matrix.
- Matrix Biol. 1998; 17: 1-19
- Display abstract
The extracellular matrix plays an integral role in the pivotal processes of development, tissue repair, and metastasis by regulating cell proliferation, differentiation, adhesion, and migration. This review is focused on a family of related glycoproteins represented by at least one member in all specialized extracellular matrices. This family currently comprises nine members grouped together on the basis of their presence in the extracellular matrix and by virtue of a leucine-rich repeat motif that dominates the structure of the core protein. It is likely that most, if not all the members of this group exist as proteoglycans in some tissues, and thus have been termed the Small Leucine-Rich Proteoglycan family, or SLRPs. The leucine-rich repeat (LRR) is usually present in tandem array and has been described in an increasing number of proteins, giving rise to a LRR-superfamily. The LRR domain of the SLRP family is unique within the superfamily in that it is flanked by cysteine clusters, and the 24 amino acid consensus for SLRP members is x-x-I/V/L-x-x-x-x-F/P/L-x-x-L/P-x-x-L-x-x-L/I-x-L-x-x-N-x-I/L, where x is any amino acid. Enormous progress has been made in describing the membership, structure and localization of this family, and recently new insight has emerged into the putative function of these molecules not just as modulators of matrix assembly but also on their intriguing role in regulating cell growth, adhesion, and migration. Determination of membership, structure and putative function of this fascinating class of molecules is summarized in this review.
- Buchanan SG, Gay NJ
- Structural and functional diversity in the leucine-rich repeat family of proteins.
- Prog Biophys Mol Biol. 1996; 65: 1-44
- Kobe B, Deisenhofer J
- Proteins with leucine-rich repeats.
- Curr Opin Struct Biol. 1995; 5: 409-16
- Display abstract
Leucine-rich repeats are short sequence motifs present in over sixty proteins, all of which appear to be involved in protein-protein interactions. The crystal structure of ribonuclease inhibitor demonstrated that the repeats correspond to beta-alpha structural units. The recently determined crystal structure of the ribonuclease A-ribonuclease inhibitor complex suggests the basis for the protein-binding function of leucine-rich repeats.
- Kajava AV, Vassart G, Wodak SJ
- Modeling of the three-dimensional structure of proteins with the typical leucine-rich repeats.
- Structure. 1995; 3: 867-77
- Display abstract
BACKGROUND: Leucine-rich repeats (LRRs) are present in proteins with diverse functions. The horseshoe-shaped structure of a ribonuclease inhibitor (RI), with a parallel beta sheet lining the inner circumference of the horseshoe and alpha helices flanking its outer circumference, is the only X-ray structure containing these repeats to be determined. Despite the fact that the lengths and sequences of the RI repeats differ from those of the most commonly occurring LRRs, it was deemed worthwhile to derive a three-dimensional structural framework of these more typical LRR proteins, using the RI structure as a template. RESULTS: Sequence alignments of 569 LRRs from 68 proteins were obtained by a profile search and used in a comparative sequence analysis to distinguish between residues with a probable structural role and those which seemed essential for function. This knowledge, along with the known atomic structure of RI, was used to model the three-dimensional structure of the most common LRR units. These modeled units were then used to build the three-dimensional structure of the extracellular domain of the thyrotropin receptor (TSHR)--a 'typical' LRR protein. CONCLUSIONS: The modeled TSHR structure adopts a non-globular arrangement, similar to that in RI. The beta regions of this typical LRR protein are the same as in the RI structure, whereas the alpha helices are shorter and the conformations of the alpha beta and beta alpha connections are different. As a result of these differences it was not possible to pack together typical LRR units using repeats such as those found in RI. This mutually exclusive relationship is supported by sequence analysis. The predicted structure of the typical LRRs obtained here can be used to build models for any of the known LRR proteins and the approach used for the prediction could be applied to other proteins containing internal repeats.