DOMAIN ACC DEFINITION DESCRIPTION -------------------------------------------------------------------------------- 14_3_3 SM00101 14-3-3 homologues 14-3-3 homologues mediates signal transduction by binding to phosphoserine-containing proteins. They are involved in growth factor signalling and also interact with MEK kinases. 35EXOc SM00474 3'-5' exonuclease 3\' -5' exonuclease proofreading domain present in DNA polymerase I, Werner syndrome helicase, RNase D and other enzymes 4.1m SM00294 putative band 4.1 homologues' binding motif 53EXOc SM00475 5'-3' exonuclease A1pp SM00506 Appr-1"-p processing enzyme Function determined by Martzen et al. Extended family detected by reciprocal PSI-BLAST searches (unpublished results, and Pehrson & Fuji). A4_EXTRA SM00006 amyloid A4 amyloid A4 precursor of Alzheimers disease AAA SM00382 ATPases associated with a variety of cellular activities AAA - ATPases associated with a variety of cellular activities. This profile/alignment only detects a fraction of this vast family. The poorly conserved N-terminal helix is missing from the alignment. AAA_PrkA SM00763 PrkA AAA domain This is a family of PrkA bacterial and archaeal serine kinases approximately 630 residues long. This is the N-terminal AAA domain. AAI SM00499 Plant lipid transfer protein / seed storage protein / trypsin-alpha amylase inhibitor domain family AARP2CN SM00785 AARP2CN (NUC121) domain This domain is the central domain of AARP2. It is weakly similar to the GTP-binding domain of elongation factor TU PUBMED:15112237. ACR SM00608 ADAM Cysteine-Rich Domain ACTIN SM00268 Actin ACTIN subfamily of ACTIN/mreB/sugarkinase/Hsp70 superfamily AD SM00995 Anticodon-binding domain This domain of approximately 100 residues is conserved from plants to humans. It is frequently found in association with Lsm domain-containing proteins. ADEAMc SM00552 tRNA-specific and double-stranded RNA adenosine deaminase (RNA-specific editase) ADF SM00102 Actin depolymerisation factor/cofilin -like domains Severs actin filaments and binds to actin monomers. ADSL_C SM00998 Adenylosuccinate lyase C-terminus Adenylosuccinate lyase catalyses two steps in the synthesis of purine nucleotides: the conversion of succinylaminoimidazole-carboxamide ribotide into aminoimidazole-carboxamide ribotide (the fifth step of de novo IMP biosynthesis); the formation of adenosine monophosphate (AMP) from adenylosuccinate (the final step in the synthesis of AMP from IMP). This entry represents the C-terminal, seven alpha-helical, domain of adenylosuccinate lyase. AFOR_N SM00790 Aldehyde ferredoxin oxidoreductase, N-terminal domain Enzymes of the aldehyde ferredoxin oxidoreductase (AOR) family PUBMED:9242907 contain a tungsten cofactor and an 4Fe4S cluster and catalyse the interconversion of aldehydes to carboxylates PUBMED:8672295. This family includes AOR, formaldehyde ferredoxin oxidoreductase (FOR), glyceraldehyde-3-phosphate ferredoxin oxidoreductase (GAPOR), all isolated from hyperthermophilic archea PUBMED:9242907; carboxylic acid reductase found in clostridia PUBMED:2550230; and hydroxycarboxylate viologen oxidoreductase from Proteus vulgaris, the sole member of the AOR family containing molybdenum PUBMED:8026480. GAPOR may be involved in glycolysis PUBMED:7721730, but the functions of the other proteins are not yet clear. AOR has been proposed to be the primary enzyme responsible for oxidising the aldehydes that are produced by the 2-keto acid oxidoreductases PUBMED:9275170. AGTRAP SM00805 Angiotensin II, type I receptor-associated protein This family consists of several angiotensin II, type I receptor-associated protein (AGTRAP) sequences. AGTRAP is known to interact specifically with the C-terminal cytoplasmic region of the angiotensin II type 1 (AT(1)) receptor to regulate different aspects of AT(1) receptor physiology. The function of this family is unclear. AHS1 SM00796 Allophanate hydrolase subunit 1 This domain represents subunit 1 of allophanate hydrolase (AHS1). AHS2 SM00797 Allophanate hydrolase subunit 2 This domain represents subunit 2 of allophanate hydrolase (AHS2). AICARFT_IMPCHas SM00798 AICARFT/IMPCHase bienzyme This is a family of bifunctional enzymes catalysing the last two steps in de novo purine biosynthesis. The bifunctional enzyme is found in both prokaryotes and eukaryotes. The second last step is catalysed by 5-aminoimidazole-4-carboxamide ribonucleotide formyltransferase (AICARFT), this enzyme catalyses the formylation of AICAR with 10-formyl-tetrahydrofolate to yield FAICAR and tetrahydrofolate. The last step is catalysed by IMP (Inosine monophosphate) cyclohydrolase (IMPCHase), cyclizing FAICAR (5-formylaminoimidazole-4-carboxamide ribonucleotide) to IMP. AIP3 SM00806 Actin interacting protein 3 Aip3p/Bud6p is a regulator of cell and cytoskeletal polarity in Saccharomyces cerevisiae that was previously identified as an actin-interacting protein. Actin-interacting protein 3 (Aip3p) localizes at the cell cortex where cytoskeleton assembly must be achieved to execute polarized cell growth, and deletion of AIP3 causes gross defects in cell and cytoskeletal polarity. Aip3p localization is mediated by the secretory pathway, mutations in early- or late-acting components of the secretory apparatus lead to Aip3p mislocalization PUBMED:10679021. AIRC SM01001 AIR carboxylase Members of this family catalyse the decarboxylation of 1-(5-phosphoribosyl)-5-amino-4-imidazole-carboxylate (AIR). This family catalyse the sixth step of de novo purine biosynthesis. Some members of this family contain two copies of this domain. AKAP_110 SM00807 A-kinase anchor protein 110 kDa This family consists of several mammalian protein kinase A anchoring protein 3 (PRKA3) or A-kinase anchor protein 110 kDa (AKAP 110) sequences. Agents that increase intracellular cAMP are potent stimulators of sperm motility. Anchoring inhibitor peptides, designed to disrupt the interaction of the cAMP-dependent protein kinase A (PKA) with A kinase-anchoring proteins (AKAPs), are potent inhibitors of sperm motility. PKA anchoring is a key biochemical mechanism controlling motility. AKAP110 shares compartments with both RI and RII isoforms of PKA and may function as a regulator of both motility- and head-associated functions such as capacitation and the acrosome reaction PUBMED:10319321. ALAD SM01004 Delta-aminolevulinic acid dehydratase This entry represents porphobilinogen (PBG) synthase (PBGS, or 5-aminoaevulinic acid dehydratase, or ALAD, ), which functions during the second stage of tetrapyrrole biosynthesis. This enzyme catalyses a Knorr-type condensation reaction between two molecules of ALA to generate porphobilinogen, the pyrrolic building block used in later steps PUBMED:17311232. The structure of the enzyme is based on a TIM barrel topology made up of eight identical subunits, where each subunit binds to a metal ion that is essential for activity, usually zinc (in yeast, mammals and certain bacteria) or magnesium (in plants and other bacteria). A lysine has been implicated in the catalytic mechanism PUBMED:3092810. The lack of PBGS enzyme causes a rare porphyric disorder known as ALAD porphyria, which appears to involve conformational changes in the enzyme PUBMED:17236137. ALBUMIN SM00103 serum albumin AMA-1 SM00815 Apical membrane antigen 1 Apical membrane antigen 1 (AMA-1) is a Plasmodium asexual blood-stage antigen. It has been suggested that positive selection operates on the AMA-1 gene in regions coding for antigenic sites. AMOP SM00723 Adhesion-associated domain present in MUC4 and other proteins AMPKBI SM01010 5'-AMP-activated protein kinase beta subunit, interation domain This region is found in the beta subunit of the 5'-AMP-activated protein kinase complex, and its yeast homologues Sip1, Sip2 and Gal83, which are found in the SNF1 kinase complex. This region is sufficient for interaction of this subunit with the kinase complex, but is not solely responsible for the interaction, and the interaction partner is not known. The isoamylase N-terminal domain is sometimes found in proteins belonging to this family. AMP_N SM01011 Aminopeptidase P, N-terminal domain This domain is structurally very similar to the creatinase N-terminal domain. However, little or no sequence similarity exists between the two families. ANATO SM00104 Anaphylatoxin homologous domain C3a, C4a and C5a anaphylatoxins are protein fragments generated enzymatically in serum during activation of complement molecules C3, C4, and C5. They induce smooth muscle contraction. These fragments are homologous to a three-fold repeat in fibulins. ANK SM00248 ankyrin repeats Ankyrin repeats are about 33 amino acids long and occur in at least four consecutive copies. They are involved in protein-protein interactions. The core of the repeat seems to be an helix-loop-helix structure. ANTAR SM01012 ANTAR (AmiR and NasR transcription antitermination regulators) is an RNA-binding domain found in bacterial transcription antitermination regulatory proteins. The majority of the domain consists of a coiled-coil. ANX SM00335 Annexin repeats AP2 SM00380 DNA-binding domain in plant proteins such as APETALA2 and EREBPs AP2Ec SM00518 AP endonuclease family 2 These endonucleases play a role in DNA repair. Cleave phosphodiester bonds at apurinic or apyrimidinic sites APC2 SM01013 Anaphase promoting complex (APC) subunit 2 The anaphase promoting complex or cyclosome (APC2) is an E3 ubiquitin ligase which is part of the SCF family of ubiquitin ligases. Ubiquitin ligases catalyse the transfer of ubiquitin from the ubiquitin conjugating enzyme (E2), to the substrate protein. APPLE SM00223 APPLE domain Four-fold repeat in plasma kallikrein and coagulation factor XI. Factor XI apple 3 mediates binding to platelets. Factor XI apple 1 binds high-molecular-mass kininogen. Apple 4 in factor XI mediates dimer formation and binds to factor XIIa. Mutations in apple 4 cause factor XI deficiency, an inherited bleeding disorder. ARF SM00177 ARF-like small GTPases; ARF, ADP-ribosylation factor Ras homologues involved in vesicular transport. Activator of phospholipase D isoforms. Unlike Ras proteins they lack cysteine residues at their C-termini and therefore are unlikely to be prenylated. ARFs are N-terminally myristoylated. Contains ATP/GTP-binding motif (P-loop). ARID SM01014 ARID/BRIGHT DNA binding domain Members of the recently discovered ARID (AT-rich interaction domain) family of DNA-binding proteins are found in fungi and invertebrate and vertebrate metazoans. ARID-encoding genes are involved in a variety of biological processes including embryonic development, cell lineage gene regulation and cell cycle control. Although the specific roles of this domain and of ARID-containing proteins in transcriptional regulation are yet to be elucidated, they include both positive and negative transcriptional regulation and a likely involvement in the modification of chromatin structure PUBMED:10838570. The basic structure of the ARID domain domain appears to be a series of six alpha-helices separated by beta-strands, loops, or turns, but the structured region may extend to an additional helix at either or both ends of the basic six. Based on primary sequence homology, they can be partitioned into three structural classes: Minimal ARID proteins that consist of a core domain formed by six alpha helices; ARID proteins that supplement the core domain with an N-terminal alpha-helix; and Extended-ARID proteins, which contain the core domain and additional alpha-helices at their N- and C-termini. ARM SM00185 Armadillo/beta-catenin-like repeats Approx. 40 amino acid repeat. Tandem repeats form superhelix of helices that is proposed to mediate interaction of beta-catenin with its ligands. Involved in transducing the Wingless/Wnt signal. In plakoglobin arm repeats bind alpha-catenin and N-cadherin. ASCH SM01022 The ASCH domain adopts a beta-barrel fold similar to that of the PUA domain. It is thought to function as an RNA-binding domain during coactivation, RNA-processing and possibly during prokaryotic translation regulation PUBMED:16322048. AT_hook SM00384 DNA binding domain with preference for A/T rich regions Small DNA-binding motif first described in the high mobility group non-histone chromosomal protein HMG-I(Y). AWS SM00570 associated with SET domains subdomain of PRESET AXH SM00536 domain in Ataxins and HMG containing proteins unknown function A_amylase_inhib SM00783 Alpha amylase inhibitor Alpha amylase inhibitor inhibits mammalian alpha-amylases specifically, by forming a tight stoichiometric 1:1 complex with alpha-amylase. The inhibitor has no action on plant and microbial alpha amylases. Aamy SM00642 Alpha-amylase domain Aamy_C SM00632 Ad_cyc_g-alpha SM00789 Adenylate cyclase G-alpha binding domain This fungal domain is found in adenylate cyclase and interacts with the alpha subunit of heterotrimeric G proteins. Adenylsucc_synt SM00788 Adenylosuccinate synthetase Adenylosuccinate synthetase plays an important role in purine biosynthesis, by catalyzing the GTP-dependent conversion of IMP and aspartic acid to AMP. Adenylosuccinate synthetase has been characterized from various sources ranging from Escherichia coli (gene purA) to vertebrate tissues. In vertebrates, two isozymes are present - one involved in purine biosynthesis and the other in the purine nucleotide cycle. The crystal structure of adenylosuccinate synthetase from E. coli reveals that the dominant structural element of each monomer of the homodimer is a central beta-sheet of 10 strands. The first nine strands of the sheet are mutually parallel with right-handed crossover connections between the strands. The 10th strand is antiparallel with respect to the first nine strands. In addition, the enzyme has two antiparallel beta-sheets, comprised of two strands and three strands each, 11 alpha-helices and two short 3/10-helices. Further, it has been suggested that the similarities in the GTP-binding domains of the synthetase and the p21ras protein are an example of convergent evolution of two distinct families of GTP-binding proteins PUBMED:8244965. Structures of adenylosuccinate synthetase from Triticum aestivum and Arabidopsis thaliana when compared with the known structures from E. coli reveals that the overall fold is very similar to that of the E. coli protein PUBMED:10669609. AdoHcyase SM00996 S-adenosyl-L-homocysteine hydrolase AdoHcyase_NAD SM00997 S-adenosyl-L-homocysteine hydrolase, NAD binding domain Aerolysin SM00999 Aerolysin toxin This family represents the pore forming lobe of aerolysin. Agenet SM00743 Tudor-like domain present in plant sequences. Domain in plant sequences with possible chromatin-associated functions. Agglutinin SM00791 Amaranthus caudatus agglutinin or amaranthin is a lectin from the ancient South American crop, amaranth grain. Although its biological function is unknown, it has a high binding specificity for the methyl-glycoside of the T-antigen, found linked to serine or threonine residues of cell surface glycoproteins PUBMED:2271665. The protein is comprised of a homodimer, with each homodimer consisting of two beta-trefoil domains PUBMED:9334739. Agouti SM00792 Agouti protein The agouti protein regulates pigmentation in the mouse hair follicle producing a black hair with a subapical yellow band. A highly homologous protein agouti signal protein (ASIP) is present in humans and is expressed at highest levels in adipose tissue where it may play a role in energy homeostasis and possibly human pigmentation PUBMED:11837451, PUBMED:11833005. AgrB SM00793 Accessory gene regulator B The accessory gene regulator (agr) of Staphylococcus aureus is the central regulatory system that controls the gene expression for a large set of virulence factors. The arg locus consists of two transcripts: RNAII and RNAIII. RNAII encodes four genes (agrA, B, C, and D) whose gene products assemble a quorum sensing system. At low cell density, the agr genes are continuously expressed at basal levels. A signal molecule, autoinducing peptide (AIP), produced and secreted by the bacteria, accumulates outside of the cells. When the cell density increases and the AIP concentration reaches a threshold, it activates the agr response, i.e. activation of secreted protein gene expression and subsequent repression of cell wall-associated protein genes. AgrB and AgrD are essential for the production of the autoinducing peptide which functions as a signal for quorum sensing. AgrB is a transmembrane protein PUBMED:11195102. AgrB is involved in the proteolytic processing of AgrD and may have both proteolytic enzyme activity and a transporter facilitating the export of the processed AgrD peptide PUBMED:12122003. AgrD SM00794 Staphylococcal AgrD protein This family consists of several AgrD proteins from many Staphylococcus species. The agr locus was initially described in Staphylococcus aureus as an element controlling the production of exoproteins implicated in virulence. Its pattern of action has been shown to be complex, upregulating certain extracellular toxins and enzymes expressed post-exponentially and repressing some exponential-phase surface components. AgrD encodes the precursor of the autoinducing peptide (AIP).The AIP derived from AgrD by the action of AgrB interacts with AgrC in the membrane to activate AgrA, which upregulates transcription both from promoter P2, amplifying the response, and from P3, initiating the production of a novel effector: RNAIII. In S. aureus, delta-hemolysin is the only translation product of RNA III and is not involved in the regulatory functions of the transcript, which is therefore the primary agent for modulating the expression of other operons controlled by agr PUBMED:11807079. Agro_virD5 SM00795 Agrobacterium VirD5 protein The virD operon in Agrobacterium encodes a site-specific endonuclease, and a number of other poorly characterised products. This family represents the VirD5 protein. Aha1_N SM01000 Activator of Hsp90 ATPase, N-terminal This domain is predominantly found in the protein 'Activator of Hsp90 ATPase', it adopts a secondary structure consisting of an N-terminal alpha-helix leading into a four-stranded meandering antiparallel beta-sheet, followed by a C-terminal alpha-helix. The two helices are packed together, with the beta-sheet curving around them. They bind to the molecular chaperone HSP82 and stimulate its ATPase activity PUBMED:15039704. AlaDh_PNT_C SM01002 Alanine dehydrogenase/PNT, C-terminal domain Alanine dehydrogenase catalyzes the NAD-dependent reversible reductive amination of pyruvate into alanine. AlaDh_PNT_N SM01003 Alanine dehydrogenase/PNT, N-terminal domain Alanine dehydrogenase catalyzes the NAD-dependent reversible reductive amination of pyruvate into alanine. Ala_racemase_C SM01005 Alanine racemase, C-terminal domain Alanine racemase plays a role in providing the D-alanine required for cell wall biosynthesis by isomerising L-alanine to D-alanine. Proteins contains this domain are found in both prokaryotic and eukaryotic proteins PUBMED:1676385,PUBMED:7871888. AlcB SM01006 Siderophore biosynthesis protein domain AlcB is the conserved 45 residue region of one of the proteins of a complex which mediates alcaligin biosynthesis in Bordetella and aerobactin biosynthesis in E. coli and other bacteria. The protein appears to catalyse N-acylation of the hydroxylamine group in N-hydroxyputrescine with succinyl CoA - an activated mono-thioester derivative of succinic acid that is an intermediate in the Krebs cycle. Ald_Xan_dh_C SM01008 Aldehyde oxidase and xanthine dehydrogenase, a/b hammerhead domain Aldehyde oxidase catalyses the conversion of an aldehyde in the presence of oxygen and water to an acid and hydrogen peroxide. The enzyme is a homodimer, and requires FAD, molybdenum and two 2FE-2S clusters as cofactors. Xanthine dehydrogenase catalyses the hydrogenation of xanthine to urate, and also requires FAD, molybdenum and two 2FE-2S clusters as cofactors. This activity is often found in a bifunctional enzyme with xanthine oxidase activity too. The enzyme can be converted from the dehydrogenase form to the oxidase form irreversibly by proteolysis or reversibly through oxidation of sulphydryl groups. Aldolase_II SM01007 Class II Aldolase and Adducin N-terminal domain This family includes class II aldolases and adducins which have not been ascribed any enzymatic function. AlkA_N SM01009 AlkA N-terminal domain This domain is found at the N terminus of bacterial AlkA . AlkA (3-methyladenine-DNA glycosylase II) is a base excision repair glycosylase from Escherichia coli. It removes a variety of alkylated bases from DNA, primarily by removing alkylation damage from duplex and single stranded DNA. AlkA flips a 1-azaribose abasic nucleotide out of DNA. This produces a 66 degrees bend in the DNA and a marked widening of the minor groove PUBMED:10675345. Alpha-L-AF_C SM00813 Alpha-L-arabinofuranosidase C-terminus This entry represents the C terminus (approximately 200 residues) of bacterial and eukaryotic alpha-L-arabinofuranosidase. This catalyses the hydrolysis of non-reducing terminal alpha-L-arabinofuranosidic linkages in L-arabinose-containing polysaccharides. Alpha-amyl_C2 SM00810 Alpha-amylase C-terminal beta-sheet domain This entry represents the beta-sheet domain that is found in several alpha-amylases, usually at the C-terminus. This domain is organised as a five-stranded anti-parallel beta-sheet. Alpha-mann_mid SM00872 Alpha mannosidase, middle domain Members of this entry belong to the glycosyl hydrolase family 38, This domain, which is found in the central region adopts a structure consisting of three alpha helices, in an immunoglobulin/albumin-binding domain-like fold. The domain is predominantly found in the enzyme alpha-mannosidase PUBMED:12634058. Alpha_L_fucos SM00812 Alpha-L-fucosidase O-Glycosyl hydrolases (EC 3.2.1.-) are a widespread group of enzymes that hydrolyse the glycosidic bond between two or more carbohydrates, or between a carbohydrate and a non-carbohydrate moiety. A classification system for glycosyl hydrolases, based on sequence similarity, has led to the definition of 85 different families. This classification is available on the CAZy (CArbohydrate-Active EnZymes) web site PUBMED:. Because the fold of proteins is better conserved than their sequences, some of the families can be grouped in 'clans'. Family 29 encompasses alpha-L-fucosidases, which is a lysosomal enzyme responsible for hydrolyzing the alpha-1,6-linked fucose joined to the reducing-end N-acetylglucosamine of the carbohydrate moieties of glycoproteins. Deficiency of alpha-L-fucosidase results in the lysosomal storage disease fucosidosis. Alpha_TIF SM00814 Alpha trans-inducing protein (Alpha-TIF) Alpha-TIF (VP16) from Herpes Simplex virus is an essential tegument protein involved in the transcriptional activation of viral immediate early (IE) promoters (alpha genes) during the lytic phase of viral infection. VP16 associates with cellular transcription factors to enhance transcription rates, including the general transcription factor TFIIB and the transcriptional coactivator PC4. The N-terminal residues of VP16 confer specificity for the IE genes, while the C-terminal residues are responsible for transcriptional activation. Within the C-terminal region are two activation regions that can independently and cooperatively activate transcription. VP16 forms a transcriptional regulatory complex with two cellular proteins, the POU-domain transcription factor Oct-1 and the cell-proliferation factor HCF-1. VP16 is an alpha/beta protein with an unusual fold. Other transcription factors may have a similar topology. Alpha_adaptinC2 SM00809 Adaptin C-terminal domain Adaptins are components of the adaptor complexes which link clathrin to receptors in coated vesicles. Clathrin-associated protein complexes are believed to interact with the cytoplasmic tails of membrane proteins, leading to their selection and concentration. Gamma-adaptin is a subunit of the golgi adaptor. Alpha adaptin is a heterotetramer that regulates clathrin-bud formation. The carboxyl-terminal appendage of the alpha subunit regulates translocation of endocytic accessory proteins to the bud site. This Ig-fold domain is found in alpha, beta and gamma adaptins and consists of a beta-sandwich containing 7 strands in 2 beta-sheets in a greek-key topology PUBMED:10430869, PUBMED:12176391. The adaptor appendage contains an additional N-terminal strand. Alpha_kinase SM00811 Alpha-kinase family This family is a novel family of eukaryotic protein kinase catalytic domains, which have no detectable similarity to conventional kinases. The family contains myosin heavy chain kinases and Elongation Factor-2 kinase and a bifunctional ion channel. This family is known as the alpha-kinase family. The structure of the kinase domain revealed unexpected similarity to eukaryotic protein kinases in the catalytic core as well as to metabolic enzymes with ATP-grasp domains. Amb_V_allergen SM00816 Amb V Allergen Amb V is an Ambrosia sp (ragweed) pollen allergen. Amb t V has been shown to contain a C-terminal helix as the major T cell epitope. Free sulphhydryl groups also play a major role in the T cell recognition of cross-reactivity T cell epitopes within these related allergens. Amb_all SM00656 Amelin SM00817 Ameloblastin precursor (Amelin) This family consists of several mammalian Ameloblastin precursor (Amelin) proteins. Matrix proteins of tooth enamel consist mainly of amelogenin but also of non-amelogenin proteins, which, although their volumetric percentage is low, have an important role in enamel mineralisation. One of the non-amelogenin proteins is ameloblastin, also known as amelin and sheathlin. Ameloblastin (AMBN) is one of the enamel sheath proteins which is though to have a role in determining the prismatic structure of growing enamel crystals. Amelogenin SM00818 Amelogenins, cell adhesion proteins, play a role in the biomineralisation of teeth. They seem to regulate formation of crystallites during the secretory stage of tooth enamel development and are thought to play a major role in the structural organisation and mineralisation of developing enamel. The extracellular matrix of the developing enamel comprises two major classes of protein: the hydrophobic amelogenins and the acidic enamelins. Circular dichroism studies of porcine amelogenin have shown that the protein consists of 3 discrete folding units: the N-terminal region appears to contain beta-strand structures, while the C-terminal region displays characteristics of a random coil conformation. Subsequent studies on the bovine protein have indicated the amelogenin structure to contain a repetitive beta-turn segment and a "beta-spiral" between Gln112 and Leu138, which sequester a (Pro, Leu, Gln) rich region. The beta-spiral offers a probable site for interactions with Ca2+ ions. Muatations in the human amelogenin gene (AMGX) cause X-linked hypoplastic amelogenesis imperfecta, a disease characterised by defective enamel. A 9bp deletion in exon 2 of AMGX results in the loss of codons for Ile5, Leu6, Phe7 and Ala8, and replacement by a new threonine codon, disrupting the 16-residue (Met1-Ala16) amelogenin signal peptide. Ami_2 SM00644 Ami_3 SM00646 AraC_E_bind SM00871 Bacterial transcription activator, effector binding domain This domain is found in the probable effector binding domain of a number of different bacterial transcription activators PUBMED:10802742 and is also present in some DNA gyrase inhibitors. The absence of a HTH motif in the DNA gyrase inhibitors is thought to indicate the fact that these do not bind DNA. ArfGap SM00105 Putative GTP-ase activating proteins for the small GTPase, ARF Putative zinc fingers with GTPase activating proteins (GAPs) towards the small GTPase, Arf. The GAP of ARD1 stimulates GTPase hydrolysis for ARD1 but not ARFs. Arfaptin SM01015 Arfaptin-like domain Arfaptin interacts with ARF1, a small GTPase involved in vesicle budding at the Golgi complex and immature secretory granules. The structure of arfaptin shows that upon binding to a small GTPase, arfaptin forms an elongated, crescent-shaped dimer of three-helix coiled-coils. The N-terminal region of ICA69 is similar to arfaptin. Arg_tRNA_synt_N SM01016 Arginyl tRNA synthetase N terminal dom This domain is found at the amino terminus of Arginyl tRNA synthetase, also called additional domain 1 (Add-1). It is about 140 residues long and it has been suggested that this domain will be involved in tRNA recognition. Arrestin_C SM01017 Arrestin (or S-antigen), C-terminal domain Ig-like beta-sandwich fold. Scop reports duplication with N-terminal domain. Arrestins comprise a family of closely-related proteins that includes beta-arrestin-1 and -2, which regulate the function of beta-adrenergic receptors by binding to their phosphorylated forms, impairing their capacity to activate G(S) proteins; Cone photoreceptors C-arrestin (arrestin-X) PUBMED:7720881, which could bind to phosphorylated red/green opsins; and Drosophila phosrestins I and II, which undergo light-induced phosphorylation, and probably play a role in photoreceptor transduction PUBMED:8452755, PUBMED:1517224, PUBMED:2158671. Asparaginase SM00870 Asparaginase, which is found in various plant, animal and bacterial cells, catalyses the deamination of asparagine to yield aspartic acid and an ammonium ion, resulting in a depletion of free circulatory asparagine in plasma PUBMED:3026924. The enzyme is effective in the treatment of human malignant lymphomas, which have a diminished capacity to produce asparagine synthetase: in order to survive, such cells absorb asparagine from blood plasma PUBMED:2407723, PUBMED:3379033 - if Asn levels have been depleted by injection of asparaginase, the lymphoma cells die. Autotransporter SM00869 Autotransporter beta-domain Secretion of protein products occurs by a number of different pathways in bacteria. One of these pathways known as the type IV pathway was first described for the IgA1 protease. The protein component that mediates secretion through the outer membrane is contained within the secreted protein itself, hence the proteins secreted in this way are called autotransporters. This family corresponds to the presumed integral membrane beta-barrel domain that transports the protein. This domain is found at the C-terminus of the proteins it occurs in. The N-terminus contains the variable passenger domain that is translocated across the membrane. Once the passenger domain is exported it is cleaved auto-catalytically in some proteins, in others a different peptidase is used and in some cases no cleavage occurs. B12-binding_2 SM01018 B12 binding domain Cobalamin-dependent methionine synthase is a large modular protein that catalyses methyl transfer from methyltetrahydrofolate (CH3-H4folate) to homocysteine. During the catalytic cycle, it supports three distinct methyl transfer reactions, each involving the cobalamin (vitamin B12) cofactor and a substrate bound to its own functional unit PUBMED:11731805. The cobalamin cofactor plays an essential role in this reaction, accepting the methyl group from CH3-H4folate to form methylcob(III)alamin, and in turn donating the methyl group to homocysteine to generate methionine and cob(I)alamin. Methionine synthase is a large enzyme composed of four structurally and functionally distinct modules: the first two modules bind homocysteine and CH3-H4folate, the third module binds the cobalamin cofactor and the C-terminal module binds S-adenosylmethionine. The cobalamin-binding module is composed of two structurally distinct domains: a 4-helical bundle cap domain (residues 651-740 in the Escherichia coli enzyme) and an alpha/beta B12-binding domain (residues 741-896). The 4-helical bundle forms a cap over the alpha/beta domain, which acts to shield the methyl ligand of cobalamin from solvent PUBMED:8939751. Furthermore, in the conversion to the active conformation of this enzyme, the 4-helical cap rotates to allow the cobalamin cofactor to bind the activation domain. The alpha/beta domain is a common cobalamin-binding motif, whereas the 4-helical bundle domain with its methyl cap is a distinctive feature of methionine synthases. B2-adapt-app_C SM01020 Beta2-adaptin appendage, C-terminal sub-domain Members of this family adopt a structure consisting of a 5 stranded beta-sheet, flanked by one alpha helix on the outer side, and by two alpha helices on the inner side. This domain is required for binding to clathrin, and its subsequent polymerisation. Furthermore, a hydrophobic patch present in the domain also binds to a subset of D-phi-F/W motif-containing proteins that are bound by the alpha-adaptin appendage domain (epsin, AP180, eps15). B3 SM01019 B3 DNA binding domain Two DNA binding proteins, RAV1 and RAV2 from Arabidopsis thaliana contain two distinct amino acid sequence domains found only in higher plant species. The N-terminal regions of RAV1 and RAV2 are homologous to the AP2 DNA-binding domain (see ) present in a family of transcription factors, while the C-terminal region exhibits homology to the highly conserved C-terminal domain, designated B3, of VP1/ABI3 transcription factors PUBMED:9862967. The AP2 and B3-like domains of RAV1 bind autonomously to the CAACA and CACCTG motifs, respectively, and together achieve a high affinity and specificity of binding. It has been suggested that the AP2 and B3-like domains of RAV1 are connected by a highly flexible structure enabling the two domains to bind to the CAACA and CACCTG motifs in various spacings and orientations PUBMED:9862967. B3_4 SM00873 B3/4 domain This domain is found in tRNA synthetase beta subunits as well as in some non tRNA synthetase proteins. B41 SM00295 Band 4.1 homologues Also known as ezrin/radixin/moesin (ERM) protein domains. Present in myosins, ezrin, radixin, moesin, protein tyrosine phosphatases. Plasma membrane-binding domain. These proteins play structural and regulatory roles in the assembly and stabilization of specialized plasmamembrane domains. Some PDZ domain containing proteins bind one or more of this family. Now includes JAKs. B5 SM00874 tRNA synthetase B5 domain This domain is found in phenylalanine-tRNA synthetase beta subunits. B561 SM00665 Cytochrome b-561 / ferric reductase transmembrane domain. Cytochrome b-561 recycles ascorbate for the generation of norepinephrine by dopamine-beta-hydroxylase in the chromaffin vesicles of the adrenal gland. It is a transmembrane heme protein with the two heme groups being bound to conserved histidine residues. A cytochrome b-561 homologue, termed Dcytb, is an iron-regulated ferric reductase in the duodenal mucosa. Other homologues of these are also likely to be ferric reductases. SDR2 is proposed to be important in regulating the metabolism of iron in the onset of neurodegenerative disorders. BACK SM00875 BTB And C-terminal Kelch The BACK domain is found juxtaposed to the BTB domain; they are separated by as little as two residues. BAF SM01023 Barrier to autointegration factor Barrier-to-autointegration factor (BAF) is an essential protein that is highly conserved in metazoan evolution, and which may act as a DNA-bridging protein PUBMED:12902403. BAF binds directly to double-stranded DNA, to transcription activators, and to inner nuclear membrane proteins, including lamin A filament proteins that anchor nuclear-pore complexes in place, and nuclear LEM-domain proteins that bind to laminins filaments and chromatin. New findings suggest that BAF has structural roles in nuclear assembly and chromatin organization, represses gene expression and might interlink chromatin structure, nuclear architecture and gene regulation in metazoans PUBMED:15130582. BAF can be exploited by retroviruses to act as a host component of pre-integration complexes, which promote the integration of the retroviral DNA into the host chromosome by preventing autointegration of retroviral DNA PUBMED:14645565. BAF might contribute to the assembly or activity of retroviral pre-integration complexes through direct binding to the retroviral proteins p55 Gag and matrix, as well as to DNA. BAG SM00264 BAG domains, present in regulator of Hsp70 proteins BAG domains, present in Bcl-2-associated athanogene 1 and silencer of death domains BAH SM00439 Bromo adjacent homology domain BAR SM00721 BASIC SM00520 Basic domain in HLH proteins of MYOD family BATS SM00876 Biotin and Thiamin Synthesis associated domain Biotin synthase (BioB), , catalyses the last step of the biotin biosynthetic pathway. The reaction consists in the introduction of a sulphur atom into dethiobiotin. BioB functions as a homodimer PUBMED:12482614. Thiamin synthesis if a complex process involving at least six gene products (ThiFSGH, ThiI and ThiJ). Two of the proteins required for the biosynthesis of the thiazole moiety of thiamine (vitamin B(1)) are ThiG and ThiH (this entry) and form a heterodimerPUBMED:12650933. Both of these reactions are thought of involve the binding of co-factors, and both function as dimers PUBMED:12482614, PUBMED:12650933. This domain therefore may be involved in co-factor binding or dimerisation. BBC SM00502 B-Box C-terminal domain Coiled coil region C-terminal to (some) B-Box domains BBOX SM00336 B-Box-type zinc finger BCL SM00337 BCL (B-Cell lymphoma); contains BH1, BH2 regions (BH1, BH2, (BH3 (one helix only)) and not BH4(one helix only)). Involved in apoptosis regulation BCS1_N SM01024 This domain is found at the N terminal of the mitochondrial ATPase BSC1. It encodes the import and intramitochondrial sorting for the protein. BEN SM01025 The BEN domain is found in diverse animal proteins such as BANP/SMAR1, NAC1 and the Drosophila mod(mdg4) isoform C, in the chordopoxvirus virosomal protein E5R and in several proteins of polydnaviruses. Computational analysis suggests that the BEN domain mediates protein-DNA and protein-protein interactions during chromatin organisation and transcription. BH4 SM00265 BH4 Bcl-2 homology region 4 BHD_1 SM01030 Rad4 beta-hairpin domain 1 This short domain is found in the Rad4 protein. This domain binds to DNA. BHD_2 SM01031 Rad4 beta-hairpin domain 2 This short domain is found in the Rad4 protein. This domain binds to DNA. BHD_3 SM01032 Rad4 beta-hairpin domain 3 This short domain is found in the Rad4 protein. This domain binds to DNA. BHL SM00411 bacterial (prokaryotic) histone like domain BID_1 SM00634 Bacterial Ig-like domain (group 1) BID_2 SM00635 Bacterial Ig-like domain 2 BING4CT SM01033 BING4CT (NUC141) domain This C terminal domain is found in the BING4 family of nucleolar WD40 repeat proteins. BIR SM00238 Baculoviral inhibition of apoptosis protein repeat Domain found in inhibitor of apoptosis proteins (IAPs) and other proteins. Acts as a direct inhibitor of caspase enzymes. BLUF SM01034 Sensors of blue-light using FAD The BLUF domain has been shown to bind FAD in the AppA protein (Q53119). AppA is involved in the repression of photosynthesis genes in response to blue-light. BMC SM00877 Bacterial microcompartments are primitive organelles composed entirely of protein subunits. The prototypical bacterial microcompartment is the carboxysome, a protein shell for sequestering carbon fixation reactions. These proteins for hexameric structure. BON SM00749 bacterial OsmY and nodulation domain BOP1NT SM01035 BOP1NT (NUC169) domain This N terminal domain is found in BOP1-like WD40 proteins. BP28CT SM01036 BP28CT (NUC211) domain This C-terminal domain is found in BAP28-like nucleolar proteins PUBMED:15112237. BPI1 SM00328 BPI/LBP/CETP N-terminal domain Bactericidal permeability-increasing protein (BPI) / Lipopolysaccharide-binding protein (LBP) / Cholesteryl ester transfer protein (CETP) N-terminal domain BPI2 SM00329 BPI/LBP/CETP C-terminal domain Bactericidal permeability-increasing protein (BPI) / Lipopolysaccharide-binding protein (LBP) / Cholesteryl ester transfer protein (CETP) C-terminal domain BRCT SM00292 breast cancer carboxy-terminal domain BRICHOS SM01039 The BRICHOS domain is about 100 amino acids long. It is found in a variety of proteins implicated in dementia, respiratory distress and cancer. Its exact function is unknown; roles that have been proposed for it include (a) in targeting of the protein to the secretory pathway, (b) intramolecular chaperone-like function, and (c) assisting the specialised intracellular protease processing system PUBMED:12114016. This C-terminal domain is embedded in the endoplasmic reticulum lumen, and binds to the N-terminal, transmembrane, SP_C, PF08999 provided that it is in non-helical conformation. Thus the Brichos domain of proSP-C is a chaperone that induces alpha-helix formation of an aggregation-prone TM region PUBMED:19472327 . BRIGHT SM00501 BRIGHT, ARID (A/T-rich interaction domain) domain DNA-binding domain containing a helix-turn-helix structure BRK SM00592 domain in transcription and CHROMO domain helicases BRLZ SM00338 basic region leucin zipper BRO1 SM01041 BRO1-like domain This domain is found in a number proteins including Rhophilin Q61085 and BRO1 P48582. It is known to have a role in endosomal targeting. ESCRT-III subunit Snf7 binds to a conserved hydrophobic patch in the BRO1 domain that is required for protein complex formation and for the protein-sorting function of BRO1 PUBMED:15935782. BROMO SM00297 bromo domain BSD SM00751 domain in transcription factors and synapse-associated proteins BTAD SM01043 Bacterial transcriptional activator domain Found in the DNRI/REDD/AFSR family of regulators. This region of AFSR (P25941) along with the C terminal region is capable of independently directing actinorhodin production. This family contains TPR repeats. BTB SM00225 Broad-Complex, Tramtrack and Bric a brac Domain in Broad-Complex, Tramtrack and Bric a brac. Also known as POZ (poxvirus and zinc finger) domain. Known to be a protein-protein interaction motif found at the N-termini of several C2H2-type transcription factors as well as Shaw-type potassium channels. Known structure reveals a tightly intertwined dimer formed via interactions between N-terminal strand and helix structures. However in a subset of BTB/POZ domains, these two secondary structures appear to be missing. Be aware SMART predicts BTB/POZ domains without the beta1- and alpha1-secondary structures. BTK SM00107 Bruton's tyrosine kinase Cys-rich motif Zinc-binding motif containing conserved cysteines and a histidine. Always found C-terminal to PH domains (but not all PH domains are followed by BTK motifs). The crystal structure shows this motif packs against the PH domain. The PH+Btk module pair has been called the Tec homology (TH) region. BTP SM00576 Bromodomain transcription factors and PHD domain containing proteins subdomain of archael histone-like transcription factors BURP SM01045 The BURP domain is found at the C-terminus of several different plant proteins. It was named after the proteins in which it was first identified: the BNM2 clone-derived protein from Brassica napus O65009; USPs and USP-like proteins P21746 P21747 Q06765 O24482; RD22 from Arabidopsis thaliana Q08298; and PG1beta from Lycopersicon esculentum Q40161. This domain is around 230 amino acid residues long. It possesses the following conserved features: two phenylalanine residues at its N-terminus; two cysteine residues; and four repeated cysteine-histidine motifs, arranged as: CH-X(10)-CH-X(25-27)-CH-X(25-26)-CH, where X can be any amino acid PUBMED:9790599. The function of this domain is unknown. B_lectin SM00108 Bulb-type mannose-specific lectin Bac_DnaA_C SM00760 Bacterial dnaA protein helix-turn-helix domain Could be involved in DNA-binding. Bac_rhodopsin SM01021 Bacteriorhodopsin-like protein The bacterial opsins are retinal-binding proteins that provide light- dependent ion transport and sensory functions to a family of halophilic bacteria PUBMED:2468194, PUBMED:2591367. They are integral membrane proteins believed to contain seven transmembrane (TM) domains, the last of which contains the attachment point for retinal (a conserved lysine). Beach SM01026 Beige/BEACH domain The BEACH domain was described in the BEIGE protein (D1035670) and in the highly homologous CHS protein. The BEACH domain is usually followed by a series of WD repeats. The function of the BEACH domain is unknown. Bet_v_1 SM01037 Pathogenesis-related protein Bet v I family This family is named after Bet v 1, the major birch pollen allergen. This protein belongs to family 10 of plant pathogenesis-related proteins (PR-10), cytoplasmic proteins of 15-17 kd that are wide-spread among dicotyledonous plants PUBMED:9417891. In recent years, a number of diverse plant proteins with low sequence similarity to Bet v 1 was identified. A classification by sequence similarity yielded several subfamilies related to PR-10 PUBMED:18922149 - Pathogenesis-related proteins PR-10: These proteins were identified as major tree pollen allergens in birch and related species (hazel, alder), as plant food allergens expressed in high levels in fruits, vegetables and seeds (apple, celery, hazelnut), and as pathogenesis-related proteins whose expression is induced by pathogen infection, wounding, or abiotic stress. Hyp-1 (Q8H1L1), an enzyme involved in the synthesis of the bioactive naphthodianthrone hypericin in St. John's wort (Hypericum perforatum) also belongs to this family. Most of these proteins were found in dicotyledonous plants. In addition, related sequences were identified in monocots and conifers. - Cytokinin-specific binding proteins: These legume proteins bind cytokinin plant hormones PUBMED:9874249. - (S)-Norcoclaurine synthases are enzymes catalysing the condensation of dopamine and 4-hydroxyphenylacetaldehyde to (S)-norcoclaurine, the first committed step in the biosynthesis of benzylisoquinoline alkaloids such as morphine PUBMED:15447655. -Major latex proteins and ripening-related proteins are proteins of unknown biological function that were first discovered in the latex of opium poppy (Papaver somniferum) and later found to be upregulated during ripening of fruits such as strawberry and cucumber PUBMED:15447655. The occurrence of Bet v 1-related proteins is confined to seed plants with the exception of a cytokinin-binding protein from the moss Physcomitrella patens (Q9AXI3). Beta-Casp SM01027 Beta-Casp domain The beta-CASP domain is found C terminal to the beta-lactamase domain in pre-mRNA 3'-end-processing endonuclease. The active site of this enzyme is located at the interface of these two domains. Beta-TrCP_D SM01028 D domain of beta-TrCP This domain is found in eukaryotes, and is approximately 40 amino acids in length. It is found associated with F-box domain, WD domain. The protein that contains this domain functions as a ubiquitin ligase. Ubiquitination is required to direct proteins towards the proteasome for degradation. This protein is part of the WD40 class of F box proteins. The D domain of these F box proteins is involved in mediating the dimerisation of the protein. Dimerisation is necessary to polyubiquitinate substrates so this D domain is vital in directing substrates towards the proteasome for degradation. BetaGal_dom2 SM01029 Beta-galactosidase, domain 2 This is the second domain of the five-domain beta-galactosidase enzyme that altogether catalyses the hydrolysis of beta(1-3) and beta(1-4) galactosyl bonds in oligosaccharides as well as the inverse reaction of enzymatic condensation and trans-glycosylation. This domain is made up of 16 antiparallel beta-strands and an alpha-helix at its C terminus. The fold of this domain appears to be unique. In addition, the last seven strands of the domain form a subdomain with an immunoglobulin-like (I-type Ig) fold in which the first strand is divided between the two beta-sheets. In penicillin spp this strand is interrupted by a 12-residue insertion which forms an additional edge-strand to the second beta-sheet of the sub-domain. The remainder of the second domain forms a series of beta-hairpins at its N terminus, four strands of which are contiguous with part of the Ig-like sub-domain, forming in total a seven-stranded antiparallel beta-sheet. This domain is associated with family Glyco_hydro_35, which is N-terminal to it, but itself has no metazoan members. Bgal_small_N SM01038 Beta galactosidase small chain This domain comprises the small chain of dimeric beta-galactosidases EC:3.2.1.23. This domain is also found in single chain beta-galactosidase. Biotin_carb_C SM00878 Biotin carboxylase C-terminal domain Biotin carboxylase is a component of the acetyl-CoA carboxylase multi-component enzyme which catalyses the first committed step in fatty acid synthesis in animals, plants and bacteria. Most of the active site residues reported in reference are in this C-terminal domain. BowB SM00269 Bowman-Birk type proteinase inhibitor Brix SM00879 The Brix domain is found in a number of eukaryotic proteins including SSF proteins from yeast and humans, Arabidopsis thaliana Peter Pan-like protein and several hypothetical proteins. Bro-N SM01040 BRO family, N-terminal domain This family includes the N-terminus of baculovirus BRO and ALI motif proteins. The function of BRO proteins is unknown. It has been suggested that BRO-A and BRO-C are DNA binding proteins that influence host DNA replication and/or transcription PUBMED:10888617. This Pfam domain does not include the characteristic invariant alanine, leucine, isoleucine motif of the ALI proteins PUBMED:9847359. Brr6_like_C_C SM01042 Di-sulfide bridge nucleocytoplasmic transport domain Brr6_like_C_C is the highly conserved C-terminal region of a group of proteins found in fungi. It carries four highly conserved cysteine residues. It is suggested that members of the family interact with each other via di-sulfide bridges to form a complex which is involved in nucleocytoplasmic transport PUBMED:15882446 . Btz SM01044 CASC3/Barentsz eIF4AIII binding This domain is found on CASC3 (cancer susceptibility candidate gene 3 protein) which is also known as Barentsz (Btz). CASC3 is a component of the EJC (exon junction complex) which is a complex that is involved in post-transcriptional regulation of mRNA in metazoa. The complex is formed by the association of four proteins (eIF4AIII, Barentsz, Mago, and Y14), mRNA, and ATP. This domain wraps around eIF4AIII and stacks against the 5' nucleotide PUBMED:16923391. C1 SM00109 Protein kinase C conserved region 1 (C1) domains (Cysteine-rich domains) Some bind phorbol esters and diacylglycerol. Some bind RasGTP. Zinc-binding domains. C1Q SM00110 Complement component C1q domain. Globular domain found in many collagens and eponymously in complement C1q. When part of full length proteins these domains form a 'bouquet' due to the multimerization of heterotrimers. The C1q fold is similar to that of tumour necrosis factor. C1_4 SM01047 TFIIH C1-like domain The carboxyl-terminal region of TFIIH is essential for transcription activity. This regions binds three zinc atoms through two independent domain. The first contains a C4 zinc finger motif, whereas the second is characterised by a CX(2)CX(2-4)FCADCD motif. The solution structure of the second C-terminal domain revealed homology with the regulatory domain of protein kinase C PUBMED:10882739. C2 SM00239 Protein kinase C conserved region 2 (CalB) Ca2+-binding motif present in phospholipases, protein kinases C, and synaptotagmins (among others). Some do not appear to contain Ca2+-binding sites. Particular C2s appear to bind phospholipids, inositol polyphosphates, and intracellular proteins. Unusual occurrence in perforin. Synaptotagmin and PLC C2s are permuted in sequence with respect to N- and C-terminal beta strands. SMART detects C2 domains using one or both of two profiles. C345C SM00643 Netrin C-terminal Domain C4 SM00111 C-terminal tandem repeated domain in type 4 procollagens Duplicated domain in C-terminus of type 4 collagens. Mutations in alpha-5 collagen IV are associated with X-linked Alport syndrome. C6 SM01048 This domain of unknown function is found in the C. elegans protein Q19522. It is presumed to be an extracellular domain. The C6 domain contains six conserved cysteine residues in most copies of the domain. However some copies of the domain are missing cysteine residues 1 and 3 suggesting that these form a disulphide bridge. C8 SM00832 This domain contains 8 conserved cysteine residues, but this family only contains 7 of them to overlaps with other domains. It is found in disease-related proteins including von Willebrand factor, Alpha tectorin, Zonadhesin and Mucin. CA SM00112 Cadherin repeats. Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. Cadherin domains occur as repeats in the extracellular regions which are thought to mediate cell-cell contact when bound to calcium. CAD SM00266 Domains present in proteins implicated in post-mortem DNA fragmentation CADG SM00736 Dystroglycan-type cadherin-like domains. Cadherin-homologous domains present in metazoan dystroglycans and alpha/epsilon sarcoglycans, yeast Axl2p and in a very large protein from magnetotactic bacteria. Likely to bind calcium ions. CALCITONIN SM00113 calcitonin This family is formed by calcitonin, the calcitonin gene-related peptide, and amylin. They are short polypeptide hormones. CAMSAP_CKK SM01051 Microtubule-binding calmodulin-regulated spectrin-associated This is the C-terminal domain of a family of eumetazoan proteins collectively defined as calmodulin-regulated spectrin-associated, or CAMSAP, proteins. CAMSAP proteins carry an N-terminal region that includes the CH domain, a central region including a predicted coiled-coil and this C-terminal, or CKK, domain - defined as being present in CAMSAP, KIAA1078 and KIAA1543, The C-terminal domain is the part of the CAMSAP proteins that binds to microtubules. The domain appears to act by producing inhibition of neurite extension, probably by blocking microtubule function. CKK represents a domain that has evolved with the metazoa. The structure of a murine hypothetical protein from RIKEN cDNA has shown the domain to adopt a mainly beta barrel structure with an associated alpha-helical hairpin. CAP10 SM00672 Putative lipopolysaccharide-modifying enzyme. CAP_GLY SM01052 Cytoskeleton-associated proteins (CAPs) are involved in the organisation of microtubules and transportation of vesicles and organelles along the cytoskeletal network. A conserved motif, CAP-Gly, has been identified in a number of CAPs, including CLIP-170 and dynactins. The crystal structure of Caenorhabditis elegans F53F4.3 protein Q20728 CAP-Gly domain was recently solved PUBMED:12221106. The domain contains three beta-strands. The most conserved sequence, GKNDG, is located in two consecutive sharp turns on the surface, forming the entrance to a groove PUBMED:12221106. CARD SM00114 Caspase recruitment domain Motif contained in proteins involved in apoptotic signalling. Mediates homodimerisation. Structure consists of six antiparallel helices arranged in a topology homologue to the DEATH and the DED domain. CARP SM00673 Domain in CAPs (cyclase-associated proteins) and X-linked retinitis pigmentosa 2 gene product. CASH SM00722 Domain present in carbohydrate binding proteins and sugar hydrolses CASc SM00115 Caspase, interleukin-1 beta converting enzyme (ICE) homologues Cysteine aspartases that mediate programmed cell death (apoptosis). Caspases are synthesised as zymogens and activated by proteolysis of the peptide backbone adjacent to an aspartate. The resulting two subunits associate to form an (alpha)2(beta)2-tetramer which is the active enzyme. Activation of caspases can be mediated by other caspase homologues. CAT SM01059 Chloramphenicol acetyltransferase Chloramphenicol acetyltransferase (CAT) PUBMED:1867713 catalyzes the acetyl-CoA dependent acetylation of chloramphenicol (Cm), an antibiotic which inhibits prokaryotic peptidyltransferase activity. Acetylation of Cm by CAT inactivates the antibiotic. A histidine residue, located in the C-terminal section of the enzyme, plays a central role in its catalytic mechanism. There is a second family of CAT PUBMED:1314803, evolutionary unrelated to the main family described above. These CAT belong to the bacterial hexapeptide-repeat containing-transferases family (see ). The crystal structure of the type III enzyme from Escherichia coli with chloramphenicol bound has been determined. CAT is a trimer of identical subunits (monomer Mr 25,000) and the trimeric structure is stabilised by a number of hydrogen bonds, some of which result in the extension of a beta-sheet across the subunit interface. Chloramphenicol binds in a deep pocket located at the boundary between adjacent subunits of the trimer, such that the majority of residues forming the binding pocket belong to one subunit while the catalytically essential histidine belongs to the adjacent subunit. His195 is appropriately positioned to act as a general base catalyst in the reaction, and the required tautomeric stabilisation is provided by an unusual interaction with a main-chain carbonyl oxygen PUBMED:2187098. CAT_RBD SM01061 CAT RNA binding domain This RNA binding domain is found at the amino terminus of transcriptional antitermination proteins such as BglG, SacY and LicT. These proteins control the expression of sugar metabolising operons in Gram+ and Gram- bacteria. This domain has been called the CAT (Co-AntiTerminator) domain. It binds as a dimer PUBMED:9305644 to short Ribonucleotidic Anti-Terminator (RAT) hairpin, each monomer interacting symmetrically with both strands of the RAT hairpin PUBMED:11953318. In the full-length protein, CAT is followed by two phosphorylatable PTS regulation domains that modulate the RNA binding activity of CAT. Upon activation, the dimeric proteins bind to RAT targets in the nascent mRNA, thereby preventing abortive dissociation of the RNA polymerase from the DNA template PUBMED:10610766. CBD_II SM00637 CBD_IV SM00606 Cellulose Binding Domain Type IV CBF SM00521 CCAAT-Binding transcription Factor CBM49 SM01063 Carbohydrate binding domain CBM49 This domain is found at the C terminal of cellulases and in vitro binding studies have shown it to binds to crystalline cellulose PUBMED:17322304 . CBM_10 SM01064 Cellulose or protein binding domain This domain is found in two distinct sets of proteins with different functions. Those found in aerobic bacteria bind cellulose (or other carbohydrates); but in anaerobic fungi they are protein binding domains, referred to as dockerin domains or docking domains. They are believed to be responsible for the assembly of a multiprotein cellulase/hemicellulase complex, similar to the cellulosome found in certain anaerobic bacteria. CBM_2 SM01065 Starch binding domain CBM_25 SM01066 Carbohydrate binding domain CBM_3 SM01067 Cellulose binding domain CBM_X SM01068 Putative carbohydrate binding domain CBS SM00116 Domain in cystathionine beta-synthase and other proteins. Domain present in all 3 forms of cellular life. Present in two copies in inosine monophosphate dehydrogenase, of which one is disordered in the crystal structure [3]. A number of disease states are associated with CBS-containing proteins including homocystinuria, Becker's and Thomsen disease. CCP SM00032 Domain abundant in complement control proteins; SUSHI repeat; short complement-like repeat (SCR) The complement control protein (CCP) modules (also known as short consensus repeats SCRs or SUSHI repeats) contain approximately 60 amino acid residues and have been identified in several proteins of the complement system. A missense mutation in seventh CCP domain causes deficiency of the b subunit of factor XIII. CDC37_C SM01069 Cdc37 C terminal domain Cdc37 is a protein required for the activity of numerous eukaryotic protein kinases. This domains corresponds to the C terminal domain whose function is unclear. It is found C terminal to the Hsp90 chaperone (Heat shocked protein 90) binding domain PF08565 and the N terminal kinase binding domain of Cdc37 PUBMED:16098195. CDC37_M SM01070 Cdc37 Hsp90 binding domain Cdc37 is a molecular chaperone required for the activity of numerous eukaryotic protein kinases. This domains corresponds to the Hsp90 chaperone (Heat shocked protein 90) binding domain of Cdc37 PUBMED:16098195. It is found between the N terminal Cdc37 domain which is predominantly involved in kinase binding, and the C terminal domain of Cdc37 whose function is unclear. CDC37_N SM01071 Cdc37 N terminal kinase binding Cdc37 is a molecular chaperone required for the activity of numerous eukaryotic protein kinases. This domain corresponds to the N terminal domain which binds predominantly to protein kinases PUBMED:16098195 and is found N terminal to the Hsp (Heat shocked protein) 90-binding domain. Expression of a construct consisting of only the N-terminal domain of Saccharomyces pombe Cdc37 results in cellular viability. This indicates that interactions with the cochaperone Hsp90 may not be essential for Cdc37 function PUBMED:16098195. CDC48_2 SM01072 Cell division protein 48 (CDC48) domain 2 This domain has a double psi-beta barrel fold and includes VCP-like ATPase and N-ethylmaleimide sensitive fusion protein N-terminal domains. Both the VAT and NSF N-terminal functional domains consist of two structural domains of which this is at the C-terminus. The VAT-N domain found in AAA ATPases is a substrate 185-residue recognition domain PUBMED:10531028. CDC48_N SM01073 Cell division protein 48 (CDC48) N-terminal domain This domain has a double psi-beta barrel fold and includes VCP-like ATPase and N-ethylmaleimide sensitive fusion protein N-terminal domains. Both the VAT and NSF N-terminal functional domains consist of two structural domains of which this is at the N-terminus. The VAT-N domain found in AAA ATPases is a substrate 185-residue recognition domain PUBMED:10531028. CDT1 SM01075 DNA replication factor CDT1 like CDT1 is a component of the replication licensing system and promotes the loading of the mini-chromosome maintenance complex onto chromatin. Geminin is an inhibitor of CDT1 and prevents inappropriate re-initiation of replication on an already fired origin. This region of CDT1 binds to Geminin PUBMED:15286659. CENPB SM00674 Putative DNA-binding domain in centromere protein B, mouse jerky and transposases. CFEM SM00747 eight cysteine-containing domain present in fungal extracellular membrane proteins CG-1 SM01076 CG-1 domains are highly conserved domains of about 130 amino-acid residues containing a predicted bipartite NLS and named after a partial cDNA clone isolated from parsley encoding a sequence-specific DNA-binding protein PUBMED:8075408. CG-1 domains are associated with CAMTA proteins (for CAlModulin -binding Transcription Activator) that are transcription factors containing a calmodulin -binding domain and ankyrins (ANK) motifs PUBMED:11925432. CGGC SM01078 This putative domain contains a quite highly conserved sequence of CGGC in its central region. The domain has many conserved cysteines and histidines suggestive of a zinc binding function. CH SM00033 Calponin homology domain Actin binding domains present in duplicate at the N-termini of spectrin-like proteins (including dystrophin, alpha-actinin). These domains cross-link actin filaments into bundles and networks. A calponin homology domain is predicted in yeasst Cdc24p. CHAD SM00880 The CHAD domain is an alpha-helical domain functionally associated with some members of the adenylate cyclase family . It has conserved histidines that may chelate metals. CHASE SM01079 This domain is found in the extracellular portion of receptor-like proteins - such as serine/threonine kinases and adenylyl cyclases PUBMED:11590000, PUBMED:11590001. Predicted to be a ligand binding domain PUBMED:11590000. CHASE2 SM01080 CHASE2 is an extracellular sensory domain, which is present in various classes of transmembrane receptors that are parts of signal transduction pathways in bacteria. Specifically, CHASE2 domains are found in histidine kinases, adenylate cyclases, serine/threonine kinases and predicted diguanylate cyclases/phosphodiesterases. Environmental factors that are recognised by CHASE2 domains are not known at this time PUBMED:12486065. CHB_HEX SM01081 Putative carbohydrate binding domain This domain represents the N terminal domain in chitobiases and beta-hexosaminidases EC:3.2.1.52. It is composed of a beta sandwich structure that is similar in structure to the cellulose binding domain of cellulase from Cellulomonas fimi PUBMED:8673609. This suggests that this may be a carbohydrate binding domain. CHK SM00587 ZnF_C4 abd HLH domain containing kinases domain subfamily of choline kinases CHRD SM00754 A domain in the BMP inhibitor chordin and in microbial proteins. CHROMO SM00298 Chromatin organization modifier domain CHZ SM01082 Histone chaperone domain CHZ This domain is highly conserved from yeasts to humans and is part of the chaperone protein HIRIP3 in vertebrates which interacts with the H3.3 chaperone HIRA, implicated in histone replacement during transcription. N- and C- termini of Chz family members are relatively divergent but do contain similar acidic stretches rich in Glu/Asp residues, characteristic of all histone chaperones PUBMED:17289584. CKS SM01084 Cyclin-dependent kinase regulatory subunit Cyclin-dependent kinase regulatory subunit. CK_II_beta SM01085 Casein kinase II regulatory subunit CLECT SM00034 C-type lectin (CTL) or carbohydrate-recognition domain (CRD) Many of these domains function as calcium-dependent carbohydrate binding modules. CLH SM00299 Clathrin heavy chain repeat homology CLIP SM00680 Clip or disulphide knot domain Present in horseshoe crab proclotting enzyme N-terminal domain, Drosophila Easter and silkworm prophenoloxidase-activating enzyme. CLa SM00035 CLUSTERIN alpha chain CLb SM00030 CLUSTERIN Beta chain CM_2 SM00830 Chorismate mutase type II Chorismate mutase, catalyses the conversion of chorismate to prephenate in the pathway of tyrosine and phenylalanine biosynthesis. This enzyme is negatively regulated by tyrosine, tryptophan and phenylalanine PUBMED:9642265, PUBMED:9497350. CNH SM00036 Domain found in NIK1-like kinases, mouse citron and yeast ROM1, ROM2 Unpublished observations. CNX SM00037 Connexin homologues Connexin channels participate in the regulation of signaling between developing and differentiated cell types. COG6 SM01087 Conserved oligomeric complex COG6 COG6 is a component of the conserved oligomeric golgi complex, which is composed of eight different subunits and is required for normal golgi morphology and localisation. COLFI SM00038 Fibrillar collagens C-terminal domain Found at C-termini of fibrillar collagens: Ephydatia muelleri procollagen EMF1alpha, vertebrate collagens alpha(1)III, alpha(1)II, alpha(2)V etc. COLIPASE SM00023 Colipase Colipase is a protein that functions as a cofactor for pancreatic lipase, with which it forms a stoichiometric complex. It also binds to the bile-salt covered triacylglycerol interface thus allowing the enzyme to anchor itself to the water-lipid interface. Colipase is a small protein of approximately 100 amino-acid residues with five conserved disulfide bonds. CO_deh_flav_C SM01092 CO dehydrogenase flavoprotein C-terminal domain CP12 SM01093 CPDc SM00577 catalytic domain of ctd-like phosphatases CPSF73-100_C SM01098 This is the C-terminal conserved region of the pre-mRNA 3'-end-processing of the polyadenylation factor CPSF-73/CPSF-100 proteins. The exact function of this domain is not known. CPSase_L_D3 SM01096 Carbamoyl-phosphate synthetase large chain, oligomerisation domain Carbamoyl-phosphate synthase catalyses the ATP-dependent synthesis of carbamyl-phosphate from glutamine or ammonia and bicarbonate. The carbamoyl-phosphate synthase (CPS) enzyme in prokaryotes is a heterodimer of a small and large chain. CPSase_sm_chain SM01097 Carbamoyl-phosphate synthase small chain, CPSase domain The carbamoyl-phosphate synthase domain is in the amino terminus of protein. Carbamoyl-phosphate synthase catalyses the ATP-dependent synthesis of carbamyl-phosphate from glutamine or ammonia and bicarbonate. This important enzyme initiates both the urea cycle and the biosynthesis of arginine and/or pyrimidines PUBMED:1972379. The carbamoyl-phosphate synthase (CPS) enzyme in prokaryotes is a heterodimer of a small and large chain. The small chain promotes the hydrolysis of glutamine to ammonia, which is used by the large chain to synthesise carbamoyl phosphate. The small chain has a GATase domain in the carboxyl terminus. CPW_WPC SM01099 This group of sequences is defined by a domain of about 61 residues in length with six well-conserved cysteine residues and six well-conserved aromatic sites. The domain can be found in tandem repeats, and is known so far only in Plasmodium falciparum. It is named for motifs of CPxxW and (less well conserved) WPC. Its function is unknown. CRA SM00757 CT11-RanBPM protein-protein interaction domain present in crown eukaryotes (plants, animals, fungi) CRAL_TRIO_N SM01100 CRAL/TRIO, N-terminal domain CRF SM00039 corticotropin-releasing factor CRISPR_assoc SM01101 This domain forms an anti-parallel beta strand structure with flanking alpha helical regions. CRM1_C SM01102 CRM1 C terminal CRM1 (also known as Exportin1) mediates the nuclear export of proteins bearing a leucine-rich nuclear export signal (NES). CRM1 forms a complex with the NES containing protein and the small GTPase Ran. This region forms an alpha helical structure formed by six helical hairpin motifs that are structurally similar to the HEAT repeat, but share little sequence similarity to the HEAT repeat PUBMED:15574331. CRS1_YhbY SM01103 Escherichia coli YhbY is associated with pre-50S ribosomal subunits, which implies a function in ribosome assembly. GFP fused to a single-domain CRM protein from maize localises to the nucleolus, suggesting that an analogous activity may have been retained in plants PUBMED:17105995. A CRM domain containing protein in plant chloroplasts has been shown to function in group I and II intron splicing PUBMED:18065687. In vitro experiments with an isolated maize CRM domain have shown it to have RNA binding activity. These and other results suggest that the CRM domain evolved in the context of ribosome function prior to the divergence of Archaea and Bacteria, that this function has been maintained in extant prokaryotes, and that the domain was recruited to serve as an RNA binding module during the evolution of plant genomes PUBMED:17105995. YhbY has a fold similar to that of the C-terminal domain of translation initiation factor 3 (IF3C), which binds to 16S rRNA in the 30S ribosome PUBMED:12429100. CSF2 SM00040 Granulocyte-macrophage colony-simulating factor (GM-CSF) GM-CSF stimulates the development of and the cytotoxic activity of white blood cells. CSP SM00357 Cold shock protein domain RNA-binding domain that functions as a RNA-chaperone in bacteria and is involved in regulating translation in eukaryotes. Contains sub-family of RNA-binding domains in the Rho transcription termination factor. CT SM00041 C-terminal cystine knot-like domain (CTCK) The structures of transforming growth factor-beta (TGFbeta), nerve growth factor (NGF), platelet-derived growth factor (PDGF) and gonadotropin all form 2 highly twisted antiparallel pairs of beta-strands and contain three disulphide bonds. The domain is non-globular and little is conserved among these presumed homologues except for their cysteine residues. CT domains are predicted to form homodimers. CTD SM01104 Spt5 C-terminal nonapeptide repeat binding Spt4 The C-terminal domain of the transcription elongation factor protein Spt5 is necessary for binding to Spt4 to form the functional complex that regulates early transcription elongation by RNA polymerase II. The complex may be involved in pre-mRNA processing through its association with mRNA capping enzymes. This CTD domain carries a regular nonapeptide repeat that can be present in up to 18 copies, as in S. pombe PUBMED:19460865. The repeat has a characteristic TPA motif. CTLH SM00668 C-terminal to LisH motif. Alpha-helical motif of unknown function. CTNS SM00679 Repeated motif present between transmembrane helices in cystinosin, yeast ERS1p, mannose-P-dolichol utilization defect 1, and other hypothetical proteins. Function unknown, but likely to be associated with the glycosylation machinery. CUB SM00042 Domain first found in C1r, C1s, uEGF, and bone morphogenetic protein. This domain is found mostly among developmentally-regulated proteins. Spermadhesins contain only this domain. CUE SM00546 Domain that may be involved in binding ubiquitin-conjugating enzymes (UBCs) CUE domains also occur in two protein of the IL-1 signal transduction pathway, tollip and TAB2. Ponting (Biochem. J.) "Proteins of the Endoplasmic reticulum" (in press) CULLIN SM00182 Cullin CUT SM01109 The CUT domain is a DNA-binding domain often found in combination with a downstream homeodomain. CVNH SM01111 In molecular biology, the CVNH domain (CyanoVirin-N Homology domain) is a conserved protein domain. It is found in the sugar-binding antiviral protein cyanovirin-N (CVN) as well as proteins from filamentous ascomycetes and in the fern Ceratopteris richardii.(PMID 16003744) CW SM00605 CXC SM01114 Tesmin/TSO1-like CXC domain This family includes proteins that have two copies of a cysteine rich motif as follows: C-X-C-X4-C-X3-YC-X-C-X6-C-X3-C-X-C-X2-C. The family includes Tesmin Q9Y4I5 (PUBMED:10191092) and TSO1 Q9LE32 (PUBMED:10769245) . This family is called a CXC domain in (PUBMED:10769245). CY SM00043 Cystatin-like domain Cystatins are a family of cysteine protease inhibitors that occur mainly as single domain proteins. However some extracellular proteins such as kininogen, His-rich glycoprotein and fetuin also contain these domains. CYCLIN SM00385 domain present in cyclins, TFIIB and Retinoblastoma A helical domain present in cyclins and TFIIB (twice) and Retinoblastoma (once). A protein recognition domain functioning in cell-cycle and transcription control. CYCc SM00044 Adenylyl- / guanylyl cyclase, catalytic domain Present in two copies in mammalian adenylyl cyclases. Eubacterial homologues are known. Two residues (Asn, Arg) are thought to be involved in catalysis. These cyclases have important roles in a diverse range of cellular processes. CYTH SM01118 These sequences are functionally identified as members of the adenylate cyclase family, which catalyses the conversion of ATP to 3',5'-cyclic AMP and pyrophosphate. Six distinct non-homologous classes of AC have been identified. The structure of three classes of adenylyl cyclases have been solved (PUBMED:16905149). CaMBD SM01053 Calmodulin binding domain Small-conductance Ca2+-activated K+ channels (SK channels) are independent of voltage and gated solely by intracellular Ca2+. These membrane channels are heteromeric complexes that comprise pore-forming alpha-subunits and the Ca2+-binding protein calmodulin (CaM) PUBMED:11323678. CaM binds to the SK channel through this the CaM-binding domain (CaMBD), which is located in an intracellular region of the alpha-subunit immediately carboxy-terminal to the pore. Channel opening is triggered when Ca2+ binds the EF hands in the N-lobe of CaM. The structure of this domain complexed with CaM is known PUBMED:11323678. This domain forms an elongated dimer with a CaM molecule bound at each end; each CaM wraps around three alpha-helices, two from one CaMBD subunit and one from the other. CaM_binding SM01054 Plant calmodulin-binding domain The sequences featured in this family are found repeated in a number of plant calmodulin-binding proteins (such as Q8W235 Q84ZT8 and Q8H6X1), and are thought to constitute the calmodulin-binding domains PUBMED:12825696, PUBMED:11684678. Binding of the proteins to calmodulin depends on the presence of calcium ions PUBMED:12825696, PUBMED:11684678. These proteins are thought to be involved in various processes, such as plant defence responses PUBMED:12825696 and stolonisation or tuberization PUBMED:11684678 . Ca_chan_IQ SM01062 Voltage gated calcium channel IQ domain Voltage gated calcium channels control cellular calcium entry in response to changes in membrane potential. The isoleucine-glutamine (IQ) motif in the voltage gated calcium channel IQ domain interacts with hydrophobic pockets of Ca2+/calmodulin PUBMED:16299511. The interaction regulates two self-regulatory calcium dependent feedback mechanism, calcium dependent inactivation (CDI), and calcium-dependent facilitation (CDF). Cache_2 SM01049 Cache is an extracellular domain that is predicted to have a role in small-molecule recognition in a wide range of proteins, including the animal dihydropyridine-sensitive voltage-gated Ca2+ channel; alpha-2delta subunit, and various bacterial chemotaxis receptors. The name Cache comes from CAlcium channels and CHEmotaxis receptors. This domain consists of an N-terminal part with three predicted strands and an alpha-helix, and a C-terminal part with a strand dyad followed by a relatively unstructured region. The N-terminal portion of the (unpermuted) Cache domain contains three predicted strands that could form a sheet analogous to that present in the core of the PAS domain structure. Cache domains are particularly widespread in bacteria, with Vibrio cholerae. The animal calcium channel alpha-2delta subunits might have acquired a part of their extracellular domains from a bacterial source PUBMED:11084361. The Cache domain appears to have arisen from the GAF-PAS fold despite their divergent functions PUBMED:11292341. CactinC_cactus SM01050 Cactus-binding C-terminus of cactin protein CactinC_cactus is the C-terminal 200 residues of the cactin protein which are necessary for the association of cactin with IkappaB-cactus as one of the intracellular members of the Rel complex. The Rel (NF-kappaB) pathway is conserved in invertebrates and vertebrates. In mammals, it controls the activities of the immune and inflammatory response genes as well as viral genes, and is critical for cell growth and survival. In Drosophila, the Rel pathway functions in the innate cellular and humoral immune response, in muscle development, and in the establishment of dorsal-ventral polarity in the early embryo PUBMED:10842059. Most members of the family also have a Cactin_mid domain further upstream. Cadherin_pro SM01055 Cadherin prodomain like Cadherins are a family of proteins that mediate calcium dependent cell-cell adhesion. They are activated through cleavage of a prosequence in the late Golgi. This domain corresponds to the folded region of the prosequence, and is termed the prodomain. The prodomain shows structural resemblance to the cadherin domain, but lacks all the features known to be important for cadherin-cadherin interactions PUBMED:15130472. Calx_beta SM00237 Domains in Na-Ca exchangers and integrin-beta4 Domain in Na-Ca exchangers and integrin subunit beta4 (and some cyanobacterial proteins) Candida_ALS_N SM01056 Cell-wall agglutinin N-terminal ligand-sugar binding This is likely to be the sugar or ligand binding domain of the yeast alpha-agglutinins. CarD_TRCF SM01058 CarD-like/TRCF domain CarD is a Myxococcus xanthus protein required for the activation of light- and starvation-inducible genes PUBMED:8692912. This family includes the presumed N-terminal domain. CarD interacts with the zinc-binding protein CarG, to form a complex that regulates multiple processes in Myxococcus xanthus PUBMED:16879646. This family also includes a domain to the N-terminal side of the DEAD helicase of TRCF proteins. TRCF displaces RNA polymerase stalled at a lesion, binds to the damage recognition protein UvrA, and increases the template strand repair rate during transcription PUBMED:7876261. This domain is involved in binding to the stalled RNA polymerase PUBMED:7876261. Carb_anhydrase SM01057 Eukaryotic-type carbonic anhydrase Carbonic anhydrases are zinc metalloenzymes which catalyse the reversible hydration of carbon dioxide to bicarbonate PUBMED:18336305, PUBMED:10978542. CAs have essential roles in facilitating the transport of carbon dioxide and protons in the intracellular space, across biological membranes and in the layers of the extracellular space; they are also involved in many other processes, from respiration and photosynthesis in eukaryotes to cyanate degradation in prokaryotes. There are five known evolutionarily distinct CA families (alpha, beta, gamma, delta and epsilon) that have no significant sequence identity and have structurally distinct overall folds. Some CAs are membrane-bound, while others act in the cytosol; there are several related proteins that lack enzymatic activity. The active site of alpha-CAs is well described, consisting of a zinc ion coordinated through 3 histidine residues and a water molecule/hydroxide ion that acts as a potent nucleophile. The enzyme employs a two-step mechanism: in the first step, there is a nucleophilic attack of a zinc-bound hydroxide ion on carbon dioxide; in the second step, the active site is regenerated by the ionisation of the zinc-bound water molecule and the removal of a proton from the active site PUBMED:9336012. Beta- and gamma-CAs also employ a zinc hydroxide mechanism, although at least some beta-class enzymes do not have water directly coordinated to the metal ion. Catalase SM01060 Catalases are antioxidant enzymes that catalyse the conversion of hydrogen peroxide to water and molecular oxygen, serving to protect cells from its toxic effects PUBMED:11351128. Hydrogen peroxide is produced as a consequence of oxidative cellular metabolism and can be converted to the highly reactive hydroxyl radical via transition metals, this radical being able to damage a wide variety of molecules within a cell, leading to oxidative stress and cell death. Catalases act to neutralise hydrogen peroxide toxicity, and are produced by all aerobic organisms ranging from bacteria to man. Most catalases are mono-functional, haem-containing enzymes, although there are also bifunctional haem-containing peroxidase/catalases that are closely related to plant peroxidases, and non-haem, manganese-containing catalases that are found in bacteria PUBMED:14745498. Cation_ATPase_N SM00831 Cation transporter/ATPase, N-terminus This entry represents the conserved N-terminal region found in several classes of cation-transporting P-type ATPases, including those that transport H+, Na+, Ca2+, Na+/K+, and H+/K+. In the H+/K+- and Na+/K+-exchange P-ATPases, this domain is found in the catalytic alpha chain. In gastric H+/K+-ATPases, this domain undergoes reversible sequential phosphorylation inducing conformational changes that may be important for regulating the function of these ATPases PUBMED:12480547, PUBMED:12529322. Cdc6_C SM01074 CDC6, C terminal The C terminal domain of CDC6 assumes a winged helix fold, with a five alpha-helical bundle (alpha15-alpha19) structure, backed on one side by three beta strands (beta6-beta8). It has been shown that this domain acts as a DNA-localisation factor, however its exact function is, as yet, unknown. Putative functions include: (1) mediation of protein-protein interactions and (2) regulation of nucleotide binding and hydrolysis. Mutagenesis studies have shown that this domain is essential for appropriate Cdc6 activity PUBMED:11030343. Cg6151-P SM01077 Uncharacterized conserved protein CG6151-P This is a family of small, less than 200 residue long, proteins which are named as CG6151-P proteins that are conserved from fungi to humans. The function is unknown. The fungal members have a characteristic ICP sequence motif. Some members are annotated as putative clathrin-coated vesicle protein but this could not be defined. ChSh SM00300 Chromo Shadow Domain ChW SM00728 Clostridial hydrophobic, with a conserved W residue, domain. CheW SM00260 Two component signalling adaptor domain ChtBD1 SM00270 Chitin binding domain ChtBD2 SM00494 Chitin-binding domain type 2 ChtBD3 SM00495 Chitin-binding domain type 3 Cir_N SM01083 N-terminal domain of CBF1 interacting co-repressor CIR This is a 45 residue conserved region at the N-terminal end of a family of proteins referred to as CIRs (CBF1-interacting co-repressors). CBF1 (centromere-binding factor 1) acts as a transcription factor that causes repression by binding specifically to GTGGGAA motifs in responsive promoters, and it requires CIR as a co-repressor. CIR binds to histone deacetylase and to SAP30 and serves as a linker between CBF1 and the histone deacetylase complex PUBMED:9874765. Citrate_ly_lig SM00764 Citrate lyase ligase C-terminal domain Proteins of this family contain the C-terminal domain of citrate lyase ligase EC:6.2.1.22. ClpB_D2-small SM01086 C-terminal, D2-small domain, of ClpB protein This is the C-terminal domain of ClpB protein, referred to as the D2-small domain, and is a mixed alpha-beta structure. Compared with the D1-small domain (included in AAA) it lacks the long coiled-coil insertion, and instead of helix C4 contains a beta-strand (e3) that is part of a three stranded beta-pleated sheet. In Thermophilus the whole protein forms a hexamer with the D1-small and D2-small domains located on the outside of the hexamer, with the long coiled-coil being exposed on the surface. The D2-small domain is essential for oligomerisation, forming a tight interface with the D2-large domain of a neighbouring subunit and thereby providing enough binding energy to stabilise the functional assembly PUBMED:14567920. The domain is associated with two Clp_N at the N-terminus as well as AAA and AAA_2. CoA_binding SM00881 CoA binding domain This domain has a Rossmann fold and is found in a number of proteins including succinyl CoA synthetases, malate and ATP-citrate ligases. CoA_trans SM00882 Coenzyme A transferase Coenzyme A (CoA) transferases belong to an evolutionary conserved family of enzymes catalyzing the reversible transfer of CoA from one carboxylic acid to another. They have been identified in many prokaryotes and in mammalian tissues. The bacterial enzymes are heterodimer of two subunits (A and B) of about 25 Kd each while eukaryotic SCOT consist of a single chain which is colinear with the two bacterial subunits. CobW_C SM00833 Cobalamin synthesis protein cobW C-terminal domain CobW proteins are generally found proximal to the trimeric cobaltochelatase subunit CobN, which is essential for vitamin B12 (cobalamin) biosynthesis PUBMED:12869542. They contain a P-loop nucleotide-binding loop in the N-terminal domain and a histidine-rich region in the C-terminal portion suggesting a role in metal binding, possibly as an intermediary between the cobalt transport and chelation systems. CobW might be involved in cobalt reduction leading to cobalt(I) corrinoids. This entry represents the C-terminal domain found in CobW, as well as in P47K, a Pseudomonas chlororaphis protein needed for nitrile hydratase expression PUBMED:7765511. Cog4 SM00762 COG4 transport protein This region is found in yeast oligomeric golgi complex component 4 which is involved in ER to Golgi and intra Golgi transport. Col_cuticle_N SM01088 Nematode cuticle collagen N-terminal domain The function of this domain is unknown. It is found in the N-terminal region of nematode cuticle collagens. Cuticle is a tough elastic structure secreted by hypodermal cells and is primarily composed of collagen proteins PUBMED:7828882. Connexin_CCC SM01089 Gap junction channel protein cysteine-rich domain Copper-fist SM01090 Copper fist is an N-terminal domain involved in copper-dependent DNA binding. It is named for its resemblance to a fist. It can be found in some fungal transcription factors. These proteins activate the transcription of the metallothionein gene in response to copper. Metallothionein maintains copper levels in yeast. The copper fist domain is similar in structure to metallothionein itself, and on copper binding undergoes a large conformational change, which allows DNA binding. CorC_HlyC SM01091 Transporter associated domain This small domain is found in a family of proteins with the DUF21 domain and two CBS domains with this domain found at the C-terminus of the proteins, the domain is also found at the C terminus of some Na+/H+ antiporters. This domain is also found in CorC that is involved in Magnesium and cobalt efflux. The function of this domain is uncertain but might be involved in modulating transport of ion substrates. CpcD SM01094 CpcD/allophycocyanin linker domain Cpl-7 SM01095 Cpl-7 lysozyme C-terminal domain This domain was originally found in the C-terminal moiety of the Cpl-7 lysozyme encoded by the Streptococcus pneumoniae bacteriophage Cp-7. It is assumed that these repeats represent cell wall binding motifs although no direct evidence has been obtained so far. Cpn10 SM00883 Chaperonin 10 Kd subunit The chaperonins are 'helper' molecules required for correct folding and subsequent assembly of some proteins. These are required for normal cell growth, and are stress-induced, acting to stabilise or protect disassembled polypeptides under heat-shock conditions. Type I chaperonins present in eubacteria, mitochondria and chloroplasts require the concerted action of 2 proteins, chaperonin 60 (cpn60) and chaperonin 10 (cpn10). The 10 kDa chaperonin (cpn10 - or groES in bacteria) exists as a ring-shaped oligomer of between six to eight identical subunits, while the 60 kDa chaperonin (cpn60 - or groEL in bacteria) forms a structure comprising 2 stacked rings, each ring containing 7 identical subunits. These ring structures assemble by self-stimulation in the presence of Mg2+-ATP. The central cavity of the cylindrical cpn60 tetradecamer provides as isolated environment for protein folding whilst cpn-10 binds to cpn-60 and synchronizes the release of the folded protein in an Mg2+-ATP dependent manner. The binding of cpn10 to cpn60 inhibits the weak ATPase activity of cpn60. Cu_FIST SM00412 Copper-Fist binds DNA only in present of copper or silver Cullin_Nedd8 SM00884 Cullin protein neddylation domain This is the neddylation site of cullin proteins which are a family of structurally related proteins containing an evolutionarily conserved cullin domain. With the exception of APC2, each member of the cullin family is modified by Nedd8 and several cullins function in Ubiquitin-dependent proteolysis, a process in which the 26S proteasome recognises and subsequently degrades a target protein tagged with K48-linked poly-ubiquitin chains. Cullins are molecular scaffolds responsible for assembling the ROC1/Rbx1 RING-based E3 ubiquitin ligases, of which several play a direct role in tumorigenesis. Nedd8/Rub1 is a small ubiquitin-like protein, which was originally found to be conjugated to Cdc53, a cullin component of the SCF (Skp1-Cdc53/CUL1-F-box protein) E3 Ub ligase complex in Saccharomyces cerevisiae, and Nedd8 modification has now emerged as a regulatory pathway of fundamental importance for cell cycle control and for embryogenesis in metazoans. The only identified Nedd8 substrates are cullins. Neddylation results in covalent conjugation of a Nedd8 moiety onto a conserved cullin lysine residue. Cupin_1 SM00835 Cupin This family represents the conserved barrel domain of the 'cupin' superfamily ('cupa' is the Latin term for a small barrel). This family contains 11S and 7S plant seed storage proteins, and germins. Plant seed storage proteins provide the major nitrogen source for the developing plant. Cutinase SM01110 This enzyme belongs to the family of hydrolases, specifically those acting on carboxylic ester bonds. The systematic name of this enzyme class is cutin hydrolase. Aerial plant organs are protected by a cuticle composed of an insoluble polymeric structural compound, cutin, which is a polyester composed of hydroxy and hydroxyepoxy fatty acids. Plant pathogenic fungi produce extracellular degradative enzymes that play an important role in pathogenesis. They include cutinase, which hydrolyses cutin, facilitating fungus penetration through the cuticle. Inhibition of the enzyme can prevent fungal infection through intact cuticles. Cutin monomers released from the cuticle by small amounts of cutinase on fungal spore surfaces can greatly increase the amount of cutinase secreted by the spore, the mechanism for which process is as yet unknown. (PMID 1557023) CxxC_CXXC_SSSS SM00834 Putative regulatory protein CxxC_CXXC_SSSS represents a region of about 41 amino acids found in a number of small proteins in a wide range of bacteria. The region usually begins with the initiator Met and contains two CxxC motifs separated by 17 amino acids. One protein in this entry has been noted as a putative regulatory protein, designated FmdB. Most proteins in this entry have a C-terminal region containing highly degenerate sequence. Cyanate_lyase SM01116 Cyanate lyase C-terminal domain, Cyanate hydratase Cyanate lyase (also known as cyanase) EC:4.2.1.104 is responsible for the hydrolysis of cyanate, allowing organisms that possess the enzyme to overcome the toxicity of environmental cyanate. This enzyme is composed of two domains, an N-terminal helix-turn-helix and this structurally unique C-terminal domain (PUBMED:10801492). CysPc SM00230 Calpain-like thiol protease family. Calpain-like thiol protease family (peptidase family C2). Calcium activated neutral protease (large subunit). Cyt-b5 SM01117 Cytochrome b5-like Heme/Steroid binding domain This family includes heme binding domains from a diverse range of proteins. This family also includes proteins that bind to steroids. The family includes progesterone receptors such as O00264 (PUBMED:9705155,PUBMED:8774719). Many members of this subfamily are membrane anchored by an N-terminal transmembrane alpha helix. This family also includes a domain in some chitin synthases. There is no known ligand for this domain in the chitin synthases. D-ser_dehydrat SM01119 Putative serine dehydratase domain This domain is found at the C-terminus of yeast D-serine dehydratase (PUBMED:17937657). Structures have been solved for two bacterial members of this family. The yeast protein has been shown to be a zinc dependant enzyme. D5_N SM00885 D5 N terminal like This domain is found in D5 proteins of DNA viruses and bacteriophage P4 DNA primases phages. DAGKa SM00045 Diacylglycerol kinase accessory domain (presumed) Diacylglycerol (DAG) is a second messenger that acts as a protein kinase C activator. DAG can be produced from the hydrolysis of phosphatidylinositol 4,5-bisphosphate (PIP2) by a phosphoinositide-specific phospholipase C and by the degradation of phosphatidylcholine (PC) by a phospholipase C or the concerted actions of phospholipase D and phosphatidate phosphohydrolase. This domain might either be an accessory domain or else contribute to the catalytic domain. Bacterial homologues are known. DAGKc SM00046 Diacylglycerol kinase catalytic domain (presumed) Diacylglycerol (DAG) is a second messenger that acts as a protein kinase C activator. DAG can be produced from the hydrolysis of phosphatidylinositol 4,5-bisphosphate (PIP2) by a phosphoinositide-specific phospholipase C and by the degradation of phosphatidylcholine (PC) by a phospholipase C or the concerted actions of phospholipase D and phosphatidate phosphohydrolase. This domain is presumed to be the catalytic domain. Bacterial homologues areknown. DALR_1 SM00836 DALR anticodon binding domain This all alpha helical domain is the anticodon binding domain of Arginyl tRNA synthetase. This domain is known as the DALR domain after characteristic conserved amino acids PUBMED:10447505. DALR_2 SM00840 This DALR domain is found in cysteinyl-tRNA-synthetases. DAX SM00021 Domain present in Dishevelled and axin Domain of unknown function. DBC1 SM01122 DBC1 and it homologs from diverse eukaryotes are a catalytically inactive version of the Nudix hydrolase (MutT) domain (PUBMED:18418069). DBC1 is predicted to bind NAD metabolites and regulate the activity of SIRT1 or related deacetylases by sensing the soluble products or substrates of the NAD-dependent deacetylation reaction (PUBMED:18418069). DBP10CT SM01123 DBP10CT (NUC160) domain This C terminal domain is found in the Dbp10p subfamily of hypothetical RNA helicases (PUBMED:15112237). DBR1 SM01124 Lariat debranching enzyme, C-terminal domain This presumed domain is found at the C-terminus of lariat debranching enzyme. This domain is always found in association with Metallophos PF00149. DCD SM00767 DCD is a plant specific domain in proteins involved in development and programmed cell death. The domain is shared by several proteins in the Arabidopsis and the rice genomes, which otherwise show a different protein architecture. Biological studies indicate a role of these proteins in phytohormone response, embryo development and programmed cell death by pathogens or ozone. DCP2 SM01125 Dcp2, box A domain This domain is specific to mRNA decapping protein 2 and this region has been termed Box A (PUBMED:12218187). Removal of the cap structure is catalysed by the Dcp1-Dcp2 complex (PUBMED:16341225). DCX SM00537 Domain in the Doublecortin (DCX) gene product Tandemly-repeated domain in doublin, the Doublecortin gene product. Proposed to bind tubulin. Doublecortin (DCX) is mutated in human X-linked neuronal migration defects. DDE_Tnp_IS1595 SM01126 ISXO2-like transposase domain This domain probably functions as an integrase that is found in a wide variety of transposases, including ISXO2. DDHD SM01127 The DDHD domain is 180 residues long and contains four conserved residues that may form a metal binding site. The domain is named after these four residues. This pattern of conservation of metal binding residues is often seen in phosphoesterase domains. This domain is found in retinal degeneration B proteins, as well as a family of probable phospholipases. It has been shown that this domain is found in a longer C terminal region that binds to PYK2 tyrosine kinase. These proteins have been called N-terminal domain-interacting receptor (Nir1, Nir2 and Nir3) (PUBMED:10022914). This suggests that this region is involved in functionally important interactions in other members of this family. DDRGK SM01128 This is a family of proteins of approximately 300 residues, found in plants and vertebrates. They contain a highly conserved DDRGK motif. DDT SM00571 domain in different transcription and chromosome remodeling factors DEATH SM00005 DEATH domain, found in proteins involved in cell death (apoptosis). Alpha-helical domain present in a variety of proteins with apoptotic functions. Some (but not all) of these domains form homotypic and heterotypic dimers. DED SM00031 Death effector domain DEFSN SM00048 Defensin/corticostatin family Cysteine-rich domains that lyse bacteria, fungi and enveloped viruses by forming multimeric membrane-spanning channels. DELLA SM01129 Transcriptional regulator DELLA protein N terminal Gibberellins are plant hormones which have great impact on growth signalling. DELLA proteins are transcriptional regulators of growth related proteins which are downregulated when gibberellins bind to their receptor GID1. GID1 forms a complex with DELLA proteins and signals them towards 26S proteasome. The N terminal of DELLA proteins contains conserved DELLA and VHYNP motifs which are important for GID1 binding and proteolysis of the DELLA proteins. (PUBMED:19037309) DENN SM00799 Domain found in a variety of signalling proteins, always encircled by uDENN and dDENN The DENN domain is found in a variety of signalling proteins involved in Rab-mediated processes or regulation of MAPKs signalling pathways. The DENN domain is always encircled on both sides by more divergent domains, called uDENN (for upstream DENN) and dDENN (for downstream DENN). The function of the DENN domain remains to date unclear, although it appears to represent a good candidate for a GTP/GDP exchange activity. DEP SM00049 Domain found in Dishevelled, Egl-10, and Pleckstrin Domain of unknown function present in signalling proteins that contain PH, rasGEF, rhoGEF, rhoGAP, RGS, PDZ domains. DEP domain in Drosophila dishevelled is essential to rescue planar polarity defects and induce JNK signalling (Cell 94, 109-118). DEXDc SM00487 DEAD-like helicases superfamily DEXDc2 SM00488 DEXDc3 SM00489 DHDPS SM01130 Dihydrodipicolinate synthetase family This family has a TIM barrel structure. This enzyme belongs to the family of lyases, specifically the hydro-lyases, which cleave carbon-oxygen bonds. DHHA2 SM01131 This domain is often found adjacent to the DHH domain PF01368 and is called DHHA2 for DHH associated domain. This domain is diagnostic of DHH subfamily 2 members (PUBMED:9478130). The domain is about 120 residues long and contains a conserved DXK motif at its amino terminus. DIL SM01132 The DIL domain has no known function. DIRP SM01135 DIRP (Domain in Rb-related Pathway) is postulated to be involved in the Rb-related pathway, which is encoded by multiple eukaryotic genomes and is present in proteins including lin-9 of Caenorhabditis elegans, aly of fruit fly and mustard weed. Studies of lin-9 and aly of fruit fly proteins containing DIRP suggest that this domain might be involved in development. Aly, lin-9, act in parallel to, or downstream of, activation of MAPK by the RTK-Ras signalling pathway. DISIN SM00050 Homologues of snake disintegrins Snake disintegrins inhibit the binding of ligands to integrin receptors. They contain a 'RGD' sequence, identical to the recognition site of many adhesion proteins. Molecules containing both disintegrin and metalloprotease domains are known as ADAMs. DKCLD SM01136 DKCLD (NUC011) domain This is a TruB_N/PUA domain associated N-terminal domain of Dyskerin-like proteins (PUBMED:15112237). DM SM00301 Doublesex DNA-binding motif DM10 SM00676 Domains in hypothetical proteins in Drosophila, C. elegans and mammals. Occurs singly in some nucleoside diphosphate kinases. DM11 SM00675 Domains in hypothetical proteins in Drosophila including 2 in CG15241 and CG9329. DM13 SM00686 Domain present in fly proteins (CG14681, CG12492, CG6217), worm H06A10.1 and Arabidopsis thaliana MBG8.9. DM14 SM00685 Repeats in fly CG4713, worm Y37H9A.3 and human FLJ20241. DM15 SM00684 Tandem repeat in fly CG14066 (La related protein), human KIAA0731 and worm R144.7. Unknown function. DM16 SM00683 Repeats in sea squirt COS41.4, worm R01H10.6, fly CG1126 etc. DM3 SM00692 Zinc finger domain in CG10631, C. elegans LIN-15B and human P52rIPK. DM4_12 SM00718 DM4/DM12 family of domains in Drosophila melanogaster proteins of unknown function. DM5 SM00690 Domain of unknown function, currently peculiar to Drosophila. DM6 SM00689 Cysteine-rich domain currently specific to Drosophila. DM7 SM00688 Domain of unknown function in Drosophila CG15332, CG15333 and CG18293 DM8 SM00697 Repeats found in several Drosophila proteins. DM9 SM00696 Repeats found in Drosophila proteins. DMAP_binding SM01137 DMAP1-binding Domain This domain binds DMAP1, a transcriptional co-repressor. DNaseIc SM00476 deoxyribonuclease I Deoxyribonuclease I catalyzes the endonucleolytic cleavage of double-stranded DNA. The enzyme is secreted outside the cell and also involved in apoptosis in the nucleus. DP SM01138 Transcription factor DP DP forms a heterodimer with E2F and regulates genes involved in cell cycle progression. The transcriptional activity of E2F is inhibited by the retinoblastoma protein which binds to the E2F-DP heterodimer [PUBMED:16360038] and negatively regulates the G1-S transition. DPBB_1 SM00837 Rare lipoprotein A (RlpA)-like double-psi beta-barrel Rare lipoprotein A (RlpA) contains a conserved region that has the double-psi beta-barrel (DPBB) fold. The function of RlpA is not well understood, but it has been shown to act as a prc mutant suppressor in Escherichia coli. The DPBB fold is often an enzymatic domain. The members of this family are quite diverse, and if catalytic this family may contain several different functions. Another example of this domain is found in the N terminus of pollen allergen. DRY_EERY SM01141 Alternative splicing regulator This entry represents the conserved N-terminal region of SWAP (suppressor-of-white-apricot protein) proteins. This region contains two highly conserved motifs, viz: DRY and EERY, which appear to be the sites for alternative splicing of exons 2 and 3 of the SWAP mRNA [PUBMED:8206918]. These proteins are thus thought to be involved in auto-regulation of pre-mRNA splicing. Most family members are associated with two SWAP (Surp) domains SM00648 and an Arginine- serine-rich binding region towards the C-terminus. DSHCT SM01142 This C terminal domain is found in DOB1/SK12/helY-like DEAD box helicases [PUBMED:15112237]. DSL SM00051 delta serrate ligand DSPc SM00195 Dual specificity phosphatase, catalytic domain DSRM SM00358 Double-stranded RNA binding motif DSX_dimer SM01143 Doublesex dimerisation domain Doublesex (DSX) is a transcription factor that regulates somatic sexual differences in Drosophila. The structure of this domain has revealed a novel dimeric arrangement of ubiquitin-associated folds that has not previously been identified in a transcription factor [PUBMED:16049008]. DTW SM01144 This presumed domain is found in bacterial and eukaryotic proteins. Its function is unknown. The domain contains multiple conserved motifs including a DTXW motif that this domain has been named after. DUF1041 SM01145 Domain of Unknown Function (DUF1041) This family consists of several eukaryotic domains of unknown function. Members of this family are often found in tandem repeats and co-occur with C1,C2 and PH (SM00109, SM00239, SM00233) domains. DUF1086 SM01146 Domain of Unknown Function (DUF1086) This family consists of several eukaryotic domains of unknown function which are present in chromodomain helicase DNA binding proteins. This domain is often found in conjunction with DEXDc (SM00487), HELICc (SM00490), DUF1087,CHROMO (SM00298) and PHD (SM00249). DUF1087 SM01147 Members of this family are found in various chromatin remodelling factors and transposases. Their exact function is, as yet, unknown. DUF1220 SM01148 Repeat of unknown function (DUF1220) DUF1237 SM01149 This family contains a number of hypothetical proteins of about 450 residues in length. Their function is unknown, and most are bacterial. However, structurally this family is part of the 6 hairpin glycosidase superfamily, suggesting a glycosyl hydrolase function. DUF1338 SM01150 This domain is found in a variety of bacterial and fungal hypothetical proteins of unknown function. The structure of this domain has been solved by structural genomics. The structure implies a zinc-binding function, so it is a putative metal hydrolase (PDB:3iuz). DUF1518 SM01151 This domain, which is usually found tandemly repeated, is found various receptor co-activating proteins. DUF167 SM01152 DUF1693 SM01153 Domain of unknown function (DUF1693) This family contains many hypothetical proteins. It also includes four nematode prion-like proteins. This domain has been identified as part of the nucleotidyltransferase superfamily. DUF1704 SM01154 This family contains many hypothetical proteins. DUF1713 SM01155 Mitochondrial domain of unknown function (DUF1713) This domain is found at the C terminal end of mitochondrial proteins of unknown function. DUF1716 SM01156 Eukaryotic domain of unknown function (DUF1716) This domain is found in eukaryotic proteins. A human nuclear protein with this domain (Q8WYA6) is thought to have a role in apoptosis [PUBMED:12659813]. DUF1719 SM01157 This is a domain of unknown function. It may have a role in ATPase activation. DUF1741 SM01158 This is a eukaryotic domain of unknown function. DUF1744 SM01159 This domain is found on the epsilon catalytic subunit of DNA polymerase. It is found C terminal to POLBc (SM00486). DUF1751 SM01160 Eukaryotic integral membrane protein (DUF1751) This domain is found in eukaryotic integral membrane proteins. Q12239 a Saccharomyces cerervisiae protein, has been shown to localise COP II vesicles [PUBMED:14562095]. DUF1767 SM01161 Eukaryotic domain of unknown function. This domain is found to the N-terminus of the nucleic acid binding domain. DUF1771 SM01162 This domain is always found adjacent to SMR (SM00463). DUF1785 SM01163 This region is found in argonaute [PUBMED:16216572] proteins and often co-occurs with BAG (SM00264) and Piwi (SM00950). DUF1856 SM01164 This domain has no known function. It is found in the C-terminal segment of various vasopressin receptors. DUF1866 SM01165 This domain, found in Synaptojanin, has no known function. DUF1899 SM01166 This set of domains is found in various eukaryotic proteins. Function is unknown. DUF1900 SM01167 This domain is predominantly found in the structural protein coronin, and is duplicated in some sequences. It has no known function [PUBMED:16172398]. DUF1907 SM01168 The structure of this domain displays an alpha-beta-beta-alpha four layer topology, with an HxHxxxxxxxxxH motif that coordinates a zinc ion, and an acetate anion at a site that likely supports the enzymatic activity of an ester hydrolase [PUBMED:16522806]. DUF1943 SM01169 Members of this family adopt a structure consisting of several large open beta-sheets. Their exact function has not, as yet, been determined [PUBMED:12135361]. DUF1944 SM01170 Members of this family adopt a structure consisting of several large open beta-sheets. Their exact function has not, as yet, been determined [PUBMED:12135361]. DUF3700 SM01172 This domain family is found in eukaryotes, and is approximately 120 amino acids in length. There are two conserved sequence motifs: YGL and LRDR. This family is related to GATase enzyme domains. DUF4187 SM01173 This family is found at the very C-terminus of proteins that carry a G-patch domain SM00443. The domain is short and cysteine-rich . DUF4205 SM01174 The proteins in this family are uncharacterized but often named FAM188B. DUF4206 SM01175 This is a family of cysteine-rich proteins. Many members also carry a pleckstrin-homology domain,SM00233. DUF4208 SM01176 This domain is found at the C-terminus of chromodomain-helicase-DNA-binding proteins. The exact function of the domain is undetermined. DUF4210 SM01177 This short domain is found in fungi, plants and animals, and the proteins appear to be necessary for chromosome segregation during meiosis. DUF4217 SM01178 This short domain is found at the C-terminus of many helicase proteins. DUF862 SM01179 PPPDE putative peptidase domain The PPPDE superfamily (after Permuted Papain fold Peptidases of DsRNA viruses and Eukaryotes), consists of predicted thiol peptidases with a circularly permuted papain-like fold. The inference of the likely DUB function of the PPPDE superfamily proteins is based on the fusions of the catalytic domain to Ub-binding PUG (PUB)/UBA domains and a novel alpha-helical Ub-associated domain (the PUL domain, after PLAP, Ufd3p and Lub1p) PUBMED:15483401. DUSP SM00695 Domain in ubiquitin-specific proteases. DWA SM00523 Domain A in dwarfin family proteins DWB SM00524 Domain B in dwarfin family proteins DWNN SM01180 DWNN is a ubiquitin like domain found at the N-terminus of the RBBP6 family of splicing-associated proteins PUBMED:16396680. The DWNN domain is independently expressed in higher vertebrates so it may function as a novel ubiquitin-like modifier of other proteins PUBMED:16396680. DYNc SM00053 Dynamin, GTPase Large GTPases that mediate vesicle trafficking. Dynamin participates in the endocytic uptake of receptors, associated ligands, and plasma membrane following an exocytic event. DZF SM00572 domain in DSRM or ZnF_C2H2 domain containing proteins Dabb SM00886 Stress responsive A/B Barrel Domain The function of this domain is unknown, but it is upregulated in response to salt stress in Populus balsamifera (balsam poplar). It is also found at the C-terminus of a fructose 1,6-bisphosphate aldolase from Hydrogenophilus thermoluteolus.It is found in the pA01 plasmid, which encodes genes for molybdopterin uptake and degradation of plant alkaloid nicotine. The structure of one has been solved and the domain forms an alpha-beta barrel dimer. Although there is a clear duplication within the domain it is not obviously detectable in the sequence. Dak1_2 SM01121 This is the kinase domain of the dihydroxyacetone kinase family. Dak2 SM01120 This domain is the predicted phosphatase domain of the dihydroxyacetone kinase family. DeoC SM01133 DeoC/LacD family aldolase This family includes diverse aldolase enzymes. This family includes the enzyme deoxyribose-phosphate aldolase EC:4.1.2.4, which is involved in nucleotide metabolism. The family also includes a group of related bacterial proteins of unknown function, see examples Q57843 and P76143. The family also includes tagatose 1,6-diphosphate aldolase ( EC:4.1.2.40) is part of the tagatose-6-phosphate pathway of galactose-6-phosphate degradation (PUBMED:1655695). DeoRC SM01134 DeoR C terminal sensor domain The sensor domains of the DeoR are catalytically inactive versions of the ISOCOT fold, but retain the substrate binding site (PUBMED:16376935). DeorC senses diverse sugar derivatives such as deoxyribose nucleoside (DeoR), tagatose phosphate (LacR), galactosamine (AgaR), myo-inositol (Bacillus IolR) and L-ascorbate (UlaR) (PUBMED:16376935, 18844374, 15306018). DnaG_DnaB_bind SM00766 DNA primase DnaG DnaB-binding DnaG_DnaB_bind defines a domain of primase required for functional interaction with DnaB that attracts primase to the replication fork. DnaG_DnaB_bind is responsible for the interaction between DnaG and DnaB. DnaJ SM00271 DnaJ molecular chaperone homology domain DoH SM00664 Possible catecholamine-binding domain present in a variety of eukaryotic proteins. A predominantly beta-sheet domain present as a regulatory N-terminal domain in dopamine beta-hydroxylase, mono-oxygenase X and SDR2. Its function remains unknown at present (Ponting, Human Molecular Genetics, in press). Drf_FH3 SM01139 Diaphanous FH3 Domain This region is found in the Formin-like and and diaphanous proteins [PUBMED:12676083,PUBMED:9606213] Drf_GBD SM01140 Diaphanous GTPase-binding Domain This domain is bound to by GTP-attached Rho proteins, leading to activation of the Drf protein. DysFC SM00694 Dysferlin domain, C-terminal region. Domain of unknown function present in yeast peroxisomal proteins, dysferlin, myoferlin and hypothetical proteins. Due to an insertion of a dysferlin domain within a second dysferlin domain we have chosen to predict these domains in two parts: the N-terminal region and the C-terminal region. DysFN SM00693 Dysferlin domain, N-terminal region. Domain of unknown function present in yeast peroxisomal proteins, dysferlin, myoferlin and hypothetical proteins. Due to an insertion of a dysferlin domain within a second dysferlin domain we have chosen to predict these domains in two parts: the N-terminal region and the C-terminal region. E2_bind SM01181 E1 and E2 enzymes play a central role in ubiquitin and ubiquitin-like protein transfer cascades. This is an E2 binding domain that is found on NEDD8 activating E1 enzyme. The domain resembles ubiquitin, and recruits the catalytic core of the E2 enzyme Ubc12 in a similar manner to that in which ubiquitin interacts with ubiquitin binding domains PUBMED:15694336. EAL SM00052 Putative diguanylate phosphodiesterase Putative diguanylate phosphodiesterase, present in a variety of bacteria. EB_dh SM00887 Ethylbenzene dehydrogenase Eythylbenzene dehydrogenase is a heterotrimer of three subunits that catalyses the anaerobic degradation of hydrocarbons. The alpha subunit contains the catalytic centre as a Molybdenum cofactor-complex. This removes an electron-pair from the hydrocarbon and passes it along an electron transport system involving iron-sulphur complexes held in the beta subunit and a Haem b molecule contained in the gamma subunit. The electron-pair is then subsequently passed to an as yet unknown receiver. The enzyme is found in a variety of different bacteria. EF-1_beta_acid SM01182 Eukaryotic elongation factor 1 beta central acidic region EF1G SM01183 Elongation factor 1 gamma, conserved domain EF1_GNE SM00888 EF-1 guanine nucleotide exchange domain Translation elongation factors are responsible for two main processes during protein synthesis on the ribosome. EF1A (or EF-Tu) is responsible for the selection and binding of the cognate aminoacyl-tRNA to the A-site (acceptor site) of the ribosome. EF2 (or EF-G) is responsible for the translocation of the peptidyl-tRNA from the A-site to the P-site (peptidyl-tRNA site) of the ribosome, thereby freeing the A-site for the next aminoacyl-tRNA to bind. Elongation factors are responsible for achieving accuracy of translation and both EF1A and EF2 are remarkably conserved throughout evolution. Elongation factor EF1B (also known as EF-Ts or EF-1beta/gamma/delta) is a nucleotide exchange factor that is required to regenerate EF1A from its inactive form (EF1A-GDP) to its active form (EF1A-GTP). EF1A is then ready to interact with a new aminoacyl-tRNA to begin the cycle again. EF1B is more complex in eukaryotes than in bacteria, and can consist of three subunits: EF1B-alpha (or EF-1beta), EF1B-gamma (or EF-1gamma) and EF1B-beta (or EF-1delta). This entry represents the guanine nucleotide exchange domain of the beta (EF-1beta, also known as EF1B-alpha) and delta (EF-1delta, also known as EF1B-beta) chains of EF1B proteins from eukaryotes and archaea. The beta and delta chains have exchange activity, which mainly resides in their homologous guanine nucleotide exchange domains, found in the C-terminal region of the peptides. Their N-terminal regions may be involved in interactions with the gamma chain (EF-1gamma). EFG_C SM00838 Elongation factor G C-terminus This domain includes the carboxyl terminal regions of Elongation factor G, elongation factor 2 and some tetracycline resistance proteins and adopt a ferredoxin-like fold. EFG_IV SM00889 Elongation factor G, domain IV Translation elongation factors are responsible for two main processes during protein synthesis on the ribosome. EF1A (or EF-Tu) is responsible for the selection and binding of the cognate aminoacyl-tRNA to the A-site (acceptor site) of the ribosome. EF2 (or EF-G) is responsible for the translocation of the peptidyl-tRNA from the A-site to the P-site (peptidyl-tRNA site) of the ribosome, thereby freeing the A-site for the next aminoacyl-tRNA to bind. Elongation factors are responsible for achieving accuracy of translation and both EF1A and EF2 are remarkably conserved throughout evolution. Elongation factor EF2 (EF-G) is a G-protein. It brings about the translocation of peptidyl-tRNA and mRNA through a ratchet-like mechanism: the binding of GTP-EF2 to the ribosome causes a counter-clockwise rotation in the small ribosomal subunit; the hydrolysis of GTP to GDP by EF2 and the subsequent release of EF2 causes a clockwise rotation of the small subunit back to the starting position. This twisting action destabilises tRNA-ribosome interactions, freeing the tRNA to translocate along the ribosome upon GTP-hydrolysis by EF2. EF2 binding also affects the entry and exit channel openings for the mRNA, widening it when bound to enable the mRNA to translocate along the ribosome. EF2 has five domains. This entry represents domain IV found in EF2 (or EF-G) of both prokaryotes and eukaryotes. The EF2-GTP-ribosome complex undergoes extensive structural rearrangement for tRNA-mRNA movement to occur. Domain IV, which extends from the 'body' of the EF2 molecule much like a lever arm, appears to be essential for the structural transition to take place. EFP SM01185 Elongation factor P (EF-P) OB domain EFh SM00054 EF-hand, calcium binding motif EF-hands are calcium-binding motifs that occur at least in pairs. Links between disease states and genes encoding EF-hands, particularly the S100 subclass, are emerging. Each motif consists of a 12 residue loop flanked on either side by a 12 residue alpha-helix. EF-hands undergo a conformational change unpon binding calcium ions. EGF SM00181 Epidermal growth factor-like domain. EGF_CA SM00179 Calcium-binding EGF-like domain EGF_Lam SM00180 Laminin-type epidermal growth factor-like domai EGF_like SM00001 EGF domain, unclasssified subfamily EH SM00027 Eps15 homology domain Pair of EF hand motifs that recognise proteins containing Asn-Pro-Phe (NPF) sequences. EKR SM00890 Domain of unknown function EKR is a short, 33 residue, domain found in bacterial and some lower eukaryotic species which lies between a POR (pyruvate ferredoxin/flavodoxin oxidoreductase) and the 4Fe-4S binding domain Fer4. It contains a characteristic EKR sequence motif. The exact function of this domain is not known. ELFV_dehydrog SM00839 Glutamate/Leucine/Phenylalanine/Valine dehydrogenase Glutamate, leucine, phenylalanine and valine dehydrogenases are structurally and functionally related. They contain a Gly-rich region containing a conserved Lys residue, which has been implicated in the catalytic activity, in each case a reversible oxidative deamination reaction. ELK SM01188 This domain is required for the nuclear localisation of these proteins PUBMED:11352458. All of these proteins are members of the Tale/Knox homeodomain family, a subfamily within homeobox SM00389. ELM2 SM01189 The ELM2 (Egl-27 and MTA1 homology 2) domain is a small domain of unknown function. It is found in the MTA1 protein that is part of the NuRD complex PUBMED:10226007. The domain is usually found to the N terminus of a myb-like DNA binding domain SANT SM00717. ELM2 is also found associated with an ARID DNA binding domain SM01014 in Q84JT7. This suggests that ELM2 may also be involved in DNA binding, or perhaps is a protein-protein interaction domain. EMP24_GP25L SM01190 emp24/gp25L/p24 family/GOLD Members of this family are implicated in bringing cargo forward from the ER and binding to coat proteins by their cytoplasmic domains. This domain corresponds closely to the beta-strand rich GOLD domain described in PUBMED:12049664. The GOLD domain is always found combined with lipid- or membrane-association domains PUBMED:12049664. END SM00272 Endothelin ENDO3c SM00478 endonuclease III includes endonuclease III (DNA-(apurinic or apyrimidinic site) lyase), alkylbase DNA glycosidases (Alka-family) and other DNA glycosidases ENT SM01191 This presumed domain is named after Emsy N Terminus (ENT). Emsy is a protein that is amplified in breast cancer and interacts with BRCA2. The N terminus of this protein is found to be similar to other vertebrate and plant proteins of unknown function. This domain has a completely conserved histidine residue that may be functionally important. ENTH SM00273 Epsin N-terminal homology (ENTH) domain EPEND SM00026 Ependymins Ependymins are the predominant proteins in the cerebrospinal fluid (CSF) of teleost fish. They have been implicated in the neurochemistry of memory and neuronal regeneration. They are glycoproteins of about 200 amino acids that can bind calcium. Four cysteines are conserved that probably form disulfide bonds. EPH_lbd SM00615 Ephrin receptor ligand binding domain ERCC4 SM00891 ERCC4 domain This entry represents a structural motif found in several DNA repair nucleases, such as Rad1/Mus81/XPF endonucleases, and in ATP-dependent helicases. The XPF/Rad1/Mus81-dependent nuclease family specifically cleaves branched structures generated during DNA repair, replication, and recombination, and is essential for maintaining genome stability. The nuclease domain architecture exhibits remarkable similarity to those of restriction endonucleases. ETF SM00893 Electron transfer flavoprotein domain Electron transfer flavoproteins (ETFs) serve as specific electron acceptors for primary dehydrogenases, transferring the electrons to terminal respiratory systems. They can be functionally classified into constitutive, "housekeeping" ETFs, mainly involved in the oxidation of fatty acids (Group I), and ETFs produced by some prokaryotes under specific growth conditions, receiving electrons only from the oxidation of specific substrates (Group II). ETFs are heterodimeric proteins composed of an alpha and beta subunit, and contain an FAD cofactor and AMP. ETF consists of three domains: domains I and II are formed by the N- and C-terminal portions of the alpha subunit, respectively, while domain III is formed by the beta subunit. Domains I and III share an almost identical alpha-beta-alpha sandwich fold, while domain II forms an alpha-beta-alpha sandwich similar to that of bacterial flavodoxins. FAD is bound in a cleft between domains II and III, while domain III binds the AMP molecule. Interactions between domains I and III stabilise the protein, forming a shallow bowl where domain II resides. This entry represents the N-terminal domain of both the alpha and beta subunits from Group I and Group II ETFs. ETS SM00413 erythroblast transformation specific domain variation of the helix-turn-helix motif EXOIII SM00479 exonuclease domain in DNA-polymerase alpha and epsilon chain, ribonuclease T and other exonucleases EZ_HEAT SM00567 E-Z type HEAT repeats Present in subunits of cyanobacterial phycocyanin lyase, and other proteins. Probable scaffolding role. Elicitin SM01187 Elicitins form a novel class of plant necrotic proteins which are secreted by Phytophthora and Pythium fungi, parasites of many economically important crops. These proteins induce leaf necrosis in infected plants and elicit an incompatible hypersensitive-like reaction, leading to the development of a systemic acquired resistance against a range of fungal and bacterial plant pathogens PUBMED:8994969. Elong-fact-P_C SM00841 Elongation factor P, C-terminal These nucleic acid binding domains are predominantly found in elongation factor P, where they adopt an OB-fold, with five beta-strands forming a beta-barrel in a Greek-key topology PUBMED:15210970. Elp3 SM00729 Elongator protein 3, MiaB family, Radical SAM This superfamily contains MoaA, NifB, PqqE, coproporphyrinogen III oxidase, biotin synthase and MiaB families, and includes a representative in the eukaryotic elongator subunit, Elp-3. Some members of the family are methyltransferases. Endonuclease_NS SM00892 DNA/RNA non-specific endonuclease A family of bacterial and eukaryotic endonucleases share the following characteristics: they act on both DNA and RNA, cleave double-stranded and single-stranded nucleic acids and require a divalent ion such as magnesium for their activity. An histidine has been shown to be essential for the activity of the Serratia marcescens nuclease. This residue is located in a conserved region which also contains an aspartic acid residue that could be implicated in the binding of the divalent ion. Enolase_C SM01192 Enolase, C-terminal TIM barrel domain Enolase_N SM01193 Enolase, N-terminal domain Excalibur SM00894 Excalibur calcium-binding domain Extracellular Ca2+-dependent nuclease YokF from Bacillus subtilis and several other surface-exposed proteins from diverse bacteria are encoded in the genomes in two paralogous forms that differ by a ~45 amino acid fragment, which comprises a novel conserved domain. Sequence analysis of this domain revealed a conserved DxDxDGxxCE motif, which is strikingly similar to the Ca2+-binding loop of the calmodulin-like EF-hand domains, suggesting an evolutionary relationship between them. Functions of many of the other proteins in which the novel domain, named Excalibur (extracellular calcium-binding region), is found, as well as a structural model of its conserved motif are consistent with the notion that the Excalibur domain binds calcium. This domain is but one more example of the diversity of structural contexts surrounding the EF-hand-like calcium-binding loop in bacteria. This loop is thus more widespread than hitherto recognised and the evolution of EF-hand-like domains is probably more complex than previously appreciated. FA58C SM00231 Coagulation factor 5/8 C-terminal domain, discoidin domain Cell surface-attached carbohydrate-binding domain, present in eukaryotes and assumed to have horizontally transferred to eubacterial genomes. FABD SM00808 F-actin binding domain (FABD) FABD is the F-actin binding domain of Bcr-Abl and its cellular counterpart c-Abl. The Bcr-Abl tyrosine kinase causes different forms of leukemia in humans. Depending on its position within the cell, Bcr-Abl differentially affects cellular growth. The FABD forms a compact left-handed four-helix bundle in solution. FAS1 SM00554 Four repeated domains in the Fasciclin I family of proteins, present in many other contexts. FBD SM00579 domain in FBox and BRCT domain containing plant proteins FBG SM00186 Fibrinogen-related domains (FReDs) Domain present at the C-termini of fibrinogen beta and gamma chains, and a variety of fibrinogen-related proteins, including tenascin and Drosophila scabrous. FBOX SM00256 A Receptor for Ubiquitination Targets FCD SM00895 This entry represents the C-terminal ligand binding domain of many members of the GntR family. This domain probably binds to a range of effector molecules that regulate the transcription of genes through the action of the N-terminal DNA-binding domain. This domain is found in and that are regulators of sugar biosynthesis operons. Many bacterial transcription regulation proteins bind DNA through a helix-turn-helix (HTH) motif, which can be classified into subfamilies on the basis of sequence similarities. The HTH GntR family has many members distributed among diverse bacterial groups that regulate various biological processes. It was named GntR after the Bacillus subtilis repressor of the gluconate operon. In general, these proteins contain a DNA-binding HTH domain at the N terminus, and an effector binding or oligomerisation domain at the C terminus. The winged-helix DNA-binding domain is well conserved in structure for the whole of the GntR family, and is similar in structure to other transcriptional regulator families. The C-terminal effector-binding and oligomerisation domains are more variable and are consequently used to define the subfamilies. Based on the sequence and structure of the C-terminal domains, the GtnR family can be divided into four major groups, as represented by FadR, HutC, MocR and YtrA, as well as some minor groups such as those represented by AraR and PlmA. FCH SM00055 Fes/CIP4 homology domain Alignment extended from original report. Highly alpha-helical. Also known as the RAEYL motif or the S. pombe Cdc15 N-terminal domain. FDX-ACB SM00896 Ferredoxin-fold anticodon binding domain This is the anticodon binding domain found in some phenylalanyl tRNA synthetases. The domain has a ferredoxin fold, consisting of an alpha+beta sandwich with anti-parallel beta-sheets (beta-alpha-beta x2). FES SM00525 iron-sulpphur binding domain in DNA-(apurinic or apyrimidinic site) lyase (subfamily of ENDO3) FF SM00441 Contains two conserved F residues A novel motif that often accompanies WW domains. Often contains two conserved Phe (F) residues. FGF SM00442 Acidic and basic fibroblast growth factor family. Mitogens that stimulate growth or differentiation of cells of mesodermal or neuroectodermal origin. The family play essential roles in patterning and differentiation during vertebrate embryogenesis, and have neurotrophic activities. FH SM00339 FORKHEAD FORKHEAD, also known as a "winged helix" FH2 SM00498 Formin Homology 2 Domain FH proteins control rearrangements of the actin cytoskeleton, especially in the context of cytokinesis and cell polarisation. Members of this family have been found to interact with Rho-GTPases, profilin and other actin-assoziated proteins. These interactions are mediated by the proline-rich FH1 domain, usually located in front of FH2 (but not listed in SMART). Despite this cytosolic function, vertebrate formins have been assigned functions within the nucleus. A set of Formin-Binding Proteins (FBPs) has been shown to bind FH1 with their WW domain. FHA SM00240 Forkhead associated domain Found in eukaryotic and prokaryotic proteins. Putative nuclear signalling domain. FIMAC SM00057 factor I membrane attack complex FIST SM00897 FIST N domain The FIST N domain is a novel sensory domain, which is present in signal transduction proteins from Bacteria, Archaea and Eukarya. Chromosomal proximity of FIST-encoding genes to those coding for proteins involved in amino acid metabolism and transport suggest that FIST domains bind small ligands, such as amino acids. FMN_bind SM00900 This conserved region includes the FMN-binding site of the NqrC protein as well as the NosR and NirI regulatory proteins. FN1 SM00058 Fibronectin type 1 domain One of three types of internal repeat within the plasma protein, fibronectin. Found also in coagulation factor XII, HGF activator and tissue-type plasminogen activator. In t-PA and fibronectin, this domain type contributes to fibrin-binding. FN2 SM00059 Fibronectin type 2 domain One of three types of internal repeat within the plasma protein, fibronectin. Also occurs in coagulation factor XII, 2 type IV collagenases, PDC-109, and cation-independent mannose-6-phosphate and secretory phospholipase A2 receptors. In fibronectin, PDC-109, and the collagenases, this domain contributes to collagen-binding function. FN3 SM00060 Fibronectin type 3 domain One of three types of internal repeat within the plasma protein, fibronectin. The tenth fibronectin type III repeat contains a RGD cell recognition sequence in a flexible loop between 2 strands. Type III modules are present in both extracellular and intracellular proteins. FOLN SM00274 Follistatin-N-terminal domain-like Follistatin-N-terminal domain-like, EGF-like. Region distinct from the kazal-like sequence FRG SM00901 This domain contains a conserved N-terminal (F/Y)RG motif. It is functionally uncharacterised. FRI SM00063 Frizzled Drosophila melanogaster frizzled mediates signalling that polarises a precursor cell along the anteroposterior axis. Homologues of the N-terminal region of frizzled exist either as transmembrane or secreted molecules. Frizzled homologues are reported to be receptors for the Wnt growth factors. (Not yet in MEDLINE: the FRI domain occurs in several receptor tyrosine kinases [Xu, Y.K. and Nusse, Curr. Biol. 8 R405-R406 (1998); Masiakowski, P. and Yanopoulos, G.D., Curr. Biol. 8, R407 (1998)]. FTP SM00607 eel-Fucolectin Tachylectin-4 Pentaxrin-1 Domain FU SM00261 Furin-like repeats FYRC SM00542 "FY-rich" domain, C-terminal region is sometimes closely juxtaposed with the N-terminal region (FYRN), but sometimes is far distant. Unknown function, but occurs frequently in chromatin-associated proteins. FYRN SM00541 "FY-rich" domain, N-terminal region is sometimes closely juxtaposed with the C-terminal region (FYRC), but sometimes is far distant. Unknown function, but occurs frequently in chromatin-associated proteins. FYVE SM00064 Protein present in Fab1, YOTB, Vac1, and EEA1 The FYVE zinc finger is named after four proteins where it was first found: Fab1, YOTB/ZK632.12, Vac1, and EEA1. The FYVE finger has been shown to bind two Zn2+ ions. The FYVE finger has eight potential zinc coordinating cysteine positions. The FYVE finger is structurally related to the PHD finger and the RING finger. Many members of this family also include two histidines in a motif R+HHC+XCG, where + represents a charged residue and X any residue. The FYVE finger functions in the membrane recruitment of cytosolic proteins by binding to phosphatidylinositol 3-phosphate (PI3P), which is prominent on endosomes. The R+HHC+XCG motif is critical for PI3P binding. Fapy_DNA_glyco SM00898 Formamidopyrimidine-DNA glycosylase N-terminal domain This entry represents the catalytic domain of DNA glycosylase/AP lyase enzymes, which are involved in base excision repair of DNA damaged by oxidation or by mutagenic agents. Most damage to bases in DNA is repaired by the base excision repair pathway PUBMED:15588838. These enzymes are primarily from bacteria, and have both DNA glycosylase activity and AP lyase activity. Examples include formamidopyrimidine-DNA glycosylases (Fpg; MutM) and endonuclease VIII (Nei). Formamidopyrimidine-DNA glycosylases (Fpg, MutM) is a trifunctional DNA base excision repair enzyme that removes a wide range of oxidation-damaged bases (N-glycosylase activity; ) and cleaves both the 3'- and 5'-phosphodiester bonds of the resulting apurinic/apyrimidinic site (AP lyase activity; ). Fpg has a preference for oxidised purines, excising oxidized purine bases such as 7,8-dihydro-8-oxoguanine (8-oxoG). ITs AP (apurinic/apyrimidinic) lyase activity introduces nicks in the DNA strand, cleaving the DNA backbone by beta-delta elimination to generate a single-strand break at the site of the removed base with both 3'- and 5'-phosphates. Fpg is a monomer composed of 2 domains connected by a flexible hinge PUBMED:10921868. The two DNA-binding motifs (a zinc finger and the helix-two-turns-helix motifs) suggest that the oxidized base is flipped out from double-stranded DNA in the binding mode and excised by a catalytic mechanism similar to that of bifunctional base excision repair enzymes PUBMED:10921868. Fpg binds one ion of zinc at the C-terminus, which contains four conserved and essential cysteines PUBMED:8473347, PUBMED:7704272. Endonuclease VIII (Nei) has the same enzyme activities as Fpg above, but with a preference for oxidized pyrimidines, such as thymine glycol, 5,6-dihydrouracil and 5,6-dihydrothymine PUBMED:15232006. These protein contains three structural domains: an N-terminal catalytic core domain, a central helix-two turn-helix (H2TH) module and a C-terminal zinc finger (see PDB:1K82) PUBMED:11912217. The N-terminal catalytic domain and the C-terminal zinc finger straddle the DNA with the long axis of the protein oriented roughly orthogonal to the helical axis of the DNA. Residues that contact DNA are located in the catalytic domain and in a beta-hairpin loop formed by the zinc finger PUBMED:12055620. Fe_hyd_SSU SM00902 Iron hydrogenase small subunit Many microorganisms, such as methanogenic, acetogenic, nitrogen-fixing, photosynthetic, or sulphate-reducing bacteria, metabolise hydrogen. Hydrogen activation is mediated by a family of enzymes, termed hydrogenases, which either provide these organisms with reducing power from hydrogen oxidation, or act as electron sinks. There are two hydrogenases families that differ functionally from each other: NiFe hydrogenases tend to be more involved in hydrogen oxidation, while Iron-only FeFe (Fe only) hydrogenases in hydrogen production. Fe only hydrogenases show a common core structure, which contains a moiety, deeply buried inside the protein, with an Fe-Fe dinuclear centre, nonproteic bridging, terminal CO and CN- ligands attached to each of the iron atoms, and a dithio moiety, which also bridges the two iron atoms and has been tentatively assigned as a di(thiomethyl)amine. This common core also harbours three [4Fe-4S] iron-sulphur clusters PUBMED:11921392. In FeFe hydrogenases, as in NiFe hydrogenases, the set of iron-sulphur clusters is dispersed regularly between the dinuclear Fe-Fe centre and the molecular surface. These clusters are distant by about 1.2 nm from each other but the [4Fe-4S] cluster closest to the dinuclear centre is covalently bound to one of the iron atoms though a thiolate bridging ligand. The moiety including the dinuclear centre, the thiolate bridging ligand, and the proximal [4Fe-4S] cluster is known as the H-cluster. A channel, lined with hydrophobic amino acid side chains, nearly connects the dinuclear centre and the molecular surface. Furthermore hydrogen-bonded water molecule sites have been identified at the interior and at the surface of the protein. The small subunit is comprised of alternating random coil and alpha helical structures that encompass the large subunit in a novel protein fold PUBMED:10368269. FeoA SM00899 This entry represents the core domain of the ferrous iron (Fe2+) transport protein FeoA found in bacteria. This domain also occurs at the C-terminus in related proteins. The transporter Feo is composed of three proteins: FeoA a small, soluble SH3-domain protein probably located in the cytosol; FeoB, a large protein with a cytosolic N-terminal G-protein domain and a C-terminal integral inner-membrane domain containing two 'Gate' motifs which likely functions as the Fe2+ permease; and FeoC, a small protein apparently functioning as an [Fe-S]-dependent transcriptional repressor. Feo allows the bacterial cell to acquire iron from its environment. Flavin_Reduct SM00903 Flavin reductase like domain This entry represents the FMN-binding domain found in NAD(P)H-flavin oxidoreductases (flavin reductases), a class of enzymes capable of producing reduced flavin for bacterial bioluminescence and other biological processes. This domain is also found in various other oxidoreductase and monooxygenase enzymes PUBMED:12829278, PUBMED:15461461, PUBMED:11017201. This domain consists of a beta-barrel with Greek key topology, and is related to the ferredoxin reductase-like FAD-binding domain. The flavin reductases have a different dimerisation mode than that found in the PNP oxidase-like family, which also carries an FMN-binding domain with a similar topology. Flavokinase SM00904 Riboflavin kinase Riboflavin is converted into catalytically active cofactors (FAD and FMN) by the actions of riboflavin kinase, which converts it into FMN, and FAD synthetase, which adenylates FMN to FAD. Eukaryotes usually have two separate enzymes, while most prokaryotes have a single bifunctional protein that can carry out both catalyses, although exceptions occur in both cases. While eukaryotic monofunctional riboflavin kinase is orthologous to the bifunctional prokaryotic enzyme PUBMED:14580199, the monofunctional FAD synthetase differs from its prokaryotic counterpart, and is instead related to the PAPS-reductase family PUBMED:17049878. The bacterial FAD synthetase that is part of the bifunctional enzyme has remote similarity to nucleotidyl transferases and, hence, it may be involved in the adenylylation reaction of FAD synthetases PUBMED:12517446. This entry represents riboflavin kinase, which occurs as part of a bifunctional enzyme or a stand-alone enzyme. Flu_M1_C SM00759 Influenza Matrix protein (M1) C-terminal domain This region is thought to be a second domain of the M1 matrix protein. FolB SM00905 Dihydroneopterin aldolase Dihydroneopterin aldolase catalyses the conversion of 7,8-dihydroneopterin to 6-hydroxymethyl-7,8-dihydropterin in the biosynthetic pathway of tetrahydrofolate. In the opportunistic pathogen Pneumocystis carinii, dihydroneopterin aldolase function is expressed as the N-terminal portion of the multifunctional folic acid synthesis protein (Fas). This region encompasses two domains, FasA and FasB, which are 27% amino acid identical. FasA and FasB also share significant amino acid sequence similarity with bacterial dihydroneopterin aldolases. This region consists of two tandem sequences each homologous to folB and which form tetramers PUBMED:9709001. FtsA SM00842 Cell division protein FtsA FtsA is essential for bacterial cell division, and co-localizes to the septal ring with FtsZ. It has been suggested that the interaction of FtsA-FtsZ has arisen through coevolution in different bacterial strains PUBMED:9352931. Ftsk_gamma SM00843 This domain directs oriented DNA translocation and forms a winged helix structure. Mutated proteins with substitutions in the FtsK gamma DNA-recognition helix are impaired in DNA binding. Fungal_trans SM00906 Fungal specific transcription factor domain This domain is found in a number of fungal transcription factors including transcriptional activator xlnR, yeast regulatory protein GAL4, and other transcription proteins regulating a variety of cellular and metabolic processes. G2F SM00682 G2 nidogen domain and fibulin GA SM00844 GA module The protein G-related albumin-binding (GA) module is composed of three alpha helices PUBMED:9086265. This module is found in a range of bacterial cell surface proteins. The GA module from the Peptostreptococcus magnus albumin-binding protein (PAB) shows a strong affinity for albumin. GAF SM00065 Domain present in phytochromes and cGMP-specific phosphodiesterases. Mutations within these domains in PDE6B result in autosomal recessive inheritance of retinitis pigmentosa. GAL4 SM00066 GAL4-like Zn(II)2Cys6 (or C6 zinc) binuclear cluster DNA-binding domain Gal4 is a positive regulator for the gene expression of the galactose- induced genes of S. cerevisiae. Is present only in fungi. GAS2 SM00243 Growth-Arrest-Specific Protein 2 Domain GROWTH-ARREST-SPECIFIC PROTEIN 2 Domain GASTRIN SM00029 gastrin / cholecystokinin / caerulein family This family gathers small proteins of about 100 130 amino acids that act as hormones, among them gastrin, cholecystokinin and preprocaerulein which stimulate gastric, biliary, and pancreatic secretion and smooth muscle contraction. GDNF SM00907 GDNF/GAS1 domain This cysteine rich domain is found in multiple copies in GNDF and GAS1 proteins. GDNF and neurturin (NTN) receptors are potent survival factors for sympathetic, sensory and central nervous system neurons PUBMED:16551639, PUBMED:9192899. GDNF and neurturin promote neuronal survival by signaling through similar multicomponent receptors that consist of a common receptor tyrosine kinase and a member of a GPI-linked family of receptors that determines ligand specificity PUBMED:9192898. GED SM00302 Dynamin GTPase effector domain GEL SM00262 Gelsolin homology domain Gelsolin/severin/villin homology domain. Calcium-binding and actin-binding. Both intra- and extracellular domains. GGDEF SM00267 diguanylate cyclase Diguanylate cyclase, present in a variety of bacteria. GGL SM00224 G protein gamma subunit-like motifs GHA SM00067 Glycoprotein hormone alpha chain homologues. Also called gonadotropins. Glycoprotein hormones consist of two glycosylated chains (alpha and beta) of similar topology. GHB SM00068 Glycoprotein hormone beta chain homologues. Also called gonadotropins. Glycoprotein hormones consist of two glycosylated chains (alpha and beta) of similar topology. GIT SM00555 Helical motif in the GIT family of ADP-ribosylation factor GTPase-activating proteins Helical motif in the GIT family of ADP-ribosylation factor GTPase-activating proteins, and in yeast Spa2p and Sph1p (CPP; unpublished results). In p95-APP1 the N-terminal GIT motif might be involved in binding PIX. GIYc SM00465 GIY-YIG type nucleases (URI domain) GLA SM00069 Domain containing Gla (gamma-carboxyglutamate) residues. A hyaluronan-binding domain found in proteins associated with the extracellular matrix, cell adhesion and cell migration. GLECT SM00276 Galectin Galectin - galactose-binding lectin GLUCA SM00070 Glucagon like hormones GPS SM00303 G-protein-coupled receptor proteolytic site domain Present in latrophilin/CL-1, sea urchin REJ and polycystin. GRAM SM00568 domain in glucosyltransferases, myotubularins and other putative membrane-associated proteins GRAN SM00277 Granulin GS SM00467 GS motif Aa approx. 30 amino acid motif that precedes the kinase domain in types I and II TGF beta receptors. Mutation of two or more of the serines or threonines in the TTSGSGSG of TGF-beta type I receptor impairs phosphorylation and signaling activity. GYF SM00444 Contains conserved Gly-Tyr-Phe residues Proline-binding domain in CD2-binding protein. Contains conserved Gly-Tyr-Phe residues. GYR SM00713 Motif of unknown function with conserved Gly, Tyr, Arg tripeptide in Drosophila proteins. G_alpha SM00275 G protein alpha subunit Subunit of G proteins that contains the guanine nucleotide binding site G_patch SM00443 glycine rich nucleic binding domain A predicted glycine rich nucleic binding domain found in the splicing factor 45, SON DNA binding protein and D-type Retrovirus- polyproteins. Gal-bind_lectin SM00908 Galactoside-binding lectin Animal lectins display a wide variety of architectures. They are classified according to the carbohydrate-recognition domain (CRD) of which there are two main types, S-type and C-type. Galectins (previously S-lectins) bind exclusively beta-galactosides like lactose. They do not require metal ions for activity. Galectins are found predominantly, but not exclusively in mammals PUBMED:8124704. Their function is unclear. They are developmentally regulated and may be involved in differentiation, cellular regulation and tissue construction. Galanin SM00071 Galanin Galanin [1,2,3] is a neuropeptide that controls various biological activities: it regulates the release growth hormone, inhibits the release of insulin and somatostatin, contracts smooth muscle of the gastrointestinal and genitourinary tract and may be involved in the control of adrenal secretion GatB_Yqey SM00845 GatB domain This domain is found in GatB and proteins related to bacterial Yqey. It is about 140 amino acid residues long. This domain is found at the C terminus of GatB which transamidates Glu-tRNA to Gln-tRNA. The function of this domain is uncertain. It does however suggest that Yqey and its relatives have a role in tRNA metabolism. Germane SM00909 Sporulation and spore germination The GerMN domain is a region of approximately 100 residues that is found, duplicated, in the Bacillus GerM protein and is implicated in both sporulation and spore germination. The domain is found in a number of different bacterial species both alone and in association with other domains such as Amidase_3 PF01520 Gmad1 and Gmad2. It is predicted to have a novel alpha-beta fold. Glyco_10 SM00633 Glycosyl hydrolase family 10 Glyco_18 SM00636 Glyco_25 SM00641 Glycosyl hydrolases family 25 Glyco_32 SM00640 Glycosyl hydrolases family 32 GoLoco SM00390 LGN motif, putative GEFs specific for G-alpha GTPases GEF specific for Galpha_i proteins Gp_dh_N SM00846 Glyceraldehyde 3-phosphate dehydrogenase, NAD binding domain GAPDH is a tetrameric NAD-binding enzyme involved in glycolysis and glyconeogenesis. N-terminal domain is a Rossmann NAD(P) binding fold. Grip SM00755 golgin-97, RanBP2alpha,Imh1p and p230/golgin-245 GuKc SM00072 Guanylate kinase homologues. Active enzymes catalyze ATP-dependent phosphorylation of GMP to GDP. Structure resembles that of adenylate kinase. So-called membrane-associated guanylate kinase homologues (MAGUKs) do not possess guanylate kinase activities; instead at least some possess protein-binding functions. H15 SM00526 Domain in histone families 1 and 5 H2A SM00414 Histone 2A H2B SM00427 Histone H2B H3 SM00428 Histone H3 H4 SM00417 Histone H4 HA2 SM00847 Helicase associated domain (HA2) Add an annotation This presumed domain is about 90 amino acid residues in length. It is found is a diverse set of RNA helicases. Its function is unknown, however it seems likely to be involved in nucleic acid binding. HALZ SM00340 homeobox associated leucin zipper HAMP SM00304 HAMP (Histidine kinases, Adenylyl cyclases, Methyl binding proteins, Phosphatases) domain HAT SM00386 HAT (Half-A-TPR) repeats Present in several RNA-binding proteins. Structurally and sequentially thought to be similar to TPRs. HATPase_c SM00387 Histidine kinase-like ATPases Histidine kinase-, DNA gyrase B-, phytochrome-like ATPases. HDAC_interact SM00761 Histone deacetylase (HDAC) interacting This domain is found on transcriptional regulators. It forms interactions with histone deacetylases. HDc SM00471 Metal dependent phosphohydrolases with conserved 'HD' motif. Includes eukaryotic cyclic nucleotide phosphodiesterases (PDEc). This profile/HMM does not detect HD homologues in bacterial glycine aminoacyl-tRNA synthetases (beta subunit). HECTc SM00119 Domain Homologous to E6-AP Carboxyl Terminus with E3 ubiquitin-protein ligases. Can bind to E2 enzymes. HELICc SM00490 helicase superfamily c-terminal domain HELICc2 SM00491 HELICc3 SM00492 HEPN SM00748 Higher Eukarytoes and Prokaryotes Nucleotide-binding domain HIRAN SM00910 The HIRAN protein (HIP116, Rad5p N-terminal) is found in the N-terminal regions of the SWI2/SNF2 proteins typified by HIP116 and Rad5p. HIRAN is found as a standalone protein in several bacteria and prophages, or fused to other catalytic domains, such as a nuclease of the restriction endonuclease fold and TDP1-like DNA phosphoesterases, in the eukaryotes PUBMED:16627993. It has been predicted that this protein functions as a DNA-binding domain that probably recognises features associated with damaged DNA or stalled replication forks PUBMED:16627993 HLH SM00353 helix loop helix domain HMG SM00398 high mobility group HMG17 SM00527 domain in high mobilty group proteins HMG14 and HMG 17 HNHc SM00507 HNH nucleases HNS SM00528 Domain in histone-like proteins of HNS family HOLI SM00430 Ligand binding domain of hormone receptors HOX SM00389 Homeodomain DNA-binding factors that are involved in the transcriptional regulation of key developmental processes HPT SM00073 Histidine Phosphotransfer domain Contains an active histidine residue that mediates phosphotransfer reactions. Domain detected only in eubacteria. This alignment is an extension to that shown in the Cell structure paper. HRDC SM00341 Helicase and RNase D C-terminal Hypothetical role in nucleic acid binding. Mutations in the HRDC domain cause human disease. HSA SM00573 domain in helicases and associated with SANT domains HSF SM00415 heat shock factor HTH_ARAC SM00342 helix_turn_helix, arabinose operon control protein HTH_ARSR SM00418 helix_turn_helix, Arsenical Resistance Operon Repressor HTH_ASNC SM00344 helix_turn_helix ASNC type AsnC: an autogenously regulated activator of asparagine synthetase A transcription in Escherichia coli) HTH_CRP SM00419 helix_turn_helix, cAMP Regulatory protein HTH_DEOR SM00420 helix_turn_helix, Deoxyribose operon repressor HTH_DTXR SM00529 Helix-turn-helix diphteria tox regulatory element iron dependent repressor HTH_GNTR SM00345 helix_turn_helix gluconate operon transcriptional repressor HTH_ICLR SM00346 helix_turn_helix isocitrate lyase regulation HTH_LACI SM00354 helix_turn _helix lactose operon repressor HTH_LUXR SM00421 helix_turn_helix, Lux Regulon lux regulon (activates the bioluminescence operon HTH_MARR SM00347 helix_turn_helix multiple antibiotic resistance protein HTH_MERR SM00422 helix_turn_helix, mercury resistance HTH_XRE SM00530 Helix-turn-helix XRE-family like proteins HTTM SM00752 Horizontally Transferred TransMembrane Domain Sequence analysis of vitamin K dependent gamma-carboxylases (VKGC) revealed the presence of a novel domain, HTTM (Horizontally Transferred TransMembrane) in its N-terminus. In contrast to most known domains, HTTM contains four transmembrane regions. Its occurrence in eukaryotes, bacteria and archaea is more likely caused by horizontal gene transfer than by early invention. The conservation of VKGC catalytic sites indicates an enzymatic function also for the other family members. HWE_HK SM00911 HWE histidine kinase The HWE domain is found in a subset of two-component system kinases, belonging to the same superfamily as PUBMED:14702314. In PUBMED:14702314, the HWE family was defined by the presence of conserved a H residue and a WXE motifs and was limited to members of the proteobacteria. However, many homologues of this domain are lack the WXE motif. Furthermore, homologues are found in a wide range of Gram-positive and Gram-negative bacteria as well as in several archaea. HX SM00120 Hemopexin-like repeats. Hemopexin is a heme-binding protein that transports heme to the liver. Hemopexin-like repeats occur in vitronectin and some matrix metalloproteinases family (matrixins). The HX repeats of some matrixins bind tissue inhibitor of metalloproteinases (TIMPs). HYDRO SM00075 Hydrophobins Haemagg_act SM00912 haemagglutination activity domain This domain is suggested to be a carbohydrate- dependent haemagglutination activity site PUBMED:11703654. It is found in a range of haemagglutinins and haemolysins. HhH1 SM00278 Helix-hairpin-helix DNA-binding motif class 1 HhH2 SM00279 Helix-hairpin-helix class 2 (Pol1 family) motifs HintC SM00305 Hint (Hedgehog/Intein) domain C-terminal region Hedgehog/Intein domain, C-terminal region. Domain has been split to accommodate large insertions of endonucleases. HintN SM00306 Hint (Hedgehog/Intein) domain N-terminal region Hedgehog/Intein domain, N-terminal region. Domain has been split to accommodate large insertions of endonucleases. HisKA SM00388 His Kinase A (phosphoacceptor) domain Dimerisation and phosphoacceptor domain of histidine kinases. HormR SM00008 Domain present in hormone receptors Hr1 SM00742 Rho effector or protein kinase C-related kinase homology region 1 homologues Alpha-helical domain found in vertebrate PRK1 and yeast PKC1 protein kinases C. The HR1 in rhophilin bind RhoGTP; those in PRK1 bind RhoA and RhoB. Also called RBD - Rho-binding domain IB SM00121 Insulin growth factor-binding protein homologues High affinity binding partners of insulin-like growth factors. IBN_N SM00913 Importin-beta N-terminal domain Members of the importin-beta (karyopherin-beta) family can bind and transport cargo by themselves, or can form heterodimers with importin-alpha. As part of a heterodimer, importin-beta mediates interactions with the pore complex, while importin-alpha acts as an adaptor protein to bind the nuclear localisation signal (NLS) on the cargo through the classical NLS import of proteins. Importin-beta is a helicoidal molecule constructed from 19 HEAT repeats. Many nuclear pore proteins contain FG sequence repeats that can bind to HEAT repeats within importins PUBMED:12372823, PUBMED:17161424, which is important for importin-beta mediated transport. IBR SM00647 In Between Ring fingers the domains occurs between pairs og RING fingers IDEAL SM00914 This is a short protein of unknown function it is found at the C-terminus of proteins in the UPF0302 family. It is named after the sequence of the most conserved region in some members. IENR1 SM00497 Intron encoded nuclease repeat motif Repeat of unknown function, but possibly DNA-binding via helix-turn-helix motif (Ponting, unpublished). IENR2 SM00496 Intron-encoded nuclease repeat 2 Short helical motif of unknown function (unpublished results). IFabd SM00076 Interferon alpha, beta and delta. Interferons produce antiviral and antiproliferative responses in cells. They are classified into five groups, all of them related but gamma-interferon. IG SM00409 Immunoglobulin IG_FLMN SM00557 Filamin-type immunoglobulin domains These form a rod-like structure in the actin-binding cytoskeleton protein, filamin. The C-terminal repeats of filamin bind beta1-integrin (CD29). IG_like SM00410 Immunoglobulin like IG domains that cannot be classified into one of IGv1, IGc1, IGc2, IG. IGc1 SM00407 Immunoglobulin C-Type IGc2 SM00408 Immunoglobulin C-2 Type IGv SM00406 Immunoglobulin V-Type IL1 SM00125 Interleukin-1 homologues Cytokines with various biological functions. Interluekin 1 alpha and beta are also known as hematopoietin and catabolin. IL10 SM00188 Interleukin-10 family Interleukin-10 inhibits the synthesis of a number of cytokines, including IFN-gamma, IL-2, IL-3, TNF and GM-CSF produced by activated macrophages and by helper T cells. IL2 SM00189 Interleukin-2 family Interleukin-2 is a cytokine produced by T-helper cells in response to antigenic or mitogenic stimulation. This protein is required for T-cell proliferation and other activities crucial to the regulation of the immune response. IL4_13 SM00190 Interleukins 4 and 13 Interleukins-4 and -13 are cytokines involved in inflammatory and immune responses. IL-4 stimulates B and T cells. IL6 SM00126 Interleukin-6 homologues Family includes granulocyte colony-stimulating factor (G-CSF) and myelomonocytic growth factor (MGF). IL-6 is also known as B-cell stimulatory factor 2. IL7 SM00127 Interleukin-7 and interleukin-9 family. IL-7 is a cytokine that acts as a growth factor for early lymphoid cells of both B- and T-cell lineages. IL-9 is a multifunctional cytokine that, although originally described as a T-cell growth factor, its function in T-cell response remains unclear. ILWEQ SM00307 I/LWEQ domain Thought to possess an F-actin binding function. INB SM00187 Integrin beta subunits (N-terminal portion of extracellular region) Portion of beta integrins that lies N-terminal to their EGF-like repeats. Integrins are cell adhesion molecules that mediate cell-extracellular matrix and cell-cell interactions. They contain both alpha and beta subunits. Beta integrins are proposed to have a von Willebrand factor type-A "insert" or "I" -like domain (although this remains to be confirmed). IPPc SM00128 Inositol polyphosphate phosphatase, catalytic domain homologues Mg(2+)-dependent/Li(+)-sensitive enzymes. IPT SM00429 ig-like, plexins, transcription factors IQ SM00015 Short calmodulin-binding motif containing conserved Ile and Gln residues. Calmodulin-binding motif. IRF SM00348 interferon regulatory factor interferon regulatory factor, also known as trytophan pentad repeat IRO SM00548 Motif in Iroquois-class homeodomain proteins (only). Unknown function. ITAM SM00077 Immunoreceptor tyrosine-based activation motif Motif that may be dually phosphorylated on tyrosine that links antigen receptors to downstream signalling machinery. IlGF SM00078 Insulin / insulin-like growth factor / relaxin family. Family of proteins including insulin, relaxin, and IGFs. Insulin decreases blood glucose concentration. Inhibitor_I29 SM00848 Cathepsin propeptide inhibitor domain (I29) This domain is found at the N-terminus of some C1 peptidases such as Cathepsin L where it acts as a propeptide. There are also a number of proteins that are composed solely of multiple copies of this domain such as the peptidase inhibitor salarin. This family is classified as I29 by MEROPS. Peptide proteinase inhibitors can be found as single domain proteins or as single or multiple domains within proteins; these are referred to as either simple or compound inhibitors, respectively. In many cases they are synthesised as part of a larger precursor protein, either as a prepropeptide or as an N-terminal domain associated with an inactive peptidase or zymogen. This domain prevents access of the substrate to the active site. Removal of the N-terminal inhibitor domain either by interaction with a second peptidase or by autocatalytic cleavage activates the zymogen. Other inhibitors interact direct with proteinases using a simple noncovalent lock and key mechanism; while yet others use a conformational change-based trapping mechanism that depends on their structural and thermodynamic properties. Int_alpha SM00191 Integrin alpha (beta-propellor repeats). Integrins are cell adhesion molecules that mediate cell-extracellular matrix and cell-cell interactions. They contain both alpha and beta subunits. Alpha integrins are proposed to contain a domain containing a 7-fold repeat that adopts a beta-propellor fold. Some of these domains contain an inserted von Willebrand factor type-A domain. Some repeats contain putative calcium-binding sites. The 7-fold repeat domain is homologous to a similar domain in phosphatidylinositol-glycan-specific phospholipase D. JAB_MPN SM00232 JAB/MPN domain Domain in Jun kinase activation domain binding protein and proteasomal subunits. Domain at Mpr1p and Pad1p N-termini. Domain of unknown function. JHBP SM00700 Juvenile hormone binding protein domains in insects. The juvenile hormone exerts pleiotropic functions during insect life cycles and its binding proteins regulate these functions. Jacalin SM00915 Jacalin-like lectin domain This entry represents a mannose-binding lectin domain with a beta-prism fold consisting of three 4-stranded beta-sheets, with an internal pseudo 3-fold symmetry. Some lectins in this group stimulate distinct T- and B- cell functions, such as Jacalin, which binds to the T-antigen and acts as an agglutinin. This domain is found in 1 to 6 copies in lectins. The domain is also found in the salt-stress induced protein from rice and an animal prostatic spermine-binding protein. JmjC SM00558 A domain family that is part of the cupin metalloenzyme superfamily. Probable enzymes, but of unknown functions, that regulate chromatin reorganisation processes (Clissold and Ponting, in press). JmjN SM00545 Small domain found in the jumonji family of transcription factors To date, this domain always co-occurs with the JmjC domain (although the reverse is not true). KAZAL SM00280 Kazal type serine protease inhibitors Kazal type serine protease inhibitors and follistatin-like domains. KH SM00322 K homology RNA-binding domain KIND SM00750 kinase non-catalytic C-lobe domain It is an interaction domain identified as being similar to the C-terminal protein kinase catalytic fold (C lobe). Its presence at the N terminus of signalling proteins and the absence of the active-site residues in the catalytic and activation loops suggest that it folds independently and is likely to be non-catalytic. The occurrence of KIND only in metazoa implies that it has evolved from the catalytic protein kinase domain into an interaction domain possibly by keeping the substrate-binding features KISc SM00129 Kinesin motor, catalytic domain. ATPase. Microtubule-dependent molecular motors that play important roles in intracellular transport of organelles and in cell division. KOW SM00739 KOW (Kyprides, Ouzounis, Woese) motif. Motif in ribosomal proteins, NusG, Spt5p, KIN17 and T54. KR SM00130 Kringle domain Named after a Danish pastry. Found in several serine proteases and in ROR-like receptors. Can occur in up to 38 copies (in apolipoprotein(a)). Plasminogen-like kringles possess affinity for free lysine and lysine- containing peptides. KRAB SM00349 krueppel associated box KU SM00131 BPTI/Kunitz family of serine protease inhibitors. Serine protease inhibitors. One member of the family is encoded by an alternatively-spliced form of Alzheimer's amyloid beta-protein. Kelch SM00612 Knot1 SM00505 Knottins Knottins, representing plant lectins/antimicrobial peptides, plant proteinase/amylase inhibitors, plant gamma-thionins and arthropod defensins. Ku78 SM00559 Ku70 and Ku80 are 70kDa and 80kDa subunits of the Lupus Ku autoantigen This is a single stranded DNA- and ATP-depedent helicase that has a role in chromosome translocation. This is a domain of unknown function C-terminal to its von Willebrand factor A domain, that also occurs in bacterial hypothetical proteins. L27 SM00569 domain in receptor targeting proteins Lin-2 and Lin-7 L51_S25_CI-B8 SM00916 Mitochondrial ribosomal protein L51 / S25 / CI-B8 domain Proteins containing this domain are located in the mitochondrion and include ribosomal protein L51, and S25. This domain is also found in mitochondrial NADH-ubiquinone oxidoreductase B8 subunit (CI-B8) . It is not known whether all members of this family form part of the NADH-ubiquinone oxidoreductase and whether they are also all ribosomal proteins. LA SM00715 Domain in the RNA-binding Lupus La protein; unknown function LCCL SM00603 LDLa SM00192 Low-density lipoprotein receptor domain class A Cysteine-rich repeat in the low-density lipoprotein (LDL) receptor that plays a central role in mammalian cholesterol metabolism. The N-terminal type A repeats in LDL receptor bind the lipoproteins. Other homologous domains occur in related receptors, including the very low-density lipoprotein receptor and the LDL receptor-related protein/alpha 2-macroglobulin receptor, and in proteins which are functionally unrelated, such as the C9 component of complement. Mutations in the LDL receptor gene cause familial hypercholesterolemia. LEM SM00540 in nuclear membrane-associated proteins LEM, domain in nuclear membrane-associated proteins, including lamino-associated polypeptide 2 and emerin. LH2 SM00308 Lipoxygenase homology 2 (beta barrel) domain LIF_OSM SM00080 leukemia inhibitory factor OSM, Oncostatin M LIGANc SM00532 Ligase N family LIM SM00132 Zinc-binding domain present in Lin-11, Isl-1, Mec-3. Zinc-binding domain family. Some LIM domains bind protein partners via tyrosine-containing motifs. LIM domains are found in many key regulators of developmental pathways. LINK SM00445 Link (Hyaluronan-binding) LITAF SM00714 Possible membrane-associated motif in LPS-induced tumor necrosis factor alpha factor (LITAF), also known as PIG7, and other animal proteins. LMWPc SM00226 Low molecular weight phosphatase family LNS2 SM00775 This domain is found in Saccharomyces cerevisiae protein SMP2, proteins with an N-terminal lipin domain and phosphatidylinositol transfer proteins. SMP2 is involved in plasmid maintenance and respiration. Lipin proteins are involved in adipose tissue development and insulin resistance. LON SM00464 Found in ATP-dependent protease La (LON) N-terminal domain of the ATP-dependent protease La (LON), present also in other bacterial ORFs. LPD_N SM00638 Lipoprotein N-terminal Domain LRR SM00370 Leucine-rich repeats, outliers LRRCT SM00082 Leucine rich repeat C-terminal domain LRRNT SM00013 Leucine rich repeat N-terminal domain LRR_BAC SM00364 Leucine-rich repeats, bacterial type LRR_CC SM00367 Leucine-rich repeat - CC (cysteine-containing) subfamily LRR_RI SM00368 Leucine rich repeat, ribonuclease inhibitor type LRR_SD22 SM00365 Leucine-rich repeat, SDS22-like subfamily LRR_TYP SM00369 Leucine-rich repeats, typical (most populated) subfamily LRRcap SM00446 occurring C-terminal to leucine-rich repeats A motif occurring C-terminal to leucine-rich repeats in "sds22-like" and "typical" LRR-containing proteins. LU SM00134 Ly-6 antigen / uPA receptor -like domain Three-fold repeated domain in urokinase-type plasminogen activator receptor; occurs singly in other GPI-linked cell-surface glycoproteins (Ly-6 family, CD59, thymocyte B cell antigen, Sgp-2). Topology of these domains is similar to that of snake venom neurotoxins. LY SM00135 Low-density lipoprotein-receptor YWTD domain Type "B" repeats in low-density lipoprotein (LDL) receptor that plays a central role in mammalian cholesterol metabolism. Also present in a variety of molecules similar to gp300/megalin. LYZ1 SM00263 Alpha-lactalbumin / lysozyme C LYZ2 SM00047 Lysozyme subfamily 2 Eubacterial enzymes distantly related to eukaryotic lysozymes. Lactamase_B SM00849 Metallo-beta-lactamase superfamily Apart from the beta-lactamases a number of other proteins contain this domain PUBMED:7588620. These proteins include thiolesterases, members of the glyoxalase II family, that catalyse the hydrolysis of S-D-lactoyl-glutathione to form glutathione and D-lactic acid and a competence protein that is essential for natural transformation in Neisseria gonorrhoeae and could be a transporter involved in DNA uptake. Except for the competence protein these proteins bind two zinc ions per molecule as cofactor. LamB SM00281 Laminin B domain LamG SM00282 Laminin G domain LamGL SM00560 LamG-like jellyroll fold domain LamNT SM00136 Laminin N-terminal domain (domain VI) N-terminal domain of laminins and laminin-related protein such as Unc-6/ netrins. LeuA_dimer SM00917 LeuA allosteric (dimerisation) domain This is the C-terminal regulatory (R) domain of alpha-isopropylmalate synthase, which catalyses the first committed step in the leucine biosynthetic pathway PUBMED:15159544. This domain, is an internally duplicated structure with a novel fold PUBMED:15159544. It comprises two similar units that are arranged such that the two -helices pack together in the centre, crossing at an angle of 34 degrees, sandwiched between the two three-stranded, antiparallel beta-sheets. The overall domain is thus constructed as a beta-alpha-beta three-layer sandwich PUBMED:15159544. Lig_chan-Glu_bd SM00918 Ligated ion channel L-glutamate- and glycine-binding site This region, sometimes called the S1 domain, is the luminal domain just upstream of the first, M1, transmembrane region of transmembrane ion-channel proteins, and it binds L-glutamate and glycine. It is found in association with Lig_chan. LisH SM00667 Lissencephaly type-1-like homology motif Alpha-helical motif present in Lis1, treacle, Nopp140, some katanin p60 subunits, muskelin, tonneau, LEUNIG and numerous WD40 repeat-containing proteins. It is suggested that LisH motifs contribute to the regulation of microtubule dynamics, either by mediating dimerisation, or else by binding cytoplasmic dynein heavy chain or microtubules directly. LysM SM00257 Lysin motif LytTR SM00850 LytTr DNA-binding domain This domain is found in a variety of bacterial transcriptional regulators. The domain binds to a specific DNA sequence pattern. MA SM00283 Methyl-accepting chemotaxis-like domains (chemotaxis sensory transducer). Thought to undergo reversible methylation in response to attractants or repellants during bacterial chemotaxis. MA3 SM00544 Domain in DAP-5, eIF4G, MA-3 and other proteins. Highly alpha-helical. May contain repeats and/or regions similar to MIF4G domains Ponting (TIBS) "Novel eIF4G domain homologues" in press MACPF SM00457 membrane-attack complex / perforin MADF SM00595 subfamily of SANT domain MADS SM00432 MAM SM00137 Domain in meprin, A5, receptor protein tyrosine phosphatase mu (and others) Likely to have an adhesive function. Mutations in the meprin MAM domain affect noncovalent associations within meprin oligomers. In receptor tyrosine phosphatase mu-like molecules the MAM domain is important for homophilic cell-cell interactions. MANEC SM00765 The MANEC domain was formerly called MANSC. This domain, comprising 8 conserved cysteines, is found in the N terminus of higher multicellular animal membrane and extracellular proteins. It is postulated that this domain may play a role in the formation of protein complexes involving various protease activators and inhibitors. It is possible that some of the cysteine residues in the MANSC domain form structurally important disulfide bridges. All of the MANSC-containing proteins contain predicted transmembrane regions and signal peptides. It has been proposed that the MANSC domain in HAI-1 might function through binding with hepatocyte growth factor activator and matriptase. MATH SM00061 meprin and TRAF homology MBD SM00391 Methyl-CpG binding domain Methyl-CpG binding domain, also known as the TAM (TTF-IIP5, ARBP, MeCP1) domain MBT SM00561 Present in Drosophila Scm, l(3)mbt, and vertebrate SCML2 Present in Drosophila Scm, l(3)mbt, and vertebrate SCML2. These proteins are involved in transcriptional regulation. MCM SM00350 minichromosome maintenance proteins MD SM00604 MGS SM00851 MGS-like domain This domain composes the whole protein of methylglyoxal synthetase and the domain is also found in Carbamoyl phosphate synthetase (CPS) where it forms a regulatory domain that binds to the allosteric effector ornithine. This family also includes inosicase. The known structures in this family show a common phosphate binding site PUBMED:10526357. MHC_II_alpha SM00920 Class II histocompatibility antigen, alpha domain Class II MHC glycoproteins are expressed on the surface of antigen-presenting cells (APC), including macrophages, dendritic cells and B cells. MHC II proteins present peptide antigens that originate extracellularly from foreign bodies such as bacteria. Proteins from the pathogen are degraded into peptide fragments within the APC, which sequesters these fragments into the endosome so they can bind to MHC class II proteins, before being transported to the cell surface. MHC class II receptors display antigens for recognition by helper T cells (stimulate development of B cell clones) and inflammatory T cells (cause the release of lymphokines that attract other cells to site of infection) PUBMED:15120183. MHC_II_beta SM00921 Class II histocompatibility antigen, beta domain Class II MHC glycoproteins are expressed on the surface of antigen-presenting cells (APC), including macrophages, dendritic cells and B cells. MHC II proteins present peptide antigens that originate extracellularly from foreign bodies such as bacteria. Proteins from the pathogen are degraded into peptide fragments within the APC, which sequesters these fragments into the endosome so they can bind to MHC class II proteins, before being transported to the cell surface. MHC class II receptors display antigens for recognition by helper T cells (stimulate development of B cell clones) and inflammatory T cells (cause the release of lymphokines that attract other cells to site of infection) PUBMED:15120183. MIF4G SM00543 Middle domain of eukaryotic initiation factor 4G (eIF4G) Also occurs in NMD2p and CBP80. The domain is rich in alpha-helices and may contain multiple alpha-helical repeats. In eIF4G, this domain binds eIF4A, eIF3, RNA and DNA. Ponting (TiBS) "Novel eIF4G domain homologues (in press) MIR SM00472 Domain in ryanodine and inositol trisphosphate receptors and protein O-mannosyltransferases MIT SM00745 Microtubule Interacting and Trafficking molecule domain ML SM00737 Domain involved in innate immunity and lipid metabolism. ML (MD-2-related lipid-recognition) is a novel domain identified in MD-1, MD-2, GM2A, Npc2 and multiple proteins of unknown function in plants, animals and fungi. These single-domain proteins were predicted to form a beta-rich fold containing multiple strands, and to mediate diverse biological functions through interacting with specific lipids. MORN SM00698 Possible plasma membrane-binding motif in junctophilins, PIP-5-kinases and protein kinases. MR_MLE SM00922 Mandelate racemase / muconate lactonizing enzyme, C-terminal domain Mandelate racemase (MR) and muconate lactonizing enzyme (MLE) are two bacterial enzymes involved in aromatic acid catabolism. They catalyze mechanistically distinct reactions yet they are related at the level of their primary, quaternary (homooctamer) and tertiary structures PUBMED:2215699, PUBMED:8256284. This entry represents the C-terminal region of these proteins. MUTSac SM00534 ATPase domain of DNA mismatch repair MUTS family MUTSd SM00533 DNA-binding domain of DNA mismatch repair MUTS family MYSc SM00242 Myosin. Large ATPases. ATPase; molecular motor. Muscle contraction consists of a cyclical interaction between myosin and actin. The core of the myosin structure is similar in fold to that of kinesin. Mad3_BUB1_I SM00777 Mad3/BUB1 hoMad3/BUB1 homology region 1 Proteins containing this domain are checkpoint proteins involved in cell division. This region has been shown to be essential for the binding of the binding of BUB1 and MAD3 to CDC20p. Malic_M SM00919 Malic enzyme, NAD binding domain Malic enzymes (malate oxidoreductases) catalyse the oxidative decarboxylation of malate to form pyruvate. MbtH SM00923 MbtH-like protein This domain is found in the MbtH protein as well as at the N-terminus of the antibiotic synthesis protein NIKP1. This domain is about 70 amino acids long and contains 3 fully conserved tryptophan residues. Many of the members of this family are found in known antibiotic synthesis gene clusters. MeTrc SM00138 Methyltransferase, chemotaxis proteins Methylates methyl-accepting chemotaxis proteins to form gamma-glutamyl methyl ester residues. MgtE_N SM00924 MgtE intracellular N domain This region is the integral membrane part of the eubacterial MgtE family of magnesium transporters. It is presumed to be an intracellular domain, that may be involved in magnesium binding. MltA SM00925 MltA specific insert domain This beta barrel domain is found inserted in the MltA a murein degrading transglycosylase enzyme. This domain may be involved in peptidoglycan binding. MoCF_biosynth SM00852 Probable molybdopterin binding domain This domain is found a variety of proteins involved in biosynthesis of molybdopterin cofactor. The domain is presumed to bind molybdopterin. The structure of this domain is known, and it forms an alpha/beta structure. In the known structure of Gephyrin this domain mediates trimerisation. Molybdop_Fe4S4 SM00926 Molybdopterin oxidoreductase Fe4S4 domain The molybdopterin oxidoreductase Fe4S4 domain is found in a number of reductase/dehydrogenase families, which include the periplasmic nitrate reductase precursor and the formate dehydrogenase alpha chain. Mterf SM00733 Mitochondrial termination factor repeats Human mitochondrial termination factor is a DNA-binding protein that acts as a transcription termination factor. Six repeats occur in human mTERF, that also are present in numerous plant proteins. MutH SM00927 DNA mismatch repair enzyme MutH MutS, MutL and MutH are the three essential proteins for initiation of methyl-directed DNA mismatch repair to correct mistakes made during DNA replication in Escherichia coli. MutH cleaves a newly synthesized and unmethylated daughter strand 5' to the sequence d(GATC) in a hemi-methylated duplex. Activation of MutH requires the recognition of a DNA mismatch by MutS and MutL PUBMED:9482749. MutL_C SM00853 MutL C terminal dimerisation domain MutL and MutS are key components of the DNA repair machinery that corrects replication errors. MutS recognises mispaired or unpaired bases in a DNA duplex and in the presence of ATP, recruits MutL to form a DNA signaling complex for repair. The N terminal region of MutL contains the ATPase domain and the C terminal is involved in dimerisation. MyTH4 SM00139 Domain in Myosin and Kinesin Tails Domain present twice in myosin-VIIa, and also present in 3 other myosins. NADH-G_4Fe-4S_3 SM00929 NADH-ubiquinone oxidoreductase-G iron-sulfur binding region NADH_4Fe-4S SM00928 NADH-ubiquinone oxidoreductase-F iron-sulfur binding region NAT_PEP SM00183 Natriuretic peptide Atrial natriuretic peptides are vertebrate hormones important in the overall control of cardiovascular homeostasis and sodium and water balance in general. NDK SM00562 These are enzymes that catalyze nonsubstrate specific conversions of nucleoside diphosphates to nucleoside triphosphates. These enzymes play important roles in bacterial growth, signal transduction and pathogenicity. NEAT SM00725 NEAr Transporter domain NEBU SM00227 The Nebulin repeat is present also in Las1. Tandem arrays of these repeats are known to bind actin. NEUZ SM00588 domain in neuralized proteins NGF SM00140 Nerve growth factor (NGF or beta-NGF) NGF is important for the development and maintenance of the sympathetic and sensory nervous systems. NGN SM00738 In Spt5p, this domain may confer affinity for Spt4p. It possesses a RNP-like fold. In Spt5p, this domain may confer affinity for Spt4p.Spt4p NH SM00003 Neurohypophysial hormones Vasopressin/oxytocin gene family. NIDO SM00539 Extracellular domain of unknown function in nidogen (entactin) and hypothetical proteins. NIL SM00930 This domain is found at the C-terminus of ABC transporter proteins involved in D-methionine transport as well as a number of ferredoxin-like proteins. This domain is likely to act as a substrate binding domain. The domain has been named after a conserved sequence in some members of the family. NL SM00004 Domain found in Notch and Lin-12 The Notch protein is essential for the proper differentiation of the Drosophila ectoderm. This protein contains 3 NL domains. NMU SM00084 Neuromedin U Neuromedin U (NmU) is a vertebrate peptide which stimulates uterine smooth muscle contraction and causes selective vasoconstriction. Like most other active peptides, it is proteolytically processed from a larger precursor protein. The mature peptides are 8 (NmU-8) to 25 (NmU-25) residues long and C- terminally amidated. The sequence of the C-terminal extremity of NmU is extremely well conserved in mammals, birds and amphibians. NOSIC SM00931 NOSIC (NUC001) domain This is the central domain in Nop56/SIK1-like proteins PUBMED:15112237. NPCBM SM00776 This novel putative carbohydrate binding module (NPCBM) domain is found at the N-terminus of glycosyl hydrolase family 98 proteins. NRF SM00703 N-terminal domain in C. elegans NRF-6 (Nose Resistant to Fluoxetine-4) and NDG-4 (resistant to nordihydroguaiaretic acid-4). Also present in several other worm and fly proteins. NTR SM00206 Tissue inhibitor of metalloproteinase family. Form complexes with metalloproteinases, such as collagenases, and irreversibly inactivate them. NUC SM00477 DNA/RNA non-specific endonuclease prokaryotic and eukaryotic double- and single-stranded DNA and RNA endonucleases also present in phosphodiesterases Nfu_N SM00932 Scaffold protein Nfu/NifU N terminal This domain is found at the N terminus of NifU and NifU related proteins, and in the human Nfu protein. Both of these proteins are thought to be involved in the the assembly of iron-sulphur clusters PUBMED:12886008. NurA SM00933 This family includes NurA a nuclease exhibiting both single-stranded endonuclease activity and 5'-3' exonuclease activity on single-stranded and double-stranded DNA from the hyperthermophilic archaeon Sulfolobus acidocaldarius PUBMED:12052775. OLF SM00284 Olfactomedin-like domains OMPdecase SM00934 Orotidine 5'-phosphate decarboxylase / HUMPS family Orotidine 5'-phosphate decarboxylase (OMPdecase) PUBMED:2835631, PUBMED:1730672 catalyzes the last step in the de novo biosynthesis of pyrimidines, the decarboxylation of OMP into UMP. In higher eukaryotes OMPdecase is part, with orotate phosphoribosyltransferase, of a bifunctional enzyme, while the prokaryotic and fungal OMPdecases are monofunctional protein. ORANGE SM00511 Orange domain This domain confers specificity among members of the Hairy/E(SPL) family. OSTEO SM00017 Osteopontin Osteopontin is an acidic phosphorylated glycoprotein of about 40 Kd which is abundant in the mineral matrix of bones and which binds tightly to hydroxyapatite [1,2,3]. It is suggested that osteopontin might function as a cell attachment factor and could play a key role in the adhesion of osteoclasts to the mineral matrix of bone OmpH SM00935 Outer membrane protein (OmpH-like) This family includes outer membrane proteins such as OmpH among others. Skp (OmpH) has been characterised as a molecular chaperone that interacts with unfolded proteins as they emerge in the periplasm from the Sec translocation machinery PUBMED:15304217. P-II SM00938 Nitrogen regulatory protein P-II P-II modulates the activity of glutamine synthetase. P4Hc SM00702 Prolyl 4-hydroxylase alpha subunit homologues. Mammalian enzymes catalyse hydroxylation of collagen, for example. Prokaryotic enzymes might catalyse hydroxylation of antibiotic peptides. These are 2-oxoglutarate-dependent dioxygenases, requiring 2-oxoglutarate and dioxygen as cosubstrates and ferrous iron as a cofactor. PA14 SM00758 domain in bacterial beta-glucosidases other glycosidases, glycosyltransferases, proteases, amidases, yeast adhesins, and bacterial toxins. PA2c SM00085 Phospholipase A2 PAC SM00086 Motif C-terminal to PAS motifs (likely to contribute to PAS structural domain) PAC motif occurs C-terminal to a subset of all known PAS motifs. It is proposed to contribute to the PAS domain fold. PAH SM00309 Pancreatic hormones / neuropeptide F / peptide YY family Pancreatic hormone is a regulator of pancreatic and gastrointestinal functions. PAM SM00753 PCI/PINT associated module PAN_AP SM00473 divergent subfamily of APPLE domains Apple-like domains present in Plasminogen, C. elegans hypothetical ORFs and the extracellular portion of plant receptor-like protein kinases. Predicted to possess protein- and/or carbohydrate-binding functions. PAS SM00091 PAS domain PAS motifs appear in archaea, eubacteria and eukarya. Probably the most surprising identification of a PAS domain was that in EAG-like K+-channels ([1]; Ponting & Aravind, in press). PASTA SM00740 PAW SM00613 domain present in PNGases and other hypothetical proteins present in several copies in proteins with unknown function in C. elegans PAX SM00351 Paired Box domain PAZ SM00949 This domain is named PAZ after the proteins Piwi Argonaut and Zwille. This domain is found in two families of proteins that are involved in post-transcriptional gene silencing. These are the Piwi family and the Dicer family, that includes the Carpel factory protein. The function of the domains is unknown but has been suggested to mediate complex formation between proteins of the Piwi and Dicer families by hetero-dimerisation. The three-dimensional structure of this domain has been solved. The PAZ domain is composed of two subdomains. One subdomain is similar to the OB fold, albeit with a different topology. The OB-fold is well known as a single-stranded nucleic acid binding fold. The second subdomain is composed of a beta-hairpin followed by an alpha-helix. The PAZ domains shows low-affinity nucleic acid binding and appears to interact with the 3' ends of single-stranded regions of RNA in the cleft between the two subdomains. PAZ can bind the characteristic two-base 3' overhangs of siRNAs, indicating that although PAZ may not be a primary nucleic acid binding site in Dicer or RISC, it may contribute to the specific and productive incorporation of siRNAs and miRNAs into the RNAi pathway. PB1 SM00666 PB1 domain Phox and Bem1p domain, present in many eukaryotic cytoplasmic signalling proteins. The domain adopts a beta-grasp fold, similar to that found in ubiquitin and Ras-binding domains. A motif, variously termed OPR, PC and AID, represents the most conserved region of the majority of PB1 domains, and is necessary for PB1 domain function. This function is the formation of PB1 domain heterodimers, although not all PB1 domain pairs associate. PBD SM00285 P21-Rho-binding domain Small domains that bind Cdc42p- and/or Rho-like small GTPases. Also known as the Cdc42/Rac interactive binding (CRIB). PBP5_C SM00936 Penicillin-binding protein 5, C-terminal domain Penicillin-binding protein 5 expressed by E. coli (P04287) functions as a D-alanyl-D-alanine carboxypeptidase. It is composed of two domains that are oriented at approximately right angles to each other. The N-terminal domain (PF00768) is the catalytic domain. The C-terminal domain featured in this family is organised into a sandwich of two anti-parallel beta-sheets, and has a relatively hydrophobic surface as compared to the N-terminal domain. Its precise function is unknown; it may mediate interactions with other cell wall-synthesising enzymes, thus allowing the protein to be recruited to areas of active cell wall synthesis. It may also function as a linker domain that positions the active site in the catalytic domain closer to the peptidoglycan layer, to allow it to interact with cell wall peptides PUBMED:10967102. PBPb SM00062 Bacterial periplasmic substrate-binding proteins bacterial proteins, eukaryotic ones are in PBPe PBPe SM00079 Eukaryotic homologues of bacterial periplasmic substrate binding proteins. Prokaryotic homologues are represented by a separate alignment: PBPb PCRF SM00937 This domain is found in peptide chain release factors. PD SM00018 P or trefoil or TFF domain Proposed role in renewal and pathology of mucous epithelia. PDGF SM00141 Platelet-derived and vascular endothelial growth factors (PDGF, VEGF) family Platelet-derived growth factor is a potent activator for cells of mesenchymal origin. PDGF-A and PDGF-B form AA and BB homodimers and an AB heterodimer. Members of the VEGF family are homologues of PDGF. PDZ SM00228 Domain present in PSD-95, Dlg, and ZO-1/2. Also called DHR (Dlg homologous region) or GLGF (relatively well conserved tetrapeptide in these domains). Some PDZs have been shown to bind C-terminal polypeptides; others appear to bind internal (non-C-terminal) polypeptides. Different PDZs possess different binding specificities. PGAM SM00855 Phosphoglycerate mutase family Phosphoglycerate mutase (PGAM) and bisphosphoglycerate mutase (BPGM) are structurally related enzymes that catalyse reactions involving the transfer of phospho groups between the three carbon atoms of phosphoglycerate PUBMED:2847721, PUBMED:2831102, PUBMED:10958932. Both enzymes can catalyse three different reactions with different specificities, the isomerization of 2-phosphoglycerate (2-PGA) to 3-phosphoglycerate (3-PGA) with 2,3-diphosphoglycerate (2,3-DPG) as the primer of the reaction, the synthesis of 2,3-DPG from 1,3-DPG with 3-PGA as a primer and the degradation of 2,3-DPG to 3-PGA (phosphatase activity). In mammals, PGAM is a dimeric protein with two isoforms, the M (muscle) and B (brain) forms. In yeast, PGAM is a tetrameric protein. PGA_cap SM00854 Bacterial capsule synthesis protein PGA_cap This protein is a putative poly-gamma-glutamate capsule biosynthesis protein found in bacteria. Poly-gamma-glutamate is a natural polymer that may be involved in virulence and may help bacteria survive in high salt concentrations. It is a surface-associated protein. PGRP SM00701 Animal peptidoglycan recognition proteins homologous to Bacteriophage T3 lysozyme. The bacteriophage molecule, but not its moth homologue, has been shown to have N-acetylmuramoyl-L-alanine amidase activity. One member of this family, Tag7, is a cytokine. PH SM00233 Pleckstrin homology domain. Domain commonly found in eukaryotic signalling proteins. The domain family possesses multiple functions including the abilities to bind inositol phosphates, and various proteins. PH domains have been found to possess inserted domains (such as in PLC gamma, syntrophins) and to be inserted within other domains. Mutations in Brutons tyrosine kinase (Btk) within its PH domain cause X-linked agammaglobulinaemia (XLA) in patients. Point mutations cluster into the positively charged end of the molecule around the predicted binding site for phosphatidylinositol lipids. PHB SM00244 prohibitin homologues prohibitin homologues PHD SM00249 PHD zinc finger The plant homeodomain (PHD) finger is a C4HC3 zinc-finger-like motif found in nuclear proteins thought to be involved in epigenetics and chromatin-mediated transcriptional regulation. The PHD finger binds two zinc ions using the so-called 'cross-brace' motif and is thus structurally related to the RING finger and the FYVE finger. It is not yet known if PHD fingers have a common molecular function. Several reports suggest that it can function as a protein-protein interacton domain and it was recently demonstrated that the PHD finger of p300 can cooperate with the adjacent BROMO domain in nucleosome binding in vitro. Other reports suggesting that the PHD finger is a ubiquitin ligase have been refuted as these domains were RING fingers misidentified as PHD fingers. PI3K_C2 SM00142 Phosphoinositide 3-kinase, region postulated to contain C2 domain Outlier of C2 family. PI3K_p85B SM00143 PI3-kinase family, p85-binding domain Region of p110 PI3K that binds the p85 subunit. PI3K_rbd SM00144 PI3-kinase family, Ras-binding domain Certain members of the PI3K family possess Ras-binding domains in their N-termini. These regions show some similarity (although not highly significant similarity) to Ras-binding RA domains (unpublished observation). PI3Ka SM00145 Phosphoinositide 3-kinase family, accessory domain (PIK domain) PIK domain is conserved in all PI3 and PI4-kinases. Its role is unclear but it has been suggested to be involved in substrate presentation. PI3Kc SM00146 Phosphoinositide 3-kinase, catalytic domain Phosphoinositide 3-kinase isoforms participate in a variety of processes, including cell motility, the Ras pathway, vesicle trafficking and secretion, and apoptosis. These homologues may be either lipid kinases and/or protein kinases: the former phosphorylate the 3-position in the inositol ring of inositol phospholipids. The ataxia telangiectesia-mutated gene produced, the targets of rapamycin (TOR) and the DNA-dependent kinase have not been found to possess lipid kinase activity. Some of this family possess PI-4 kinase activities. PIG-X SM00780 PIG-X / PBN1 Mammalian PIG-X and yeast PBN1 are essential components of glycosylphosphatidylinositol-mannosyltransferase I. These enzymes are involved in the transfer of sugar molecules. PINT SM00088 motif in proteasome subunits, Int-6, Nip-1 and TRIP-15 Also called the PCI (Proteasome, COP9, Initiation factor 3) domain. Unknown function. PINc SM00670 Large family of predicted nucleotide-binding domains From similarities to 5'-exonucleases, these domains are predicted to be RNases. PINc domains in nematode SMG-5 and yeast NMD4p are predicted to be involved in RNAi. PIPKc SM00330 Phosphatidylinositol phosphate kinases PKD SM00089 Repeats in polycystic kidney disease 1 (PKD1) and other proteins Polycystic kidney disease 1 protein contains 14 repeats, present elsewhere such as in microbial collagenases. PKS_AT SM00827 Acyl transferase domain in polyketide synthase (PKS) enzymes. PKS_DH SM00826 PKS_ER SM00829 Enoylreductase Enoylreductase in Polyketide synthases. PKS_KR SM00822 This enzymatic domain is part of bacterial polyketide synthases and catalyses the first step in the reductive modification of the beta-carbonyl centres in the growing polyketide chain. It uses NADPH to reduce the keto group to a hydroxy group. PKS_KS SM00825 Beta-ketoacyl synthase The structure of beta-ketoacyl synthase is similar to that of the thiolase family and also chalcone synthase. The active site of beta-ketoacyl synthase is located between the N and C-terminal domains. PKS_MT SM00828 Methyltransferase in polyketide synthase (PKS) enzymes. PKS_PP SM00823 Phosphopantetheine attachment site Phosphopantetheine (or pantetheine 4' phosphate) is the prosthetic group of acyl carrier proteins (ACP) in some multienzyme complexes where it serves as a 'swinging arm' for the attachment of activated fatty acid and amino-acid groups PUBMED:5321311. PKS_TE SM00824 Thioesterase Peptide synthetases are involved in the non-ribosomal synthesis of peptide antibiotics. Next to the operons encoding these enzymes, in almost all cases, are genes that encode proteins that have similarity to the type II fatty acid thioesterases of vertebrates. There are also modules within the peptide synthetases that also share this similarity. With respect to antibiotic production, thioesterases are required for the addition of the last amino acid to the peptide antibiotic, thereby forming a cyclic antibiotic. Thioesterases (non-integrated) have molecular masses of 25-29 kDa. PLAc SM00022 Cytoplasmic phospholipase A2, catalytic subunit Cytosolic phospholipases A2 hydrolyse arachidonyl phospholipids. Family includes phospholipases B isoforms. PLCXc SM00148 Phospholipase C, catalytic domain (part); domain X Phosphoinositide-specific phospholipases C. These enzymes contain 2 regions (X and Y) which together form a TIM barrel-like structure containing the active site residues. Phospholipase C enzymes (PI-PLC) act as signal transducers that generate two second messengers, inositol-1,4,5-trisphosphate and diacylglycerol. The bacterial enzyme [6] appears to be a homologue of the mammalian PLCs. PLCYc SM00149 Phospholipase C, catalytic domain (part); domain Y Phosphoinositide-specific phospholipases C. These enzymes contain 2 regions (X and Y) which together form a TIM barrel-like structure containing the active site residues. Phospholipase C enzymes (PI-PLC) act as signal transducers that generate two second messengers, inositol-1,4,5-trisphosphate and diacylglycerol. The bacterial enzyme [6] appears to be a homologue of the mammalian PLCs. PLDc SM00155 Phospholipase D. Active site motifs. Phosphatidylcholine-hydrolyzing phospholipase D (PLD) isoforms are activated by ADP-ribosylation factors (ARFs). PLD produces phosphatidic acid from phosphatidylcholine, which may be essential for the formation of certain types of transport vesicles or may be constitutive vesicular transport to signal transduction pathways. PC-hydrolysing PLD is a homologue of cardiolipin synthase, phosphatidylserine synthase, bacterial PLDs, and viral proteins. Each of these appears to possess a domain duplication which is apparent by the presence of two motifs containing well-conserved histidine, lysine, aspartic acid, and/or asparagine residues which may contribute to the active site. An E. coli endonuclease (nuc) and similar proteins appear to be PLD homologues but possess only one of these motifs. The profile contained here represents only the putative active site regions, since an accurate multiple alignment of the repeat units has not been achieved. PLEC SM00250 Plectin repeat PLP SM00002 Myelin proteolipid protein (PLP or lipophilin) PMEI SM00856 Plant invertase/pectin methylesterase inhibitor This domain inhibits pectin methylesterases (PMEs) and invertases through formation of a non-covalent 1:1 complex PUBMED:8521860. It has been implicated in the regulation of fruit development, carbohydrate metabolism and cell wall extension. It may also be involved in inhibiting microbial pathogen PMEs. It has been observed that it is often expressed as a large inactive preprotein PUBMED:8521860. It is also found at the N-termini of PMEs predicted from DNA sequences, suggesting that both PMEs and their inhibitors are expressed as a single polyprotein and subsequently processed. It has two disulphide bridges and is mainly alpha-helical PUBMED:10880981. POL3Bc SM00480 DNA polymerase III beta subunit POLAc SM00482 DNA polymerase A domain POLBc SM00486 DNA polymerase type-B family DNA polymerase alpha, delta, epsilon and zeta chain (eukaryota), DNA polymerases in archaea, DNA polymerase II in e. coli, mitochondrial DNA polymerases and and virus DNA polymerases POLIIIAc SM00481 DNA polymerase alpha chain like domain DNA polymerase alpha chain like domain, incl. family of hypothetical proteins POLXc SM00483 DNA polymerase X family includes vertebrate polymerase beta and terminal deoxynucleotidyltransferases POP4 SM00538 A domain found in a protein subunit of human RNase MRP and RNase P ribonucleoprotein complexes and archaeal proteins. POU SM00352 Found in Pit-Oct-Unc transcription factors POX SM00574 domain associated with HOX domains PP2Ac SM00156 Protein phosphatase 2A homologues, catalytic domain. Large family of serine/threonine phosphatases, that includes PP1, PP2A and PP2B (calcineurin) family members. PP2C_SIG SM00331 Sigma factor PP2C-like phosphatases PP2Cc SM00332 Serine/threonine phosphatases, family 2C, catalytic domain The protein architecture and deduced catalytic mechanism of PP2C phosphatases are similar to the PP1, PP2A, PP2B family of protein Ser/Thr phosphatases, with which PP2C shares no sequence similarity. PQQ SM00564 beta-propeller repeat Beta-propeller repeat occurring in enzymes with pyrrolo-quinoline quinone (PQQ) as cofactor, in Ire1p-like Ser/Thr kinases, and in prokaryotic dehydrogenases. PRE_C2HC SM00596 PROF SM00392 Profilin Binds actin monomers, membrane polyphosphoinositides and poly-L-proline. PRP SM00157 Major prion protein The prion protein is a major component of scrapie-associated fibrils in Creutzfeldt-Jakob disease, kuru, Gerstmann-Straussler syndrome and bovine spongiform encephalopathy. PRY SM00589 associated with SPRY domains PSA SM00639 Paramecium Surface Antigen Repeat PSI SM00423 domain found in Plexins, Semaphorins and Integrins PSN SM00730 Presenilin, signal peptide peptidase, family Presenilin 1 and presenilin 2 are polytopic membrane proteins, whose genes are mutated in some individuals with Alzheimer's disease. Distant homologues, present in eukaryotes and archaea, also contain conserved aspartic acid residues which are predicted to contribute to catalysis. At least one member of this family has been shown to possess signal peptide peptidase activity. PSP SM00581 proline-rich domain in spliceosome associated proteins PTB SM00462 Phosphotyrosine-binding domain, phosphotyrosine-interaction (PI) domain PTB/PI domain structure similar to those of pleckstrin homology (PH) and IRS-1-like PTB domains. PTBI SM00310 Phosphotyrosine-binding domain (IRS1-like) PTH SM00087 Parathyroid hormone PTI SM00286 Plant trypsin inhibitors PTN SM00193 Pleiotrophin / midkine family Heparin-binding domain family. PTPc SM00194 Protein tyrosine phosphatase, catalytic domain PTPc_DSPc SM00012 Protein tyrosine phosphatase, catalytic domain, undefined specificity Protein tyrosine phosphatases. Homologues detected by this profile and not by those of "PTPc" or "DSPc" are predicted to be protein phosphatases with a similar fold to DSPs and PTPs, yet with unpredicted specificities. PTPc_motif SM00404 Protein tyrosine phosphatase, catalytic domain motif PTX SM00159 Pentraxin / C-reactive protein / pentaxin family This family form a doscoid pentameric structure. Human serum amyloid P demonstrates calcium-mediated ligand-binding. PUA SM00359 Putative RNA-binding Domain in PseudoUridine synthase and Archaeosine transglycosylase PUG SM00580 domain in protein kinases, N-glycanases and other nuclear proteins PUR SM00712 DNA/RNA-binding repeats in PUR-alpha/beta/gamma and in hypothetical proteins from spirochetes and the Bacteroides-Cytophaga-Flexibacter bacteria. PWI SM00311 PWI, domain in splicing factors PWWP SM00293 domain with conserved PWWP motif conservation of Pro-Trp-Trp-Pro residues PX SM00312 PhoX homologous domain, present in p47phox and p40phox. Eukaryotic domain of unknown function present in phox proteins, PLD isoforms, a PI3K isoform. PXA SM00313 Domain associated with PX domains unpubl. observations PYNP_C SM00941 Pyrimidine nucleoside phosphorylase C-terminal domain This domain is found at the C-terminal end of the large alpha/beta domain making up various pyrimidine nucleoside phosphorylases PUBMED:9817849, PUBMED:2199449. It has slightly different conformations in different members of this family. For example, in pyrimidine nucleoside phosphorylase (PYNP, ) there is an added three-stranded anti-parallel beta sheet as compared to other members of the family, such as E. coli thymidine phosphorylase (TP, ) PUBMED:9817849. The domain contains an alpha/ beta hammerhead fold and residues in this domain seem to be important in formation of the homodimer PUBMED:9817849. ParB SM00470 ParB-like nuclease domain Plasmid RK2 ParB preferentially cleaves single-stranded DNA. ParB also nicks supercoiled plasmid DNA preferably at sites with potential single-stranded character, like AT-rich regions and sequences that can form cruciform structures. ParB also exhibits 5-->3 exonuclease activity. PbH1 SM00710 Parallel beta-helix repeats The tertiary structures of pectate lyases and rhamnogalacturonase A show a stack of parallel beta strands that are coiled into a large helix. Each coil of the helix represents a structural repeat that, in some homologues, can be recognised from sequence information alone. Conservation of asparagines might be connected with asparagine-ladders that contribute to the stability of the fold. Proteins containing these repeats most often are enzymes with polysaccharide substrates. PepX_C SM00939 X-Pro dipeptidyl-peptidase C-terminal non-catalytic domain This domain is found at the C-terminus of cocaine esterase CocE, several glutaryl-7-ACA acylases, and the putative diester hydrolase NonD of Streptomyces griseus (all hydrolases). The domain, which is a beta sandwich, is also found in serine peptidases belonging to MEROPS peptidase family S15: Xaa-Pro dipeptidyl-peptidases. Members of this entry, that are not characterised as peptidases, show extensive low-level similarity to the Xaa-Pro dipeptidyl-peptidases. PepX_N SM00940 X-Prolyl dipeptidyl aminopeptidase PepX, N-terminal This N-terminal domain adopts a secondary structure consisting of a helical bundle of eight alpha helices and three beta strands, with the last alpha helix connecting to the first strand of the catalytic domain. The first strand of the N-terminus also forms a small parallel beta sheet with strand five of the catalytic domain. This domain mediates dimerisation of the protein, with two proline residues present in the domain being critical for interaction PUBMED:12377124. Pept_C1 SM00645 Papain family cysteine protease PhBP SM00708 Insect pheromone/odorant binding protein domains. PhnA_Zn_Ribbon SM00782 PhnA Zinc-Ribbon This protein family includes an uncharacterised member designated phnA in Escherichia coli, part of a large operon associated with alkylphosphonate uptake and carbon-phosphorus bond cleavage. This protein is not related to the characterised phosphonoacetate hydrolase designated PhnA. Piwi SM00950 This domain is found in the protein Piwi and its relatives. The function of this domain is the dsRNA guided hydrolysis of ssRNA. Determination of the crystal structure of Argonaute reveals that PIWI is an RNase H domain, and identifies Argonaute as Slicer, the enzyme that cleaves mRNA in the RNAi RISC complex PUBMED:15284453 . In addition, Mg+2 dependence and production of 3'-OH and 5' phosphate products are shared characteristics of RNaseH and RISC. The PIWI domain core has a tertiary structure belonging to the RNase H family of enzymes. RNase H fold proteins all have a five-stranded mixed beta-sheet surrounded by helices. By analogy to RNase H enzymes which cleave single-stranded RNA guided by the DNA strand in an RNA/DNA hybrid, the PIWI domain can be inferred to cleave single-stranded RNA, for example mRNA, guided by double stranded siRNA. PlsC SM00563 Phosphate acyltransferases Function in phospholipid biosynthesis and have either glycerolphosphate, 1-acylglycerolphosphate, or 2-acylglycerolphosphoethanolamine acyltransferase activities. Tafazzin, the product of the gene mutated in patients with Barth syndrome, is a member of this family. Plus3 SM00719 Short conserved domain in transcriptional regulators. Plus3 domains occur in the Saccharomyces cerevisiae Rtf1p protein, which interacts with Spt6p, and in parsley CIP, which interacts with the bZIP protein CPRF1. PolyA SM00517 C-terminal domain of Poly(A)-binding protein. Present also in Drosophila hyperplastics discs protein. Involved in homodimerisation (either directly or indirectly) PostSET SM00508 Cysteine-rich motif following a subset of SET domains PreSET SM00468 N-terminal to some SET domains A Cys-rich putative Zn2+-binding domain that occurs N-terminal to some SET domains. Function is unknown. Unpublished. PriCT_1 SM00942 Primase C terminal 1 (PriCT-1) This alpha helical domain is found at the C terminal of primases. Prim-Pol SM00943 Bifunctional DNA primase/polymerase, N-terminal Members of this family adopt a structure consisting of a core of antiparallel beta sheets. They are found in various bacterial hypothetical proteins, and have been shown to harbour both primase and polymerase activities PUBMED:14730355. Prim_Zn_Ribbon SM00778 Zinc-binding domain of primase-helicase This region represents the zinc binding domain. It is found in the N-terminal region of the bacteriophage P4 alpha protein, which is a multifunctional protein with origin recognition, helicase and primase activities. Pro-kuma_activ SM00944 Pro-kumamolisin, activation domain This domain is found at the N-terminus of peptidases belonging to MEROPS peptidase family S53 (sedolisin, clan SB). The domain adopts a ferredoxin-like fold, with an alpha+beta sandwich. Cleavage of the domain results in activation of the peptidase PUBMED:15242607. ProQ SM00945 ProQ/FINO family This family includes ProQ, which is required for full activation of the osmoprotectant transporter, ProQ, in Escherichia coli. ProRS-C_1 SM00946 Prolyl-tRNA synthetase, C-terminal Members of this family are predominantly found in prokaryotic prolyl-tRNA synthetase. They contain a zinc binding site, and adopt a structure consisting of alpha helices and antiparallel beta sheets arranged in 2 layers, in a beta-alpha-beta-alpha-beta motif PUBMED:12578991. Pro_CA SM00947 Carbonic anhydrase Carbonic anhydrases (CA) are zinc metalloenzymes which catalyze the reversible hydration of carbon dioxide. In Escherichia coli, CA (gene cynT) is involved in recycling carbon dioxide formed in the bicarbonate-dependent decomposition of cyanate by cyanase (gene cynS). By this action, it prevents the depletion of cellular bicarbonate PUBMED:1740425. In photosynthetic bacteria and plant chloroplast, CA is essential to inorganic carbon fixation PUBMED:1584776. Prokaryotic and plant chloroplast CA are structurally and evolutionary related and form a family distinct from the one which groups the many different forms of eukaryotic CA's. Proteasome_A_N SM00948 Proteasome subunit A N-terminal signature Add an annotation This domain is conserved in the A subunits of the proteasome complex proteins. Pumilio SM00025 Pumilio-like repeats Pumilio-like repeats that bind RNA. QLQ SM00951 QLQ is named after the conserved Gln, Leu, Gln motif. QLQ is found at the N-terminus of SWI2/SNF2 protein, which has been shown to be involved in protein-protein interactions. QLQ has been postulated to be involved in mediating protein interactions PUBMED:12974814. R3H SM00393 Putative single-stranded nucleic acids-binding domain RA SM00314 Ras association (RalGDS/AF-6) domain RasGTP effectors (in cases of AF6, canoe and RalGDS); putative RasGTP effectors in other cases. Kalhammer et al. have shown that not all RA domains bind RasGTP. Predicted structure similar to that determined, and that of the RasGTP-binding domain of Raf kinase. Predicted RA domains in PLC210 and nore1 found to bind RasGTP. Included outliers (Grb7, Grb14, adenylyl cyclases etc.) RAB SM00175 Rab subfamily of small GTPases Rab GTPases are implicated in vesicle trafficking. RAN SM00176 Ran (Ras-related nuclear proteins) /TC4 subfamily of small GTPases Ran is involved in the active transport of proteins through nuclear pores. RAP SM00952 This domain is found in various eukaryotic species, particularly in apicomplexans such as Plasmodium falciparum, where it is found in proteins that are important in various parasite-host cell interactions. It is thought to be an RNA-binding domain PUBMED:15501674. RAS SM00173 Ras subfamily of RAS small GTPases Similar in fold and function to the bacterial EF-Tu GTPase. p21Ras couples receptor Tyr kinases and G protein receptors to protein kinase cascades RBD SM00455 Raf-like Ras-binding domain REC SM00448 cheY-homologous receiver domain CheY regulates the clockwise rotation of E. coli flagellar motors. This domain contains a phosphoacceptor site that is phosphorylated by histidine kinase homologues. RES SM00953 This presumed protein contains 3 highly conserved polar groups that could form an active site. These are an arginine, glutamate and serine, hence the RES domain. RES is found widely distributed in bacteria, it has about 150 residues in length. RGS SM00315 Regulator of G protein signalling domain RGS family members are GTPase-activating proteins for heterotrimeric G-protein alpha-subunits. RHO SM00174 Rho (Ras homology) subfamily of Ras-like small GTPases Members of this subfamily of Ras-like small GTPases include Cdc42 and Rac, as well as Rho isoforms. RHOD SM00450 Rhodanese Homology Domain An alpha beta fold found duplicated in the Rhodanese protein. The the Cysteine containing enzymatically active version of the domain is also found in the CDC25 class of protein phosphatases and a variety of proteins such as sulfide dehydrogenases and stress proteins such as Senesence specific protein 1 in plants, PspE and GlpE in bacteria and cyanide and arsenate resistance proteins. Inactive versions with a loss of the cysteine are also seen in Dual specificity phosphatases, ubiquitin hydrolases from yeast and in sulfuryltransferases. These are likely to play a role in protein interactions. RIBOc SM00535 Ribonuclease III family RICIN SM00458 Ricin-type beta-trefoil Carbohydrate-binding domain formed from presumed gene triplication. RIIa SM00394 RIIalpha, Regulatory subunit portion of type II PKA R-subunit RIIalpha, Regulatory subunit portion of type II PKA R-subunit. Contains dimerisation interface and binding site for A-kinase-anchoring proteins (AKAPs). RING SM00184 Ring finger E3 ubiquitin-protein ligase activity is intrinsic to the RING domain of c-Cbl and is likely to be a general function of this domain; Various RING fingers exhibit binding activity towards E2 ubiquitin-conjugating enzymes (Ubc' s) RINGv SM00744 The RING-variant domain is a C4HC3 zinc-finger like motif found in a number of cellular and viral proteins. Some of these proteins have been shown both in vivo and in vitro to have ubiquitin E3 ligase activity. The RING-variant domain is reminiscent of both the RING and the PHD domains and may represent an evolutionary intermediate. To describe this domain the term PHD/LAP domain has been used in the past. Extended description: The RING-variant (RINGv) domain contains a C4HC3 zinc-finger-like motif similar to the PHD domain, while some of the spacing between the Cys/His residues follow a pattern somewhat closer to that found in the RING domain. The RINGv domain, similar to the RING, PHD and LIM domains, is thought to bind two zinc ions co-ordinated by the highly conserved Cys and His residues. RING variant domain: C-x (2) -C-x(10-45)-C-x (1) -C-x (7) -H-x(2)-C-x(11-25)-C-x(2)-C As opposed to a PHD: C-x(1-2) -C-x (7-13)-C-x(2-4)-C-x(4-5)-H-x(2)-C-x(10-21)-C-x(2)-C Classical RING domain: C-x (2) -C-x (9-39)-C-x(1-3)-H-x(2-3)-C-x(2)-C-x(4-48) -C-x(2)-C RIO SM00090 RIO-like kinase RL11 SM00649 Ribosomal protein L11/L12 RNAse_Pc SM00092 Pancreatic ribonuclease RNB SM00955 This domain is the catalytic domain of ribonuclease II.PUBMED:16806266 RPEL SM00707 Repeat in Drosophila CG10860, human KIAA0680 and C. elegans F26H9.2 RPOL4c SM00657 DNA-directed RNA-polymerase II subunit RPOL8c SM00658 RNA polymerase subunit 8 subunit of RNA polymerase I, II and III RPOL9 SM00661 RNA polymerase subunit 9 RPOLA_N SM00663 RNA polymerase I subunit A N-terminus RPOLCX SM00659 RNA polymerase subunit CX present in RNA polymerase I, II and III RPOLD SM00662 RNA polymerases D DNA-directed RNA polymerase subunit D and bacterial alpha chain RPR SM00582 domain present in proteins, which are involved in regulation of nuclear pre-mRNA RQC SM00956 This DNA-binding domain is found in the RecQ helicase among others and has a helix-turn-helix structure. The RQC domain, found only in RecQ family enzymes, is a high affinity G4 DNA binding domain PUBMED:16530788. RRM SM00360 RNA recognition motif RRM_1 SM00361 RNA recognition motif RRM_2 SM00362 RNA recognition motif RUN SM00593 domain involved in Ras-like GTPase signaling RWD SM00591 domain in RING finger and WD repeat containing proteins and DEXDc-like helicases subfamily related to the UBCc domain RanBD SM00160 Ran-binding domain Domain of apporximately 150 residues that stabilises the GTP-bound form of Ran (the Ras-like nuclear small GTPase). RasGAP SM00323 GTPase-activator protein for Ras-like GTPases All alpha-helical domain that accelerates the GTPase activity of Ras, thereby "switching" it into an "off" position. Improved domain limits from structure. RasGEF SM00147 Guanine nucleotide exchange factor for Ras-like small GTPases RasGEFN SM00229 Guanine nucleotide exchange factor for Ras-like GTPases; N-terminal motif A subset of guanine nucleotide exchange factor for Ras-like small GTPases appear to possess this domain N-terminal to the RasGef (Cdc25-like) domain. The recent crystal structureof Sos shows that this domain is alpha-helical and plays a "purely structural role" (Nature 394, 337-343). RelA_SpoT SM00954 Region found in RelA / SpoT proteins The functions of Escherichia coli RelA and SpoT differ somewhat. RelA produces pppGpp (or ppGpp) from ATP and GTP (or GDP). SpoT degrades ppGpp, but may also act as a secondary ppGpp synthetase. The two proteins are strongly similar. In many species, a single homolog to SpoT and RelA appears reponsible for both ppGpp synthesis and ppGpp degradation. (p)ppGpp is a regulatory metabolite of the stringent response, but appears also to be involved in antibiotic biosynthesis in some species. Resolvase SM00857 Resolvase, N terminal domain The N-terminal domain of the resolvase family contains the active site and the dimer interface. The extended arm at the C-terminus of this domain connects to the C-terminal helix-turn-helix domain of resolvase. RhoGAP SM00324 GTPase-activator protein for Rho-like GTPases GTPase activator proteins towards Rho/Rac/Cdc42-like small GTPases. etter domain limits and outliers. RhoGEF SM00325 Guanine nucleotide exchange factor for Rho/Rac/Cdc42-like GTPases Guanine nucleotide exchange factor for Rho/Rac/Cdc42-like GTPases Also called Dbl-homologous (DH) domain. It appears that PH domains invariably occur C-terminal to RhoGEF/DH domains. Improved coverage. Rho_N SM00959 Rho termination factor, N-terminal domain The Rho termination factor disengages newly transcribed RNA from its DNA template at certain, specific transcripts. It it thought that two copies of Rho bind to RNA and that Rho functions as a hexamer of protomers PUBMED:10230401. This domain is found to the N-terminus of the RNA binding domain. Robl_LC7 SM00960 Roadblock/LC7 domain This family includes proteins that are about 100 amino acids long and have been shown to be related PUBMED:11084347. Members of this family of proteins are associated with both flagellar outer arm dynein and Drosophila and rat brain cytoplasmic dynein. It is proposed that roadblock/LC7 family members may modulate specific dynein functions PUBMED:10402468. This family also includes Golgi-associated MP1 adapter protein and MglB from Myxococcus xanthus, a protein involved in gliding motility PUBMED:2464581. However the family also includes members from non-motile bacteria such as Streptomyces coelicolor, suggesting that the protein may play a structural or regulatory role. RuBisCO_small SM00961 Ribulose bisphosphate carboxylase, small chain RuBisCO (ribulose-1,5-bisphosphate carboxylase/oxygenase) is a bifunctional enzyme that catalyses both the carboxylation and oxygenation of ribulose-1,5-bisphosphate (RuBP), thus fixing carbon dioxide as the first step of the Calvin cycle. RuBisCO is the major protein in the stroma of chloroplasts, and in higher plants exists as a complex of 8 large and 8 small subunits. The function of the small subunit is unknown PUBMED:3012537. While the large subunit is coded for by a single gene, the small subunit is coded for by several different genes, which are distributed in a tissue specific manner. They are transcriptionally regulated by light receptor phytochrome PUBMED:3010233, which results in RuBisCO being more abundant during the day when it is required. S1 SM00316 Ribosomal protein S1-like RNA-binding domain S4 SM00363 S4 RNA-binding domain SAA SM00197 Serum amyloid A proteins Serum amyloid A proteins are induced during the acute-phase response. Secondary amyloidosis is characterised by the extracellular accumulation in tissues of SAA proteins. SAA proteins are apolipoproteins. SAF SM00858 This domain family includes a range of different proteins. Such as antifreeze proteins and flagellar FlgA proteins, and CpaB pilus proteins. SAM SM00454 Sterile alpha motif. Widespread domain in signalling and nuclear proteins. In EPH-related tyrosine kinases, appears to mediate cell-cell initiated signal transduction via the binding of SH2-containing proteins to a conserved tyrosine that is phosphorylated. In many cases mediates homodimerisation. SAM_PNT SM00251 SAM / Pointed domain A subfamily of the SAM domain SAND SM00258 SAND domain SANT SM00717 SANT SWI3, ADA2, N-CoR and TFIIIB'' DNA-binding domains SAP SM00513 Putative DNA-binding (bihelical) motif predicted to be involved in chromosomal organisation SAPA SM00162 Saposin/surfactant protein-B A-type DOMAIN Present as four and three degenerate copies, respectively, in prosaposin and surfactant protein B. Single copies in acid sphingomyelinase, NK-lysin amoebapores and granulysin. Putative phospholipid membrane binding domains. SAR SM00178 Sar1p-like members of the Ras-family of small GTPases Yeast SAR1 is an essential gene required for transport of secretory proteins from the endoplasmic reticulum to the Golgi apparatus. SATase_N SM00971 Serine acetyltransferase, N-terminal The N-terminal domain of serine acetyltransferase has a sequence that is conserved in plants PUBMED:7608200 and bacteria PUBMED:7608200. SCAN SM00431 leucine rich region SCP SM00198 SCP / Tpx-1 / Ag5 / PR-1 / Sc7 family of extracellular domains. Human glioma pathogenesis-related protein GliPR and the plant pathogenesis-related protein represent functional links between plant defense systems and human immune system. This family has no known function. SCPU SM00972 Spore Coat Protein U domain This domain is found in a bacterial family of spore coat proteins PUBMED:1904442 as well as a family of secreted pili proteins involved in motility and biofilm formation PUBMED:1904442. SCY SM00199 Intercrine alpha family (small cytokine C-X-C) (chemokine CXC). Family of cytokines involved in cell-specific chemotaxis, mediation of cell growth, and the inflammatory response. SEA SM00200 Domain found in sea urchin sperm protein, enterokinase, agrin Proposed function of regulating or binding carbohydrate sidechains. SEC14 SM00516 Domain in homologues of a S. cerevisiae phosphatidylinositol transfer protein (Sec14p) Domain in homologues of a S. cerevisiae phosphatidylinositol transfer protein (Sec14p) and in RhoGAPs, RhoGEFs and the RasGAP, neurofibromin (NF1). Lipid-binding domain. The SEC14 domain of Dbl is known to associate with G protein beta/gamma subunits. SEL1 SM00671 Sel1-like repeats. These represent a subfamily of TPR (tetratricopeptide repeat) sequences. SEP SM00553 Domain present in Saccharomyces cerevisiae Shp1, Drosophila melanogaster eyes closed gene (eyc), and vertebrate p47. SERPIN SM00093 SERine Proteinase INhibitors SET SM00317 SET (Su(var)3-9, Enhancer-of-zeste, Trithorax) domain Putative methyl transferase, based on outlier plant homologues SFM SM00500 Splicing Factor Motif, present in Prp18 and Pr04 SF_P SM00019 Pulmonary surfactant proteins Pulmonary surfactant associated proteins promote alveolar stability by lowering the surface tension at the air-liquid interface in the peripheral air spaces. SP-C, a component of surfactant, is a highly hydrophobic peptide of 35 amino acid residues which is processed from a larger precursor protein. SP-C is post-translationally modified by the covalent attachment of two palmitoyl groups on two adjacent cysteines SH2 SM00252 Src homology 2 domains Src homology 2 domains bind phosphotyrosine-containing polypeptides via 2 surface pockets. Specificity is provided via interaction with residues that are distinct from the phosphotyrosine. Only a single occurrence of a SH2 domain has been found in S. cerevisiae. SH3 SM00326 Src homology 3 domains Src homology 3 (SH3) domains bind to target proteins through sequences containing proline and hydrophobic amino acids. Pro-containing polypeptides may bind to SH3 domains in 2 different binding orientations. SH3b SM00287 Bacterial SH3 domain homologues SHR3_chaperone SM00786 ER membrane protein SH3 This family of proteins are membrane localised chaperones that are required for correct plasma membrane localisation of amino acid permeases (AAPs) PUBMED:15623581. Shr3 prevents AAPs proteins from aggregating and assists in their correct folding. In the absence of Shr3, AAPs are retained in the ER. SMC_hinge SM00968 SMC proteins Flexible Hinge Domain This entry represents the hinge region of the SMC (Structural Maintenance of Chromosomes) family of proteins. The hinge region is responsible for formation of the DNA interacting dimer. It is also possible that the precise structure of it is an essential determinant of the specificity of the DNA-protein interaction PUBMED:12411491. SMI1_KNR4 SM00860 SMI1 / KNR4 family Proteins in this family are involved in the regulation of 1,3-beta-glucan synthase activity and cell-wall formation. SMR SM00463 Small MutS-related domain SNc SM00318 Staphylococcal nuclease homologues SO SM00201 Somatomedin B -like domains Somatomedin-B is a peptide, proteolytically excised from vitronectin, that is a growth hormone-dependent serum factor with protease-inhibiting activity. SOCS SM00253 suppressors of cytokine signalling suppressors of cytokine signalling SOCS_box SM00969 The SOCS box acts as a bridge between specific substrate- binding domains and more generic proteins that comprise a large family of E3 ubiquitin protein ligases. SPEC SM00150 Spectrin repeats SPK SM00583 domain in SET and PHD domain containing proteins and protein kinases SPRY SM00449 Domain in SPla and the RYanodine Receptor. Domain of unknown function. Distant homologues are domains in butyrophilin/marenostrin/pyrin homologues. SPT2 SM00784 SPT2 chromatin protein This entry includes the Saccharomyces cerevisiae protein SPT2 which is a chromatin protein involved in transcriptional regulation PUBMED:15563464. SR SM00202 Scavenger receptor Cys-rich The sea ucrhin egg peptide speract contains 4 repeats of SR domains that contain 6 conserved cysteines. May bind bacterial antigens in the protein MARCO. SRA SM00466 SET and RING finger associated domain. Domain of unknown function in SET domain containing proteins and in Deinococcus radiodurans DRA1533. Domain in SET domain containing proteins and in Deinococcus radiodurans DRA1533. SRP54 SM00962 SRP54-type protein, GTPase domain This entry represents the GTPase domain of the 54 kDa SRP54 component, a GTP-binding protein that interacts with the signal sequence when it emerges from the ribosome. SRP54 of the signal recognition particle has a three-domain structure: an N-terminal helical bundle domain, a GTPase domain, and the M-domain that binds the 7s RNA and also binds the signal sequence. The extreme C-terminal region is glycine-rich and lower in complexity and poorly conserved between species. The GTPase domain is evolutionary related to P-loop NTPase domains found in a variety of other proteins PUBMED:7518075. SRP54_N SM00963 SRP54-type protein, helical bundle domain This entry represents the N-terminal helical bundle domain of the 54 kDa SRP54 component, a GTP-binding protein that interacts with the signal sequence when it emerges from the ribosome. SRP54 of the signal recognition particle has a three-domain structure: an N-terminal helical bundle domain, a GTPase domain, and the M-domain that binds the 7s RNA and also binds the signal sequence. The extreme C-terminal region is glycine-rich and lower in complexity and poorly conserved between species. START SM00234 in StAR and phosphatidylcholine transfer protein putative lipid-binding domain in StAR and phosphatidylcholine transfer protein STAT_int SM00964 STAT protein, protein interaction domain STAT proteins (Signal Transducers and Activators of Transcription) are a family of transcription factors that are specifically activated to regulate gene transcription when cells encounter cytokines and growth factors. STAT proteins also include an SH2 domain. STE SM00424 STE like transcription factors STI SM00452 Soybean trypsin inhibitor (Kunitz) family of protease inhibitors STI1 SM00727 Heat shock chaperonin-binding motif. STN SM00965 Secretin and TonB N terminus short domain This is a short domain found at the N-terminus of the Secretins of the bacterial type II/III secretory system as well as the TonB-dependent receptor proteins. These proteins are involved in TonB-dependent active uptake of selective substrates. STYKc SM00221 Protein kinase; unclassified specificity. Phosphotransferases. The specificity of this class of kinases can not be predicted. Possible dual-specificity Ser/Thr/Tyr kinase. SWAP SM00648 Suppressor-of-White-APricot splicing regulator domain present in regulators which are responsible for pre-mRNA splicing processes SWIB SM00151 SWI complex, BAF60b domains S_TK_X SM00133 Extension to Ser/Thr-type protein kinases S_TKc SM00220 Serine/Threonine protein kinases, catalytic domain Phosphotransferases. Serine or threonine-specific kinase subfamily. SapB SM00741 Saposin (B) Domains Present in multiple copies in prosaposin and in pulmonary surfactant-associated protein B. In plant aspartic proteinases, a saposin domain is circularly permuted. This causes the prediction algorithm to predict two such domains, where only one is truly present. Sec63 SM00973 Sec63 Brl domain This domain was named after the yeast Sec63 (or NPL1) (also known as the Brl domain) protein in which it was found. This protein is required for assembly of functional endoplasmic reticulum translocons PUBMED:16368690, PUBMED:11023840. Other yeast proteins containing this domain include pre-mRNA splicing helicase BRR2, HFM1 protein and putative helicases. Sec7 SM00222 Sec7 domain Domain named after the S. cerevisiae SEC7 gene product, which is required for proper protein transport through the Golgi. The domain facilitates guanine nucleotide exchange on the small GTPases, ARFs (ADP ribosylation factors). SecA_DEAD SM00957 SecA DEAD-like domain SecA protein binds to the plasma membrane where it interacts with proOmpA to support translocation of proOmpA through the membrane. SecA protein achieves this translocation, in association with SecY protein, in an ATP dependent manner PUBMED:9644254 PUBMED:2542029. This domain represents the N-terminal ATP-dependent helicase domain, which is related to the PUBMED:12242434. SecA_PP_bind SM00958 SecA preprotein cross-linking domain The SecA ATPase is involved in the insertion and retraction of preproteins through the plasma membrane. This domain has been found to cross-link to preproteins, thought to indicate a role in preprotein binding. The pre-protein cross-linking domain is comprised of two sub domains that are inserted within the ATPase domain PUBMED:12242434. Sema SM00630 semaphorin domain Semialdhyde_dh SM00859 Semialdehyde dehydrogenase, NAD binding domain The semialdehyde dehydrogenase family is found in N-acetyl-glutamine semialdehyde dehydrogenase (AgrC), which is involved in arginine biosynthesis, and aspartate-semialdehyde dehydrogenase, an enzyme involved in the biosynthesis of various amino acids from aspartate. This family is also found in yeast and fungal Arg5,6 protein, which is cleaved into the enzymes N-acety-gamma-glutamyl-phosphate reductase and acetylglutamate kinase. These are also involved in arginine biosynthesis. All proteins in this entry contain a NAD binding region of semialdehyde dehydrogenase. ShKT SM00254 ShK toxin domain ShK toxin domain Skp1 SM00512 Found in Skp1 protein family Family of Skp1 (kinetochore protein required for cell cycle progression) and elongin C (subunit of RNA polymerase II transcription factor SIII) homologues. Sm SM00651 snRNP Sm proteins small nuclear ribonucleoprotein particles (snRNPs) involved in pre-mRNA splicing Sorb SM00459 Sorbin homologous domain First found in the peptide hormone sorbin and later in the ponsin/ArgBP2/vinexin family of proteins. Spc7 SM00787 Spc7 kinetochore protein This domain is found in cell division proteins which are required for kinetochore-spindle association. SpoU_sub_bind SM00967 RNA 2'-O ribose methyltransferase substrate binding This domain is a RNA 2'-O ribose methyltransferase substrate binding domain. SpoVT_AbrB SM00966 SpoVT / AbrB like domain This domain is found in AbrB from Bacillus subtilis. The product of the abrB gene is an ambiactive repressor and activator of the transcription of genes expressed during the transition state between vegetative growth and the onset of stationary phase and sporulation PUBMED:2504584. AbrB is thought to interact directly with the transcription initiation regions of genes under its control PUBMED:8755877. AbrB contains a helix-turn-helix structure, but this domain ends before the helix-turn-helix begins PUBMED:1908787. The product of the B. subtilis gene spoVT is another member of this family and is also a transcriptional regulator PUBMED:8755877. DNA-binding activity in this AbrB homologue requires hexamerisation PUBMED:10978510. Another family member has been isolated from the Sulfolobus solfataricus and has been identified as a homologue of bacterial repressor-like proteins. The Escherichia coli family member SohA or Prl1F appears to be bifunctional and is able to regulate its own expression as well as relieve the export block imposed by high-level synthesis of beta-galactosidase hybrid proteins PUBMED:2152898. SprT SM00731 SprT homologues. Predicted to have roles in transcription elongation. Contains a conserved HExxH motif, indicating a metalloprotease function. SynN SM00503 Syntaxin N-terminal domain Three-helix domain that (in Sso1p) slows the rate of its reaction with the SNAP-25 homologue Sec9p T5orf172 SM00974 This entry represents the putative helicase A859L PUBMED:11897024. TAF SM00803 TATA box binding protein associated factor TAFs (TATA box binding protein associated factors) are part of the transcription initiation factor TFIID multimeric protein complex. TFIID is composed of the TATA box binding protein (TBP) and a number of TAFs. The TAFs provide binding sites for many different transcriptional activators and co-activators that modulate transcription initiation by Pol II. TAF proteins adopt a histone-like fold. TAFH SM00549 TAF homology Domain in Drosophila nervy, CBFA2T1, human TAF105, human TAF130, and Drosophila TAF110. Also known as nervy homology region 1 (NHR1). TAP_C SM00804 C-terminal domain of vertebrate Tap protein The vertebrate Tap protein is a member of the NXF family of shuttling transport receptors for the nuclear export of mRNA. Its most C-terminal domain is important for binding to FG repeat-containing nuclear pore proteins (FG-nucleoporins) and is sufficient to mediate shuttling. This domain forms a compact four-helix fold related to that of a UBA domain. TBC SM00164 Domain in Tre-2, BUB2p, and Cdc16p. Probable Rab-GAPs. Widespread domain present in Gyp6 and Gyp7, thereby giving rise to the notion that it performs a GTP-activator activity on Rab-like GTPases. TBOX SM00425 Domain first found in the mice T locus (Brachyury) protein TDU SM00711 Short repeats in human TONDU, fly vestigial and other proteins. Unknown function. TEA SM00426 TEA domain TECPR SM00706 Beta propeller repeats in Physarum polycephalum tectonins, Limulus lectin L-6 and animal hypothetical proteins. TFIIE SM00531 Transcription initiation factor IIE TFS2M SM00510 Domain in the central regions of transcription elongation factor S-II (and elsewhere) TFS2N SM00509 Domain in the N-terminus of transcription elongation factor S-II (and elsewhere) TGFB SM00204 Transforming growth factor-beta (TGF-beta) family Family members are active as disulphide-linked homo- or heterodimers. TGFB is a multifunctional peptide that controls proliferation, differentiation, and other functions in many cell types. TGc SM00460 Transglutaminase/protease-like homologues Transglutaminases are enzymes that establish covalent links between proteins. A subset of transglutaminase homologues appear to catalyse the reverse reaction, the hydrolysis of peptide bonds. Proteins with this domain are both extracellular and intracellular, and it is likely that the eukaryotic intracellular proteins are involved in signalling events. THAP SM00980 The THAP domain is a putative DNA-binding domain (DBD) and probably also binds a zinc ion. It features the conserved C2CH architecture (consensus sequence: Cys - 2-4 residues - Cys - 35-50 residues - Cys - 2 residues - His). Other universal features include the location of the domain at the N-termini of proteins, its size of about 90 residues, a C-terminal AVPTIF box and several other conserved residues. Orthologues of the human THAP domain have been identified in other vertebrates and probably worms and flies, but not in other eukaryotes or any prokaryotes PUBMED:12575992. THEG SM00705 Repeats in THEG (testicular haploid expressed gene) and several fly proteins. THN SM00205 Thaumatin family The thaumatin family gathers proteins related to plant pathogenesis. The thaumatin family includes very basic members with extracellular and vacuolar localization. Thaumatin itsel is a potent sweet-tasting protein. Several members of this family display significant in vitro activity of inhibiting hyphal growth or spore germination of various fungi probably by a membrane permeabilizing mechanism. THUMP SM00981 The THUMP domain is named after after thiouridine synthases, methylases and PSUSs PUBMED:11295541. The THUMP domain consists of about 110 amino acid residues. The structure of ThiI reveals that the THUMP has a fold unlike that of previously characterised RNA-binding domains PUBMED:16343540. It is predicted that this domain is an RNA-binding domain The THUMP domain probably functions by delivering a variety of RNA modification enzymes to their targets PUBMED:11295541. THY SM00152 Thymosin beta actin-binding motif. TIFY SM00979 This short possible domain is found in a variety of plant transcription factors that contain GATA domains as well as other motifs. Although previously known as the Zim domain this is now called the tify domain after its most conserved amino acids. TIFY proteins can be further classified into two groups depending on the presence (group I) or absence (group II) of a C2C2-GATA domain. Functional annotation of these proteins is still poor, but several screens revealed a link between TIFY proteins of group II and jasmonic acid-related stress response. TIR SM00255 Toll - interleukin 1 - resistance TK SM00203 Tachykinin family Tachykinins are a group of biologically active peptides which excite neurons, evoke behavioral responses, are potent vasodilatators and contract (directly or indirectly) many smooth muscles. These peptides are synthesized as longer precursors and then processed to peptides from ten to twelve residues long. TLC SM00724 TRAM, LAG1 and CLN8 homology domains. Protein domain with at least 5 transmembrane alpha-helices. Lag1p and Lac1p are essential for acyl-CoA-dependent ceramide synthesis, TRAM is a subunit of the translocon and the CLN8 gene is mutated in Northern epilepsy syndrome. The family may possess multiple functions such as lipid trafficking, metabolism, or sensing. Trh homologues possess additional homeobox domains. TLDc SM00584 domain in TBC and LysM domain containing proteins TNF SM00207 Tumour necrosis factor family. Family of cytokines that form homotrimeric or heterotrimeric complexes. TNF mediates mature T-cell receptor-induced apoptosis through the p75 TNF receptor. TNFR SM00208 Tumor necrosis factor receptor / nerve growth factor receptor repeats. Repeats in growth factor receptors that are involved in growth factor binding. TNF/TNFR TOP1Ac SM00437 Bacterial DNA topoisomerase I DNA-binding domain Bacterial DNA topoisomerase I and III, Eukaryotic DNA topoisomeraes III, reverse gyrase alpha subunit TOP1Bc SM00436 Bacterial DNA topoisomeraes I ATP-binding domain Extension of TOPRIM in Bacterial DNA topoisomeraes I and III, Eukaryotic DNA topoisomeraes III, reverse gyrase beta subunit TOP2c SM00433 TopoisomeraseII Eukaryotic DNA topoisomerase II, GyrB, ParE TOP4c SM00434 DNA Topoisomerase IV Bacterial DNA topoisomerase IV, GyrA, ParC TOPEUc SM00435 DNA Topoisomerase I (eukaryota) DNA Topoisomerase I (eukaryota), DNA topoisomerase V, Vaccina virus topoisomerase, Variola virus topoisomerase, Shope fibroma virus topoisomeras TOPRIM SM00493 topoisomerases, DnaG-type primases, OLD family nucleases and RecR proteins TPK_B1_binding SM00983 Thiamin pyrophosphokinase, vitamin B1 binding domain Thiamin pyrophosphokinase (TPK) catalyzes the transfer of a pyrophosphate group from ATP to vitamin B1 (thiamin) to form the coenzyme thiamin pyrophosphate (TPP). Thus, TPK is important for the formation of a coenzyme required for central metabolic functions. The structure of thiamin pyrophosphokinase suggest that the enzyme may operate by a mechanism of pyrophosphoryl transfer similar to those described for pyrophosphokinases functioning in nucleotide biosynthesis PUBMED:11435118. TPR SM00028 Tetratricopeptide repeats Repeats present in 4 or more copies in proteins. Contain a minimum of 34 amino acids each and self-associate via a "knobs and holes" mechanism. TRASH SM00746 metallochaperone-like domain TRCF SM00982 This domain is found in proteins necessary for strand-specific repair in DNA such as TRCF in Escherichia coli. A lesion in the template strand blocks the RNA polymerase complex (RNAP). The RNAP-DNA-RNA complex is specifically recognised by the transcription-repair-coupling factor (TRCF) which releases RNAP and the truncated transcript. TR_FER SM00094 Transferrin TR_THY SM00095 Transthyretin TSP1 SM00209 Thrombospondin type 1 repeats Type 1 repeats in thrombospondin-1 bind and activate TGF-beta. TSPN SM00210 Thrombospondin N-terminal -like domains. Heparin-binding and cell adhesion domain of thrombospondin TSPc SM00245 tail specific protease tail specific protease TUDOR SM00333 Tudor domain Domain of unknown function present in several RNA-binding proteins. 10 copies in the Drosophila Tudor protein. Initial proposal that the survival motor neuron gene product contain a Tudor domain are corroborated by more recent database search techniques such as PSI-BLAST (unpublished). TY SM00211 Thyroglobulin type I repeats. The N-terminal region of human thyroglobulin contains 11 type-1 repeats TY repeats are proposed to be inhibitors of cysteine proteases and binding partners of heparin. TarH SM00319 Homologues of the ligand binding domain of Tar Homologues of the ligand binding domain of the wild-type bacterial aspartate receptor, Tar. Telo_bind SM00976 Telomeric single stranded DNA binding POT1/CDC13 The telomere-binding protein forms a heterodimer in ciliates consisting of an alpha and a beta subunit. This complex may function as a protective cap for the single-stranded telomeric overhang. Alpha subunit consists of 3 structural domains, all with the same beta-barrel OB fold. Telomerase_RBD SM00975 Telomerase ribonucleoprotein complex - RNA binding domain Telomeres in most organisms are comprised of tandem simple sequence repeats PUBMED:9671704. The total length of telomeric repeat sequence at each chromosome end is determined in a balance of sequence loss and sequence addition PUBMED:9671704. One major influence on telomere length is the enzyme telomerase PUBMED:9671704. It is a reverse transcriptase that adds these simple sequence repeats to chromosome ends by copying a template sequence within the RNA component of the enzyme PUBMED:9671704. The RNA binding domain of telomerase - TRBD - is made up of twelve alpha helices and two short beta sheets PUBMED:17997966. How telomerase and associated regulatory factors physically interact and function with each other to maintain appropriate telomere length is poorly understood. It is known however that TRBD is involved in formation of the holoenzyme (which performs the telomere extension) in addition to recognition and binding of RNA PUBMED:17997966. TilS_C SM00977 TilS substrate C-terminal domain This domain is found in the tRNA(Ile) lysidine synthetase (TilS) protein. Tim44 SM00978 Tim44 is an essential component of the machinery that mediates the translocation of nuclear-encoded proteins across the mitochondrial inner membrane PUBMED:10430866. Tim44 is thought to bind phospholipids of the mitochondrial inner membrane both by electrostatic interactions and by penetrating the polar head group region PUBMED:10430866. Trans_reg_C SM00862 Transcriptional regulatory protein, C terminal This domain is almost always found associated with the response regulator receiver domain. It may play a role in DNA binding. Transket_pyr SM00861 Transketolase, pyrimidine binding domain Transketolase (TK) catalyzes the reversible transfer of a two-carbon ketol unit from xylulose 5-phosphate to an aldose receptor, such as ribose 5-phosphate, to form sedoheptulose 7-phosphate and glyceraldehyde 3- phosphate. This enzyme, together with transaldolase, provides a link between the glycolytic and pentose-phosphate pathways. TK requires thiamine pyrophosphate as a cofactor. In most sources where TK has been purified, it is a homodimer of approximately 70 Kd subunits. TK sequences from a variety of eukaryotic and prokaryotic sources show that the enzyme has been evolutionarily conserved. In the peroxisomes of methylotrophic yeast Hansenula polymorpha, there is a highly related enzyme, dihydroxy-acetone synthase (DHAS) (also known as formaldehyde transketolase), which exhibits a very unusual specificity by including formaldehyde amongst its substrates. Tryp_SPc SM00020 Trypsin-like serine protease Many of these are synthesised as inactive precursor zymogens that are cleaved during limited proteolysis to generate their active forms. A few, however, are active as single chain molecules, and others are inactive due to substitutions of the catalytic triad residues. Tubulin SM00864 Tubulin/FtsZ family, GTPase domain This domain is found in all tubulin chains, as well as the bacterial FtsZ family of proteins. These proteins are involved in polymer formation. Tubulin is the major component of microtubules, while FtsZ is the polymer-forming protein of bacterial cell division, it is part of a ring in the middle of the dividing cell that is required for constriction of cell membrane and cell envelope to yield two daughter cells. FtsZ and tubulin are GTPases, this entry is the GTPase domain. FtsZ can polymerise into tubes, sheets, and rings in vitro and is ubiquitous in bacteria and archaea. Tubulin_C SM00865 Tubulin/FtsZ family, C-terminal domain This domain is found in the tubulin alpha, beta and gamma chains, as well as the bacterial FtsZ family of proteins. These proteins are GTPases and are involved in polymer formation. Tubulin is the major component of microtubules, while FtsZ is the polymer-forming protein of bacterial cell division, it is part of a ring in the middle of the dividing cell that is required for constriction of cell membrane and cell envelope to yield two daughter cells. FtsZ can polymerise into tubes, sheets, and rings in vitro and is ubiquitous in bacteria and archaea. This is the C-terminal domain. TyrKc SM00219 Tyrosine kinase, catalytic domain Phosphotransferases. Tyrosine-specific kinase subfamily. UAS SM00594 UBA SM00165 Ubiquitin associated domain Present in Rad23, SNF1-like kinases. The newly-found UBA in p62 is known to bind ubiquitin. UBA_e1_C SM00985 Ubiquitin-activating enzyme e1 C-terminal domain This presumed domain found at the C terminus of Ubiquitin-activating enzyme e1 proteins is functionally uncharacterised. UBCc SM00212 Ubiquitin-conjugating enzyme E2, catalytic domain homologues Proteins destined for proteasome-mediated degradation may be ubiquitinated. Ubiquitination follows conjugation of ubiquitin to a conserved cysteine residue of UBC homologues. This pathway functions in regulating many fundamental processes required for cell viability.TSG101 is one of several UBC homologues that lacks this active site cysteine. UBQ SM00213 Ubiquitin homologues Ubiquitin-mediated proteolysis is involved in the regulated turnover of proteins required for controlling cell cycle progression UBX SM00166 Domain present in ubiquitin-regulatory proteins Present in FAF1 and Shp1p. UDG SM00986 Uracil DNA glycosylase superfamily UDPG_MGDP_dh_C SM00984 UDP binding domain The UDP-glucose/GDP-mannose dehydrogenases are a small group of enzymes which possesses the ability to catalyse the NAD-dependent 2-fold oxidation of an alcohol to an acid without the release of an aldehyde intermediate PUBMED:2470755, PUBMED:9013585. UIM SM00726 Ubiquitin-interacting motif. Present in proteasome subunit S5a and other ubiquitin-associated proteins. UME SM00802 Domain in UVSB PI-3 kinase, MEI-41 and ESR-1 Characteristic domain in UVSP PI-3 kinase, MEI-41 and ESR-1. Found in nucleolar proteins. Associated with FAT, FATC, PI3_PI4_kinase modules. UTG SM00096 Uteroglobin UTRA SM00866 The UbiC transcription regulator-associated (UTRA) domain is a conserved ligand-binding domain that has a similar fold to PUBMED:12757941. It is believed to modulate activity of bacterial transcription factors in response to binding small molecules. Ubox SM00504 Modified RING finger domain Modified RING finger domain, without the full complement of Zn2+-binding ligands. Probable involvement in E2-dependent ubiquitination. UreE_C SM00987 UreE urease accessory protein, C-terminal domain UreE is a urease accessory protein. Urease hydrolyses urea into ammonia and carbamic acid. The C-terminal region of members of this family contains a His rich Nickel binding site. UreE_N SM00988 UreE urease accessory protein, N-terminal domain UreE is a urease accessory protein. Urease hydrolyses urea into ammonia and carbamic acid. V4R SM00989 The V4R (vinyl 4 reductase) domain is a predicted small molecular binding domain, that may bind to hydrocarbons. VHP SM00153 Villin headpiece domain VHS SM00288 Domain present in VPS-27, Hrs and STAM Unpublished observations. Domain of unknown function. VIT SM00609 Vault protein Inter-alpha-Trypsin domain VKc SM00756 Family of likely enzymes that includes the catalytic subunit of vitamin K epoxide reductase. Bacterial homologues are fused to members of the thioredoxin family of oxidoreductases. VPS10 SM00602 VPS9 SM00167 Domain present in VPS9 Domain present in yeast vacuolar sorting protein 9 and other proteins. VRR_NUC SM00990 This entry contains proteins with the VRR-NUC domain. It is associated with members of the PD-(D/E)XK nuclease superfamily, which include the type III restriction modification enzymes, for example StyLTI. VWA SM00327 von Willebrand factor (vWF) type A domain VWA domains in extracellular eukaryotic proteins mediate adhesion via metal ion-dependent adhesion sites (MIDAS). Intracellular VWA domains and homologues in prokaryotes have recently been identified. The proposed VWA domains in integrin beta subunits have recently been substantiated using sequence-based methods. VWC SM00214 von Willebrand factor (vWF) type C domain VWC_def SM00011 VWC_out SM00215 von Willebrand factor (vWF) type C domain VWD SM00216 von Willebrand factor (vWF) type D domain Von Willebrand factor contains several type D domains: D1 and D2 are present within the N-terminal propeptide whereas the remaining D domains are required for multimerisation. WAP SM00217 Four-disulfide core domains WD40 SM00320 WD40 repeats Note that these repeats are permuted with respect to the structural repeats (blades) of the beta propeller domain. WGR SM00773 Proposed nucleic acid binding domain This domain is named after its most conserved central motif. It is found in a variety of polyA polymerases as well as in molybdate metabolism regulators (e.g. in E.coli) and other proteins of unknown function. The domain is found in isolation in some proteins and is between 70 and 80 residues in length. It is proposed that it may be a nucleic acid binding domain. WH1 SM00461 WASP homology region 1 Region of the Wiskott-Aldrich syndrome protein (WASp) that contains point mutations in the majority of patients with WAS. Unknown function. Ena-like WH1 domains bind polyproline-containing peptides, and that Homer contains a WH1 domain. WH2 SM00246 Wiskott Aldrich syndrome homology region 2 Wiskott Aldrich syndrome homology region 2 / actin-binding motif WHEP-TRS SM00991 A conserved domain of 46 amino acids, called WHEP-TRS has been shown PUBMED:1756734 to exist in a number of higher eukaryote aminoacyl-transfer RNA synthetases. This domain is present one to six times in the several enzymes. There are three copies in mammalian multifunctional aminoacyl-tRNA synthetase in a region that separates the N-terminal glutamyl-tRNA synthetase domain from the C-terminal prolyl-tRNA synthetase domain, and six copies in the intercatalytic region of the Drosophila enzyme. The domain is found at the N-terminal extremity of the mammalian tryptophanyl- tRNA synthetase and histidyl-tRNA synthetase, and the mammalian, insect, nematode and plant glycyl- tRNA synthetases PUBMED:8463296. This domain could contain a central alpha-helical region and may play a role in the association of tRNA-synthetases into multienzyme complexes. WHy SM00769 Water Stress and Hypersensitive response WIF SM00469 Wnt-inhibitory factor-1 like domain Occurs as extracellular domain in metazoan Ryk receptor tyrosine kinases. C. elegans Ryk is required for cell-cuticle recognition. WIF-1 binds to Wnt and inhibits its activity. WNT1 SM00097 found in Wnt-1 WR1 SM00289 Worm-specific repeat type 1 Worm-specific repeat type 1. Cysteine-rich domain apparently unique (so far) to C. elegans. Often appears with KU domains. About 3 dozen worm proteins contain this domain. WRKY SM00774 DNA binding domain The WRKY domain is a DNA binding domain found in one or two copies in a superfamily of plant transcription factors. These transcription factors are involved in the regulation of various physiological programs that are unique to plants, including pathogen defense, senescence and trichome development. The domain is a 60 amino acid region that is defined by the conserved amino acid sequence WRKYGQK at its N-terminal end, together with a novel zinc-finger-like motif. It binds specifically to the DNA sequence motif (T)(T)TGAC(C/T), which is known as the W box. The invariant TGAC core is essential for function and WRKY binding. WSC SM00321 present in yeast cell wall integrity and stress response component proteins Domain present in WSC proteins, polycystin and fungal exoglucanase WSN SM00453 Worm-specific (usually) N-terminal domain WW SM00456 Domain with 2 conserved Trp (W) residues Also known as the WWP or rsp5 domain. Binds proline-rich polypeptides. WWE SM00678 Domain in Deltex and TRIP12 homologues. Possibly involved in regulation of ubiquitin-mediated proteolysis. X8 SM00768 Possibly involved in carbohydrate binding The X8 domain, which may be involved in carbohydrate binding, is found in an Olive pollen antigen as well as at the C terminus of family 17 glycosyl hydrolases. It contains 6 conserved cysteine residues which presumably form three disulfide bridges. XPGI SM00484 Xeroderma pigmentosum G I-region domain in nucleases XPGN SM00485 Xeroderma pigmentosum G N-region domain in nucleases XTALbg SM00247 Beta/gamma crystallins Beta/gamma crystallins YL1_C SM00993 YL1 nuclear protein C-terminal domain This domain is found in proteins of the YL1 family. These proteins have been shown to be DNA-binding and may be a transcription factor. This domain is found in proteins that are not YL1 proteins. YccV-like SM00992 Hemimethylated DNA-binding protein YccV like YccV is a hemimethylated DNA binding protein which has been shown to regulate dnaA gene expression. The structure of one of the hypothetical proteins in this family has been solved and it forms a beta sheet structure with a terminating alpha helix. YceI SM00867 YceI-like domain E. coli YceI is a base-induced periplasmic protein. The recent structure of a member of this family shows that it binds to polyisoprenoid. The structure consists of an extended, eight-stranded, antiparallel beta-barrel that resembles the lipocalin fold. YqgFc SM00732 Likely ribonuclease with RNase H fold. YqgF proteins are likely to function as an alternative to RuvC in most bacteria, and could be the principal holliday junction resolvases in low-GC Gram-positive bacteria. In Spt6p orthologues, the catalytic residues are substituted indicating that they lack enzymatic functions. ZM SM00735 ZASP-like motif Short motif (26 amino acids) present in an alpha-actinin-binding protein, ZASP, and similar molecules. ZP SM00241 Zona pellucida (ZP) domain ZP proteins are responsible for sperm-adhesion fo the zona pellucida. ZP domains are also present in multidomain transmembrane proteins such as glycoprotein GP2, uromodulin and TGF-beta receptor type III (betaglycan). ZU5 SM00218 Domain present in ZO-1 and Unc5-like netrin receptors Domain of unknown function. Zalpha SM00550 Z-DNA-binding domain in adenosine deaminases. Helix-turn-helix-containing domain. Also known as Zab. ZipA_C SM00771 ZipA, C-terminal domain (FtsZ-binding) C-terminal domain of ZipA, a component of cell division in E.coli. It interacts with the FtsZ protein in one of the initial steps of septum formation. The structure of this domain is composed of three alpha-helices and a beta-sheet consisting of six antiparallel beta-strands. ZnF_A20 SM00259 A20-like zinc fingers A20- (an inhibitor of cell death)-like zinc fingers. The zinc finger mediates self-association in A20. These fingers also mediate IL-1-induced NF-kappaB activation. ZnF_AN1 SM00154 AN1-like Zinc finger Zinc finger at the C-terminus of An1, a ubiquitin-like protein in Xenopus laevis. ZnF_BED SM00614 BED zinc finger DNA-binding domain in chromatin-boundary-element-binding proteins and transposases ZnF_C2C2 SM00440 C2C2 Zinc finger Nucleic-acid-binding motif in transcriptional elongation factor TFIIS and RNA polymerases. ZnF_C2H2 SM00355 zinc finger ZnF_C2HC SM00343 zinc finger ZnF_C3H1 SM00356 zinc finger ZnF_C4 SM00399 c4 zinc finger in nuclear hormone receptors ZnF_CDGSH SM00704 CDGSH-type zinc finger. Function unknown. ZnF_CHCC SM00400 zinc finger ZnF_DBF SM00586 Zinc finger in DBF-like proteins ZnF_GATA SM00401 zinc finger binding to DNA consensus sequence [AT]GATA[AG] ZnF_NFX SM00438 Repressor of transcription ZnF_PMZ SM00575 plant mutator transposase zinc finger ZnF_RBZ SM00547 Zinc finger domain Zinc finger domain in Ran-binding proteins (RanBPs), and other proteins. In RanBPs, this domain binds RanGDP. ZnF_Rad18 SM00734 Rad18-like CCHC zinc finger Yeast Rad18p functions with Rad5p in error-free post-replicative DNA repair. This zinc finger is likely to bind nucleic-acids. ZnF_TAZ SM00551 TAZ zinc finger, present in p300 and CBP ZnF_TTF SM00597 zinc finger in transposases and transcription factors ZnF_U1 SM00451 U1-like zinc finger Family of C2H2-type zinc fingers, present in matrin, U1 small nuclear ribonucleoprotein C and other RNA-binding proteins. ZnF_UBP SM00290 Ubiquitin Carboxyl-terminal Hydrolase-like zinc finger ZnF_UBR1 SM00396 Putative zinc finger in N-recognin, a recognition component of the N-end rule pathway Domain is involved in recognition of N-end rule substrates in yeast Ubr1p ZnF_ZZ SM00291 Zinc-binding domain, present in Dystrophin, CREB-binding protein. Putative zinc-binding domain present in dystrophin-like proteins, and CREB-binding protein/p300 homologues. The ZZ in dystrophin appears to bind calmodulin. A missense mutation of one of the conserved cysteines in dystrophin results in a patient with Duchenne muscular dystrophy [3]. ZnMc SM00235 Zinc-dependent metalloprotease Neutral zinc metallopeptidases. This alignment represents a subset of known subfamilies. Highest similarity occurs in the HExxH zinc-binding site/ active site. Zn_dep_PLPC SM00770 Zinc dependent phospholipase C (alpha toxin) This domain conveys a zinc dependent phospholipase C activity (EC 3.1.4.3). It is found in a monomeric phospholipase C of Bacillus cereus as well as in the alpha toxin of Clostridium perfringens and Clostridium bifermentans, which is involved in haemolysis and cell rupture. It is also found in a lecithinase of Listeria monocytogenes, which is involved in breaking the 2-membrane vacuoles that surround the bacterium. Structure information: PDB 1ca1. Zn_pept SM00631 Zpr1 SM00709 Duplicated domain in the epidermal growth factor- and elongation factor-1alpha-binding protein Zpr1. Also present in archaeal proteins. acidPPc SM00014 Acid phosphatase homologues alkPPc SM00098 Alkaline phosphatase homologues btg1 SM00099 tob/btg1 family The tob/btg1 is a family of proteins that inhibit cell proliferation. c-SKI_SMAD_bind SM01046 c-SKI Smad4 binding domain c-SKI is an oncoprotein that inhibits TGF-beta signaling through interaction with Smad proteins PUBMED:15107821. This domain binds to Smad4 PUBMED:12419246 . cNMP SM00100 Cyclic nucleotide-monophosphate binding domain Catabolite gene activator protein (CAP) is a prokaryotic homologue of eukaryotic cNMP-binding domains, present in ion channels, and cNMP-dependent kinases. calpain_III SM00720 cwf21 SM01115 The cwf21 family is involved in mRNA splicing. It has been isolated as a subcomplex of the splicosome in Schizosaccharomyces pombe (PUBMED:11884590). The function of the cwf21 domain is to bind directly to the spliceosomal protein Prp8. Mutations in the cwf21 domain prevent Prp8 from binding (PUBMED:19854871). The structure of this domain has recently been solved which shows this domain to be composed of two alpha helices. dDENN SM00801 Domain always found downstream of DENN domain, found in a variety of signalling proteins The dDENN domain is part of the tripartite DENN domain. It is always found downstream of the DENN domain itself, which is found in a variety of signalling proteins involved in Rab-mediated processes or regulation of MAPKs signalling pathways. The DENN domain is always encircled on both sides by more divergent domains, called uDENN (for upstream DENN) and dDENN (for downstream DENN). The function of the DENN domain remains to date unclear, although it appears to represent a good candidate for a GTP/GDP exchange activity. eIF1a SM00652 eukaryotic translation initiation factor 1A eIF2B_5 SM00653 domain present in translation initiation factor eIF2B and eIF5 eIF3_N SM01186 eIF3 subunit 6 N terminal domain This is the N terminal domain of subunit 6 translation initiation factor eIF3. eIF5C SM00515 Domain at the C-termini of GCD6, eIF-2B epsilon, eIF-4 gamma and eIF-5 eIF6 SM00654 translation initiation factor 6 eRF1_1 SM01194 The release factor eRF1 terminates protein biosynthesis by recognising stop codons at the A site of the ribosome and stimulating peptidyl-tRNA bond hydrolysis at the peptidyl transferase centre. The crystal structure of human eRF1 is known PUBMED:10676813. The overall shape and dimensions of eRF1 resemble a tRNA molecule with domains 1, 2, and 3 of eRF1 corresponding to the anticodon loop, aminoacyl acceptor stem, and T stem of a tRNA molecule, respectively. The position of the essential GGQ motif at an exposed tip of domain 2 suggests that the Gln residue coordinates a water molecule to mediate the hydrolytic activity at the peptidyl transferase centre. A conserved groove on domain 1, 80 A from the GGQ motif, is proposed to form the codon recognition site PUBMED:10676813. This family also includes other proteins for which the precise molecular function is unknown. Many of them are from Archaebacteria. These proteins may also be involved in translation termination but this awaits experimental verification. efhand_Ca_insen SM01184 Ca2+ insensitive EF hand EF hands are helix-loop-helix binding motifs involved in the regulation of many cellular processes. EF hands usually bind to Ca2+ ions which causes a major conformational change that allows the protein to interact with its designated targets. This domain corresponds to an EF hand which has partially or entirely lost its calcium-binding properties. The calcium insensitive EF hand is still able to mediate protein-protein recognition PUBMED:11573089. fCBD SM00236 Fungal-type cellulose-binding domain Small four-cysteine cellulose-binding domain of fungi rADc SM00650 Ribosomal RNA adenine dimethylases s48_45 SM00970 Sexual stage antigen s48/45 domain This family contains sexual stage s48/45 antigens from Plasmodium (approximately 450 residues long). These are surface proteins expressed by Plasmodium male and female gametes that have been shown to play a conserved and important role in fertilisation PUBMED:11163248. small_GTPase SM00010 Small GTPase of the Ras superfamily; ill-defined subfamily SMART predicts Ras-like small GTPases of the ARF, RAB, RAN, RAS, and SAR subfamilies. Others that could not be classified in this way are predicted to be members of the small GTPase superfamily without predictions of the subfamily. tRNA_SAD SM00863 Threonyl and Alanyl tRNA synthetase second additional domain The catalytically active form of threonyl/alanyl tRNA synthetase is a dimer. Within the tRNA synthetase class II dimer, the bound tRNA interacts with both monomers making specific interactions with the catalytic domain, the C-terminal domain, and this SAD domain (the second additional domain). The second additional domain is comprised of a pair of perpendicularly orientated antiparallel beta sheets, of four and three strands, respectively, that surround a central alpha helix that forms the core of the domain. t_SNARE SM00397 Helical region found in SNAREs All alpha-helical motifs that form twisted and parallel four-helix bundles in target soluble N-ethylmaleimide-sensitive factor (NSF) attachment protein (SNAP) receptor proteins. This motif found in "Q-SNAREs". uDENN SM00800 Domain always found upstream of DENN domain, found in a variety of signalling proteins The uDENN domain is part of the tripartite DENN domain. It is always found upstream of the DENN domain itself, which is found in a variety of signalling proteins involved in Rab-mediated processes or regulation of MAPKs signalling pathways. The DENN domain is always encircled on both sides by more divergent domains, called uDENN (for upstream DENN) and dDENN (for downstream DENN). The function of the DENN domain remains to date unclear, although it appears to represent a good candidate for a GTP/GDP exchange activity. zf-AD SM00868 Zinc-finger associated domain (zf-AD) The zf-AD domain, also known as ZAD, forms an atypical treble-cleft-like zinc co-ordinating fold. The zf-AD domain is thought to be involved in mediating dimer formation, but does not bind to DNA. zf-C4_ClpX SM00994 ClpX C4-type zinc finger The ClpX heat shock protein of Escherichia coli is a member of the universally conserved Hsp100 family of proteins, and possesses a putative zinc finger motif of the C4 type. This presumed zinc binding domain is found at the N-terminus of the ClpX protein. ClpX is an ATPase which functions both as a substrate specificity component of the ClpXP protease and as a molecular chaperone. The molecular function of this domain is now known.