DOMAIN ACC DEFINITION DESCRIPTION -------------------------------------------------------------------------------- 14_3_3 SM00101 14-3-3 homologues 14-3-3 homologues mediates signal transduction by binding to phosphoserine-containing proteins. They are involved in growth factor signalling and also interact with MEK kinases. 35EXOc SM00474 3'-5' exonuclease 3\' -5' exonuclease proofreading domain present in DNA polymerase I, Werner syndrome helicase, RNase D and other enzymes 4.1m SM00294 putative band 4.1 homologues' binding motif 53EXOc SM00475 5'-3' exonuclease 6PGD SM01350 6-phosphogluconate dehydrogenase, C-terminal domain This family represents the C-terminal all-alpha domain of 6-phosphogluconate dehydrogenase. The domain contains two structural repeats of 5 helices each. 7TM_GPCR_Srsx SM01381 Serpentine type 7TM GPCR chemoreceptor Srsx Chemoreception is mediated in Caenorhabditis elegans by members of the seven-transmembrane G-protein-coupled receptor class (7TM GPCRs) of proteins which are of the serpentine type (PMID:7585938). Srsx is a solo family amongst the superfamilies of chemoreceptors. Chemoperception is one of the central senses of soil nematodes like C. elegans which are otherwise blind and deaf (PMID:18050473) A1pp SM00506 Appr-1"-p processing enzyme Function determined by Martzen et al. Extended family detected by reciprocal PSI-BLAST searches (unpublished results, and Pehrson & Fuji). A2M SM01360 Alpha-2-macroglobulin family This family includes the C-terminal region of the alpha-2-macroglobulin family. A2M_N_2 SM01359 Alpha-2-Macroglobulin This family includes a region of the alpha-2-macroglobulin family. A2M_recep SM01361 A-macroglobulin receptor This family includes the receptor domain region of the alpha-2-macroglobulin family. A4_EXTRA SM00006 amyloid A4 amyloid A4 precursor of Alzheimers disease AAA SM00382 ATPases associated with a variety of cellular activities AAA - ATPases associated with a variety of cellular activities. This profile/alignment only detects a fraction of this vast family. The poorly conserved N-terminal helix is missing from the alignment. AAA_PrkA SM00763 PrkA AAA domain This is a family of PrkA bacterial and archaeal serine kinases approximately 630 residues long. This is the N-terminal AAA domain. AAI SM00499 Plant lipid transfer protein / seed storage protein / trypsin-alpha amylase inhibitor domain family Aamy SM00642 Alpha-amylase domain Aamy_C SM00632 A_amylase_inhib SM00783 Alpha amylase inhibitor Alpha amylase inhibitor inhibits mammalian alpha-amylases specifically, by forming a tight stoichiometric 1:1 complex with alpha-amylase. The inhibitor has no action on plant and microbial alpha amylases. AARP2CN SM00785 AARP2CN (NUC121) domain This domain is the central domain of AARP2. It is weakly similar to the GTP-binding domain of elongation factor TU PUBMED:15112237. acidPPc SM00014 Acid phosphatase homologues ACR SM00608 ADAM Cysteine-Rich Domain ACTH_domain SM01363 Corticotropin ACTH domain ACTIN SM00268 Actin ACTIN subfamily of ACTIN/mreB/sugarkinase/Hsp70 superfamily AD SM00995 Anticodon-binding domain This domain of approximately 100 residues is conserved from plants to humans. It is frequently found in association with Lsm domain-containing proteins. Ad_cyc_g-alpha SM00789 Adenylate cyclase G-alpha binding domain This fungal domain is found in adenylate cyclase and interacts with the alpha subunit of heterotrimeric G proteins. ADEAMc SM00552 tRNA-specific and double-stranded RNA adenosine deaminase (RNA-specific editase) Adenylsucc_synt SM00788 Adenylosuccinate synthetase Adenylosuccinate synthetase plays an important role in purine biosynthesis, by catalyzing the GTP-dependent conversion of IMP and aspartic acid to AMP. Adenylosuccinate synthetase has been characterized from various sources ranging from Escherichia coli (gene purA) to vertebrate tissues. In vertebrates, two isozymes are present - one involved in purine biosynthesis and the other in the purine nucleotide cycle. The crystal structure of adenylosuccinate synthetase from E. coli reveals that the dominant structural element of each monomer of the homodimer is a central beta-sheet of 10 strands. The first nine strands of the sheet are mutually parallel with right-handed crossover connections between the strands. The 10th strand is antiparallel with respect to the first nine strands. In addition, the enzyme has two antiparallel beta-sheets, comprised of two strands and three strands each, 11 alpha-helices and two short 3/10-helices. Further, it has been suggested that the similarities in the GTP-binding domains of the synthetase and the p21ras protein are an example of convergent evolution of two distinct families of GTP-binding proteins PUBMED:8244965. Structures of adenylosuccinate synthetase from Triticum aestivum and Arabidopsis thaliana when compared with the known structures from E. coli reveals that the overall fold is very similar to that of the E. coli protein PUBMED:10669609. ADF SM00102 Actin depolymerisation factor/cofilin -like domains Severs actin filaments and binds to actin monomers. AdoHcyase SM00996 S-adenosyl-L-homocysteine hydrolase AdoHcyase_NAD SM00997 S-adenosyl-L-homocysteine hydrolase, NAD binding domain ADSL_C SM00998 Adenylosuccinate lyase C-terminus Adenylosuccinate lyase catalyses two steps in the synthesis of purine nucleotides: the conversion of succinylaminoimidazole-carboxamide ribotide into aminoimidazole-carboxamide ribotide (the fifth step of de novo IMP biosynthesis); the formation of adenosine monophosphate (AMP) from adenylosuccinate (the final step in the synthesis of AMP from IMP). This entry represents the C-terminal, seven alpha-helical, domain of adenylosuccinate lyase. Aerolysin SM00999 Aerolysin toxin This family represents the pore forming lobe of aerolysin. AFOR_N SM00790 Aldehyde ferredoxin oxidoreductase, N-terminal domain Enzymes of the aldehyde ferredoxin oxidoreductase (AOR) family PUBMED:9242907 contain a tungsten cofactor and an 4Fe4S cluster and catalyse the interconversion of aldehydes to carboxylates PUBMED:8672295. This family includes AOR, formaldehyde ferredoxin oxidoreductase (FOR), glyceraldehyde-3-phosphate ferredoxin oxidoreductase (GAPOR), all isolated from hyperthermophilic archea PUBMED:9242907; carboxylic acid reductase found in clostridia PUBMED:2550230; and hydroxycarboxylate viologen oxidoreductase from Proteus vulgaris, the sole member of the AOR family containing molybdenum PUBMED:8026480. GAPOR may be involved in glycolysis PUBMED:7721730, but the functions of the other proteins are not yet clear. AOR has been proposed to be the primary enzyme responsible for oxidising the aldehydes that are produced by the 2-keto acid oxidoreductases PUBMED:9275170. Agenet SM00743 Tudor-like domain present in plant sequences. Domain in plant sequences with possible chromatin-associated functions. Agglutinin SM00791 Amaranthus caudatus agglutinin or amaranthin is a lectin from the ancient South American crop, amaranth grain. Although its biological function is unknown, it has a high binding specificity for the methyl-glycoside of the T-antigen, found linked to serine or threonine residues of cell surface glycoproteins PUBMED:2271665. The protein is comprised of a homodimer, with each homodimer consisting of two beta-trefoil domains PUBMED:9334739. Agouti SM00792 Agouti protein The agouti protein regulates pigmentation in the mouse hair follicle producing a black hair with a subapical yellow band. A highly homologous protein agouti signal protein (ASIP) is present in humans and is expressed at highest levels in adipose tissue where it may play a role in energy homeostasis and possibly human pigmentation PUBMED:11837451, PUBMED:11833005. AgrB SM00793 Accessory gene regulator B The accessory gene regulator (agr) of Staphylococcus aureus is the central regulatory system that controls the gene expression for a large set of virulence factors. The arg locus consists of two transcripts: RNAII and RNAIII. RNAII encodes four genes (agrA, B, C, and D) whose gene products assemble a quorum sensing system. At low cell density, the agr genes are continuously expressed at basal levels. A signal molecule, autoinducing peptide (AIP), produced and secreted by the bacteria, accumulates outside of the cells. When the cell density increases and the AIP concentration reaches a threshold, it activates the agr response, i.e. activation of secreted protein gene expression and subsequent repression of cell wall-associated protein genes. AgrB and AgrD are essential for the production of the autoinducing peptide which functions as a signal for quorum sensing. AgrB is a transmembrane protein PUBMED:11195102. AgrB is involved in the proteolytic processing of AgrD and may have both proteolytic enzyme activity and a transporter facilitating the export of the processed AgrD peptide PUBMED:12122003. AgrD SM00794 Staphylococcal AgrD protein This family consists of several AgrD proteins from many Staphylococcus species. The agr locus was initially described in Staphylococcus aureus as an element controlling the production of exoproteins implicated in virulence. Its pattern of action has been shown to be complex, upregulating certain extracellular toxins and enzymes expressed post-exponentially and repressing some exponential-phase surface components. AgrD encodes the precursor of the autoinducing peptide (AIP).The AIP derived from AgrD by the action of AgrB interacts with AgrC in the membrane to activate AgrA, which upregulates transcription both from promoter P2, amplifying the response, and from P3, initiating the production of a novel effector: RNAIII. In S. aureus, delta-hemolysin is the only translation product of RNA III and is not involved in the regulatory functions of the transcript, which is therefore the primary agent for modulating the expression of other operons controlled by agr PUBMED:11807079. Agro_virD5 SM00795 Agrobacterium VirD5 protein The virD operon in Agrobacterium encodes a site-specific endonuclease, and a number of other poorly characterised products. This family represents the VirD5 protein. AGTRAP SM00805 Angiotensin II, type I receptor-associated protein This family consists of several angiotensin II, type I receptor-associated protein (AGTRAP) sequences. AGTRAP is known to interact specifically with the C-terminal cytoplasmic region of the angiotensin II type 1 (AT(1)) receptor to regulate different aspects of AT(1) receptor physiology. The function of this family is unclear. Aha1_N SM01000 Activator of Hsp90 ATPase, N-terminal This domain is predominantly found in the protein 'Activator of Hsp90 ATPase', it adopts a secondary structure consisting of an N-terminal alpha-helix leading into a four-stranded meandering antiparallel beta-sheet, followed by a C-terminal alpha-helix. The two helices are packed together, with the beta-sheet curving around them. They bind to the molecular chaperone HSP82 and stimulate its ATPase activity PUBMED:15039704. AHS1 SM00796 Allophanate hydrolase subunit 1 This domain represents subunit 1 of allophanate hydrolase (AHS1). AHS2 SM00797 Allophanate hydrolase subunit 2 This domain represents subunit 2 of allophanate hydrolase (AHS2). AICARFT_IMPCHas SM00798 AICARFT/IMPCHase bienzyme This is a family of bifunctional enzymes catalysing the last two steps in de novo purine biosynthesis. The bifunctional enzyme is found in both prokaryotes and eukaryotes. The second last step is catalysed by 5-aminoimidazole-4-carboxamide ribonucleotide formyltransferase (AICARFT), this enzyme catalyses the formylation of AICAR with 10-formyl-tetrahydrofolate to yield FAICAR and tetrahydrofolate. The last step is catalysed by IMP (Inosine monophosphate) cyclohydrolase (IMPCHase), cyclizing FAICAR (5-formylaminoimidazole-4-carboxamide ribonucleotide) to IMP. AIF_C SM01353 Apoptosis-inducing factor, mitochondrion-associated, C-term This C-terminal domain appears to be a dimerization domain of the mitochondrial apoptosis-inducing factor 1. protein. The domain also appears at the C-terminus of FAD-dependent pyridine nucleotide-disulfide oxidoreductases. Apoptosis inducing factor (AIF) is a bifunctional mitochondrial flavoprotein critical for energy metabolism and induction of caspase-independent apoptosis. On reduction with NADH, AIF undergoes dimerization and forms tight, long-lived FADH2-NAD charge-transfer complexes proposed to be functionally important. AIP3 SM00806 Actin interacting protein 3 Aip3p/Bud6p is a regulator of cell and cytoskeletal polarity in Saccharomyces cerevisiae that was previously identified as an actin-interacting protein. Actin-interacting protein 3 (Aip3p) localizes at the cell cortex where cytoskeleton assembly must be achieved to execute polarized cell growth, and deletion of AIP3 causes gross defects in cell and cytoskeletal polarity. Aip3p localization is mediated by the secretory pathway, mutations in early- or late-acting components of the secretory apparatus lead to Aip3p mislocalization PUBMED:10679021. AIRC SM01001 AIR carboxylase Members of this family catalyse the decarboxylation of 1-(5-phosphoribosyl)-5-amino-4-imidazole-carboxylate (AIR). This family catalyse the sixth step of de novo purine biosynthesis. Some members of this family contain two copies of this domain. AKAP_110 SM00807 A-kinase anchor protein 110 kDa This family consists of several mammalian protein kinase A anchoring protein 3 (PRKA3) or A-kinase anchor protein 110 kDa (AKAP 110) sequences. Agents that increase intracellular cAMP are potent stimulators of sperm motility. Anchoring inhibitor peptides, designed to disrupt the interaction of the cAMP-dependent protein kinase A (PKA) with A kinase-anchoring proteins (AKAPs), are potent inhibitors of sperm motility. PKA anchoring is a key biochemical mechanism controlling motility. AKAP110 shares compartments with both RI and RII isoforms of PKA and may function as a regulator of both motility- and head-associated functions such as capacitation and the acrosome reaction PUBMED:10319321. ALAD SM01004 Delta-aminolevulinic acid dehydratase This entry represents porphobilinogen (PBG) synthase (PBGS, or 5-aminoaevulinic acid dehydratase, or ALAD, ), which functions during the second stage of tetrapyrrole biosynthesis. This enzyme catalyses a Knorr-type condensation reaction between two molecules of ALA to generate porphobilinogen, the pyrrolic building block used in later steps PUBMED:17311232. The structure of the enzyme is based on a TIM barrel topology made up of eight identical subunits, where each subunit binds to a metal ion that is essential for activity, usually zinc (in yeast, mammals and certain bacteria) or magnesium (in plants and other bacteria). A lysine has been implicated in the catalytic mechanism PUBMED:3092810. The lack of PBGS enzyme causes a rare porphyric disorder known as ALAD porphyria, which appears to involve conformational changes in the enzyme PUBMED:17236137. AlaDh_PNT_C SM01002 Alanine dehydrogenase/PNT, C-terminal domain Alanine dehydrogenase catalyzes the NAD-dependent reversible reductive amination of pyruvate into alanine. AlaDh_PNT_N SM01003 Alanine dehydrogenase/PNT, N-terminal domain Alanine dehydrogenase catalyzes the NAD-dependent reversible reductive amination of pyruvate into alanine. Ala_racemase_C SM01005 Alanine racemase, C-terminal domain Alanine racemase plays a role in providing the D-alanine required for cell wall biosynthesis by isomerising L-alanine to D-alanine. Proteins contains this domain are found in both prokaryotic and eukaryotic proteins PUBMED:1676385,PUBMED:7871888. ALBUMIN SM00103 serum albumin AlcB SM01006 Siderophore biosynthesis protein domain AlcB is the conserved 45 residue region of one of the proteins of a complex which mediates alcaligin biosynthesis in Bordetella and aerobactin biosynthesis in E. coli and other bacteria. The protein appears to catalyse N-acylation of the hydroxylamine group in N-hydroxyputrescine with succinyl CoA - an activated mono-thioester derivative of succinic acid that is an intermediate in the Krebs cycle. Aldolase_II SM01007 Class II Aldolase and Adducin N-terminal domain This family includes class II aldolases and adducins which have not been ascribed any enzymatic function. Ald_Xan_dh_C SM01008 Aldehyde oxidase and xanthine dehydrogenase, a/b hammerhead domain Aldehyde oxidase catalyses the conversion of an aldehyde in the presence of oxygen and water to an acid and hydrogen peroxide. The enzyme is a homodimer, and requires FAD, molybdenum and two 2FE-2S clusters as cofactors. Xanthine dehydrogenase catalyses the hydrogenation of xanthine to urate, and also requires FAD, molybdenum and two 2FE-2S clusters as cofactors. This activity is often found in a bifunctional enzyme with xanthine oxidase activity too. The enzyme can be converted from the dehydrogenase form to the oxidase form irreversibly by proteolysis or reversibly through oxidation of sulphydryl groups. AlkA_N SM01009 AlkA N-terminal domain This domain is found at the N terminus of bacterial AlkA . AlkA (3-methyladenine-DNA glycosylase II) is a base excision repair glycosylase from Escherichia coli. It removes a variety of alkylated bases from DNA, primarily by removing alkylation damage from duplex and single stranded DNA. AlkA flips a 1-azaribose abasic nucleotide out of DNA. This produces a 66 degrees bend in the DNA and a marked widening of the minor groove PUBMED:10675345. alkPPc SM00098 Alkaline phosphatase homologues Alpha_adaptinC2 SM00809 Adaptin C-terminal domain Adaptins are components of the adaptor complexes which link clathrin to receptors in coated vesicles. Clathrin-associated protein complexes are believed to interact with the cytoplasmic tails of membrane proteins, leading to their selection and concentration. Gamma-adaptin is a subunit of the golgi adaptor. Alpha adaptin is a heterotetramer that regulates clathrin-bud formation. The carboxyl-terminal appendage of the alpha subunit regulates translocation of endocytic accessory proteins to the bud site. This Ig-fold domain is found in alpha, beta and gamma adaptins and consists of a beta-sandwich containing 7 strands in 2 beta-sheets in a greek-key topology PUBMED:10430869, PUBMED:12176391. The adaptor appendage contains an additional N-terminal strand. Alpha-amyl_C2 SM00810 Alpha-amylase C-terminal beta-sheet domain This entry represents the beta-sheet domain that is found in several alpha-amylases, usually at the C-terminus. This domain is organised as a five-stranded anti-parallel beta-sheet. Alpha_kinase SM00811 Alpha-kinase family This family is a novel family of eukaryotic protein kinase catalytic domains, which have no detectable similarity to conventional kinases. The family contains myosin heavy chain kinases and Elongation Factor-2 kinase and a bifunctional ion channel. This family is known as the alpha-kinase family. The structure of the kinase domain revealed unexpected similarity to eukaryotic protein kinases in the catalytic core as well as to metabolic enzymes with ATP-grasp domains. Alpha-L-AF_C SM00813 Alpha-L-arabinofuranosidase C-terminus This entry represents the C terminus (approximately 200 residues) of bacterial and eukaryotic alpha-L-arabinofuranosidase. This catalyses the hydrolysis of non-reducing terminal alpha-L-arabinofuranosidic linkages in L-arabinose-containing polysaccharides. Alpha_L_fucos SM00812 Alpha-L-fucosidase O-Glycosyl hydrolases (EC 3.2.1.-) are a widespread group of enzymes that hydrolyse the glycosidic bond between two or more carbohydrates, or between a carbohydrate and a non-carbohydrate moiety. A classification system for glycosyl hydrolases, based on sequence similarity, has led to the definition of 85 different families. This classification is available on the CAZy (CArbohydrate-Active EnZymes) web site PUBMED:. Because the fold of proteins is better conserved than their sequences, some of the families can be grouped in 'clans'. Family 29 encompasses alpha-L-fucosidases, which is a lysosomal enzyme responsible for hydrolyzing the alpha-1,6-linked fucose joined to the reducing-end N-acetylglucosamine of the carbohydrate moieties of glycoproteins. Deficiency of alpha-L-fucosidase results in the lysosomal storage disease fucosidosis. Alpha-mann_mid SM00872 Alpha mannosidase, middle domain Members of this entry belong to the glycosyl hydrolase family 38, This domain, which is found in the central region adopts a structure consisting of three alpha helices, in an immunoglobulin/albumin-binding domain-like fold. The domain is predominantly found in the enzyme alpha-mannosidase PUBMED:12634058. Alpha_TIF SM00814 Alpha trans-inducing protein (Alpha-TIF) Alpha-TIF (VP16) from Herpes Simplex virus is an essential tegument protein involved in the transcriptional activation of viral immediate early (IE) promoters (alpha genes) during the lytic phase of viral infection. VP16 associates with cellular transcription factors to enhance transcription rates, including the general transcription factor TFIIB and the transcriptional coactivator PC4. The N-terminal residues of VP16 confer specificity for the IE genes, while the C-terminal residues are responsible for transcriptional activation. Within the C-terminal region are two activation regions that can independently and cooperatively activate transcription. VP16 forms a transcriptional regulatory complex with two cellular proteins, the POU-domain transcription factor Oct-1 and the cell-proliferation factor HCF-1. VP16 is an alpha/beta protein with an unusual fold. Other transcription factors may have a similar topology. AMA-1 SM00815 Apical membrane antigen 1 Apical membrane antigen 1 (AMA-1) is a Plasmodium asexual blood-stage antigen. It has been suggested that positive selection operates on the AMA-1 gene in regions coding for antigenic sites. Amb_all SM00656 Amb_V_allergen SM00816 Amb V Allergen Amb V is an Ambrosia sp (ragweed) pollen allergen. Amb t V has been shown to contain a C-terminal helix as the major T cell epitope. Free sulphhydryl groups also play a major role in the T cell recognition of cross-reactivity T cell epitopes within these related allergens. Amelin SM00817 Ameloblastin precursor (Amelin) This family consists of several mammalian Ameloblastin precursor (Amelin) proteins. Matrix proteins of tooth enamel consist mainly of amelogenin but also of non-amelogenin proteins, which, although their volumetric percentage is low, have an important role in enamel mineralisation. One of the non-amelogenin proteins is ameloblastin, also known as amelin and sheathlin. Ameloblastin (AMBN) is one of the enamel sheath proteins which is though to have a role in determining the prismatic structure of growing enamel crystals. Amelogenin SM00818 Amelogenins, cell adhesion proteins, play a role in the biomineralisation of teeth. They seem to regulate formation of crystallites during the secretory stage of tooth enamel development and are thought to play a major role in the structural organisation and mineralisation of developing enamel. The extracellular matrix of the developing enamel comprises two major classes of protein: the hydrophobic amelogenins and the acidic enamelins. Circular dichroism studies of porcine amelogenin have shown that the protein consists of 3 discrete folding units: the N-terminal region appears to contain beta-strand structures, while the C-terminal region displays characteristics of a random coil conformation. Subsequent studies on the bovine protein have indicated the amelogenin structure to contain a repetitive beta-turn segment and a "beta-spiral" between Gln112 and Leu138, which sequester a (Pro, Leu, Gln) rich region. The beta-spiral offers a probable site for interactions with Ca2+ ions. Muatations in the human amelogenin gene (AMGX) cause X-linked hypoplastic amelogenesis imperfecta, a disease characterised by defective enamel. A 9bp deletion in exon 2 of AMGX results in the loss of codons for Ile5, Leu6, Phe7 and Ala8, and replacement by a new threonine codon, disrupting the 16-residue (Met1-Ala16) amelogenin signal peptide. Ami_2 SM00644 Ami_3 SM00646 AMOP SM00723 Adhesion-associated domain present in MUC4 and other proteins AMPKBI SM01010 5'-AMP-activated protein kinase beta subunit, interation domain This region is found in the beta subunit of the 5'-AMP-activated protein kinase complex, and its yeast homologues Sip1, Sip2 and Gal83, which are found in the SNF1 kinase complex. This region is sufficient for interaction of this subunit with the kinase complex, but is not solely responsible for the interaction, and the interaction partner is not known. The isoamylase N-terminal domain is sometimes found in proteins belonging to this family. AMP_N SM01011 Aminopeptidase P, N-terminal domain This domain is structurally very similar to the creatinase N-terminal domain. However, little or no sequence similarity exists between the two families. ANATO SM00104 Anaphylatoxin homologous domain C3a, C4a and C5a anaphylatoxins are protein fragments generated enzymatically in serum during activation of complement molecules C3, C4, and C5. They induce smooth muscle contraction. These fragments are homologous to a three-fold repeat in fibulins. ANK SM00248 ankyrin repeats Ankyrin repeats are about 33 amino acids long and occur in at least four consecutive copies. They are involved in protein-protein interactions. The core of the repeat seems to be an helix-loop-helix structure. ANTAR SM01012 ANTAR (AmiR and NasR transcription antitermination regulators) is an RNA-binding domain found in bacterial transcription antitermination regulatory proteins. The majority of the domain consists of a coiled-coil. Antimicrobial21 SM01357 Plant antimicrobial peptide This family includes plant antimicrobial peptides. They adopt an alpha-helical hairpin fold stabilised by two disulphide bonds (PMID:21561864). ANX SM00335 Annexin repeats AP2 SM00380 DNA-binding domain in plant proteins such as APETALA2 and EREBPs AP2Ec SM00518 AP endonuclease family 2 These endonucleases play a role in DNA repair. Cleave phosphodiester bonds at apurinic or apyrimidinic sites AP3B1_C SM01355 Clathrin-adaptor complex-3 beta-1 subunit C-terminal This domain lies at the C-terminus of the clathrin-adaptor protein complex-3 beta-1 subunit. The AP-3 complex is associated with the Golgi region of the cell as well as with more peripheral structures. The AP-3 complex may be directly involved in trafficking to lysosomes or alternatively it may be involved in another pathway, but that mis-sorting in that pathway may indirectly lead to defects in pigment granules (PMID:10024875). AP4E_app_platf SM01356 Adaptin AP4 complex epsilon appendage platform This domain is found at the C terminal of clathrin-adaptor epsilon subunit, and at the C-terminus of the appendage on the platform domain. APC10 SM01337 Anaphase-promoting complex, subunit 10 (APC10) APC2 SM01013 Anaphase promoting complex (APC) subunit 2 The anaphase promoting complex or cyclosome (APC2) is an E3 ubiquitin ligase which is part of the SCF family of ubiquitin ligases. Ubiquitin ligases catalyse the transfer of ubiquitin from the ubiquitin conjugating enzyme (E2), to the substrate protein. APCDDC SM01352 Adenomatosis polyposis coli down-regulated 1 The domain is duplicated in most members of this family. APCDD is directly regulated by the beta-catenin/Tcf complex, and its elevated expression promotes proliferation of colonic epithelial cells in vitro and in vivo (PMID:12384519). APCDD1 has an N-terminal signal-peptide and a C-terminal transmembrane region. The domain is rich in cysteines, there being up to 12 such residues, a structural motif important for interaction between Wnt ligands and their receptors. APCDD1 is expressed in a broad repertoire of cell types, indicating that it may regulate a diverse range of biological processes controlled by Wnt signalling (PMID:20393562). APPLE SM00223 APPLE domain Four-fold repeat in plasma kallikrein and coagulation factor XI. Factor XI apple 3 mediates binding to platelets. Factor XI apple 1 binds high-molecular-mass kininogen. Apple 4 in factor XI mediates dimer formation and binds to factor XIIa. Mutations in apple 4 cause factor XI deficiency, an inherited bleeding disorder. AraC_E_bind SM00871 Bacterial transcription activator, effector binding domain This domain is found in the probable effector binding domain of a number of different bacterial transcription activators PUBMED:10802742 and is also present in some DNA gyrase inhibitors. The absence of a HTH motif in the DNA gyrase inhibitors is thought to indicate the fact that these do not bind DNA. Arch1 SM01446 sub-family of tyrosine recombinases tyrosine recombinase sub-family mediates DNA clevage through tyrosine residue at the active site https://doi.org/10.1101/542381 Arch2 SM01456 sub-family of tyrosine recombinases tyrosine recombinase sub-family mediates DNA clevage through tyrosine residue at the active site https://doi.org/10.1101/542382 ARF SM00177 ARF-like small GTPases; ARF, ADP-ribosylation factor Ras homologues involved in vesicular transport. Activator of phospholipase D isoforms. Unlike Ras proteins they lack cysteine residues at their C-termini and therefore are unlikely to be prenylated. ARFs are N-terminally myristoylated. Contains ATP/GTP-binding motif (P-loop). Arfaptin SM01015 Arfaptin-like domain Arfaptin interacts with ARF1, a small GTPase involved in vesicle budding at the Golgi complex and immature secretory granules. The structure of arfaptin shows that upon binding to a small GTPase, arfaptin forms an elongated, crescent-shaped dimer of three-helix coiled-coils. The N-terminal region of ICA69 is similar to arfaptin. ArfGap SM00105 Putative GTP-ase activating proteins for the small GTPase, ARF Putative zinc fingers with GTPase activating proteins (GAPs) towards the small GTPase, Arf. The GAP of ARD1 stimulates GTPase hydrolysis for ARD1 but not ARFs. Arg_tRNA_synt_N SM01016 Arginyl tRNA synthetase N terminal dom This domain is found at the amino terminus of Arginyl tRNA synthetase, also called additional domain 1 (Add-1). It is about 140 residues long and it has been suggested that this domain will be involved in tRNA recognition. ARID SM01014 ARID/BRIGHT DNA binding domain Members of the recently discovered ARID (AT-rich interaction domain) family of DNA-binding proteins are found in fungi and invertebrate and vertebrate metazoans. ARID-encoding genes are involved in a variety of biological processes including embryonic development, cell lineage gene regulation and cell cycle control. Although the specific roles of this domain and of ARID-containing proteins in transcriptional regulation are yet to be elucidated, they include both positive and negative transcriptional regulation and a likely involvement in the modification of chromatin structure PUBMED:10838570. The basic structure of the ARID domain domain appears to be a series of six alpha-helices separated by beta-strands, loops, or turns, but the structured region may extend to an additional helix at either or both ends of the basic six. Based on primary sequence homology, they can be partitioned into three structural classes: Minimal ARID proteins that consist of a core domain formed by six alpha helices; ARID proteins that supplement the core domain with an N-terminal alpha-helix; and Extended-ARID proteins, which contain the core domain and additional alpha-helices at their N- and C-termini. ARM SM00185 Armadillo/beta-catenin-like repeats Approx. 40 amino acid repeat. Tandem repeats form superhelix of helices that is proposed to mediate interaction of beta-catenin with its ligands. Involved in transducing the Wingless/Wnt signal. In plakoglobin arm repeats bind alpha-catenin and N-cadherin. Arrestin_C SM01017 Arrestin (or S-antigen), C-terminal domain Ig-like beta-sandwich fold. Scop reports duplication with N-terminal domain. Arrestins comprise a family of closely-related proteins that includes beta-arrestin-1 and -2, which regulate the function of beta-adrenergic receptors by binding to their phosphorylated forms, impairing their capacity to activate G(S) proteins; Cone photoreceptors C-arrestin (arrestin-X) PUBMED:7720881, which could bind to phosphorylated red/green opsins; and Drosophila phosrestins I and II, which undergo light-induced phosphorylation, and probably play a role in photoreceptor transduction PUBMED:8452755, PUBMED:1517224, PUBMED:2158671. ASCH SM01022 The ASCH domain adopts a beta-barrel fold similar to that of the PUA domain. It is thought to function as an RNA-binding domain during coactivation, RNA-processing and possibly during prokaryotic translation regulation PUBMED:16322048. Asparaginase SM00870 Asparaginase, which is found in various plant, animal and bacterial cells, catalyses the deamination of asparagine to yield aspartic acid and an ammonium ion, resulting in a depletion of free circulatory asparagine in plasma PUBMED:3026924. The enzyme is effective in the treatment of human malignant lymphomas, which have a diminished capacity to produce asparagine synthetase: in order to survive, such cells absorb asparagine from blood plasma PUBMED:2407723, PUBMED:3379033 - if Asn levels have been depleted by injection of asparaginase, the lymphoma cells die. Aspzincin_M35 SM01351 Lysine-specific metallo-endopeptidase This is the catalytic region of aspzincins, a group of lysine-specific metallo-endopeptidases in the M35 family. They exhibit the following active-site architecture. The active site is composed of two helices and a loop region and includes the HExxH and GTxDxxYG motifs. In UniProt:P81054 His117, His121 and Asp130 coordinate to the catalytic zinc ligands. An electrostatically negative region composed of Asp154 and Glu157 attracts a positively charged Lys side chain of a substrate in a specific manner (PMID:11679721). AT_hook SM00384 DNA binding domain with preference for A/T rich regions Small DNA-binding motif first described in the high mobility group non-histone chromosomal protein HMG-I(Y). Autotransporter SM00869 Autotransporter beta-domain Secretion of protein products occurs by a number of different pathways in bacteria. One of these pathways known as the type IV pathway was first described for the IgA1 protease. The protein component that mediates secretion through the outer membrane is contained within the secreted protein itself, hence the proteins secreted in this way are called autotransporters. This family corresponds to the presumed integral membrane beta-barrel domain that transports the protein. This domain is found at the C-terminus of the proteins it occurs in. The N-terminus contains the variable passenger domain that is translocated across the membrane. Once the passenger domain is exported it is cleaved auto-catalytically in some proteins, in others a different peptidase is used and in some cases no cleavage occurs. AWS SM00570 associated with SET domains subdomain of PRESET AXH SM00536 domain in Ataxins and HMG containing proteins unknown function B12-binding_2 SM01018 B12 binding domain Cobalamin-dependent methionine synthase is a large modular protein that catalyses methyl transfer from methyltetrahydrofolate (CH3-H4folate) to homocysteine. During the catalytic cycle, it supports three distinct methyl transfer reactions, each involving the cobalamin (vitamin B12) cofactor and a substrate bound to its own functional unit PUBMED:11731805. The cobalamin cofactor plays an essential role in this reaction, accepting the methyl group from CH3-H4folate to form methylcob(III)alamin, and in turn donating the methyl group to homocysteine to generate methionine and cob(I)alamin. Methionine synthase is a large enzyme composed of four structurally and functionally distinct modules: the first two modules bind homocysteine and CH3-H4folate, the third module binds the cobalamin cofactor and the C-terminal module binds S-adenosylmethionine. The cobalamin-binding module is composed of two structurally distinct domains: a 4-helical bundle cap domain (residues 651-740 in the Escherichia coli enzyme) and an alpha/beta B12-binding domain (residues 741-896). The 4-helical bundle forms a cap over the alpha/beta domain, which acts to shield the methyl ligand of cobalamin from solvent PUBMED:8939751. Furthermore, in the conversion to the active conformation of this enzyme, the 4-helical cap rotates to allow the cobalamin cofactor to bind the activation domain. The alpha/beta domain is a common cobalamin-binding motif, whereas the 4-helical bundle domain with its methyl cap is a distinctive feature of methionine synthases. B2-adapt-app_C SM01020 Beta2-adaptin appendage, C-terminal sub-domain Members of this family adopt a structure consisting of a 5 stranded beta-sheet, flanked by one alpha helix on the outer side, and by two alpha helices on the inner side. This domain is required for binding to clathrin, and its subsequent polymerisation. Furthermore, a hydrophobic patch present in the domain also binds to a subset of D-phi-F/W motif-containing proteins that are bound by the alpha-adaptin appendage domain (epsin, AP180, eps15). B3 SM01019 B3 DNA binding domain Two DNA binding proteins, RAV1 and RAV2 from Arabidopsis thaliana contain two distinct amino acid sequence domains found only in higher plant species. The N-terminal regions of RAV1 and RAV2 are homologous to the AP2 DNA-binding domain (see ) present in a family of transcription factors, while the C-terminal region exhibits homology to the highly conserved C-terminal domain, designated B3, of VP1/ABI3 transcription factors PUBMED:9862967. The AP2 and B3-like domains of RAV1 bind autonomously to the CAACA and CACCTG motifs, respectively, and together achieve a high affinity and specificity of binding. It has been suggested that the AP2 and B3-like domains of RAV1 are connected by a highly flexible structure enabling the two domains to bind to the CAACA and CACCTG motifs in various spacings and orientations PUBMED:9862967. B3_4 SM00873 B3/4 domain This domain is found in tRNA synthetase beta subunits as well as in some non tRNA synthetase proteins. B41 SM00295 Band 4.1 homologues Also known as ezrin/radixin/moesin (ERM) protein domains. Present in myosins, ezrin, radixin, moesin, protein tyrosine phosphatases. Plasma membrane-binding domain. These proteins play structural and regulatory roles in the assembly and stabilization of specialized plasmamembrane domains. Some PDZ domain containing proteins bind one or more of this family. Now includes JAKs. B5 SM00874 tRNA synthetase B5 domain This domain is found in phenylalanine-tRNA synthetase beta subunits. B561 SM00665 Cytochrome b-561 / ferric reductase transmembrane domain. Cytochrome b-561 recycles ascorbate for the generation of norepinephrine by dopamine-beta-hydroxylase in the chromaffin vesicles of the adrenal gland. It is a transmembrane heme protein with the two heme groups being bound to conserved histidine residues. A cytochrome b-561 homologue, termed Dcytb, is an iron-regulated ferric reductase in the duodenal mucosa. Other homologues of these are also likely to be ferric reductases. SDR2 is proposed to be important in regulating the metabolism of iron in the onset of neurodegenerative disorders. Bac_DnaA_C SM00760 Bacterial dnaA protein helix-turn-helix domain Could be involved in DNA-binding. BACK SM00875 BTB And C-terminal Kelch The BACK domain is found juxtaposed to the BTB domain; they are separated by as little as two residues. Bac_rhodopsin SM01021 Bacteriorhodopsin-like protein The bacterial opsins are retinal-binding proteins that provide light- dependent ion transport and sensory functions to a family of halophilic bacteria PUBMED:2468194, PUBMED:2591367. They are integral membrane proteins believed to contain seven transmembrane (TM) domains, the last of which contains the attachment point for retinal (a conserved lysine). BAF SM01023 Barrier to autointegration factor Barrier-to-autointegration factor (BAF) is an essential protein that is highly conserved in metazoan evolution, and which may act as a DNA-bridging protein PUBMED:12902403. BAF binds directly to double-stranded DNA, to transcription activators, and to inner nuclear membrane proteins, including lamin A filament proteins that anchor nuclear-pore complexes in place, and nuclear LEM-domain proteins that bind to laminins filaments and chromatin. New findings suggest that BAF has structural roles in nuclear assembly and chromatin organization, represses gene expression and might interlink chromatin structure, nuclear architecture and gene regulation in metazoans PUBMED:15130582. BAF can be exploited by retroviruses to act as a host component of pre-integration complexes, which promote the integration of the retroviral DNA into the host chromosome by preventing autointegration of retroviral DNA PUBMED:14645565. BAF might contribute to the assembly or activity of retroviral pre-integration complexes through direct binding to the retroviral proteins p55 Gag and matrix, as well as to DNA. BAG SM00264 BAG domains, present in regulator of Hsp70 proteins BAG domains, present in Bcl-2-associated athanogene 1 and silencer of death domains BAH SM00439 Bromo adjacent homology domain BAR SM00721 BASIC SM00520 Basic domain in HLH proteins of MYOD family BATS SM00876 Biotin and Thiamin Synthesis associated domain Biotin synthase (BioB), , catalyses the last step of the biotin biosynthetic pathway. The reaction consists in the introduction of a sulphur atom into dethiobiotin. BioB functions as a homodimer PUBMED:12482614. Thiamin synthesis if a complex process involving at least six gene products (ThiFSGH, ThiI and ThiJ). Two of the proteins required for the biosynthesis of the thiazole moiety of thiamine (vitamin B(1)) are ThiG and ThiH (this entry) and form a heterodimerPUBMED:12650933. Both of these reactions are thought of involve the binding of co-factors, and both function as dimers PUBMED:12482614, PUBMED:12650933. This domain therefore may be involved in co-factor binding or dimerisation. BBC SM00502 B-Box C-terminal domain Coiled coil region C-terminal to (some) B-Box domains BBOX SM00336 B-Box-type zinc finger BC10 SM01396 Bladder cancer-related protein BC10 This family consists of a series of short proteins of around 90 residues in length. The human protein O60629 or BC10 has been implicated in bladder cancer where the transcription of the gene coding for this protein is nearly completely abolished in highly invasive transitional cell carcinomas (TCCs) PMID:11920613. The protein is a small globular protein containing two transmembrane helices, and it is a multiply edited transcript. All the editing sites are found in either the 5-UTR or the N-terminal section of the protein, which is predicted to be outside the membrane. The three coding edits are all non-synonymous and predicted to encode exposed residues PMID:15797904. The function of this family is unknown. BCL SM00337 BCL (B-Cell lymphoma); contains BH1, BH2 regions (BH1, BH2, (BH3 (one helix only)) and not BH4(one helix only)). Involved in apoptosis regulation BCS1_N SM01024 This domain is found at the N terminal of the mitochondrial ATPase BSC1. It encodes the import and intramitochondrial sorting for the protein. Beach SM01026 Beige/BEACH domain The BEACH domain was described in the BEIGE protein (D1035670) and in the highly homologous CHS protein. The BEACH domain is usually followed by a series of WD repeats. The function of the BEACH domain is unknown. BEN SM01025 The BEN domain is found in diverse animal proteins such as BANP/SMAR1, NAC1 and the Drosophila mod(mdg4) isoform C, in the chordopoxvirus virosomal protein E5R and in several proteins of polydnaviruses. Computational analysis suggests that the BEN domain mediates protein-DNA and protein-protein interactions during chromatin organisation and transcription. Beta-Casp SM01027 Beta-Casp domain The beta-CASP domain is found C terminal to the beta-lactamase domain in pre-mRNA 3'-end-processing endonuclease. The active site of this enzyme is located at the interface of these two domains. BetaGal_dom2 SM01029 Beta-galactosidase, domain 2 This is the second domain of the five-domain beta-galactosidase enzyme that altogether catalyses the hydrolysis of beta(1-3) and beta(1-4) galactosyl bonds in oligosaccharides as well as the inverse reaction of enzymatic condensation and trans-glycosylation. This domain is made up of 16 antiparallel beta-strands and an alpha-helix at its C terminus. The fold of this domain appears to be unique. In addition, the last seven strands of the domain form a subdomain with an immunoglobulin-like (I-type Ig) fold in which the first strand is divided between the two beta-sheets. In penicillin spp this strand is interrupted by a 12-residue insertion which forms an additional edge-strand to the second beta-sheet of the sub-domain. The remainder of the second domain forms a series of beta-hairpins at its N terminus, four strands of which are contiguous with part of the Ig-like sub-domain, forming in total a seven-stranded antiparallel beta-sheet. This domain is associated with family Glyco_hydro_35, which is N-terminal to it, but itself has no metazoan members. Beta-TrCP_D SM01028 D domain of beta-TrCP This domain is found in eukaryotes, and is approximately 40 amino acids in length. It is found associated with F-box domain, WD domain. The protein that contains this domain functions as a ubiquitin ligase. Ubiquitination is required to direct proteins towards the proteasome for degradation. This protein is part of the WD40 class of F box proteins. The D domain of these F box proteins is involved in mediating the dimerisation of the protein. Dimerisation is necessary to polyubiquitinate substrates so this D domain is vital in directing substrates towards the proteasome for degradation. Bet_v_1 SM01037 Pathogenesis-related protein Bet v I family This family is named after Bet v 1, the major birch pollen allergen. This protein belongs to family 10 of plant pathogenesis-related proteins (PR-10), cytoplasmic proteins of 15-17 kd that are wide-spread among dicotyledonous plants PUBMED:9417891. In recent years, a number of diverse plant proteins with low sequence similarity to Bet v 1 was identified. A classification by sequence similarity yielded several subfamilies related to PR-10 PUBMED:18922149 - Pathogenesis-related proteins PR-10: These proteins were identified as major tree pollen allergens in birch and related species (hazel, alder), as plant food allergens expressed in high levels in fruits, vegetables and seeds (apple, celery, hazelnut), and as pathogenesis-related proteins whose expression is induced by pathogen infection, wounding, or abiotic stress. Hyp-1 (Q8H1L1), an enzyme involved in the synthesis of the bioactive naphthodianthrone hypericin in St. John's wort (Hypericum perforatum) also belongs to this family. Most of these proteins were found in dicotyledonous plants. In addition, related sequences were identified in monocots and conifers. - Cytokinin-specific binding proteins: These legume proteins bind cytokinin plant hormones PUBMED:9874249. - (S)-Norcoclaurine synthases are enzymes catalysing the condensation of dopamine and 4-hydroxyphenylacetaldehyde to (S)-norcoclaurine, the first committed step in the biosynthesis of benzylisoquinoline alkaloids such as morphine PUBMED:15447655. -Major latex proteins and ripening-related proteins are proteins of unknown biological function that were first discovered in the latex of opium poppy (Papaver somniferum) and later found to be upregulated during ripening of fruits such as strawberry and cucumber PUBMED:15447655. The occurrence of Bet v 1-related proteins is confined to seed plants with the exception of a cytokinin-binding protein from the moss Physcomitrella patens (Q9AXI3). Bgal_small_N SM01038 Beta galactosidase small chain This domain comprises the small chain of dimeric beta-galactosidases EC:3.2.1.23. This domain is also found in single chain beta-galactosidase. BH4 SM00265 BH4 Bcl-2 homology region 4 BHD_1 SM01030 Rad4 beta-hairpin domain 1 This short domain is found in the Rad4 protein. This domain binds to DNA. BHD_2 SM01031 Rad4 beta-hairpin domain 2 This short domain is found in the Rad4 protein. This domain binds to DNA. BHD_3 SM01032 Rad4 beta-hairpin domain 3 This short domain is found in the Rad4 protein. This domain binds to DNA. BHL SM00411 bacterial (prokaryotic) histone like domain BID_1 SM00634 Bacterial Ig-like domain (group 1) BID_2 SM00635 Bacterial Ig-like domain 2 BING4CT SM01033 BING4CT (NUC141) domain This C terminal domain is found in the BING4 family of nucleolar WD40 repeat proteins. Biotin_carb_C SM00878 Biotin carboxylase C-terminal domain Biotin carboxylase is a component of the acetyl-CoA carboxylase multi-component enzyme which catalyses the first committed step in fatty acid synthesis in animals, plants and bacteria. Most of the active site residues reported in reference are in this C-terminal domain. BIR SM00238 Baculoviral inhibition of apoptosis protein repeat Domain found in inhibitor of apoptosis proteins (IAPs) and other proteins. Acts as a direct inhibitor of caspase enzymes. B_lectin SM00108 Bulb-type mannose-specific lectin BLUF SM01034 Sensors of blue-light using FAD The BLUF domain has been shown to bind FAD in the AppA protein (Q53119). AppA is involved in the repression of photosynthesis genes in response to blue-light. BLVR SM01354 Bovine leukaemia virus receptor (BLVR) This family consists of several bovine specific leukaemia virus receptors which are thought to function as transmembrane proteins, although their exact function is unknown (PMID:12692298). BMC SM00877 Bacterial microcompartments are primitive organelles composed entirely of protein subunits. The prototypical bacterial microcompartment is the carboxysome, a protein shell for sequestering carbon fixation reactions. These proteins for hexameric structure. BON SM00749 bacterial OsmY and nodulation domain BOP1NT SM01035 BOP1NT (NUC169) domain This N terminal domain is found in BOP1-like WD40 proteins. BowB SM00269 Bowman-Birk type proteinase inhibitor BP28CT SM01036 BP28CT (NUC211) domain This C-terminal domain is found in BAP28-like nucleolar proteins PUBMED:15112237. BPI1 SM00328 BPI/LBP/CETP N-terminal domain Bactericidal permeability-increasing protein (BPI) / Lipopolysaccharide-binding protein (LBP) / Cholesteryl ester transfer protein (CETP) N-terminal domain BPI2 SM00329 BPI/LBP/CETP C-terminal domain Bactericidal permeability-increasing protein (BPI) / Lipopolysaccharide-binding protein (LBP) / Cholesteryl ester transfer protein (CETP) C-terminal domain BRCT SM00292 breast cancer carboxy-terminal domain BRICHOS SM01039 The BRICHOS domain is about 100 amino acids long. It is found in a variety of proteins implicated in dementia, respiratory distress and cancer. Its exact function is unknown; roles that have been proposed for it include (a) in targeting of the protein to the secretory pathway, (b) intramolecular chaperone-like function, and (c) assisting the specialised intracellular protease processing system PUBMED:12114016. This C-terminal domain is embedded in the endoplasmic reticulum lumen, and binds to the N-terminal, transmembrane, SP_C, PF08999 provided that it is in non-helical conformation. Thus the Brichos domain of proSP-C is a chaperone that induces alpha-helix formation of an aggregation-prone TM region PUBMED:19472327 . BRIGHT SM00501 BRIGHT, ARID (A/T-rich interaction domain) domain DNA-binding domain containing a helix-turn-helix structure Brix SM00879 The Brix domain is found in a number of eukaryotic proteins including SSF proteins from yeast and humans, Arabidopsis thaliana Peter Pan-like protein and several hypothetical proteins. BRK SM00592 domain in transcription and CHROMO domain helicases BRLZ SM00338 basic region leucin zipper BRO1 SM01041 BRO1-like domain This domain is found in a number proteins including Rhophilin Q61085 and BRO1 P48582. It is known to have a role in endosomal targeting. ESCRT-III subunit Snf7 binds to a conserved hydrophobic patch in the BRO1 domain that is required for protein complex formation and for the protein-sorting function of BRO1 PUBMED:15935782. BROMO SM00297 bromo domain Bro-N SM01040 BRO family, N-terminal domain This family includes the N-terminus of baculovirus BRO and ALI motif proteins. The function of BRO proteins is unknown. It has been suggested that BRO-A and BRO-C are DNA binding proteins that influence host DNA replication and/or transcription PUBMED:10888617. This Pfam domain does not include the characteristic invariant alanine, leucine, isoleucine motif of the ALI proteins PUBMED:9847359. Brr6_like_C_C SM01042 Di-sulfide bridge nucleocytoplasmic transport domain Brr6_like_C_C is the highly conserved C-terminal region of a group of proteins found in fungi. It carries four highly conserved cysteine residues. It is suggested that members of the family interact with each other via di-sulfide bridges to form a complex which is involved in nucleocytoplasmic transport PUBMED:15882446 . BSD SM00751 domain in transcription factors and synapse-associated proteins BTAD SM01043 Bacterial transcriptional activator domain Found in the DNRI/REDD/AFSR family of regulators. This region of AFSR (P25941) along with the C terminal region is capable of independently directing actinorhodin production. This family contains TPR repeats. BTB SM00225 Broad-Complex, Tramtrack and Bric a brac Domain in Broad-Complex, Tramtrack and Bric a brac. Also known as POZ (poxvirus and zinc finger) domain. Known to be a protein-protein interaction motif found at the N-termini of several C2H2-type transcription factors as well as Shaw-type potassium channels. Known structure reveals a tightly intertwined dimer formed via interactions between N-terminal strand and helix structures. However in a subset of BTB/POZ domains, these two secondary structures appear to be missing. Be aware SMART predicts BTB/POZ domains without the beta1- and alpha1-secondary structures. BTD SM01268 Beta-trefoil DNA-binding domain Members of this family of DNA binding domains adopt a beta-trefoil fold, that is, a capped beta-barrel with internal pseudo threefold symmetry. In the DNA-binding protein LAG-1, it also is the site of mutually exclusive interactions with NotchIC (and the viral protein EBNA2) and co-repressors (SMRT/N-Cor and CIR) PUBMED:15297877. btg1 SM00099 tob/btg1 family The tob/btg1 is a family of proteins that inhibit cell proliferation. BTK SM00107 Bruton's tyrosine kinase Cys-rich motif Zinc-binding motif containing conserved cysteines and a histidine. Always found C-terminal to PH domains (but not all PH domains are followed by BTK motifs). The crystal structure shows this motif packs against the PH domain. The PH+Btk module pair has been called the Tec homology (TH) region. BTP SM00576 Bromodomain transcription factors and PHD domain containing proteins subdomain of archael histone-like transcription factors Btz SM01044 CASC3/Barentsz eIF4AIII binding This domain is found on CASC3 (cancer susceptibility candidate gene 3 protein) which is also known as Barentsz (Btz). CASC3 is a component of the EJC (exon junction complex) which is a complex that is involved in post-transcriptional regulation of mRNA in metazoa. The complex is formed by the association of four proteins (eIF4AIII, Barentsz, Mago, and Y14), mRNA, and ATP. This domain wraps around eIF4AIII and stacks against the 5' nucleotide PUBMED:16923391. BURP SM01045 The BURP domain is found at the C-terminus of several different plant proteins. It was named after the proteins in which it was first identified: the BNM2 clone-derived protein from Brassica napus O65009; USPs and USP-like proteins P21746 P21747 Q06765 O24482; RD22 from Arabidopsis thaliana Q08298; and PG1beta from Lycopersicon esculentum Q40161. This domain is around 230 amino acid residues long. It possesses the following conserved features: two phenylalanine residues at its N-terminus; two cysteine residues; and four repeated cysteine-histidine motifs, arranged as: CH-X(10)-CH-X(25-27)-CH-X(25-26)-CH, where X can be any amino acid PUBMED:9790599. The function of this domain is unknown. C1 SM00109 Protein kinase C conserved region 1 (C1) domains (Cysteine-rich domains) Some bind phorbol esters and diacylglycerol. Some bind RasGTP. Zinc-binding domains. C1_4 SM01047 TFIIH C1-like domain The carboxyl-terminal region of TFIIH is essential for transcription activity. This regions binds three zinc atoms through two independent domain. The first contains a C4 zinc finger motif, whereas the second is characterised by a CX(2)CX(2-4)FCADCD motif. The solution structure of the second C-terminal domain revealed homology with the regulatory domain of protein kinase C PUBMED:10882739. C1Q SM00110 Complement component C1q domain. Globular domain found in many collagens and eponymously in complement C1q. When part of full length proteins these domains form a 'bouquet' due to the multimerization of heterotrimers. The C1q fold is similar to that of tumour necrosis factor. C2 SM00239 Protein kinase C conserved region 2 (CalB) Ca2+-binding motif present in phospholipases, protein kinases C, and synaptotagmins (among others). Some do not appear to contain Ca2+-binding sites. Particular C2s appear to bind phospholipids, inositol polyphosphates, and intracellular proteins. Unusual occurrence in perforin. Synaptotagmin and PLC C2s are permuted in sequence with respect to N- and C-terminal beta strands. SMART detects C2 domains using one or both of two profiles. C345C SM00643 Netrin C-terminal Domain C4 SM00111 C-terminal tandem repeated domain in type 4 procollagens Duplicated domain in C-terminus of type 4 collagens. Mutations in alpha-5 collagen IV are associated with X-linked Alport syndrome. C6 SM01048 This domain of unknown function is found in the C. elegans protein Q19522. It is presumed to be an extracellular domain. The C6 domain contains six conserved cysteine residues in most copies of the domain. However some copies of the domain are missing cysteine residues 1 and 3 suggesting that these form a disulphide bridge. C8 SM00832 This domain contains 8 conserved cysteine residues, but this family only contains 7 of them to overlaps with other domains. It is found in disease-related proteins including von Willebrand factor, Alpha tectorin, Zonadhesin and Mucin. CA SM00112 Cadherin repeats. Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. Cadherin domains occur as repeats in the extracellular regions which are thought to mediate cell-cell contact when bound to calcium. Ca_chan_IQ SM01062 Voltage gated calcium channel IQ domain Voltage gated calcium channels control cellular calcium entry in response to changes in membrane potential. The isoleucine-glutamine (IQ) motif in the voltage gated calcium channel IQ domain interacts with hydrophobic pockets of Ca2+/calmodulin PUBMED:16299511. The interaction regulates two self-regulatory calcium dependent feedback mechanism, calcium dependent inactivation (CDI), and calcium-dependent facilitation (CDF). Cache_2 SM01049 Cache is an extracellular domain that is predicted to have a role in small-molecule recognition in a wide range of proteins, including the animal dihydropyridine-sensitive voltage-gated Ca2+ channel; alpha-2delta subunit, and various bacterial chemotaxis receptors. The name Cache comes from CAlcium channels and CHEmotaxis receptors. This domain consists of an N-terminal part with three predicted strands and an alpha-helix, and a C-terminal part with a strand dyad followed by a relatively unstructured region. The N-terminal portion of the (unpermuted) Cache domain contains three predicted strands that could form a sheet analogous to that present in the core of the PAS domain structure. Cache domains are particularly widespread in bacteria, with Vibrio cholerae. The animal calcium channel alpha-2delta subunits might have acquired a part of their extracellular domains from a bacterial source PUBMED:11084361. The Cache domain appears to have arisen from the GAF-PAS fold despite their divergent functions PUBMED:11292341. CactinC_cactus SM01050 Cactus-binding C-terminus of cactin protein CactinC_cactus is the C-terminal 200 residues of the cactin protein which are necessary for the association of cactin with IkappaB-cactus as one of the intracellular members of the Rel complex. The Rel (NF-kappaB) pathway is conserved in invertebrates and vertebrates. In mammals, it controls the activities of the immune and inflammatory response genes as well as viral genes, and is critical for cell growth and survival. In Drosophila, the Rel pathway functions in the innate cellular and humoral immune response, in muscle development, and in the establishment of dorsal-ventral polarity in the early embryo PUBMED:10842059. Most members of the family also have a Cactin_mid domain further upstream. CAD SM00266 Domains present in proteins implicated in post-mortem DNA fragmentation CADG SM00736 Dystroglycan-type cadherin-like domains. Cadherin-homologous domains present in metazoan dystroglycans and alpha/epsilon sarcoglycans, yeast Axl2p and in a very large protein from magnetotactic bacteria. Likely to bind calcium ions. Cadherin_pro SM01055 Cadherin prodomain like Cadherins are a family of proteins that mediate calcium dependent cell-cell adhesion. They are activated through cleavage of a prosequence in the late Golgi. This domain corresponds to the folded region of the prosequence, and is termed the prodomain. The prodomain shows structural resemblance to the cadherin domain, but lacks all the features known to be important for cadherin-cadherin interactions PUBMED:15130472. CALCITONIN SM00113 calcitonin This family is formed by calcitonin, the calcitonin gene-related peptide, and amylin. They are short polypeptide hormones. calpain_III SM00720 Calx_beta SM00237 Domains in Na-Ca exchangers and integrin-beta4 Domain in Na-Ca exchangers and integrin subunit beta4 (and some cyanobacterial proteins) CaMBD SM01053 Calmodulin binding domain Small-conductance Ca2+-activated K+ channels (SK channels) are independent of voltage and gated solely by intracellular Ca2+. These membrane channels are heteromeric complexes that comprise pore-forming alpha-subunits and the Ca2+-binding protein calmodulin (CaM) PUBMED:11323678. CaM binds to the SK channel through this the CaM-binding domain (CaMBD), which is located in an intracellular region of the alpha-subunit immediately carboxy-terminal to the pore. Channel opening is triggered when Ca2+ binds the EF hands in the N-lobe of CaM. The structure of this domain complexed with CaM is known PUBMED:11323678. This domain forms an elongated dimer with a CaM molecule bound at each end; each CaM wraps around three alpha-helices, two from one CaMBD subunit and one from the other. CaM_binding SM01054 Plant calmodulin-binding domain The sequences featured in this family are found repeated in a number of plant calmodulin-binding proteins (such as Q8W235 Q84ZT8 and Q8H6X1), and are thought to constitute the calmodulin-binding domains PUBMED:12825696, PUBMED:11684678. Binding of the proteins to calmodulin depends on the presence of calcium ions PUBMED:12825696, PUBMED:11684678. These proteins are thought to be involved in various processes, such as plant defence responses PUBMED:12825696 and stolonisation or tuberization PUBMED:11684678 . CAMSAP_CKK SM01051 Microtubule-binding calmodulin-regulated spectrin-associated This is the C-terminal domain of a family of eumetazoan proteins collectively defined as calmodulin-regulated spectrin-associated, or CAMSAP, proteins. CAMSAP proteins carry an N-terminal region that includes the CH domain, a central region including a predicted coiled-coil and this C-terminal, or CKK, domain - defined as being present in CAMSAP, KIAA1078 and KIAA1543, The C-terminal domain is the part of the CAMSAP proteins that binds to microtubules. The domain appears to act by producing inhibition of neurite extension, probably by blocking microtubule function. CKK represents a domain that has evolved with the metazoa. The structure of a murine hypothetical protein from RIKEN cDNA has shown the domain to adopt a mainly beta barrel structure with an associated alpha-helical hairpin. Candida_ALS_N SM01056 Cell-wall agglutinin N-terminal ligand-sugar binding This is likely to be the sugar or ligand binding domain of the yeast alpha-agglutinins. Candidate SM01447 sub-family of tyrosine recombinases tyrosine recombinase sub-family closely related to Xer subfamily found in Candidate phyla. Involved in DNA clevage through tyrosine residue at the active site https://doi.org/10.1101/542381 CAP10 SM00672 Putative lipopolysaccharide-modifying enzyme. CAP_GLY SM01052 Cytoskeleton-associated proteins (CAPs) are involved in the organisation of microtubules and transportation of vesicles and organelles along the cytoskeletal network. A conserved motif, CAP-Gly, has been identified in a number of CAPs, including CLIP-170 and dynactins. The crystal structure of Caenorhabditis elegans F53F4.3 protein Q20728 CAP-Gly domain was recently solved PUBMED:12221106. The domain contains three beta-strands. The most conserved sequence, GKNDG, is located in two consecutive sharp turns on the surface, forming the entrance to a groove PUBMED:12221106. Carb_anhydrase SM01057 Eukaryotic-type carbonic anhydrase Carbonic anhydrases are zinc metalloenzymes which catalyse the reversible hydration of carbon dioxide to bicarbonate PUBMED:18336305, PUBMED:10978542. CAs have essential roles in facilitating the transport of carbon dioxide and protons in the intracellular space, across biological membranes and in the layers of the extracellular space; they are also involved in many other processes, from respiration and photosynthesis in eukaryotes to cyanate degradation in prokaryotes. There are five known evolutionarily distinct CA families (alpha, beta, gamma, delta and epsilon) that have no significant sequence identity and have structurally distinct overall folds. Some CAs are membrane-bound, while others act in the cytosol; there are several related proteins that lack enzymatic activity. The active site of alpha-CAs is well described, consisting of a zinc ion coordinated through 3 histidine residues and a water molecule/hydroxide ion that acts as a potent nucleophile. The enzyme employs a two-step mechanism: in the first step, there is a nucleophilic attack of a zinc-bound hydroxide ion on carbon dioxide; in the second step, the active site is regenerated by the ionisation of the zinc-bound water molecule and the removal of a proton from the active site PUBMED:9336012. Beta- and gamma-CAs also employ a zinc hydroxide mechanism, although at least some beta-class enzymes do not have water directly coordinated to the metal ion. CARD SM00114 Caspase recruitment domain Motif contained in proteins involved in apoptotic signalling. Mediates homodimerisation. Structure consists of six antiparallel helices arranged in a topology homologue to the DEATH and the DED domain. CarD_TRCF SM01058 CarD-like/TRCF domain CarD is a Myxococcus xanthus protein required for the activation of light- and starvation-inducible genes PUBMED:8692912. This family includes the presumed N-terminal domain. CarD interacts with the zinc-binding protein CarG, to form a complex that regulates multiple processes in Myxococcus xanthus PUBMED:16879646. This family also includes a domain to the N-terminal side of the DEAD helicase of TRCF proteins. TRCF displaces RNA polymerase stalled at a lesion, binds to the damage recognition protein UvrA, and increases the template strand repair rate during transcription PUBMED:7876261. This domain is involved in binding to the stalled RNA polymerase PUBMED:7876261. CARP SM00673 Domain in CAPs (cyclase-associated proteins) and X-linked retinitis pigmentosa 2 gene product. cas1 SM01431 cas1 solo endonuclease cas1 solo endonuclease is involved in integration and excision of Casposons (a type of mobile element) within a genome. Its mechanism of integration is similar to integration of CRISPR spacer regions PMID:24884953 and PMID: 21219465 CASc SM00115 Caspase, interleukin-1 beta converting enzyme (ICE) homologues Cysteine aspartases that mediate programmed cell death (apoptosis). Caspases are synthesised as zymogens and activated by proteolysis of the peptide backbone adjacent to an aspartate. The resulting two subunits associate to form an (alpha)2(beta)2-tetramer which is the active enzyme. Activation of caspases can be mediated by other caspase homologues. CASH SM00722 Domain present in carbohydrate binding proteins and sugar hydrolses CAT SM01059 Chloramphenicol acetyltransferase Chloramphenicol acetyltransferase (CAT) PUBMED:1867713 catalyzes the acetyl-CoA dependent acetylation of chloramphenicol (Cm), an antibiotic which inhibits prokaryotic peptidyltransferase activity. Acetylation of Cm by CAT inactivates the antibiotic. A histidine residue, located in the C-terminal section of the enzyme, plays a central role in its catalytic mechanism. There is a second family of CAT PUBMED:1314803, evolutionary unrelated to the main family described above. These CAT belong to the bacterial hexapeptide-repeat containing-transferases family (see ). The crystal structure of the type III enzyme from Escherichia coli with chloramphenicol bound has been determined. CAT is a trimer of identical subunits (monomer Mr 25,000) and the trimeric structure is stabilised by a number of hydrogen bonds, some of which result in the extension of a beta-sheet across the subunit interface. Chloramphenicol binds in a deep pocket located at the boundary between adjacent subunits of the trimer, such that the majority of residues forming the binding pocket belong to one subunit while the catalytically essential histidine belongs to the adjacent subunit. His195 is appropriately positioned to act as a general base catalyst in the reaction, and the required tautomeric stabilisation is provided by an unusual interaction with a main-chain carbonyl oxygen PUBMED:2187098. Catalase SM01060 Catalases are antioxidant enzymes that catalyse the conversion of hydrogen peroxide to water and molecular oxygen, serving to protect cells from its toxic effects PUBMED:11351128. Hydrogen peroxide is produced as a consequence of oxidative cellular metabolism and can be converted to the highly reactive hydroxyl radical via transition metals, this radical being able to damage a wide variety of molecules within a cell, leading to oxidative stress and cell death. Catalases act to neutralise hydrogen peroxide toxicity, and are produced by all aerobic organisms ranging from bacteria to man. Most catalases are mono-functional, haem-containing enzymes, although there are also bifunctional haem-containing peroxidase/catalases that are closely related to plant peroxidases, and non-haem, manganese-containing catalases that are found in bacteria PUBMED:14745498. Cation_ATPase_N SM00831 Cation transporter/ATPase, N-terminus This entry represents the conserved N-terminal region found in several classes of cation-transporting P-type ATPases, including those that transport H+, Na+, Ca2+, Na+/K+, and H+/K+. In the H+/K+- and Na+/K+-exchange P-ATPases, this domain is found in the catalytic alpha chain. In gastric H+/K+-ATPases, this domain undergoes reversible sequential phosphorylation inducing conformational changes that may be important for regulating the function of these ATPases PUBMED:12480547, PUBMED:12529322. CAT_RBD SM01061 CAT RNA binding domain This RNA binding domain is found at the amino terminus of transcriptional antitermination proteins such as BglG, SacY and LicT. These proteins control the expression of sugar metabolising operons in Gram+ and Gram- bacteria. This domain has been called the CAT (Co-AntiTerminator) domain. It binds as a dimer PUBMED:9305644 to short Ribonucleotidic Anti-Terminator (RAT) hairpin, each monomer interacting symmetrically with both strands of the RAT hairpin PUBMED:11953318. In the full-length protein, CAT is followed by two phosphorylatable PTS regulation domains that modulate the RNA binding activity of CAT. Upon activation, the dimeric proteins bind to RAT targets in the nascent mRNA, thereby preventing abortive dissociation of the RNA polymerase from the DNA template PUBMED:10610766. CBD_II SM00637 CBD_IV SM00606 Cellulose Binding Domain Type IV CBF SM00521 CCAAT-Binding transcription Factor CBM_10 SM01064 Cellulose or protein binding domain This domain is found in two distinct sets of proteins with different functions. Those found in aerobic bacteria bind cellulose (or other carbohydrates); but in anaerobic fungi they are protein binding domains, referred to as dockerin domains or docking domains. They are believed to be responsible for the assembly of a multiprotein cellulase/hemicellulase complex, similar to the cellulosome found in certain anaerobic bacteria. CBM_2 SM01065 Starch binding domain CBM_25 SM01066 Carbohydrate binding domain CBM_3 SM01067 Cellulose binding domain CBM49 SM01063 Carbohydrate binding domain CBM49 This domain is found at the C terminal of cellulases and in vitro binding studies have shown it to binds to crystalline cellulose PUBMED:17322304 . CBM_X SM01068 Putative carbohydrate binding domain CBS SM00116 Domain in cystathionine beta-synthase and other proteins. Domain present in all 3 forms of cellular life. Present in two copies in inosine monophosphate dehydrogenase, of which one is disordered in the crystal structure [3]. A number of disease states are associated with CBS-containing proteins including homocystinuria, Becker's and Thomsen disease. c-clamp SM01366 Sequence-specific DNA binding domain in TCFs. CCP SM00032 Domain abundant in complement control proteins; SUSHI repeat; short complement-like repeat (SCR) The complement control protein (CCP) modules (also known as short consensus repeats SCRs or SUSHI repeats) contain approximately 60 amino acid residues and have been identified in several proteins of the complement system. A missense mutation in seventh CCP domain causes deficiency of the b subunit of factor XIII. CDC37_C SM01069 Cdc37 C terminal domain Cdc37 is a protein required for the activity of numerous eukaryotic protein kinases. This domains corresponds to the C terminal domain whose function is unclear. It is found C terminal to the Hsp90 chaperone (Heat shocked protein 90) binding domain PF08565 and the N terminal kinase binding domain of Cdc37 PUBMED:16098195. CDC37_M SM01070 Cdc37 Hsp90 binding domain Cdc37 is a molecular chaperone required for the activity of numerous eukaryotic protein kinases. This domains corresponds to the Hsp90 chaperone (Heat shocked protein 90) binding domain of Cdc37 PUBMED:16098195. It is found between the N terminal Cdc37 domain which is predominantly involved in kinase binding, and the C terminal domain of Cdc37 whose function is unclear. CDC37_N SM01071 Cdc37 N terminal kinase binding Cdc37 is a molecular chaperone required for the activity of numerous eukaryotic protein kinases. This domain corresponds to the N terminal domain which binds predominantly to protein kinases PUBMED:16098195 and is found N terminal to the Hsp (Heat shocked protein) 90-binding domain. Expression of a construct consisting of only the N-terminal domain of Saccharomyces pombe Cdc37 results in cellular viability. This indicates that interactions with the cochaperone Hsp90 may not be essential for Cdc37 function PUBMED:16098195. CDC48_2 SM01072 Cell division protein 48 (CDC48) domain 2 This domain has a double psi-beta barrel fold and includes VCP-like ATPase and N-ethylmaleimide sensitive fusion protein N-terminal domains. Both the VAT and NSF N-terminal functional domains consist of two structural domains of which this is at the C-terminus. The VAT-N domain found in AAA ATPases is a substrate 185-residue recognition domain PUBMED:10531028. CDC48_N SM01073 Cell division protein 48 (CDC48) N-terminal domain This domain has a double psi-beta barrel fold and includes VCP-like ATPase and N-ethylmaleimide sensitive fusion protein N-terminal domains. Both the VAT and NSF N-terminal functional domains consist of two structural domains of which this is at the N-terminus. The VAT-N domain found in AAA ATPases is a substrate 185-residue recognition domain PUBMED:10531028. Cdc6_C SM01074 CDC6, C terminal The C terminal domain of CDC6 assumes a winged helix fold, with a five alpha-helical bundle (alpha15-alpha19) structure, backed on one side by three beta strands (beta6-beta8). It has been shown that this domain acts as a DNA-localisation factor, however its exact function is, as yet, unknown. Putative functions include: (1) mediation of protein-protein interactions and (2) regulation of nucleotide binding and hydrolysis. Mutagenesis studies have shown that this domain is essential for appropriate Cdc6 activity PUBMED:11030343. CDT1 SM01075 DNA replication factor CDT1 like CDT1 is a component of the replication licensing system and promotes the loading of the mini-chromosome maintenance complex onto chromatin. Geminin is an inhibitor of CDT1 and prevents inappropriate re-initiation of replication on an already fired origin. This region of CDT1 binds to Geminin PUBMED:15286659. CENPB SM00674 Putative DNA-binding domain in centromere protein B, mouse jerky and transposases. CFEM SM00747 eight cysteine-containing domain present in fungal extracellular membrane proteins CG-1 SM01076 CG-1 domains are highly conserved domains of about 130 amino-acid residues containing a predicted bipartite NLS and named after a partial cDNA clone isolated from parsley encoding a sequence-specific DNA-binding protein PUBMED:8075408. CG-1 domains are associated with CAMTA proteins (for CAlModulin -binding Transcription Activator) that are transcription factors containing a calmodulin -binding domain and ankyrins (ANK) motifs PUBMED:11925432. Cg6151-P SM01077 Uncharacterized conserved protein CG6151-P This is a family of small, less than 200 residue long, proteins which are named as CG6151-P proteins that are conserved from fungi to humans. The function is unknown. The fungal members have a characteristic ICP sequence motif. Some members are annotated as putative clathrin-coated vesicle protein but this could not be defined. CGGC SM01078 This putative domain contains a quite highly conserved sequence of CGGC in its central region. The domain has many conserved cysteines and histidines suggestive of a zinc binding function. CH SM00033 Calponin homology domain Actin binding domains present in duplicate at the N-termini of spectrin-like proteins (including dystrophin, alpha-actinin). These domains cross-link actin filaments into bundles and networks. A calponin homology domain is predicted in yeasst Cdc24p. CHAD SM00880 The CHAD domain is an alpha-helical domain functionally associated with some members of the adenylate cyclase family . It has conserved histidines that may chelate metals. CHASE SM01079 This domain is found in the extracellular portion of receptor-like proteins - such as serine/threonine kinases and adenylyl cyclases PUBMED:11590000, PUBMED:11590001. Predicted to be a ligand binding domain PUBMED:11590000. CHASE2 SM01080 CHASE2 is an extracellular sensory domain, which is present in various classes of transmembrane receptors that are parts of signal transduction pathways in bacteria. Specifically, CHASE2 domains are found in histidine kinases, adenylate cyclases, serine/threonine kinases and predicted diguanylate cyclases/phosphodiesterases. Environmental factors that are recognised by CHASE2 domains are not known at this time PUBMED:12486065. CHB_HEX SM01081 Putative carbohydrate binding domain This domain represents the N terminal domain in chitobiases and beta-hexosaminidases EC:3.2.1.52. It is composed of a beta sandwich structure that is similar in structure to the cellulose binding domain of cellulase from Cellulomonas fimi PUBMED:8673609. This suggests that this may be a carbohydrate binding domain. CheW SM00260 Two component signalling adaptor domain CHK SM00587 ZnF_C4 abd HLH domain containing kinases domain subfamily of choline kinases CHRD SM00754 A domain in the BMP inhibitor chordin and in microbial proteins. CHROMO SM00298 Chromatin organization modifier domain ChSh SM00300 Chromo Shadow Domain ChtBD1 SM00270 Chitin binding domain ChtBD2 SM00494 Chitin-binding domain type 2 ChtBD3 SM00495 Chitin-binding domain type 3 ChW SM00728 Clostridial hydrophobic, with a conserved W residue, domain. CHZ SM01082 Histone chaperone domain CHZ This domain is highly conserved from yeasts to humans and is part of the chaperone protein HIRIP3 in vertebrates which interacts with the H3.3 chaperone HIRA, implicated in histone replacement during transcription. N- and C- termini of Chz family members are relatively divergent but do contain similar acidic stretches rich in Glu/Asp residues, characteristic of all histone chaperones PUBMED:17289584. CIMR SM01404 Cation-independent mannose-6-phosphate receptor repeat The cation-independent mannose-6-phosphate receptor contains 15 copies of a repeat. Cir_N SM01083 N-terminal domain of CBF1 interacting co-repressor CIR This is a 45 residue conserved region at the N-terminal end of a family of proteins referred to as CIRs (CBF1-interacting co-repressors). CBF1 (centromere-binding factor 1) acts as a transcription factor that causes repression by binding specifically to GTGGGAA motifs in responsive promoters, and it requires CIR as a co-repressor. CIR binds to histone deacetylase and to SAP30 and serves as a linker between CBF1 and the histone deacetylase complex PUBMED:9874765. Citrate_ly_lig SM00764 Citrate lyase ligase C-terminal domain Proteins of this family contain the C-terminal domain of citrate lyase ligase EC:6.2.1.22. CK_II_beta SM01085 Casein kinase II regulatory subunit CKS SM01084 Cyclin-dependent kinase regulatory subunit Cyclin-dependent kinase regulatory subunit. CLa SM00035 CLUSTERIN alpha chain CLb SM00030 CLUSTERIN Beta chain CLECT SM00034 C-type lectin (CTL) or carbohydrate-recognition domain (CRD) Many of these domains function as calcium-dependent carbohydrate binding modules. CLH SM00299 Clathrin heavy chain repeat homology CLIP SM00680 Clip or disulphide knot domain Present in horseshoe crab proclotting enzyme N-terminal domain, Drosophila Easter and silkworm prophenoloxidase-activating enzyme. ClpB_D2-small SM01086 C-terminal, D2-small domain, of ClpB protein This is the C-terminal domain of ClpB protein, referred to as the D2-small domain, and is a mixed alpha-beta structure. Compared with the D1-small domain (included in AAA) it lacks the long coiled-coil insertion, and instead of helix C4 contains a beta-strand (e3) that is part of a three stranded beta-pleated sheet. In Thermophilus the whole protein forms a hexamer with the D1-small and D2-small domains located on the outside of the hexamer, with the long coiled-coil being exposed on the surface. The D2-small domain is essential for oligomerisation, forming a tight interface with the D2-large domain of a neighbouring subunit and thereby providing enough binding energy to stabilise the functional assembly PUBMED:14567920. The domain is associated with two Clp_N at the N-terminus as well as AAA and AAA_2. CM_2 SM00830 Chorismate mutase type II Chorismate mutase, catalyses the conversion of chorismate to prephenate in the pathway of tyrosine and phenylalanine biosynthesis. This enzyme is negatively regulated by tyrosine, tryptophan and phenylalanine PUBMED:9642265, PUBMED:9497350. CNH SM00036 Domain found in NIK1-like kinases, mouse citron and yeast ROM1, ROM2 Unpublished observations. cNMP SM00100 Cyclic nucleotide-monophosphate binding domain Catabolite gene activator protein (CAP) is a prokaryotic homologue of eukaryotic cNMP-binding domains, present in ion channels, and cNMP-dependent kinases. CNX SM00037 Connexin homologues Connexin channels participate in the regulation of signaling between developing and differentiated cell types. CoA_binding SM00881 CoA binding domain This domain has a Rossmann fold and is found in a number of proteins including succinyl CoA synthetases, malate and ATP-citrate ligases. CoA_trans SM00882 Coenzyme A transferase Coenzyme A (CoA) transferases belong to an evolutionary conserved family of enzymes catalyzing the reversible transfer of CoA from one carboxylic acid to another. They have been identified in many prokaryotes and in mammalian tissues. The bacterial enzymes are heterodimer of two subunits (A and B) of about 25 Kd each while eukaryotic SCOT consist of a single chain which is colinear with the two bacterial subunits. CobW_C SM00833 Cobalamin synthesis protein cobW C-terminal domain CobW proteins are generally found proximal to the trimeric cobaltochelatase subunit CobN, which is essential for vitamin B12 (cobalamin) biosynthesis PUBMED:12869542. They contain a P-loop nucleotide-binding loop in the N-terminal domain and a histidine-rich region in the C-terminal portion suggesting a role in metal binding, possibly as an intermediary between the cobalt transport and chelation systems. CobW might be involved in cobalt reduction leading to cobalt(I) corrinoids. This entry represents the C-terminal domain found in CobW, as well as in P47K, a Pseudomonas chlororaphis protein needed for nitrile hydratase expression PUBMED:7765511. CO_deh_flav_C SM01092 CO dehydrogenase flavoprotein C-terminal domain Cog4 SM00762 COG4 transport protein This region is found in yeast oligomeric golgi complex component 4 which is involved in ER to Golgi and intra Golgi transport. COG6 SM01087 Conserved oligomeric complex COG6 COG6 is a component of the conserved oligomeric golgi complex, which is composed of eight different subunits and is required for normal golgi morphology and localisation. Col_cuticle_N SM01088 Nematode cuticle collagen N-terminal domain The function of this domain is unknown. It is found in the N-terminal region of nematode cuticle collagens. Cuticle is a tough elastic structure secreted by hypodermal cells and is primarily composed of collagen proteins PUBMED:7828882. COLFI SM00038 Fibrillar collagens C-terminal domain Found at C-termini of fibrillar collagens: Ephydatia muelleri procollagen EMF1alpha, vertebrate collagens alpha(1)III, alpha(1)II, alpha(2)V etc. COLIPASE SM00023 Colipase Colipase is a protein that functions as a cofactor for pancreatic lipase, with which it forms a stoichiometric complex. It also binds to the bile-salt covered triacylglycerol interface thus allowing the enzyme to anchor itself to the water-lipid interface. Colipase is a small protein of approximately 100 amino-acid residues with five conserved disulfide bonds. Connexin_CCC SM01089 Gap junction channel protein cysteine-rich domain Copper-fist SM01090 Copper fist is an N-terminal domain involved in copper-dependent DNA binding. It is named for its resemblance to a fist. It can be found in some fungal transcription factors. These proteins activate the transcription of the metallothionein gene in response to copper. Metallothionein maintains copper levels in yeast. The copper fist domain is similar in structure to metallothionein itself, and on copper binding undergoes a large conformational change, which allows DNA binding. CorC_HlyC SM01091 Transporter associated domain This small domain is found in a family of proteins with the DUF21 domain and two CBS domains with this domain found at the C-terminus of the proteins, the domain is also found at the C terminus of some Na+/H+ antiporters. This domain is also found in CorC that is involved in Magnesium and cobalt efflux. The function of this domain is uncertain but might be involved in modulating transport of ion substrates. Cornichon SM01398 Costars SM01283 This domain is found both alone and at the C-terminus of actin-binding Rho-activating protein (ABRA). It binds to actin, and in muscle regulates the actin cytoskeleton and cell motility PMID:11983702, 20940261. It has a winged helix-like fold consisting of three alpha-helices and four antiparallel beta strands. Unlike typical winged helix proteins it does not bind to DNA, but contains a hydrophobic groove which may be responsible for interaction with other proteins PMID:21082705 . CP12 SM01093 CpcD SM01094 CpcD/allophycocyanin linker domain CPDc SM00577 catalytic domain of ctd-like phosphatases Cpl-7 SM01095 Cpl-7 lysozyme C-terminal domain This domain was originally found in the C-terminal moiety of the Cpl-7 lysozyme encoded by the Streptococcus pneumoniae bacteriophage Cp-7. It is assumed that these repeats represent cell wall binding motifs although no direct evidence has been obtained so far. Cpn10 SM00883 Chaperonin 10 Kd subunit The chaperonins are 'helper' molecules required for correct folding and subsequent assembly of some proteins. These are required for normal cell growth, and are stress-induced, acting to stabilise or protect disassembled polypeptides under heat-shock conditions. Type I chaperonins present in eubacteria, mitochondria and chloroplasts require the concerted action of 2 proteins, chaperonin 60 (cpn60) and chaperonin 10 (cpn10). The 10 kDa chaperonin (cpn10 - or groES in bacteria) exists as a ring-shaped oligomer of between six to eight identical subunits, while the 60 kDa chaperonin (cpn60 - or groEL in bacteria) forms a structure comprising 2 stacked rings, each ring containing 7 identical subunits. These ring structures assemble by self-stimulation in the presence of Mg2+-ATP. The central cavity of the cylindrical cpn60 tetradecamer provides as isolated environment for protein folding whilst cpn-10 binds to cpn-60 and synchronizes the release of the folded protein in an Mg2+-ATP dependent manner. The binding of cpn10 to cpn60 inhibits the weak ATPase activity of cpn60. CPSase_L_D3 SM01096 Carbamoyl-phosphate synthetase large chain, oligomerisation domain Carbamoyl-phosphate synthase catalyses the ATP-dependent synthesis of carbamyl-phosphate from glutamine or ammonia and bicarbonate. The carbamoyl-phosphate synthase (CPS) enzyme in prokaryotes is a heterodimer of a small and large chain. CPSase_sm_chain SM01097 Carbamoyl-phosphate synthase small chain, CPSase domain The carbamoyl-phosphate synthase domain is in the amino terminus of protein. Carbamoyl-phosphate synthase catalyses the ATP-dependent synthesis of carbamyl-phosphate from glutamine or ammonia and bicarbonate. This important enzyme initiates both the urea cycle and the biosynthesis of arginine and/or pyrimidines PUBMED:1972379. The carbamoyl-phosphate synthase (CPS) enzyme in prokaryotes is a heterodimer of a small and large chain. The small chain promotes the hydrolysis of glutamine to ammonia, which is used by the large chain to synthesise carbamoyl phosphate. The small chain has a GATase domain in the carboxyl terminus. CPSF73-100_C SM01098 This is the C-terminal conserved region of the pre-mRNA 3'-end-processing of the polyadenylation factor CPSF-73/CPSF-100 proteins. The exact function of this domain is not known. CPW_WPC SM01099 This group of sequences is defined by a domain of about 61 residues in length with six well-conserved cysteine residues and six well-conserved aromatic sites. The domain can be found in tandem repeats, and is known so far only in Plasmodium falciparum. It is named for motifs of CPxxW and (less well conserved) WPC. Its function is unknown. CRA SM00757 CT11-RanBPM protein-protein interaction domain present in crown eukaryotes (plants, animals, fungi) CRAL_TRIO_N SM01100 CRAL/TRIO, N-terminal domain CRF SM00039 corticotropin-releasing factor CRISPR_assoc SM01101 This domain forms an anti-parallel beta strand structure with flanking alpha helical regions. CRM1_C SM01102 CRM1 C terminal CRM1 (also known as Exportin1) mediates the nuclear export of proteins bearing a leucine-rich nuclear export signal (NES). CRM1 forms a complex with the NES containing protein and the small GTPase Ran. This region forms an alpha helical structure formed by six helical hairpin motifs that are structurally similar to the HEAT repeat, but share little sequence similarity to the HEAT repeat PUBMED:15574331. CRS1_YhbY SM01103 Escherichia coli YhbY is associated with pre-50S ribosomal subunits, which implies a function in ribosome assembly. GFP fused to a single-domain CRM protein from maize localises to the nucleolus, suggesting that an analogous activity may have been retained in plants PUBMED:17105995. A CRM domain containing protein in plant chloroplasts has been shown to function in group I and II intron splicing PUBMED:18065687. In vitro experiments with an isolated maize CRM domain have shown it to have RNA binding activity. These and other results suggest that the CRM domain evolved in the context of ribosome function prior to the divergence of Archaea and Bacteria, that this function has been maintained in extant prokaryotes, and that the domain was recruited to serve as an RNA binding module during the evolution of plant genomes PUBMED:17105995. YhbY has a fold similar to that of the C-terminal domain of translation initiation factor 3 (IF3C), which binds to 16S rRNA in the 30S ribosome PUBMED:12429100. CSF2 SM00040 Granulocyte-macrophage colony-simulating factor (GM-CSF) GM-CSF stimulates the development of and the cytotoxic activity of white blood cells. c-SKI_SMAD_bind SM01046 c-SKI Smad4 binding domain c-SKI is an oncoprotein that inhibits TGF-beta signaling through interaction with Smad proteins PUBMED:15107821. This domain binds to Smad4 PUBMED:12419246 . CSP SM00357 Cold shock protein domain RNA-binding domain that functions as a RNA-chaperone in bacteria and is involved in regulating translation in eukaryotes. Contains sub-family of RNA-binding domains in the Rho transcription termination factor. CT SM00041 C-terminal cystine knot-like domain (CTCK) The structures of transforming growth factor-beta (TGFbeta), nerve growth factor (NGF), platelet-derived growth factor (PDGF) and gonadotropin all form 2 highly twisted antiparallel pairs of beta-strands and contain three disulphide bonds. The domain is non-globular and little is conserved among these presumed homologues except for their cysteine residues. CT domains are predicted to form homodimers. CTD SM01104 Spt5 C-terminal nonapeptide repeat binding Spt4 The C-terminal domain of the transcription elongation factor protein Spt5 is necessary for binding to Spt4 to form the functional complex that regulates early transcription elongation by RNA polymerase II. The complex may be involved in pre-mRNA processing through its association with mRNA capping enzymes. This CTD domain carries a regular nonapeptide repeat that can be present in up to 18 copies, as in S. pombe PUBMED:19460865. The repeat has a characteristic TPA motif. CTLH SM00668 C-terminal to LisH motif. Alpha-helical motif of unknown function. CTNS SM00679 Repeated motif present between transmembrane helices in cystinosin, yeast ERS1p, mannose-P-dolichol utilization defect 1, and other hypothetical proteins. Function unknown, but likely to be associated with the glycosylation machinery. CUB SM00042 Domain first found in C1r, C1s, uEGF, and bone morphogenetic protein. This domain is found mostly among developmentally-regulated proteins. Spermadhesins contain only this domain. CUE SM00546 Domain that may be involved in binding ubiquitin-conjugating enzymes (UBCs) CUE domains also occur in two protein of the IL-1 signal transduction pathway, tollip and TAB2. Ponting (Biochem. J.) "Proteins of the Endoplasmic reticulum" (in press) Cu_FIST SM00412 Copper-Fist binds DNA only in present of copper or silver CULLIN SM00182 Cullin Cullin_Nedd8 SM00884 Cullin protein neddylation domain This is the neddylation site of cullin proteins which are a family of structurally related proteins containing an evolutionarily conserved cullin domain. With the exception of APC2, each member of the cullin family is modified by Nedd8 and several cullins function in Ubiquitin-dependent proteolysis, a process in which the 26S proteasome recognises and subsequently degrades a target protein tagged with K48-linked poly-ubiquitin chains. Cullins are molecular scaffolds responsible for assembling the ROC1/Rbx1 RING-based E3 ubiquitin ligases, of which several play a direct role in tumorigenesis. Nedd8/Rub1 is a small ubiquitin-like protein, which was originally found to be conjugated to Cdc53, a cullin component of the SCF (Skp1-Cdc53/CUL1-F-box protein) E3 Ub ligase complex in Saccharomyces cerevisiae, and Nedd8 modification has now emerged as a regulatory pathway of fundamental importance for cell cycle control and for embryogenesis in metazoans. The only identified Nedd8 substrates are cullins. Neddylation results in covalent conjugation of a Nedd8 moiety onto a conserved cullin lysine residue. Cupin_1 SM00835 Cupin This family represents the conserved barrel domain of the 'cupin' superfamily ('cupa' is the Latin term for a small barrel). This family contains 11S and 7S plant seed storage proteins, and germins. Plant seed storage proteins provide the major nitrogen source for the developing plant. CUT SM01109 The CUT domain is a DNA-binding domain often found in combination with a downstream homeodomain. Cutinase SM01110 This enzyme belongs to the family of hydrolases, specifically those acting on carboxylic ester bonds. The systematic name of this enzyme class is cutin hydrolase. Aerial plant organs are protected by a cuticle composed of an insoluble polymeric structural compound, cutin, which is a polyester composed of hydroxy and hydroxyepoxy fatty acids. Plant pathogenic fungi produce extracellular degradative enzymes that play an important role in pathogenesis. They include cutinase, which hydrolyses cutin, facilitating fungus penetration through the cuticle. Inhibition of the enzyme can prevent fungal infection through intact cuticles. Cutin monomers released from the cuticle by small amounts of cutinase on fungal spore surfaces can greatly increase the amount of cutinase secreted by the spore, the mechanism for which process is as yet unknown. (PMID 1557023) CVNH SM01111 In molecular biology, the CVNH domain (CyanoVirin-N Homology domain) is a conserved protein domain. It is found in the sugar-binding antiviral protein cyanovirin-N (CVN) as well as proteins from filamentous ascomycetes and in the fern Ceratopteris richardii.(PMID 16003744) CW SM00605 cwf21 SM01115 The cwf21 family is involved in mRNA splicing. It has been isolated as a subcomplex of the splicosome in Schizosaccharomyces pombe (PUBMED:11884590). The function of the cwf21 domain is to bind directly to the spliceosomal protein Prp8. Mutations in the cwf21 domain prevent Prp8 from binding (PUBMED:19854871). The structure of this domain has recently been solved which shows this domain to be composed of two alpha helices. CXC SM01114 Tesmin/TSO1-like CXC domain This family includes proteins that have two copies of a cysteine rich motif as follows: C-X-C-X4-C-X3-YC-X-C-X6-C-X3-C-X-C-X2-C. The family includes Tesmin Q9Y4I5 (PUBMED:10191092) and TSO1 Q9LE32 (PUBMED:10769245) . This family is called a CXC domain in (PUBMED:10769245). CxxC_CXXC_SSSS SM00834 Putative regulatory protein CxxC_CXXC_SSSS represents a region of about 41 amino acids found in a number of small proteins in a wide range of bacteria. The region usually begins with the initiator Met and contains two CxxC motifs separated by 17 amino acids. One protein in this entry has been noted as a putative regulatory protein, designated FmdB. Most proteins in this entry have a C-terminal region containing highly degenerate sequence. CY SM00043 Cystatin-like domain Cystatins are a family of cysteine protease inhibitors that occur mainly as single domain proteins. However some extracellular proteins such as kininogen, His-rich glycoprotein and fetuin also contain these domains. Cyan SM01452 sub-family of tyrosine recombinases tyrosine recombinase sub-family found predominantly in Cyanobacteria and involved in DNA clevage through tyrosine residue at the active site https://doi.org/10.1101/542381 Cyanate_lyase SM01116 Cyanate lyase C-terminal domain, Cyanate hydratase Cyanate lyase (also known as cyanase) EC:4.2.1.104 is responsible for the hydrolysis of cyanate, allowing organisms that possess the enzyme to overcome the toxicity of environmental cyanate. This enzyme is composed of two domains, an N-terminal helix-turn-helix and this structurally unique C-terminal domain (PUBMED:10801492). CYCc SM00044 Adenylyl- / guanylyl cyclase, catalytic domain Present in two copies in mammalian adenylyl cyclases. Eubacterial homologues are known. Two residues (Asn, Arg) are thought to be involved in catalysis. These cyclases have important roles in a diverse range of cellular processes. CYCLIN SM00385 domain present in cyclins, TFIIB and Retinoblastoma A helical domain present in cyclins and TFIIB (twice) and Retinoblastoma (once). A protein recognition domain functioning in cell-cycle and transcription control. Cyclin_C SM01332 Cyclins are a family of proteins that control the progression of cells through the cell cycle by activating cyclin-dependent kinase (Cdk) enzymes. CysPc SM00230 Calpain-like thiol protease family. Calpain-like thiol protease family (peptidase family C2). Calcium activated neutral protease (large subunit). Cyt-b5 SM01117 Cytochrome b5-like Heme/Steroid binding domain This family includes heme binding domains from a diverse range of proteins. This family also includes proteins that bind to steroids. The family includes progesterone receptors such as O00264 (PUBMED:9705155,PUBMED:8774719). Many members of this subfamily are membrane anchored by an N-terminal transmembrane alpha helix. This family also includes a domain in some chitin synthases. There is no known ligand for this domain in the chitin synthases. CYTH SM01118 These sequences are functionally identified as members of the adenylate cyclase family, which catalyses the conversion of ATP to 3',5'-cyclic AMP and pyrophosphate. Six distinct non-homologous classes of AC have been identified. The structure of three classes of adenylyl cyclases have been solved (PUBMED:16905149). D5_N SM00885 D5 N terminal like This domain is found in D5 proteins of DNA viruses and bacteriophage P4 DNA primases phages. Dabb SM00886 Stress responsive A/B Barrel Domain The function of this domain is unknown, but it is upregulated in response to salt stress in Populus balsamifera (balsam poplar). It is also found at the C-terminus of a fructose 1,6-bisphosphate aldolase from Hydrogenophilus thermoluteolus.It is found in the pA01 plasmid, which encodes genes for molybdopterin uptake and degradation of plant alkaloid nicotine. The structure of one has been solved and the domain forms an alpha-beta barrel dimer. Although there is a clear duplication within the domain it is not obviously detectable in the sequence. DAGKa SM00045 Diacylglycerol kinase accessory domain (presumed) Diacylglycerol (DAG) is a second messenger that acts as a protein kinase C activator. DAG can be produced from the hydrolysis of phosphatidylinositol 4,5-bisphosphate (PIP2) by a phosphoinositide-specific phospholipase C and by the degradation of phosphatidylcholine (PC) by a phospholipase C or the concerted actions of phospholipase D and phosphatidate phosphohydrolase. This domain might either be an accessory domain or else contribute to the catalytic domain. Bacterial homologues are known. DAGKc SM00046 Diacylglycerol kinase catalytic domain (presumed) Diacylglycerol (DAG) is a second messenger that acts as a protein kinase C activator. DAG can be produced from the hydrolysis of phosphatidylinositol 4,5-bisphosphate (PIP2) by a phosphoinositide-specific phospholipase C and by the degradation of phosphatidylcholine (PC) by a phospholipase C or the concerted actions of phospholipase D and phosphatidate phosphohydrolase. This domain is presumed to be the catalytic domain. Bacterial homologues areknown. Dak1_2 SM01121 This is the kinase domain of the dihydroxyacetone kinase family. Dak2 SM01120 This domain is the predicted phosphatase domain of the dihydroxyacetone kinase family. DALR_1 SM00836 DALR anticodon binding domain This all alpha helical domain is the anticodon binding domain of Arginyl tRNA synthetase. This domain is known as the DALR domain after characteristic conserved amino acids PUBMED:10447505. DALR_2 SM00840 This DALR domain is found in cysteinyl-tRNA-synthetases. DAX SM00021 Domain present in Dishevelled and axin Domain of unknown function. DBB SM01282 Dof, BCAP, and BANK (DBB) motif The DBB domain is named from the Drosophila (Downstream of FGFR - Dof, also known as Heartbroken or Stumps) protein, the BANKS and BCAP, both signalling in B-cell pathway, proteins. This domain defines a minimal region required for mediating Dof dimerisation. Since this domain can interact both with itself and with a region in the C-terminal part of the molecule, it may mediate either intermolecular or intramolecular interactions PMID:12767830. Mutants lacking this domain disrupt FGFR signal transduction and fibroblast growth-factor signalling PMID:14993266. DBC1 SM01122 DBC1 and it homologs from diverse eukaryotes are a catalytically inactive version of the Nudix hydrolase (MutT) domain (PUBMED:18418069). DBC1 is predicted to bind NAD metabolites and regulate the activity of SIRT1 or related deacetylases by sensing the soluble products or substrates of the NAD-dependent deacetylation reaction (PUBMED:18418069). DBP10CT SM01123 DBP10CT (NUC160) domain This C terminal domain is found in the Dbp10p subfamily of hypothetical RNA helicases (PUBMED:15112237). DBR1 SM01124 Lariat debranching enzyme, C-terminal domain This presumed domain is found at the C-terminus of lariat debranching enzyme. This domain is always found in association with Metallophos PF00149. DCD SM00767 DCD is a plant specific domain in proteins involved in development and programmed cell death. The domain is shared by several proteins in the Arabidopsis and the rice genomes, which otherwise show a different protein architecture. Biological studies indicate a role of these proteins in phytohormone response, embryo development and programmed cell death by pathogens or ozone. DCP2 SM01125 Dcp2, box A domain This domain is specific to mRNA decapping protein 2 and this region has been termed Box A (PUBMED:12218187). Removal of the cap structure is catalysed by the Dcp1-Dcp2 complex (PUBMED:16341225). DCX SM00537 Domain in the Doublecortin (DCX) gene product Tandemly-repeated domain in doublin, the Doublecortin gene product. Proposed to bind tubulin. Doublecortin (DCX) is mutated in human X-linked neuronal migration defects. DDE SM01503 family of Ribonuclease H like domains involved in transposition transposase DDE domain involved in insertion and excision of tranposons PMID:10207011 and PMID:6268937 DDE_1 SM01500 DDE transposase sub-family endonucleases of DDE superfamily involved in transposition DDE_2 SM01498 DDE Mu phage transposase sub-family bacteriophage Mu transposase PMID:7628012 DDE_3 SM01480 DDE transposase sub-family endonucleases of DDE superfamily involved in transposition DDE_5 SM01486 DDE transposase sub-family endonucleases of DDE superfamily involved in transposition dDENN SM00801 Domain always found downstream of DENN domain, found in a variety of signalling proteins The dDENN domain is part of the tripartite DENN domain. It is always found downstream of the DENN domain itself, which is found in a variety of signalling proteins involved in Rab-mediated processes or regulation of MAPKs signalling pathways. The DENN domain is always encircled on both sides by more divergent domains, called uDENN (for upstream DENN) and dDENN (for downstream DENN). The function of the DENN domain remains to date unclear, although it appears to represent a good candidate for a GTP/GDP exchange activity. DDE_Tnp_1 SM01492 DDE transposase sub-family transposase DDE domain invovled in transposition PMID:10207011 DDE_Tnp_1_2 SM01485 DDE transposase sub-family transposase DDE domain invovled in transposition DDE_Tnp_1_3 SM01484 DDE transposase sub-family transposase DDE domain invovled in transposition DDE_Tnp_1_4 SM01479 DDE transposase sub-family transposase DDE domain invovled in transposition DDE_Tnp_1_5 SM01483 DDE transposase sub-family transposase DDE domain invovled in transposition PMID:6268937 DDE_Tnp_1_6 SM01489 DDE transposase sub-family transposase DDE domain invovled in transposition DDE_Tnp_1_7 SM01495 DDE transposase sub-family IS4 transposase DDE domain invovled in transposition DDE_Tnp_2 SM01499 DDE transposase sub-family transposase DDE domain invovled in transposition DDE_Tnp_4 SM01490 DDE transposase sub-family transposase DDE domain invovled in transposition DDE_Tnp_IS1 SM01497 DDE transposase sub-family transposase DDE domain invovled in transposition of IS1 transposon PMID:11274106 DDE_Tnp_IS1595 SM01482 DDE transposase sub-family transposase DDE domain invovled in transposition of variety of transposons including ISXO2 DDE_Tnp_IS240 SM01481 DDE transposase sub-family transposase DDE domain invovled in transposition of variety of transposons including IS240, IS26, IS6100 and IS26. DDE_Tnp_IS66 SM01493 DDE transposase sub-family transposase DDE domain invovled in transposition of IS66 and other transposons DDE_Tnp_ISAZ013 SM01491 DDE transposase sub-family transposase DDE domain invovled in transposition DDE_Tnp_ISL3 SM01496 DDE transposase sub-family transposase DDE domain invovled in transposition of IS204, IS1001, IS1096 and IS1165 transposons DDE_Tnp_Tn3 SM01502 DDE transposase sub-family Tn3 transposase DDE domain PMID:8932514 DDHD SM01127 The DDHD domain is 180 residues long and contains four conserved residues that may form a metal binding site. The domain is named after these four residues. This pattern of conservation of metal binding residues is often seen in phosphoesterase domains. This domain is found in retinal degeneration B proteins, as well as a family of probable phospholipases. It has been shown that this domain is found in a longer C terminal region that binds to PYK2 tyrosine kinase. These proteins have been called N-terminal domain-interacting receptor (Nir1, Nir2 and Nir3) (PUBMED:10022914). This suggests that this region is involved in functionally important interactions in other members of this family. DDRGK SM01128 This is a family of proteins of approximately 300 residues, found in plants and vertebrates. They contain a highly conserved DDRGK motif. DDT SM00571 domain in different transcription and chromosome remodeling factors DEATH SM00005 DEATH domain, found in proteins involved in cell death (apoptosis). Alpha-helical domain present in a variety of proteins with apoptotic functions. Some (but not all) of these domains form homotypic and heterotypic dimers. DED SM00031 Death effector domain DEDD SM01463 DEDD transposase transposase DEDD_Tnp_IS110 SM01462 DEDD transposase family transposase Defensin_propep SM01418 Defensin propeptide DEFSN SM00048 Defensin/corticostatin family Cysteine-rich domains that lyse bacteria, fungi and enveloped viruses by forming multimeric membrane-spanning channels. DELLA SM01129 Transcriptional regulator DELLA protein N terminal Gibberellins are plant hormones which have great impact on growth signalling. DELLA proteins are transcriptional regulators of growth related proteins which are downregulated when gibberellins bind to their receptor GID1. GID1 forms a complex with DELLA proteins and signals them towards 26S proteasome. The N terminal of DELLA proteins contains conserved DELLA and VHYNP motifs which are important for GID1 binding and proteolysis of the DELLA proteins. (PUBMED:19037309) DENN SM00799 Domain found in a variety of signalling proteins, always encircled by uDENN and dDENN The DENN domain is found in a variety of signalling proteins involved in Rab-mediated processes or regulation of MAPKs signalling pathways. The DENN domain is always encircled on both sides by more divergent domains, called uDENN (for upstream DENN) and dDENN (for downstream DENN). The function of the DENN domain remains to date unclear, although it appears to represent a good candidate for a GTP/GDP exchange activity. DeoC SM01133 DeoC/LacD family aldolase This family includes diverse aldolase enzymes. This family includes the enzyme deoxyribose-phosphate aldolase EC:4.1.2.4, which is involved in nucleotide metabolism. The family also includes a group of related bacterial proteins of unknown function, see examples Q57843 and P76143. The family also includes tagatose 1,6-diphosphate aldolase ( EC:4.1.2.40) is part of the tagatose-6-phosphate pathway of galactose-6-phosphate degradation (PUBMED:1655695). DeoRC SM01134 DeoR C terminal sensor domain The sensor domains of the DeoR are catalytically inactive versions of the ISOCOT fold, but retain the substrate binding site (PUBMED:16376935). DeorC senses diverse sugar derivatives such as deoxyribose nucleoside (DeoR), tagatose phosphate (LacR), galactosamine (AgaR), myo-inositol (Bacillus IolR) and L-ascorbate (UlaR) (PUBMED:16376935, 18844374, 15306018). DEP SM00049 Domain found in Dishevelled, Egl-10, and Pleckstrin Domain of unknown function present in signalling proteins that contain PH, rasGEF, rhoGEF, rhoGAP, RGS, PDZ domains. DEP domain in Drosophila dishevelled is essential to rescue planar polarity defects and induce JNK signalling (Cell 94, 109-118). DEXDc SM00487 DEAD-like helicases superfamily DEXDc2 SM00488 DEXDc3 SM00489 DHDPS SM01130 Dihydrodipicolinate synthetase family This family has a TIM barrel structure. This enzyme belongs to the family of lyases, specifically the hydro-lyases, which cleave carbon-oxygen bonds. DHHA2 SM01131 This domain is often found adjacent to the DHH domain PF01368 and is called DHHA2 for DHH associated domain. This domain is diagnostic of DHH subfamily 2 members (PUBMED:9478130). The domain is about 120 residues long and contains a conserved DXK motif at its amino terminus. DIL SM01132 The DIL domain has no known function. DIM1 SM01410 Mitosis protein DIM1 DIRP SM01135 DIRP (Domain in Rb-related Pathway) is postulated to be involved in the Rb-related pathway, which is encoded by multiple eukaryotic genomes and is present in proteins including lin-9 of Caenorhabditis elegans, aly of fruit fly and mustard weed. Studies of lin-9 and aly of fruit fly proteins containing DIRP suggest that this domain might be involved in development. Aly, lin-9, act in parallel to, or downstream of, activation of MAPK by the RTK-Ras signalling pathway. DISIN SM00050 Homologues of snake disintegrins Snake disintegrins inhibit the binding of ligands to integrin receptors. They contain a 'RGD' sequence, identical to the recognition site of many adhesion proteins. Molecules containing both disintegrin and metalloprotease domains are known as ADAMs. DKCLD SM01136 DKCLD (NUC011) domain This is a TruB_N/PUA domain associated N-terminal domain of Dyskerin-like proteins (PUBMED:15112237). DM SM00301 Doublesex DNA-binding motif DM10 SM00676 Domains in hypothetical proteins in Drosophila, C. elegans and mammals. Occurs singly in some nucleoside diphosphate kinases. DM11 SM00675 Domains in hypothetical proteins in Drosophila including 2 in CG15241 and CG9329. DM13 SM00686 Domain present in fly proteins (CG14681, CG12492, CG6217), worm H06A10.1 and Arabidopsis thaliana MBG8.9. DM14 SM00685 Repeats in fly CG4713, worm Y37H9A.3 and human FLJ20241. DM15 SM00684 Tandem repeat in fly CG14066 (La related protein), human KIAA0731 and worm R144.7. Unknown function. DM16 SM00683 Repeats in sea squirt COS41.4, worm R01H10.6, fly CG1126 etc. DM3 SM00692 Zinc finger domain in CG10631, C. elegans LIN-15B and human P52rIPK. DM4_12 SM00718 DM4/DM12 family of domains in Drosophila melanogaster proteins of unknown function. DM5 SM00690 Domain of unknown function, currently peculiar to Drosophila. DM6 SM00689 Cysteine-rich domain currently specific to Drosophila. DM7 SM00688 Domain of unknown function in Drosophila CG15332, CG15333 and CG18293 DM8 SM00697 Repeats found in several Drosophila proteins. DM9 SM00696 Repeats found in Drosophila proteins. DMAP_binding SM01137 DMAP1-binding Domain This domain binds DMAP1, a transcriptional co-repressor. DnaG_DnaB_bind SM00766 DNA primase DnaG DnaB-binding DnaG_DnaB_bind defines a domain of primase required for functional interaction with DnaB that attracts primase to the replication fork. DnaG_DnaB_bind is responsible for the interaction between DnaG and DnaB. DnaJ SM00271 DnaJ molecular chaperone homology domain DNA_mis_repair SM01340 DNA mismatch repair protein, C-terminal domain DNA mismatch repair protein, C-terminal domain. DNaseIc SM00476 deoxyribonuclease I Deoxyribonuclease I catalyzes the endonucleolytic cleavage of double-stranded DNA. The enzyme is secreted outside the cell and also involved in apoptosis in the nucleus. DoH SM00664 Possible catecholamine-binding domain present in a variety of eukaryotic proteins. A predominantly beta-sheet domain present as a regulatory N-terminal domain in dopamine beta-hydroxylase, mono-oxygenase X and SDR2. Its function remains unknown at present (Ponting, Human Molecular Genetics, in press). DP SM01138 Transcription factor DP DP forms a heterodimer with E2F and regulates genes involved in cell cycle progression. The transcriptional activity of E2F is inhibited by the retinoblastoma protein which binds to the E2F-DP heterodimer [PUBMED:16360038] and negatively regulates the G1-S transition. DPBB_1 SM00837 Rare lipoprotein A (RlpA)-like double-psi beta-barrel Rare lipoprotein A (RlpA) contains a conserved region that has the double-psi beta-barrel (DPBB) fold. The function of RlpA is not well understood, but it has been shown to act as a prc mutant suppressor in Escherichia coli. The DPBB fold is often an enzymatic domain. The members of this family are quite diverse, and if catalytic this family may contain several different functions. Another example of this domain is found in the N terminus of pollen allergen. Drf_FH3 SM01139 Diaphanous FH3 Domain This region is found in the Formin-like and and diaphanous proteins [PUBMED:12676083,PUBMED:9606213] Drf_GBD SM01140 Diaphanous GTPase-binding Domain This domain is bound to by GTP-attached Rho proteins, leading to activation of the Drf protein. DRY_EERY SM01141 Alternative splicing regulator This entry represents the conserved N-terminal region of SWAP (suppressor-of-white-apricot protein) proteins. This region contains two highly conserved motifs, viz: DRY and EERY, which appear to be the sites for alternative splicing of exons 2 and 3 of the SWAP mRNA [PUBMED:8206918]. These proteins are thus thought to be involved in auto-regulation of pre-mRNA splicing. Most family members are associated with two SWAP (Surp) domains SM00648 and an Arginine- serine-rich binding region towards the C-terminus. D-ser_dehydrat SM01119 Putative serine dehydratase domain This domain is found at the C-terminus of yeast D-serine dehydratase (PUBMED:17937657). Structures have been solved for two bacterial members of this family. The yeast protein has been shown to be a zinc dependant enzyme. DSHCT SM01142 This C terminal domain is found in DOB1/SK12/helY-like DEAD box helicases [PUBMED:15112237]. DSL SM00051 delta serrate ligand DSPc SM00195 Dual specificity phosphatase, catalytic domain DSRM SM00358 Double-stranded RNA binding motif DSS1_SEM1 SM01385 This family contains the breast cancer tumour suppressor BRCA2-interacting protein DSS1 and its homologue SEM1, both of which are short acidic proteins. DSS1 has been shown to be a conserved component of the Rae1 mediated mRNA export pathway in Schizosaccharomyces pombe (PMID:15990877). DSX_dimer SM01143 Doublesex dimerisation domain Doublesex (DSX) is a transcription factor that regulates somatic sexual differences in Drosophila. The structure of this domain has revealed a novel dimeric arrangement of ubiquitin-associated folds that has not previously been identified in a transcription factor [PUBMED:16049008]. DTW SM01144 This presumed domain is found in bacterial and eukaryotic proteins. Its function is unknown. The domain contains multiple conserved motifs including a DTXW motif that this domain has been named after. DUF1041 SM01145 Domain of Unknown Function (DUF1041) This family consists of several eukaryotic domains of unknown function. Members of this family are often found in tandem repeats and co-occur with C1,C2 and PH (SM00109, SM00239, SM00233) domains. DUF106 SM01415 Integral membrane protein DUF106 This archaebacterial protein family has no known function. Members are predicted to be integral membrane proteins. DUF1086 SM01146 Domain of Unknown Function (DUF1086) This family consists of several eukaryotic domains of unknown function which are present in chromodomain helicase DNA binding proteins. This domain is often found in conjunction with DEXDc (SM00487), HELICc (SM00490), DUF1087,CHROMO (SM00298) and PHD (SM00249). DUF1087 SM01147 Members of this family are found in various chromatin remodelling factors and transposases. Their exact function is, as yet, unknown. DUF1220 SM01148 Repeat of unknown function (DUF1220) DUF1237 SM01149 This family contains a number of hypothetical proteins of about 450 residues in length. Their function is unknown, and most are bacterial. However, structurally this family is part of the 6 hairpin glycosidase superfamily, suggesting a glycosyl hydrolase function. DUF1338 SM01150 This domain is found in a variety of bacterial and fungal hypothetical proteins of unknown function. The structure of this domain has been solved by structural genomics. The structure implies a zinc-binding function, so it is a putative metal hydrolase (PDB:3iuz). DUF1518 SM01151 This domain, which is usually found tandemly repeated, is found various receptor co-activating proteins. DUF167 SM01152 DUF1693 SM01153 Domain of unknown function (DUF1693) This family contains many hypothetical proteins. It also includes four nematode prion-like proteins. This domain has been identified as part of the nucleotidyltransferase superfamily. DUF1704 SM01154 This family contains many hypothetical proteins. DUF1713 SM01155 Mitochondrial domain of unknown function (DUF1713) This domain is found at the C terminal end of mitochondrial proteins of unknown function. DUF1716 SM01156 Eukaryotic domain of unknown function (DUF1716) This domain is found in eukaryotic proteins. A human nuclear protein with this domain (Q8WYA6) is thought to have a role in apoptosis [PUBMED:12659813]. DUF1719 SM01157 This is a domain of unknown function. It may have a role in ATPase activation. DUF1741 SM01158 This is a eukaryotic domain of unknown function. DUF1744 SM01159 This domain is found on the epsilon catalytic subunit of DNA polymerase. It is found C terminal to POLBc (SM00486). DUF1751 SM01160 Eukaryotic integral membrane protein (DUF1751) This domain is found in eukaryotic integral membrane proteins. Q12239 a Saccharomyces cerervisiae protein, has been shown to localise COP II vesicles [PUBMED:14562095]. DUF1767 SM01161 Eukaryotic domain of unknown function. This domain is found to the N-terminus of the nucleic acid binding domain. DUF1771 SM01162 This domain is always found adjacent to SMR (SM00463). DUF1785 SM01163 This region is found in argonaute [PUBMED:16216572] proteins and often co-occurs with BAG (SM00264) and Piwi (SM00950). DUF1856 SM01164 This domain has no known function. It is found in the C-terminal segment of various vasopressin receptors. DUF1866 SM01165 This domain, found in Synaptojanin, has no known function. DUF1899 SM01166 This set of domains is found in various eukaryotic proteins. Function is unknown. DUF1900 SM01167 This domain is predominantly found in the structural protein coronin, and is duplicated in some sequences. It has no known function [PUBMED:16172398]. DUF1907 SM01168 The structure of this domain displays an alpha-beta-beta-alpha four layer topology, with an HxHxxxxxxxxxH motif that coordinates a zinc ion, and an acetate anion at a site that likely supports the enzymatic activity of an ester hydrolase [PUBMED:16522806]. DUF1943 SM01169 Members of this family adopt a structure consisting of several large open beta-sheets. Their exact function has not, as yet, been determined [PUBMED:12135361]. DUF1944 SM01170 Members of this family adopt a structure consisting of several large open beta-sheets. Their exact function has not, as yet, been determined [PUBMED:12135361]. DUF3160 SM01325 This family of proteins has no known function. DUF3385 SM01346 Domain of unknown function This domain is found in eukaryotes. DUF3402 SM01293 Domain of unknown function (DUF3402) This domain is functionally uncharacterised. This domain is found in eukaryotes. This presumed domain is typically between 350 to 473 amino acids in length. This domain is found associated with N1221. DUF3452 SM01367 Domain of unknown function (DUF3452) This domain is functionally uncharacterised. This domain is found in bacteria and eukaryotes. DUF3454 SM01334 Domain of unknown function DUF3480 SM01421 Domain of unknown function (DUF3480) This presumed domain is functionally uncharacterised. This domain is found in eukaryotes. DUF3585 SM01203 This domain is found in eukaryotes. This domain is typically between 135 and 149 amino acids in length and is found associated with the CH domain. DUF3635 SM01331 Domain of unknown function This family may be a potential Haspin-related leucine-zipper. A leucine zipper was proposed to be present towards the C-terminus of human Haspin, (up-stream of the current family) (PMID:10358056); however, as this domain would appear to span several helices and be largely within a loop structure (PMID:12737306) the actual zipper might be further downstream, and be this family, which is the very C-terminal part of the Sch. pombe sequence. DUF3700 SM01172 This domain family is found in eukaryotes, and is approximately 120 amino acids in length. There are two conserved sequence motifs: YGL and LRDR. This family is related to GATase enzyme domains. DUF4187 SM01173 This family is found at the very C-terminus of proteins that carry a G-patch domain SM00443. The domain is short and cysteine-rich . DUF4205 SM01174 The proteins in this family are uncharacterized but often named FAM188B. DUF4206 SM01175 This is a family of cysteine-rich proteins. Many members also carry a pleckstrin-homology domain,SM00233. DUF4208 SM01176 This domain is found at the C-terminus of chromodomain-helicase-DNA-binding proteins. The exact function of the domain is undetermined. DUF4210 SM01177 This short domain is found in fungi, plants and animals, and the proteins appear to be necessary for chromosome segregation during meiosis. DUF4217 SM01178 This short domain is found at the C-terminus of many helicase proteins. DUF663 SM01362 Protein of unknown function (DUF663) This family contains several uncharacterised eukaryotic proteins. DUF862 SM01179 PPPDE putative peptidase domain The PPPDE superfamily (after Permuted Papain fold Peptidases of DsRNA viruses and Eukaryotes), consists of predicted thiol peptidases with a circularly permuted papain-like fold. The inference of the likely DUB function of the PPPDE superfamily proteins is based on the fusions of the catalytic domain to Ub-binding PUG (PUB)/UBA domains and a novel alpha-helical Ub-associated domain (the PUL domain, after PLAP, Ufd3p and Lub1p) PUBMED:15483401. DUSP SM00695 Domain in ubiquitin-specific proteases. DWA SM00523 Domain A in dwarfin family proteins DWB SM00524 Domain B in dwarfin family proteins DWNN SM01180 DWNN is a ubiquitin like domain found at the N-terminus of the RBBP6 family of splicing-associated proteins PUBMED:16396680. The DWNN domain is independently expressed in higher vertebrates so it may function as a novel ubiquitin-like modifier of other proteins PUBMED:16396680. DYNc SM00053 Dynamin, GTPase Large GTPases that mediate vesicle trafficking. Dynamin participates in the endocytic uptake of receptors, associated ligands, and plasma membrane following an exocytic event. Dynein_light SM01375 Dynein light chain type 1 DysFC SM00694 Dysferlin domain, C-terminal region. Domain of unknown function present in yeast peroxisomal proteins, dysferlin, myoferlin and hypothetical proteins. Due to an insertion of a dysferlin domain within a second dysferlin domain we have chosen to predict these domains in two parts: the N-terminal region and the C-terminal region. DysFN SM00693 Dysferlin domain, N-terminal region. Domain of unknown function present in yeast peroxisomal proteins, dysferlin, myoferlin and hypothetical proteins. Due to an insertion of a dysferlin domain within a second dysferlin domain we have chosen to predict these domains in two parts: the N-terminal region and the C-terminal region. DZF SM00572 domain in DSRM or ZnF_C2H2 domain containing proteins E2_bind SM01181 E1 and E2 enzymes play a central role in ubiquitin and ubiquitin-like protein transfer cascades. This is an E2 binding domain that is found on NEDD8 activating E1 enzyme. The domain resembles ubiquitin, and recruits the catalytic core of the E2 enzyme Ubc12 in a similar manner to that in which ubiquitin interacts with ubiquitin binding domains PUBMED:15694336. E2F_TDP SM01372 E2F/DP family winged-helix DNA-binding domain This family contains the transcription factor E2F and its dimerization partners TDP1 and TDP2, which stimulate E2F-dependent transcription. E2F binds to DNA as a homodimer or as a heterodimer in association with TDP1/2, the heterodimer having increased binding efficiency. The crystal structure of an E2F4-DP2-DNA complex shows that the DNA-binding domains of the E2F and DP proteins both have a fold related to the winged-helix DNA-binding motif. Recognition of the central c/gGCGCg/c sequence of the consensus DNA-binding site is symmetric, and amino acids that contact these bases are conserved among all known E2F and DP proteins. EAL SM00052 Putative diguanylate phosphodiesterase Putative diguanylate phosphodiesterase, present in a variety of bacteria. EB_dh SM00887 Ethylbenzene dehydrogenase Eythylbenzene dehydrogenase is a heterotrimer of three subunits that catalyses the anaerobic degradation of hydrocarbons. The alpha subunit contains the catalytic centre as a Molybdenum cofactor-complex. This removes an electron-pair from the hydrocarbon and passes it along an electron transport system involving iron-sulphur complexes held in the beta subunit and a Haem b molecule contained in the gamma subunit. The electron-pair is then subsequently passed to an as yet unknown receiver. The enzyme is found in a variety of different bacteria. ECSIT_Cterm SM01284 C-terminal domain of the ECSIT protein This family represents the C-terminal domain of the evolutionarily conserved signaling intermediate in Toll pathway protein, an adapter protein of the Toll-like and IL-1 receptor signaling pathway, which is involved in the activation of NF-kappa-B via MAP3K1. This domain is missing in isoform 2. Fold recognition suggests that this domain may be distantly homologous to the pleckstrin homology domain. EF-1_beta_acid SM01182 Eukaryotic elongation factor 1 beta central acidic region EF1G SM01183 Elongation factor 1 gamma, conserved domain EF1_GNE SM00888 EF-1 guanine nucleotide exchange domain Translation elongation factors are responsible for two main processes during protein synthesis on the ribosome. EF1A (or EF-Tu) is responsible for the selection and binding of the cognate aminoacyl-tRNA to the A-site (acceptor site) of the ribosome. EF2 (or EF-G) is responsible for the translocation of the peptidyl-tRNA from the A-site to the P-site (peptidyl-tRNA site) of the ribosome, thereby freeing the A-site for the next aminoacyl-tRNA to bind. Elongation factors are responsible for achieving accuracy of translation and both EF1A and EF2 are remarkably conserved throughout evolution. Elongation factor EF1B (also known as EF-Ts or EF-1beta/gamma/delta) is a nucleotide exchange factor that is required to regenerate EF1A from its inactive form (EF1A-GDP) to its active form (EF1A-GTP). EF1A is then ready to interact with a new aminoacyl-tRNA to begin the cycle again. EF1B is more complex in eukaryotes than in bacteria, and can consist of three subunits: EF1B-alpha (or EF-1beta), EF1B-gamma (or EF-1gamma) and EF1B-beta (or EF-1delta). This entry represents the guanine nucleotide exchange domain of the beta (EF-1beta, also known as EF1B-alpha) and delta (EF-1delta, also known as EF1B-beta) chains of EF1B proteins from eukaryotes and archaea. The beta and delta chains have exchange activity, which mainly resides in their homologous guanine nucleotide exchange domains, found in the C-terminal region of the peptides. Their N-terminal regions may be involved in interactions with the gamma chain (EF-1gamma). EFG_C SM00838 Elongation factor G C-terminus This domain includes the carboxyl terminal regions of Elongation factor G, elongation factor 2 and some tetracycline resistance proteins and adopt a ferredoxin-like fold. EFG_IV SM00889 Elongation factor G, domain IV Translation elongation factors are responsible for two main processes during protein synthesis on the ribosome. EF1A (or EF-Tu) is responsible for the selection and binding of the cognate aminoacyl-tRNA to the A-site (acceptor site) of the ribosome. EF2 (or EF-G) is responsible for the translocation of the peptidyl-tRNA from the A-site to the P-site (peptidyl-tRNA site) of the ribosome, thereby freeing the A-site for the next aminoacyl-tRNA to bind. Elongation factors are responsible for achieving accuracy of translation and both EF1A and EF2 are remarkably conserved throughout evolution. Elongation factor EF2 (EF-G) is a G-protein. It brings about the translocation of peptidyl-tRNA and mRNA through a ratchet-like mechanism: the binding of GTP-EF2 to the ribosome causes a counter-clockwise rotation in the small ribosomal subunit; the hydrolysis of GTP to GDP by EF2 and the subsequent release of EF2 causes a clockwise rotation of the small subunit back to the starting position. This twisting action destabilises tRNA-ribosome interactions, freeing the tRNA to translocate along the ribosome upon GTP-hydrolysis by EF2. EF2 binding also affects the entry and exit channel openings for the mRNA, widening it when bound to enable the mRNA to translocate along the ribosome. EF2 has five domains. This entry represents domain IV found in EF2 (or EF-G) of both prokaryotes and eukaryotes. The EF2-GTP-ribosome complex undergoes extensive structural rearrangement for tRNA-mRNA movement to occur. Domain IV, which extends from the 'body' of the EF2 molecule much like a lever arm, appears to be essential for the structural transition to take place. EFh SM00054 EF-hand, calcium binding motif EF-hands are calcium-binding motifs that occur at least in pairs. Links between disease states and genes encoding EF-hands, particularly the S100 subclass, are emerging. Each motif consists of a 12 residue loop flanked on either side by a 12 residue alpha-helix. EF-hands undergo a conformational change unpon binding calcium ions. efhand_Ca_insen SM01184 Ca2+ insensitive EF hand EF hands are helix-loop-helix binding motifs involved in the regulation of many cellular processes. EF hands usually bind to Ca2+ ions which causes a major conformational change that allows the protein to interact with its designated targets. This domain corresponds to an EF hand which has partially or entirely lost its calcium-binding properties. The calcium insensitive EF hand is still able to mediate protein-protein recognition PUBMED:11573089. EFP SM01185 Elongation factor P (EF-P) OB domain EGF SM00181 Epidermal growth factor-like domain. EGF_CA SM00179 Calcium-binding EGF-like domain EGF_Lam SM00180 Laminin-type epidermal growth factor-like domai EGF_like SM00001 EGF domain, unclasssified subfamily EH SM00027 Eps15 homology domain Pair of EF hand motifs that recognise proteins containing Asn-Pro-Phe (NPF) sequences. eIF1a SM00652 eukaryotic translation initiation factor 1A eIF2B_5 SM00653 domain present in translation initiation factor eIF2B and eIF5 eIF3_N SM01186 eIF3 subunit 6 N terminal domain This is the N terminal domain of subunit 6 translation initiation factor eIF3. eIF-5a SM01376 Eukaryotic elongation factor 5A hypusine, DNA-binding OB fold eIF5A, previously thought to be an initiation factor, has been shown to be required for peptide chain elongation in yeast (PMID:9753699). eIF5C SM00515 Domain at the C-termini of GCD6, eIF-2B epsilon, eIF-4 gamma and eIF-5 eIF6 SM00654 translation initiation factor 6 EKR SM00890 Domain of unknown function EKR is a short, 33 residue, domain found in bacterial and some lower eukaryotic species which lies between a POR (pyruvate ferredoxin/flavodoxin oxidoreductase) and the 4Fe-4S binding domain Fer4. It contains a characteristic EKR sequence motif. The exact function of this domain is not known. ELFV_dehydrog SM00839 Glutamate/Leucine/Phenylalanine/Valine dehydrogenase Glutamate, leucine, phenylalanine and valine dehydrogenases are structurally and functionally related. They contain a Gly-rich region containing a conserved Lys residue, which has been implicated in the catalytic activity, in each case a reversible oxidative deamination reaction. Elicitin SM01187 Elicitins form a novel class of plant necrotic proteins which are secreted by Phytophthora and Pythium fungi, parasites of many economically important crops. These proteins induce leaf necrosis in infected plants and elicit an incompatible hypersensitive-like reaction, leading to the development of a systemic acquired resistance against a range of fungal and bacterial plant pathogens PUBMED:8994969. ELK SM01188 This domain is required for the nuclear localisation of these proteins PUBMED:11352458. All of these proteins are members of the Tale/Knox homeodomain family, a subfamily within homeobox SM00389. ELM2 SM01189 The ELM2 (Egl-27 and MTA1 homology 2) domain is a small domain of unknown function. It is found in the MTA1 protein that is part of the NuRD complex PUBMED:10226007. The domain is usually found to the N terminus of a myb-like DNA binding domain SANT SM00717. ELM2 is also found associated with an ARID DNA binding domain SM01014 in Q84JT7. This suggests that ELM2 may also be involved in DNA binding, or perhaps is a protein-protein interaction domain. Elong-fact-P_C SM00841 Elongation factor P, C-terminal These nucleic acid binding domains are predominantly found in elongation factor P, where they adopt an OB-fold, with five beta-strands forming a beta-barrel in a Greek-key topology PUBMED:15210970. Elp3 SM00729 Elongator protein 3, MiaB family, Radical SAM This superfamily contains MoaA, NifB, PqqE, coproporphyrinogen III oxidase, biotin synthase and MiaB families, and includes a representative in the eukaryotic elongator subunit, Elp-3. Some members of the family are methyltransferases. EMP24_GP25L SM01190 emp24/gp25L/p24 family/GOLD Members of this family are implicated in bringing cargo forward from the ER and binding to coat proteins by their cytoplasmic domains. This domain corresponds closely to the beta-strand rich GOLD domain described in PUBMED:12049664. The GOLD domain is always found combined with lipid- or membrane-association domains PUBMED:12049664. END SM00272 Endothelin ENDO3c SM00478 endonuclease III includes endonuclease III (DNA-(apurinic or apyrimidinic site) lyase), alkylbase DNA glycosidases (Alka-family) and other DNA glycosidases Endonuclease_NS SM00892 DNA/RNA non-specific endonuclease A family of bacterial and eukaryotic endonucleases share the following characteristics: they act on both DNA and RNA, cleave double-stranded and single-stranded nucleic acids and require a divalent ion such as magnesium for their activity. An histidine has been shown to be essential for the activity of the Serratia marcescens nuclease. This residue is located in a conserved region which also contains an aspartic acid residue that could be implicated in the binding of the divalent ion. Enolase_C SM01192 Enolase, C-terminal TIM barrel domain Enolase_N SM01193 Enolase, N-terminal domain ENT SM01191 This presumed domain is named after Emsy N Terminus (ENT). Emsy is a protein that is amplified in breast cancer and interacts with BRCA2. The N terminus of this protein is found to be similar to other vertebrate and plant proteins of unknown function. This domain has a completely conserved histidine residue that may be functionally important. ENTH SM00273 Epsin N-terminal homology (ENTH) domain EPEND SM00026 Ependymins Ependymins are the predominant proteins in the cerebrospinal fluid (CSF) of teleost fish. They have been implicated in the neurochemistry of memory and neuronal regeneration. They are glycoproteins of about 200 amino acids that can bind calcium. Four cysteines are conserved that probably form disulfide bonds. EPH_lbd SM00615 Ephrin receptor ligand binding domain Ephrin_rec_like SM01411 Putative ephrin-receptor like This family has repeats of a region rich in cysteines. ERCC4 SM00891 ERCC4 domain This entry represents a structural motif found in several DNA repair nucleases, such as Rad1/Mus81/XPF endonucleases, and in ATP-dependent helicases. The XPF/Rad1/Mus81-dependent nuclease family specifically cleaves branched structures generated during DNA repair, replication, and recombination, and is essential for maintaining genome stability. The nuclease domain architecture exhibits remarkable similarity to those of restriction endonucleases. eRF1_1 SM01194 The release factor eRF1 terminates protein biosynthesis by recognising stop codons at the A site of the ribosome and stimulating peptidyl-tRNA bond hydrolysis at the peptidyl transferase centre. The crystal structure of human eRF1 is known PUBMED:10676813. The overall shape and dimensions of eRF1 resemble a tRNA molecule with domains 1, 2, and 3 of eRF1 corresponding to the anticodon loop, aminoacyl acceptor stem, and T stem of a tRNA molecule, respectively. The position of the essential GGQ motif at an exposed tip of domain 2 suggests that the Gln residue coordinates a water molecule to mediate the hydrolytic activity at the peptidyl transferase centre. A conserved groove on domain 1, 80 A from the GGQ motif, is proposed to form the codon recognition site PUBMED:10676813. This family also includes other proteins for which the precise molecular function is unknown. Many of them are from Archaebacteria. These proteins may also be involved in translation termination but this awaits experimental verification. EsV_1_7 SM01425 EsV-1-7 repeat The EsV-1-7 repeat is a cysteine-rich motif of unknown function. The EsV-1-7 repeat motif was originally identified in the Ectocarpus IMMEDIATE UPRIGHT protein, which has an EsV-1-7 domain that contains five EsV-1-7 repeats (PMID: 28049657). The name is derived from the Ectocarpus virus EsV-1 protein EsV-1-7, which possesses six EsV-1-7 repeats. Ectocarpus has a large family of 91 EsV-1-7 domain proteins with between one and 19 copies of the motif (C-X4-C-X16-C-X2-H-X12). In addition to brown algae, EsV-1-7 domain proteins have been found in eustigmatophytes, oomycetes, cryptophytes, two families of green algae (Coccomyxaceae and Selenastraceae) and in two additional viral genomes, Emiliania huxleyi virus PS401 and Pithovirus sibericum. Based on this unusual distribution, it has been proposed that EsV-1-7 domain genes have been exchanged between lineages by horizontal gene transfer during evolution (PMID: 28049657). ETF SM00893 Electron transfer flavoprotein domain Electron transfer flavoproteins (ETFs) serve as specific electron acceptors for primary dehydrogenases, transferring the electrons to terminal respiratory systems. They can be functionally classified into constitutive, "housekeeping" ETFs, mainly involved in the oxidation of fatty acids (Group I), and ETFs produced by some prokaryotes under specific growth conditions, receiving electrons only from the oxidation of specific substrates (Group II). ETFs are heterodimeric proteins composed of an alpha and beta subunit, and contain an FAD cofactor and AMP. ETF consists of three domains: domains I and II are formed by the N- and C-terminal portions of the alpha subunit, respectively, while domain III is formed by the beta subunit. Domains I and III share an almost identical alpha-beta-alpha sandwich fold, while domain II forms an alpha-beta-alpha sandwich similar to that of bacterial flavodoxins. FAD is bound in a cleft between domains II and III, while domain III binds the AMP molecule. Interactions between domains I and III stabilise the protein, forming a shallow bowl where domain II resides. This entry represents the N-terminal domain of both the alpha and beta subunits from Group I and Group II ETFs. ETS SM00413 erythroblast transformation specific domain variation of the helix-turn-helix motif Excalibur SM00894 Excalibur calcium-binding domain Extracellular Ca2+-dependent nuclease YokF from Bacillus subtilis and several other surface-exposed proteins from diverse bacteria are encoded in the genomes in two paralogous forms that differ by a ~45 amino acid fragment, which comprises a novel conserved domain. Sequence analysis of this domain revealed a conserved DxDxDGxxCE motif, which is strikingly similar to the Ca2+-binding loop of the calmodulin-like EF-hand domains, suggesting an evolutionary relationship between them. Functions of many of the other proteins in which the novel domain, named Excalibur (extracellular calcium-binding region), is found, as well as a structural model of its conserved motif are consistent with the notion that the Excalibur domain binds calcium. This domain is but one more example of the diversity of structural contexts surrounding the EF-hand-like calcium-binding loop in bacteria. This loop is thus more widespread than hitherto recognised and the evolution of EF-hand-like domains is probably more complex than previously appreciated. EXOIII SM00479 exonuclease domain in DNA-polymerase alpha and epsilon chain, ribonuclease T and other exonucleases EZ_HEAT SM00567 E-Z type HEAT repeats Present in subunits of cyanobacterial phycocyanin lyase, and other proteins. Probable scaffolding role. FA SM01195 FERM adjacent (FA) This region is found adjacent to Band 4.1 / FERM domains in a subset of FERM containing protein. The region has been hypothesised to play a role in regulatory adaptation, based on similarity to other protein kinase PUBMED:16626485. . FA58C SM00231 Coagulation factor 5/8 C-terminal domain, discoidin domain Cell surface-attached carbohydrate-binding domain, present in eukaryotes and assumed to have horizontally transferred to eubacterial genomes. FABD SM00808 F-actin binding domain (FABD) FABD is the F-actin binding domain of Bcr-Abl and its cellular counterpart c-Abl. The Bcr-Abl tyrosine kinase causes different forms of leukemia in humans. Depending on its position within the cell, Bcr-Abl differentially affects cellular growth. The FABD forms a compact left-handed four-helix bundle in solution. FACT-Spt16_Nlob SM01285 FACT complex subunit SPT16 N-terminal lobe domain The FACT or facilitator of chromatin transcription complex binds to and alters the properties of nucleosomes. This family represents the N-terminal lobe of the NTD, or N-terminal domain, and acts as a protein-protein interaction domain presumably with partners outside of the FACT complex PMID:18089575. Knockout of the whole NTD domain, 1-450 residues in UniProt:P32558 in yeast serves to tender the cells sensitive to DNA replication stress but is not lethal. The C-terminal half of NTD is structurally similar to aminopeptidases, and the most highly conserved surface residues line a cleft equivalent to the aminopeptidase substrate-binding site, family peptidase_M24, (PFAM:PF00557) PMID:18089575 FANCL_C SM01197 FANCL C-terminal domain This domain is found at the C-terminus of the Fancl protein in humans which is the putative E3 ubiquitin ligase subunit of the FA complex (Fanconi anaemia). Eight subunits of the Fanconi anaemia gene products form a multisubunit nuclear complex which is required for mono-ubiquitination of a downstream FA protein, FANCD2. Fapy_DNA_glyco SM00898 Formamidopyrimidine-DNA glycosylase N-terminal domain This entry represents the catalytic domain of DNA glycosylase/AP lyase enzymes, which are involved in base excision repair of DNA damaged by oxidation or by mutagenic agents. Most damage to bases in DNA is repaired by the base excision repair pathway PUBMED:15588838. These enzymes are primarily from bacteria, and have both DNA glycosylase activity and AP lyase activity. Examples include formamidopyrimidine-DNA glycosylases (Fpg; MutM) and endonuclease VIII (Nei). Formamidopyrimidine-DNA glycosylases (Fpg, MutM) is a trifunctional DNA base excision repair enzyme that removes a wide range of oxidation-damaged bases (N-glycosylase activity; ) and cleaves both the 3'- and 5'-phosphodiester bonds of the resulting apurinic/apyrimidinic site (AP lyase activity; ). Fpg has a preference for oxidised purines, excising oxidized purine bases such as 7,8-dihydro-8-oxoguanine (8-oxoG). ITs AP (apurinic/apyrimidinic) lyase activity introduces nicks in the DNA strand, cleaving the DNA backbone by beta-delta elimination to generate a single-strand break at the site of the removed base with both 3'- and 5'-phosphates. Fpg is a monomer composed of 2 domains connected by a flexible hinge PUBMED:10921868. The two DNA-binding motifs (a zinc finger and the helix-two-turns-helix motifs) suggest that the oxidized base is flipped out from double-stranded DNA in the binding mode and excised by a catalytic mechanism similar to that of bifunctional base excision repair enzymes PUBMED:10921868. Fpg binds one ion of zinc at the C-terminus, which contains four conserved and essential cysteines PUBMED:8473347, PUBMED:7704272. Endonuclease VIII (Nei) has the same enzyme activities as Fpg above, but with a preference for oxidized pyrimidines, such as thymine glycol, 5,6-dihydrouracil and 5,6-dihydrothymine PUBMED:15232006. These protein contains three structural domains: an N-terminal catalytic core domain, a central helix-two turn-helix (H2TH) module and a C-terminal zinc finger (see PDB:1K82) PUBMED:11912217. The N-terminal catalytic domain and the C-terminal zinc finger straddle the DNA with the long axis of the protein oriented roughly orthogonal to the helical axis of the DNA. Residues that contact DNA are located in the catalytic domain and in a beta-hairpin loop formed by the zinc finger PUBMED:12055620. FAS1 SM00554 Four repeated domains in the Fasciclin I family of proteins, present in many other contexts. FATC SM01343 The FATC domain is named after FRAP, ATM, TRRAP C-terminal (PMID:10782091). The solution structure of the FATC domain suggests it plays a role in redox-dependent structural and cellular stability (PMID:15772072). FBA SM01198 F-box associated region Members of this family are associated with F-box domains, hence the name FBA. This domain is probably involved in binding other proteins that will be targeted for ubiquitination. Q9UK22 is involved in binding to N-glycosylated proteins. FBD SM00579 domain in FBox and BRCT domain containing plant proteins FBG SM00186 Fibrinogen-related domains (FReDs) Domain present at the C-termini of fibrinogen beta and gamma chains, and a variety of fibrinogen-related proteins, including tenascin and Drosophila scabrous. FBOX SM00256 A Receptor for Ubiquitination Targets fCBD SM00236 Fungal-type cellulose-binding domain Small four-cysteine cellulose-binding domain of fungi FCD SM00895 This entry represents the C-terminal ligand binding domain of many members of the GntR family. This domain probably binds to a range of effector molecules that regulate the transcription of genes through the action of the N-terminal DNA-binding domain. This domain is found in and that are regulators of sugar biosynthesis operons. Many bacterial transcription regulation proteins bind DNA through a helix-turn-helix (HTH) motif, which can be classified into subfamilies on the basis of sequence similarities. The HTH GntR family has many members distributed among diverse bacterial groups that regulate various biological processes. It was named GntR after the Bacillus subtilis repressor of the gluconate operon. In general, these proteins contain a DNA-binding HTH domain at the N terminus, and an effector binding or oligomerisation domain at the C terminus. The winged-helix DNA-binding domain is well conserved in structure for the whole of the GntR family, and is similar in structure to other transcriptional regulator families. The C-terminal effector-binding and oligomerisation domains are more variable and are consequently used to define the subfamilies. Based on the sequence and structure of the C-terminal domains, the GtnR family can be divided into four major groups, as represented by FadR, HutC, MocR and YtrA, as well as some minor groups such as those represented by AraR and PlmA. FCH SM00055 Fes/CIP4 homology domain Alignment extended from original report. Highly alpha-helical. Also known as the RAEYL motif or the S. pombe Cdc15 N-terminal domain. FDF SM01199 The FDF domain, so called because of the conserved FDF at its N termini, is an entirely alpha-helical domain with multiple exposed hydrophilic loops. It is found at the C terminus of Scd6p-like SM domains. It is also found with other divergent Sm domains and in proteins such as Dcp3p and FLJ21128, where it is found N terminal to the YjeF-N domain, a novel Rossmann fold domain PMID 15257761. FDX-ACB SM00896 Ferredoxin-fold anticodon binding domain This is the anticodon binding domain found in some phenylalanyl tRNA synthetases. The domain has a ferredoxin fold, consisting of an alpha+beta sandwich with anti-parallel beta-sheets (beta-alpha-beta x2). Fe_hyd_SSU SM00902 Iron hydrogenase small subunit Many microorganisms, such as methanogenic, acetogenic, nitrogen-fixing, photosynthetic, or sulphate-reducing bacteria, metabolise hydrogen. Hydrogen activation is mediated by a family of enzymes, termed hydrogenases, which either provide these organisms with reducing power from hydrogen oxidation, or act as electron sinks. There are two hydrogenases families that differ functionally from each other: NiFe hydrogenases tend to be more involved in hydrogen oxidation, while Iron-only FeFe (Fe only) hydrogenases in hydrogen production. Fe only hydrogenases show a common core structure, which contains a moiety, deeply buried inside the protein, with an Fe-Fe dinuclear centre, nonproteic bridging, terminal CO and CN- ligands attached to each of the iron atoms, and a dithio moiety, which also bridges the two iron atoms and has been tentatively assigned as a di(thiomethyl)amine. This common core also harbours three [4Fe-4S] iron-sulphur clusters PUBMED:11921392. In FeFe hydrogenases, as in NiFe hydrogenases, the set of iron-sulphur clusters is dispersed regularly between the dinuclear Fe-Fe centre and the molecular surface. These clusters are distant by about 1.2 nm from each other but the [4Fe-4S] cluster closest to the dinuclear centre is covalently bound to one of the iron atoms though a thiolate bridging ligand. The moiety including the dinuclear centre, the thiolate bridging ligand, and the proximal [4Fe-4S] cluster is known as the H-cluster. A channel, lined with hydrophobic amino acid side chains, nearly connects the dinuclear centre and the molecular surface. Furthermore hydrogen-bonded water molecule sites have been identified at the interior and at the surface of the protein. The small subunit is comprised of alternating random coil and alpha helical structures that encompass the large subunit in a novel protein fold PUBMED:10368269. FeoA SM00899 This entry represents the core domain of the ferrous iron (Fe2+) transport protein FeoA found in bacteria. This domain also occurs at the C-terminus in related proteins. The transporter Feo is composed of three proteins: FeoA a small, soluble SH3-domain protein probably located in the cytosol; FeoB, a large protein with a cytosolic N-terminal G-protein domain and a C-terminal integral inner-membrane domain containing two 'Gate' motifs which likely functions as the Fe2+ permease; and FeoC, a small protein apparently functioning as an [Fe-S]-dependent transcriptional repressor. Feo allows the bacterial cell to acquire iron from its environment. FerA SM01200 This is central domain A in proteins of the Ferlin family. PMID: 15112237 FerB SM01201 This is central domain B in proteins of the Ferlin family. PMID: 15112237 FerI SM01202 This domain is present in proteins of the Ferlin family. It is often located between two C2 domains PMID:15112237 . FERM_C SM01196 FERM C-terminal PH-like domain FES SM00525 iron-sulpphur binding domain in DNA-(apurinic or apyrimidinic site) lyase (subfamily of ENDO3) FF SM00441 Contains two conserved F residues A novel motif that often accompanies WW domains. Often contains two conserved Phe (F) residues. FGF SM00442 Acidic and basic fibroblast growth factor family. Mitogens that stimulate growth or differentiation of cells of mesodermal or neuroectodermal origin. The family play essential roles in patterning and differentiation during vertebrate embryogenesis, and have neurotrophic activities. FH SM00339 FORKHEAD FORKHEAD, also known as a "winged helix" FH2 SM00498 Formin Homology 2 Domain FH proteins control rearrangements of the actin cytoskeleton, especially in the context of cytokinesis and cell polarisation. Members of this family have been found to interact with Rho-GTPases, profilin and other actin-assoziated proteins. These interactions are mediated by the proline-rich FH1 domain, usually located in front of FH2 (but not listed in SMART). Despite this cytosolic function, vertebrate formins have been assigned functions within the nucleus. A set of Formin-Binding Proteins (FBPs) has been shown to bind FH1 with their WW domain. FHA SM00240 Forkhead associated domain Found in eukaryotic and prokaryotic proteins. Putative nuclear signalling domain. Fib_alpha SM01212 Fibrinogen alpha/beta chain family Fibrinogen is a protein involved in platelet aggregation and is essential for the coagulation of blood. This domain forms part of the central coiled coiled region of the protein which is formed from two sets of three non-identical chains (alpha, beta and gamma). Fibrillarin SM01206 Filament SM01391 Intermediate filament protein FIMAC SM00057 factor I membrane attack complex FISNA SM01288 Fish-specific NACHT associated domain This domain is frequently found associated with the NACHT domain (PFAM: PF05729) in fish and other vertebrates PMID:18039395. FIST SM00897 FIST N domain The FIST N domain is a novel sensory domain, which is present in signal transduction proteins from Bacteria, Archaea and Eukarya. Chromosomal proximity of FIST-encoding genes to those coding for proteins involved in amino acid metabolism and transport suggest that FIST domains bind small ligands, such as amino acids. FIST_C SM01204 The FIST C domain is a novel sensory domain, which is present in signal transduction proteins from Bacteria, Archaea and Eukarya. Chromosomal proximity of FIST-encoding genes to those coding for proteins involved in amino acid metabolism and transport suggest that FIST domains bind small ligands, such as amino acids. PMID:17855421 FKS1_dom1 SM01205 1,3-beta-glucan synthase subunit FKS1 The FKS1_dom1 domain is likely to be the 'Class I' region just N-terminal to the first set of transmembrane helices that is involved in 1,3-beta-glucan synthesis itself. PMID:20124029 This family is found on proteins with family Glucan_synthase, Pfam: PF02364. Flavin_Reduct SM00903 Flavin reductase like domain This entry represents the FMN-binding domain found in NAD(P)H-flavin oxidoreductases (flavin reductases), a class of enzymes capable of producing reduced flavin for bacterial bioluminescence and other biological processes. This domain is also found in various other oxidoreductase and monooxygenase enzymes PUBMED:12829278, PUBMED:15461461, PUBMED:11017201. This domain consists of a beta-barrel with Greek key topology, and is related to the ferredoxin reductase-like FAD-binding domain. The flavin reductases have a different dimerisation mode than that found in the PNP oxidase-like family, which also carries an FMN-binding domain with a similar topology. Flavokinase SM00904 Riboflavin kinase Riboflavin is converted into catalytically active cofactors (FAD and FMN) by the actions of riboflavin kinase, which converts it into FMN, and FAD synthetase, which adenylates FMN to FAD. Eukaryotes usually have two separate enzymes, while most prokaryotes have a single bifunctional protein that can carry out both catalyses, although exceptions occur in both cases. While eukaryotic monofunctional riboflavin kinase is orthologous to the bifunctional prokaryotic enzyme PUBMED:14580199, the monofunctional FAD synthetase differs from its prokaryotic counterpart, and is instead related to the PAPS-reductase family PUBMED:17049878. The bacterial FAD synthetase that is part of the bifunctional enzyme has remote similarity to nucleotidyl transferases and, hence, it may be involved in the adenylylation reaction of FAD synthetases PUBMED:12517446. This entry represents riboflavin kinase, which occurs as part of a bifunctional enzyme or a stand-alone enzyme. Flo11 SM01213 This presumed domain is found at the N-terminus of the S. cerevisiae Flo11 protein. Flo11 is required for diploid pseudohyphal formation and haploid invasive growth. It belongs to a family of proteins involved in invasive growth, cell-cell adhesion, and mating, many of which can substitute for each other under abnormal conditions PUBMED:11027318. Flu_M1_C SM00759 Influenza Matrix protein (M1) C-terminal domain This region is thought to be a second domain of the M1 matrix protein. FMN_bind SM00900 This conserved region includes the FMN-binding site of the NqrC protein as well as the NosR and NirI regulatory proteins. Fmp27_GFWDK SM01214 RNA pol II promoter Fmp27 protein domain Fmp27_GFWDK is a conserved domain of a family of proteins involved in RNA polymerase II transcription initiation PUBMED:15314641. It contains characteristic GFWDK sequence motifs. Some members are associated with domain Fmp27_SW towards the N terminus. Fmp27_SW SM01215 RNA pol II promoter Fmp27 protein domain Fmp27_SW is a conserved domain of a family of proteins involved in RNA polymerase II transcription initiation PUBMED:15314641. It contains characteristic SW and GKG sequence motifs. Fmp27_WPPW SM01216 RNA pol II promoter Fmp27 protein domain Provide feedback Fmp27_WPPW is a conserved domain of a family of proteins involved in RNA polymerase II transcription initiation PUBMED:15314641. It contains characteristic HQR and WPPW sequence motifs and is towards the C-terminal in members which contain Fmp27_SW. FN1 SM00058 Fibronectin type 1 domain One of three types of internal repeat within the plasma protein, fibronectin. Found also in coagulation factor XII, HGF activator and tissue-type plasminogen activator. In t-PA and fibronectin, this domain type contributes to fibrin-binding. FN2 SM00059 Fibronectin type 2 domain One of three types of internal repeat within the plasma protein, fibronectin. Also occurs in coagulation factor XII, 2 type IV collagenases, PDC-109, and cation-independent mannose-6-phosphate and secretory phospholipase A2 receptors. In fibronectin, PDC-109, and the collagenases, this domain contributes to collagen-binding function. FN3 SM00060 Fibronectin type 3 domain One of three types of internal repeat within the plasma protein, fibronectin. The tenth fibronectin type III repeat contains a RGD cell recognition sequence in a flexible loop between 2 strands. Type III modules are present in both extracellular and intracellular proteins. Fn3_like SM01217 Fibronectin type III-like domain This domain has a fibronectin type III-like structure PUBMED:10368285. It is often found in association with Glycoside hydrolase family 3. Its function is unknown. FolB SM00905 Dihydroneopterin aldolase Dihydroneopterin aldolase catalyses the conversion of 7,8-dihydroneopterin to 6-hydroxymethyl-7,8-dihydropterin in the biosynthetic pathway of tetrahydrofolate. In the opportunistic pathogen Pneumocystis carinii, dihydroneopterin aldolase function is expressed as the N-terminal portion of the multifunctional folic acid synthesis protein (Fas). This region encompasses two domains, FasA and FasB, which are 27% amino acid identical. FasA and FasB also share significant amino acid sequence similarity with bacterial dihydroneopterin aldolases. This region consists of two tandem sequences each homologous to folB and which form tetramers PUBMED:9709001. FOLN SM00274 Follistatin-N-terminal domain-like Follistatin-N-terminal domain-like, EGF-like. Region distinct from the kazal-like sequence FoP_duplication SM01218 C-terminal duplication domain of Friend of PRMT1 Fop, or Friend of Prmt1, proteins are conserved from fungi and plants to vertebrates. There is little that is actually conserved except for this C-terminal LDXXLDAYM region where X is any amino acid). The Fop proteins themselves are nuclear proteins localised to regions with low levels of DAPI, with a punctate/speckle-like distribution. Fop is a chromatin-associated protein and it colocalises with facultative heterochromatin. It is is critical for oestrogen-dependent gene activation PUBMED:19858291. Frataxin_Cyay SM01219 Frataxin-like domain This family contains proteins that have a domain related to the globular C-terminus of Frataxin the protein that is mutated in Friedreich's ataxia. This domain is found in a family of bacterial proteins. The function of this domain is currently unknown. It has been suggested that this family is involved in iron transport. FRG SM00901 This domain contains a conserved N-terminal (F/Y)RG motif. It is functionally uncharacterised. FRI SM00063 Frizzled Drosophila melanogaster frizzled mediates signalling that polarises a precursor cell along the anteroposterior axis. Homologues of the N-terminal region of frizzled exist either as transmembrane or secreted molecules. Frizzled homologues are reported to be receptors for the Wnt growth factors. (Not yet in MEDLINE: the FRI domain occurs in several receptor tyrosine kinases [Xu, Y.K. and Nusse, Curr. Biol. 8 R405-R406 (1998); Masiakowski, P. and Yanopoulos, G.D., Curr. Biol. 8, R407 (1998)]. Frizzled SM01330 Frizzled/Smoothened family membrane region Frizzled is a family of G protein-coupled receptor proteins (PMID: 14977528) that serves as receptors in the Wnt signaling pathway and other signaling pathways. When activated, Frizzled leads to activation of Dishevelled in the cytosol. FSA_C SM01220 Fragile site-associated protein C-terminus This is the conserved C-terminal half of the protein KIAA1109 which is the fragile site-associated protein FSA PUBMED:16545529. Genome-wide-association studies showed this protein linked to the susceptibility to coeliac disease PUBMED:17558408. The protein may also be associated with polycystic kidney disease PUBMED:16632497. FTCD SM01221 Formiminotransferase domain Formiminotransferase domain. FTCD_N SM01222 Formiminotransferase domain, N-terminal subdomain The formiminotransferase (FT) domain of formiminotransferase- cyclodeaminase (FTCD) forms a homodimer, and each protomer comprises two subdomains. The N-terminal subdomain is made up of a six-stranded mixed beta-pleated sheet and five alpha helices, which are arranged on the external surface of the beta sheet. This, in turn, faces the beta-sheet of the C-terminal subdomain to form a double beta-sheet layer. The two subdomains are separated by a short linker sequence, which is not thought to be any more flexible than the remainder of the molecule. The substrate is predicted to form a number of contacts with residues found in both the N-terminal and C-terminal subdomains PUBMED:10673422. FTO_NTD SM01223 FTO catalytic domain This domain is the catalytic AlkB-like domain from the FTO protein PUBMED:20376003. This domain catalyses a demethylase activity with a preference for 3-methylthymidine. FTP SM00607 eel-Fucolectin Tachylectin-4 Pentaxrin-1 Domain FtsA SM00842 Cell division protein FtsA FtsA is essential for bacterial cell division, and co-localizes to the septal ring with FtsZ. It has been suggested that the interaction of FtsA-FtsZ has arisen through coevolution in different bacterial strains PUBMED:9352931. Ftsk_gamma SM00843 This domain directs oriented DNA translocation and forms a winged helix structure. Mutated proteins with substitutions in the FtsK gamma DNA-recognition helix are impaired in DNA binding. FU SM00261 Furin-like repeats Fungal_trans SM00906 Fungal specific transcription factor domain This domain is found in a number of fungal transcription factors including transcriptional activator xlnR, yeast regulatory protein GAL4, and other transcription proteins regulating a variety of cellular and metabolic processes. FYRC SM00542 "FY-rich" domain, C-terminal region is sometimes closely juxtaposed with the N-terminal region (FYRN), but sometimes is far distant. Unknown function, but occurs frequently in chromatin-associated proteins. FYRN SM00541 "FY-rich" domain, N-terminal region is sometimes closely juxtaposed with the C-terminal region (FYRC), but sometimes is far distant. Unknown function, but occurs frequently in chromatin-associated proteins. FYVE SM00064 Protein present in Fab1, YOTB, Vac1, and EEA1 The FYVE zinc finger is named after four proteins where it was first found: Fab1, YOTB/ZK632.12, Vac1, and EEA1. The FYVE finger has been shown to bind two Zn2+ ions. The FYVE finger has eight potential zinc coordinating cysteine positions. The FYVE finger is structurally related to the PHD finger and the RING finger. Many members of this family also include two histidines in a motif R+HHC+XCG, where + represents a charged residue and X any residue. The FYVE finger functions in the membrane recruitment of cytosolic proteins by binding to phosphatidylinositol 3-phosphate (PI3P), which is prominent on endosomes. The R+HHC+XCG motif is critical for PI3P binding. G2F SM00682 G2 nidogen domain and fibulin G3P_acyltransf SM01207 Glycerol-3-phosphate acyltransferase This family of enzymes catalyses the transfer of an acyl group from acyl-ACP to glycerol-3-phosphate to form lysophosphatidic acid. PMID: 16949372 G5 SM01208 This domain is found in a wide range of extracellular proteins. It is found tandemly repeated in up to 8 copies. It is found in the N-terminus of peptidases belonging to the M26 family which cleave human IgA. The domain is also found in proteins involved in metabolism of bacterial cell walls suggesting this domain may have an adhesive function. G8 SM01225 This domain is found in disease proteins PKHD1 and KIAA1199 and is named G8 after its 8 conserved glycines. It is predicted to contain 10 beta strands and an alpha helix. GA SM00844 GA module The protein G-related albumin-binding (GA) module is composed of three alpha helices PUBMED:9086265. This module is found in a range of bacterial cell surface proteins. The GA module from the Peptostreptococcus magnus albumin-binding protein (PAB) shows a strong affinity for albumin. GAF SM00065 Domain present in phytochromes and cGMP-specific phosphodiesterases. Mutations within these domains in PDE6B result in autosomal recessive inheritance of retinitis pigmentosa. GAGA_bind SM01226 GAGA binding protein-like family This family includes gbp a protein from Soybean that binds to GAGA element dinucleotide repeat DNA PUBMED:12177492. It seems likely that the this domain mediates DNA binding. This putative domain contains several conserved cysteines and a histidine suggesting this may be a zinc-binding DNA interaction domain. GAGE SM01379 This family consists of several GAGE and XAGE proteins which are found exclusively in humans. The function of this family is unknown although they have been implicated in human cancers (PMID:11992404) GAL4 SM00066 GAL4-like Zn(II)2Cys6 (or C6 zinc) binuclear cluster DNA-binding domain Gal4 is a positive regulator for the gene expression of the galactose- induced genes of S. cerevisiae. Is present only in fungi. Galanin SM00071 Galanin Galanin [1,2,3] is a neuropeptide that controls various biological activities: it regulates the release growth hormone, inhibits the release of insulin and somatostatin, contracts smooth muscle of the gastrointestinal and genitourinary tract and may be involved in the control of adrenal secretion Gal-bind_lectin SM00908 Galactoside-binding lectin Animal lectins display a wide variety of architectures. They are classified according to the carbohydrate-recognition domain (CRD) of which there are two main types, S-type and C-type. Galectins (previously S-lectins) bind exclusively beta-galactosides like lactose. They do not require metal ions for activity. Galectins are found predominantly, but not exclusively in mammals PUBMED:8124704. Their function is unclear. They are developmentally regulated and may be involved in differentiation, cellular regulation and tissue construction. G_alpha SM00275 G protein alpha subunit Subunit of G proteins that contains the guanine nucleotide binding site GARS_A SM01209 Phosphoribosylglycinamide synthetase, ATP-grasp (A) domain Phosphoribosylglycinamide synthetase catalyses the second step in the de novo biosynthesis of purine. The reaction catalysed by Phosphoribosylglycinamide synthetase is the ATP- dependent addition of 5-phosphoribosylamine to glycine to form 5'phosphoribosylglycinamide. This domain is related to the ATP-grasp domain of biotin carboxylase/carbamoyl phosphate synthetase (Pfam PF02786). GARS_C SM01210 Phosphoribosylglycinamide synthetase, C domain Phosphoribosylglycinamide synthetase catalyses the second step in the de novo biosynthesis of purine. The reaction catalysed by Phosphoribosylglycinamide synthetase is the ATP- dependent addition of 5-phosphoribosylamine to glycine to form 5'phosphoribosylglycinamide. This domain is related to the C-terminal domain of biotin carboxylase/carbamoyl phosphate synthetase (Pfam PF02787). GAS2 SM00243 Growth-Arrest-Specific Protein 2 Domain GROWTH-ARREST-SPECIFIC PROTEIN 2 Domain GASTRIN SM00029 gastrin / cholecystokinin / caerulein family This family gathers small proteins of about 100 130 amino acids that act as hormones, among them gastrin, cholecystokinin and preprocaerulein which stimulate gastric, biliary, and pancreatic secretion and smooth muscle contraction. GATase_5 SM01211 CobB/CobQ-like glutamine amidotransferase domain GatB_Yqey SM00845 GatB domain This domain is found in GatB and proteins related to bacterial Yqey. It is about 140 amino acid residues long. This domain is found at the C terminus of GatB which transamidates Glu-tRNA to Gln-tRNA. The function of this domain is uncertain. It does however suggest that Yqey and its relatives have a role in tRNA metabolism. GCK SM01227 This domain is found in proteins carrying other domains known to be involved in intracellular signalling pathways indicating that it might also be involved in these pathways. It has 4 highly conserved cysteine residues, suggesting that it can bind zinc ions. Moreover, it is found repeated in some members of this family. This may indicate that these domains are able to interact with one another, raising the possibility that this domain mediates heterodimerisation. GDNF SM00907 GDNF/GAS1 domain This cysteine rich domain is found in multiple copies in GNDF and GAS1 proteins. GDNF and neurturin (NTN) receptors are potent survival factors for sympathetic, sensory and central nervous system neurons PUBMED:16551639, PUBMED:9192899. GDNF and neurturin promote neuronal survival by signaling through similar multicomponent receptors that consist of a common receptor tyrosine kinase and a member of a GPI-linked family of receptors that determines ligand specificity PUBMED:9192898. GED SM00302 Dynamin GTPase effector domain GEL SM00262 Gelsolin homology domain Gelsolin/severin/villin homology domain. Calcium-binding and actin-binding. Both intra- and extracellular domains. Gemini_AL1 SM01475 HUH phage replication sub-family replication intiator protein of geminiviruses PMID:12130667 Germane SM00909 Sporulation and spore germination The GerMN domain is a region of approximately 100 residues that is found, duplicated, in the Bacillus GerM protein and is implicated in both sporulation and spore germination. The domain is found in a number of different bacterial species both alone and in association with other domains such as Amidase_3 PF01520 Gmad1 and Gmad2. It is predicted to have a novel alpha-beta fold. G_gamma SM01224 GGL domain G-protein gamma like domains (GGL) are found in the gamma subunit of the heterotrimeric G protein complex and in regulators of G protein signaling (RGS) proteins PUBMED:9789084. It is also found fused to an inactive Galpha in the Dictyostelium protein gbqA PUBMED:21182906. G-gamma likely shares a common origin with the helical N-terminal unit of G-beta PUBMED:21182906. All organisms that posses a G-beta possess a G-gamma PUBMED:21182906. GGDEF SM00267 diguanylate cyclase Diguanylate cyclase, present in a variety of bacteria. GGL SM00224 G protein gamma subunit-like motifs GHA SM00067 Glycoprotein hormone alpha chain homologues. Also called gonadotropins. Glycoprotein hormones consist of two glycosylated chains (alpha and beta) of similar topology. GHB SM00068 Glycoprotein hormone beta chain homologues. Also called gonadotropins. Glycoprotein hormones consist of two glycosylated chains (alpha and beta) of similar topology. GIDA_assoc_3 SM01228 GidA associated domain 3 The GidA associated domain 3 is a motif that has been identified at the C-terminus of protein GidA. It consists of 4 helices, the last three being rather short and forming small bundle at the top end of the first longer one. It is here named helical domain 3 because in GidA it is preceded by two other C-terminal helical domain (based on crystal structures PMID:18565343, 19446527). GidA is an tRNA modification enzyme found in bacteria and mitochondrial. Based on mutational analysis this domain has been suggested to be implicated in binding of the D-stem of tRNA (PMID:19446527) and to be responsible for the interaction with protein MnmE PMID:18565343. Structures of GidA in complex with either tRNA or MnmE are missing. Reported to bind to Pfam family MnmE, PF12631. GIT SM00555 Helical motif in the GIT family of ADP-ribosylation factor GTPase-activating proteins Helical motif in the GIT family of ADP-ribosylation factor GTPase-activating proteins, and in yeast Spa2p and Sph1p (CPP; unpublished results). In p95-APP1 the N-terminal GIT motif might be involved in binding PIX. GIYc SM00465 GIY-YIG type nucleases (URI domain) GLA SM00069 Domain containing Gla (gamma-carboxyglutamate) residues. A hyaluronan-binding domain found in proteins associated with the extracellular matrix, cell adhesion and cell migration. GLECT SM00276 Galectin Galectin - galactose-binding lectin Gln-synt_C SM01230 Glutamine synthetase, catalytic domain GLUCA SM00070 Glucagon like hormones GluR_Homer-bdg SM01229 This is the proline-rich region of metabotropic glutamate receptor proteins that binds Homer-related synaptic proteins. The Homer proteins form a physical tether linking mGluRs with the inositol trisphosphate receptors (IP3R) that appears to be due to the proline-rich "Homer ligand" (PPXXFr). Activation of PI turnover triggers intracellular calcium release (PMID:9808459). MGluR function is altered in the mouse model of human Fragile X syndrome mental retardation, a disorder caused by loss of function mutations in the Fragile X mental retardation gene Fmr1. Homer 3 (and to a lesser extent Homer 1b/c) has been shown to form a multimeric complex with mGlu1a and the IP3 receptor, indicating that Homers may play a role in the localisation of receptors to their signalling partners (PMID:18184796) Glyco_10 SM00633 Glycosyl hydrolase family 10 Glyco_18 SM00636 Glyco_25 SM00641 Glycosyl hydrolases family 25 Glyco_32 SM00640 Glycosyl hydrolases family 32 GoLoco SM00390 LGN motif, putative GEFs specific for G-alpha GTPases GEF specific for Galpha_i proteins G_patch SM00443 glycine rich nucleic binding domain A predicted glycine rich nucleic binding domain found in the splicing factor 45, SON DNA binding protein and D-type Retrovirus- polyproteins. Gp_dh_N SM00846 Glyceraldehyde 3-phosphate dehydrogenase, NAD binding domain GAPDH is a tetrameric NAD-binding enzyme involved in glycolysis and glyconeogenesis. N-terminal domain is a Rossmann NAD(P) binding fold. GPS SM00303 G-protein-coupled receptor proteolytic site domain Present in latrophilin/CL-1, sea urchin REJ and polycystin. GRAM SM00568 domain in glucosyltransferases, myotubularins and other putative membrane-associated proteins GRAN SM00277 Granulin Grip SM00755 golgin-97, RanBP2alpha,Imh1p and p230/golgin-245 GS SM00467 GS motif Aa approx. 30 amino acid motif that precedes the kinase domain in types I and II TGF beta receptors. Mutation of two or more of the serines or threonines in the TTSGSGSG of TGF-beta type I receptor impairs phosphorylation and signaling activity. GuKc SM00072 Guanylate kinase homologues. Active enzymes catalyze ATP-dependent phosphorylation of GMP to GDP. Structure resembles that of adenylate kinase. So-called membrane-associated guanylate kinase homologues (MAGUKs) do not possess guanylate kinase activities; instead at least some possess protein-binding functions. GYF SM00444 Contains conserved Gly-Tyr-Phe residues Proline-binding domain in CD2-binding protein. Contains conserved Gly-Tyr-Phe residues. GYR SM00713 Motif of unknown function with conserved Gly, Tyr, Arg tripeptide in Drosophila proteins. H15 SM00526 Domain in histone families 1 and 5 H2A SM00414 Histone 2A H2B SM00427 Histone H2B H2TH SM01232 Formamidopyrimidine-DNA glycosylase H2TH domain Formamidopyrimidine-DNA glycosylase (Fpg) is a DNA repair enzyme that excises oxidised purines from damaged DNA. This family is the central domain containing the DNA-binding helix-two turn-helix domain (PMID:11912217). H3 SM00428 Histone H3 H4 SM00417 Histone H4 HA2 SM00847 Helicase associated domain (HA2) Add an annotation This presumed domain is about 90 amino acid residues in length. It is found is a diverse set of RNA helicases. Its function is unknown, however it seems likely to be involved in nucleic acid binding. HABP4_PAI-RBP1 SM01233 Hyaluronan / mRNA binding family This family includes the HABP4 family of hyaluronan-binding proteins, and the PAI-1 mRNA-binding protein, PAI-RBP1. HABP4 has been observed to bind hyaluronan (a glucosaminoglycan), but it is not known whether this is its primary role in vivo. It has also been observed to bind RNA, but with a lower affinity than that for hyaluronan (PMID:10887182). PAI-1 mRNA-binding protein specifically binds the mRNA of type-1 plasminogen activator inhibitor (PAI-1), and is thought to be involved in regulation of mRNA stability (PMID:11001948). However, in both cases, the sequence motifs predicted to be important for ligand binding are not conserved throughout the family, so it is not known whether members of this family share a common function. Haemagg_act SM00912 haemagglutination activity domain This domain is suggested to be a carbohydrate- dependent haemagglutination activity site PUBMED:11703654. It is found in a range of haemagglutinins and haemolysins. Haem_bd SM01235 Haem-binding domain This domain contains a potential haem-binding motif, CXXCH (PMID:17288564). Haemolytic SM01234 This domain has haemolytic activity (PMID:19306042). It is found in short (73-103 amino acid) proteins and contains three conserved cysteine residues. Haem_oxygenase_2 SM01236 Iron-containing redox enzyme The CADD, Chlamydia protein associating with death domains, crystal structure reveals a dimer of seven-helical bundles. Each bundle contains a di-iron centre adjacent to an internal cavity that forms an active site similar to that of methane mono-oxygenase hydrolase (PMID:15087448). HALZ SM00340 homeobox associated leucin zipper HAMP SM00304 HAMP (Histidine kinases, Adenylyl cyclases, Methyl binding proteins, Phosphatases) domain HAP1_N SM01424 HAP1 N-terminal conserved region This family represents an N-terminal conserved region found in several huntingtin-associated protein 1 (HAP1) homologues. HAP1 binds to huntingtin in a polyglutamine repeat-length-dependent manner. However, its possible role in the pathogenesis of Huntingtons disease is unclear PMID:7477378,9599014,9285789. This family also includes a similar N-terminal conserved region from hypothetical protein products of ALS2CR3 genes found in the human juvenile amyotrophic lateral sclerosis critical region 2q33-2q34 PMID:11161814. HAT SM00386 HAT (Half-A-TPR) repeats Present in several RNA-binding proteins. Structurally and sequentially thought to be similar to TPRs. HATPase_c SM00387 Histidine kinase-like ATPases Histidine kinase-, DNA gyrase B-, phytochrome-like ATPases. HBM SM01358 helical bimodular (HBM) domain The HBM domain was identified in Bacteria and Archaea and forms part of chemoreceptors and histidine kinases. The domain was characterized by a bimodular architecture, where ligand binding to each module caused a chemotactic response. The conservation of amino acids in the ligand binding sites of both modules suggests that HBM family members recognize similar ligands. (PMID:24347303) HDAC_interact SM00761 Histone deacetylase (HDAC) interacting This domain is found on transcriptional regulators. It forms interactions with histone deacetylases. HDc SM00471 Metal dependent phosphohydrolases with conserved 'HD' motif. Includes eukaryotic cyclic nucleotide phosphodiesterases (PDEc). This profile/HMM does not detect HD homologues in bacterial glycine aminoacyl-tRNA synthetases (beta subunit). HECTc SM00119 Domain Homologous to E6-AP Carboxyl Terminus with E3 ubiquitin-protein ligases. Can bind to E2 enzymes. HELICc SM00490 helicase superfamily c-terminal domain HELICc2 SM00491 HELICc3 SM00492 HEPN SM00748 Higher Eukarytoes and Prokaryotes Nucleotide-binding domain HhH1 SM00278 Helix-hairpin-helix DNA-binding motif class 1 HhH2 SM00279 Helix-hairpin-helix class 2 (Pol1 family) motifs HintC SM00305 Hint (Hedgehog/Intein) domain C-terminal region Hedgehog/Intein domain, C-terminal region. Domain has been split to accommodate large insertions of endonucleases. HintN SM00306 Hint (Hedgehog/Intein) domain N-terminal region Hedgehog/Intein domain, N-terminal region. Domain has been split to accommodate large insertions of endonucleases. HIRAN SM00910 The HIRAN protein (HIP116, Rad5p N-terminal) is found in the N-terminal regions of the SWI2/SNF2 proteins typified by HIP116 and Rad5p. HIRAN is found as a standalone protein in several bacteria and prophages, or fused to other catalytic domains, such as a nuclease of the restriction endonuclease fold and TDP1-like DNA phosphoesterases, in the eukaryotes PUBMED:16627993. It has been predicted that this protein functions as a DNA-binding domain that probably recognises features associated with damaged DNA or stalled replication forks PUBMED:16627993 HisKA SM00388 His Kinase A (phosphoacceptor) domain Dimerisation and phosphoacceptor domain of histidine kinases. H-kinase_dim SM01231 Signal transducing histidine kinase, homodimeric domain This helical bundle domain is the homodimer interface of the signal transducing histidine kinase family. HLH SM00353 helix loop helix domain HMG SM00398 high mobility group HMG17 SM00527 domain in high mobilty group proteins HMG14 and HMG 17 HNHc SM00507 HNH nucleases HNS SM00528 Domain in histone-like proteins of HNS family HOLI SM00430 Ligand binding domain of hormone receptors HormR SM00008 Domain present in hormone receptors HOX SM00389 Homeodomain DNA-binding factors that are involved in the transcriptional regulation of key developmental processes HPT SM00073 Histidine Phosphotransfer domain Contains an active histidine residue that mediates phosphotransfer reactions. Domain detected only in eubacteria. This alignment is an extension to that shown in the Cell structure paper. Hr1 SM00742 Rho effector or protein kinase C-related kinase homology region 1 homologues Alpha-helical domain found in vertebrate PRK1 and yeast PKC1 protein kinases C. The HR1 in rhophilin bind RhoGTP; those in PRK1 bind RhoA and RhoB. Also called RBD - Rho-binding domain HRDC SM00341 Helicase and RNase D C-terminal Hypothetical role in nucleic acid binding. Mutations in the HRDC domain cause human disease. HSA SM00573 domain in helicases and associated with SANT domains HSF SM00415 heat shock factor HTH_ARAC SM00342 helix_turn_helix, arabinose operon control protein HTH_ARSR SM00418 helix_turn_helix, Arsenical Resistance Operon Repressor HTH_ASNC SM00344 helix_turn_helix ASNC type AsnC: an autogenously regulated activator of asparagine synthetase A transcription in Escherichia coli) HTH_CRP SM00419 helix_turn_helix, cAMP Regulatory protein HTH_DEOR SM00420 helix_turn_helix, Deoxyribose operon repressor HTH_DTXR SM00529 Helix-turn-helix diphteria tox regulatory element iron dependent repressor HTH_GNTR SM00345 helix_turn_helix gluconate operon transcriptional repressor HTH_ICLR SM00346 helix_turn_helix isocitrate lyase regulation HTH_LACI SM00354 helix_turn _helix lactose operon repressor HTH_LUXR SM00421 helix_turn_helix, Lux Regulon lux regulon (activates the bioluminescence operon HTH_MARR SM00347 helix_turn_helix multiple antibiotic resistance protein HTH_MERR SM00422 helix_turn_helix, mercury resistance HTH_XRE SM00530 Helix-turn-helix XRE-family like proteins HTTM SM00752 Horizontally Transferred TransMembrane Domain Sequence analysis of vitamin K dependent gamma-carboxylases (VKGC) revealed the presence of a novel domain, HTTM (Horizontally Transferred TransMembrane) in its N-terminus. In contrast to most known domains, HTTM contains four transmembrane regions. Its occurrence in eukaryotes, bacteria and archaea is more likely caused by horizontal gene transfer than by early invention. The conservation of VKGC catalytic sites indicates an enzymatic function also for the other family members. HUH SM01476 virus and plasmid replication protein with conserved HXH motif domain involved in replication/mobilisation of plasmid, viruses and transposons through active site HUH residues PMID:9350859, PMID:12130667 and PMID: 16209952 huh_y1 SM01465 HUH transposase Y1 type HUH endonuclease sub-family involved in site-specific DNA clevage and ligation contains single catalytic tyrosine residue PMID: 16209952 huh_y2 SM01464 HUH transposase Y2 type HUH endonuclease sub-family involved in site-specific DNA clevage and ligation contains two catalytic tyrosine residue PMID: 16209952 HWE_HK SM00911 HWE histidine kinase The HWE domain is found in a subset of two-component system kinases, belonging to the same superfamily as PUBMED:14702314. In PUBMED:14702314, the HWE family was defined by the presence of conserved a H residue and a WXE motifs and was limited to members of the proteobacteria. However, many homologues of this domain are lack the WXE motif. Furthermore, homologues are found in a wide range of Gram-positive and Gram-negative bacteria as well as in several archaea. HX SM00120 Hemopexin-like repeats. Hemopexin is a heme-binding protein that transports heme to the liver. Hemopexin-like repeats occur in vitronectin and some matrix metalloproteinases family (matrixins). The HX repeats of some matrixins bind tissue inhibitor of metalloproteinases (TIMPs). HYDRO SM00075 Hydrophobins IB SM00121 Insulin growth factor-binding protein homologues High affinity binding partners of insulin-like growth factors. IBN_N SM00913 Importin-beta N-terminal domain Members of the importin-beta (karyopherin-beta) family can bind and transport cargo by themselves, or can form heterodimers with importin-alpha. As part of a heterodimer, importin-beta mediates interactions with the pore complex, while importin-alpha acts as an adaptor protein to bind the nuclear localisation signal (NLS) on the cargo through the classical NLS import of proteins. Importin-beta is a helicoidal molecule constructed from 19 HEAT repeats. Many nuclear pore proteins contain FG sequence repeats that can bind to HEAT repeats within importins PUBMED:12372823, PUBMED:17161424, which is important for importin-beta mediated transport. IBR SM00647 In Between Ring fingers the domains occurs between pairs og RING fingers ICA69 SM01237 Islet cell autoantigen ICA69, C-terminal domain This family includes a 69 kD protein which has been identified as an islet cell autoantigen in type I diabetes mellitus (PMID:8975715). Its precise function is unknown. IDEAL SM00914 This is a short protein of unknown function it is found at the C-terminus of proteins in the UPF0302 family. It is named after the sequence of the most conserved region in some members. IENR1 SM00497 Intron encoded nuclease repeat motif Repeat of unknown function, but possibly DNA-binding via helix-turn-helix motif (Ponting, unpublished). IENR2 SM00496 Intron-encoded nuclease repeat 2 Short helical motif of unknown function (unpublished results). IFabd SM00076 Interferon alpha, beta and delta. Interferons produce antiviral and antiproliferative responses in cells. They are classified into five groups, all of them related but gamma-interferon. IG SM00409 Immunoglobulin IGc1 SM00407 Immunoglobulin C-Type IGc2 SM00408 Immunoglobulin C-2 Type IG_FLMN SM00557 Filamin-type immunoglobulin domains These form a rod-like structure in the actin-binding cytoskeleton protein, filamin. The C-terminal repeats of filamin bind beta1-integrin (CD29). IG_like SM00410 Immunoglobulin like IG domains that cannot be classified into one of IGv1, IGc1, IGc2, IG. IGR SM01238 This domain is found in fungal proteins and contains a conserved IGR motif. Its function is unknown. IGv SM00406 Immunoglobulin V-Type IKKbetaNEMObind SM01239 I-kappa-kinase-beta NEMO binding domain This domain family is found in eukaryotes, and is approximately 40 amino acids in length. The family is found in association with S_TKc. These proteins are involved in inflammatory reactions. They cause release of NF-kappa-B into the nucleus of inflammatory cells and upregulation of transcription of proinflammatory cytokines. They perform this function by phosphorylating I-kappa-B proteins which are targeted for degradation to release NF-kappa-B. This kinase (I-kappa-kinase-beta) is found in association with IKK-alpha and NEMO (NF-kappa-B essential modulator). This domain is the binding site of IKK-beta for NEMO. IL1 SM00125 Interleukin-1 homologues Cytokines with various biological functions. Interluekin 1 alpha and beta are also known as hematopoietin and catabolin. IL10 SM00188 Interleukin-10 family Interleukin-10 inhibits the synthesis of a number of cytokines, including IFN-gamma, IL-2, IL-3, TNF and GM-CSF produced by activated macrophages and by helper T cells. IL2 SM00189 Interleukin-2 family Interleukin-2 is a cytokine produced by T-helper cells in response to antigenic or mitogenic stimulation. This protein is required for T-cell proliferation and other activities crucial to the regulation of the immune response. IL4_13 SM00190 Interleukins 4 and 13 Interleukins-4 and -13 are cytokines involved in inflammatory and immune responses. IL-4 stimulates B and T cells. IL6 SM00126 Interleukin-6 homologues Family includes granulocyte colony-stimulating factor (G-CSF) and myelomonocytic growth factor (MGF). IL-6 is also known as B-cell stimulatory factor 2. IL7 SM00127 Interleukin-7 and interleukin-9 family. IL-7 is a cytokine that acts as a growth factor for early lymphoid cells of both B- and T-cell lineages. IL-9 is a multifunctional cytokine that, although originally described as a T-cell growth factor, its function in T-cell response remains unclear. IlGF SM00078 Insulin / insulin-like growth factor / relaxin family. Family of proteins including insulin, relaxin, and IGFs. Insulin decreases blood glucose concentration. ILWEQ SM00307 I/LWEQ domain Thought to possess an F-actin binding function. IMPDH SM01240 IMP dehydrogenase / GMP reductase domain This family is involved in biosynthesis of guanosine nucleotide. Members of this family contain a TIM barrel structure. In the inosine monophosphate dehydrogenases 2 CBS domains are inserted in the TIM barrel (PMID: 10200156). This family is a member of the common phosphate binding site TIM barrel family. INB SM00187 Integrin beta subunits (N-terminal portion of extracellular region) Portion of beta integrins that lies N-terminal to their EGF-like repeats. Integrins are cell adhesion molecules that mediate cell-extracellular matrix and cell-cell interactions. They contain both alpha and beta subunits. Beta integrins are proposed to have a von Willebrand factor type-A "insert" or "I" -like domain (although this remains to be confirmed). ING SM01408 Inhibitor of growth proteins N-terminal histone-binding Histones undergo numerous post-translational modifications, including acetylation and methylation, at residues which are then probable docking sites for various chromatin remodelling complexes. Inhibitor of growth proteins (INGs) specifically bind to residues that have been thus modified. INGs carry a well-characterised C-terminal PHD-type zinc-finger domain, binding with lysine 4-tri-methylated histone H3 (H3K4me3), as well as this N-terminal domain that binds unmodified H3 tails. Although these two regions can bind histones independently, together they increase the apparent association of the ING for the H3 tail. Inhibitor_I29 SM00848 Cathepsin propeptide inhibitor domain (I29) This domain is found at the N-terminus of some C1 peptidases such as Cathepsin L where it acts as a propeptide. There are also a number of proteins that are composed solely of multiple copies of this domain such as the peptidase inhibitor salarin. This family is classified as I29 by MEROPS. Peptide proteinase inhibitors can be found as single domain proteins or as single or multiple domains within proteins; these are referred to as either simple or compound inhibitors, respectively. In many cases they are synthesised as part of a larger precursor protein, either as a prepropeptide or as an N-terminal domain associated with an inactive peptidase or zymogen. This domain prevents access of the substrate to the active site. Removal of the N-terminal inhibitor domain either by interaction with a second peptidase or by autocatalytic cleavage activates the zymogen. Other inhibitors interact direct with proteinases using a simple noncovalent lock and key mechanism; while yet others use a conformational change-based trapping mechanism that depends on their structural and thermodynamic properties. Int_alpha SM00191 Integrin alpha (beta-propellor repeats). Integrins are cell adhesion molecules that mediate cell-extracellular matrix and cell-cell interactions. They contain both alpha and beta subunits. Alpha integrins are proposed to contain a domain containing a 7-fold repeat that adopts a beta-propellor fold. Some of these domains contain an inserted von Willebrand factor type-A domain. Some repeats contain putative calcium-binding sites. The 7-fold repeat domain is homologous to a similar domain in phosphatidylinositol-glycan-specific phospholipase D. Int_BPP-1 SM01448 sub-family of tyrosine recombinases tyrosine recombinase sub-family mediates DNA clevage through tyrosine residue at the active site pre-dominantly in phages https://doi.org/10.1101/542381 Int_Brujita SM01449 sub-family of tyrosine recombinases tyrosine reocmbinase sub-family mediates phages/ICE integration and excision https://doi.org/10.1101/542381 and PMID: 27113630 Int_CTnDOT SM01454 sub-family of tyrosine recombinases tyrosine recombinase sub-family mediates DNA clevage through tyrosine residue at the active site in Integrative Conjugative Elements (ICE) https://doi.org/10.1101/542381 Int_Des SM01457 sub-family of tyrosine recombinases sub-family of tyrosine recombinases present in mobile elements, mediates DNA cleavage through tyrosine residue at the active site https://doi.org/10.1101/542381 Integrin_b_cyt SM01241 Integrins are a group of transmembrane proteins which function as extracellular matrix receptors and in cell adhesion. Integrins are ubiquitously expressed and are heterodimeric, each composed of an alpha and beta subunit. Several variations of the the alpha and beta subunits exist, and association of different alpha and beta subunits can have different a different binding specificity. This domain corresponds to the cytoplasmic domain of the beta subunit. Integrin_B_tail SM01242 This is the beta tail domain of the Integrin protein. Integrins are receptors which are involved in cell-cell and cell-extracellular matrix interactions. Integron SM01444 sub-family of tyrosine recombinases sub-family of tyrosine recombinases pre-dominantly present in integrons and mediates DNA clevage via tyrosine residue at the active site https://doi.org/10.1101/542381 IntKX SM01450 sub-family of tyrosine recombinases tyrosine reocmbinase sub-family mediates DNA clevage through active site tyrosine residue in Integrative Conjugative Elements (ICE) https://doi.org/10.1101/542381 Int_P2 SM01458 sub-family of tyrosine recombinases tyrosine recombinase sub-family mediates DNA clevage predominantly in proteobacterial phages through tyrosine residue at the active site https://doi.org/10.1101/542381 Int_SXT SM01440 sub-family of tyrosine recombinases tyrosine recombinase sub-family found in variety of Integrative Conjugative Elements and phages https://doi.org/10.1101/542383 Int_Tn916 SM01455 sub-family of tyrosine recombinases tyrosine recombinase sub-family found in phages and Integrative Conjugative Elements (ICE) https://doi.org/10.1101/542383 IPPc SM00128 Inositol polyphosphate phosphatase, catalytic domain homologues Mg(2+)-dependent/Li(+)-sensitive enzymes. IPT SM00429 ig-like, plexins, transcription factors IQ SM00015 Short calmodulin-binding motif containing conserved Ile and Gln residues. Calmodulin-binding motif. IRF SM00348 interferon regulatory factor interferon regulatory factor, also known as trytophan pentad repeat IRF-3 SM01243 Interferon-regulatory factor 3 This is the interferon-regulatory factor 3 chain of the hetero-dimeric structure which also contains the shorter chain CREB-binding protein. These two subunits make up the DRAF1 (double-stranded RNA-activated factor 1). Viral dsRNA produced during viral transcription or replication leads to the activation of DRAF1. The DNA-binding specificity of DRAF1 correlates with transcriptional induction of ISG (interferon-alpha,beta-stimulated gene). IRF-3 preexists in the cytoplasm of uninfected cells and translocates to the nucleus following viral infection. Translocation of IRF-3 is accompanied by an increase in serine and threonine phosphorylation, and association with the CREB coactivator occurs only after infection. IRO SM00548 Motif in Iroquois-class homeodomain proteins (only). Unknown function. IRS SM01244 Phosphotyrosine-binding domain Iso_dh SM01329 Isocitrate/isopropylmalate dehydrogenase Isocitrate dehydrogenase (IDH), is an important enzyme of carbohydrate metabolism which catalyses the oxidative decarboxylation of isocitrate into alpha-ketoglutarate. ITAM SM00077 Immunoreceptor tyrosine-based activation motif Motif that may be dually phosphorylated on tyrosine that links antigen receptors to downstream signalling machinery. JAB_MPN SM00232 JAB/MPN domain Domain in Jun kinase activation domain binding protein and proteasomal subunits. Domain at Mpr1p and Pad1p N-termini. Domain of unknown function. Jacalin SM00915 Jacalin-like lectin domain This entry represents a mannose-binding lectin domain with a beta-prism fold consisting of three 4-stranded beta-sheets, with an internal pseudo 3-fold symmetry. Some lectins in this group stimulate distinct T- and B- cell functions, such as Jacalin, which binds to the T-antigen and acts as an agglutinin. This domain is found in 1 to 6 copies in lectins. The domain is also found in the salt-stress induced protein from rice and an animal prostatic spermine-binding protein. Jag_N SM01245 This domain is found at the N-terminus of jag proteins. JHBP SM00700 Juvenile hormone binding protein domains in insects. The juvenile hormone exerts pleiotropic functions during insect life cycles and its binding proteins regulate these functions. JmjC SM00558 A domain family that is part of the cupin metalloenzyme superfamily. Probable enzymes, but of unknown functions, that regulate chromatin reorganisation processes (Clissold and Ponting, in press). JmjN SM00545 Small domain found in the jumonji family of transcription factors To date, this domain always co-occurs with the JmjC domain (although the reverse is not true). Josephin SM01246 K167R SM01295 K167/Chmadrin repeat This family represents the K167/Chmadrin repeat. (PMID:15112237) The function of this repeat is unknown. KaiA SM01247 The cyanobacterial clock proteins KaiA and KaiB are proposed as regulators of the circadian rhythm in cyanobacteria. The overall fold of the KaiA monomer is that of a four-helix bundle, which forms a dimer in the known structure (PMID:15071498). KaiB SM01248 The cyanobacterial clock proteins KaiA and KaiB are proposed as regulators of the circadian rhythm in cyanobacteria. Mutations in both proteins have been reported to alter or abolish circadian rhythmicity. KaiB adopts an alpha-beta meander motif and is found to be a dimer (PMID: 15071498). KAP SM01297 Kinesin-associated protein (KAP) This family consists of several eukaryotic kinesin-associated (KAP) proteins. Kinesins are intracellular multimeric transport motor proteins that move cellular cargo on microtubule tracks. It has been shown that the sea urchin KRP85/95 holoenzyme associates with a KAP115 non-motor protein, forming a heterotrimeric complex in vitro, called the Kinesin-II PMID:10819327. KapB SM01298 This bacterial protein forms an anti-parallel beta sheet with an extending alpha helical region. KASH SM01249 Nuclear envelope localisation domain The KASH (for Klarsicht/ANC-1/Syne-1 homology) or KLS domain is a highly hydrophobic nuclear envelope localisation domain of approximately 60 amino acids comprising a 20-amino-acid transmembrane region and a 30-35-residue C-terminal region that lies between the inner and the outer nuclear membranes (PMID:12169658). During meiotic prophase, telomeres cluster to form a bouquet arrangement of chromosomes. SUN and KASH domain proteins form complexes that span both membranes of the nuclear envelope. The KASH domain links the dynein motor complex of the microtubules, through the outer nuclear membrane to the Sad1 domain in the inner nuclear membrane which then interacts with the bouquet proteins Bqt1 and Bqt2 that are complexed with Bqt4, Rap1 and Taz1 and attached to the telomere (PMID:19948484). SUN domain-containing proteins are essential for recruiting KASH domain proteins at the outer nuclear membrane, and KASH domains provide a generic NE tethering device for functionally distinct proteins whose cytoplasmic domains mediate nuclear positioning, maintain physical connections with other cellular organelles, and possibly even influence chromosome dynamics (PMID:19687252). KAT11 SM01250 Histone acetylation protein Histone acetylation is required in many cellular processes including transcription, DNA repair, and chromatin assembly. This family contains the fungal KAT11 protein (previously known as RTT109) which is required for H3K56 acetylation. Loss of KAT11 results in the loss of H3K56 acetylation, both on bulk histone and on chromatin (PMID:17046836). KAT11 and H3K56 acetylation appear to correlate with actively transcribed genes and associate with the elongating form of Pol II in yeast (PMID:17046836). This family also incorporates the p300/CBP histone acetyltransferase domain which has different catalytic properties and cofactor regulation to KAT11 (PMID:18568037). KAZAL SM00280 Kazal type serine protease inhibitors Kazal type serine protease inhibitors and follistatin-like domains. KbaA SM01251 KinB-signalling pathway activation in sporulation This family of small proteins is found in the membrane and is necessary for kinase KinB signalling during sporulation. There is a conserved GFF sequence motif. The initiation of sporulation in Bacillus subtilis is dependent on the phosphorylation of the Spo0A transcription factor mediated by the phospho-relay and by two major kinases, KinA and KinB. Kelch SM00612 KH SM00322 K homology RNA-binding domain KilA-N SM01252 Conserved DNA-binding domain that is found in a wide range of proteins of large bacterial and eukaryotic DNA viruses. Kin17_mid SM01253 Domain of Kin17 curved DNA-binding protein Kin17_mid is the conserved central 169 residue region of a family of Kin17 proteins. Towards the N-terminal end there is a zinc-finger domain, and in human and mouse members there is a RecA-like domain further downstream. The Kin17 protein in humans forms intra-nuclear foci during cell proliferation and is re-distributed in the nucleoplasm during the cell cycle PUBMED:10964102. KIND SM00750 kinase non-catalytic C-lobe domain It is an interaction domain identified as being similar to the C-terminal protein kinase catalytic fold (C lobe). Its presence at the N terminus of signalling proteins and the absence of the active-site residues in the catalytic and activation loops suggest that it folds independently and is likely to be non-catalytic. The occurrence of KIND only in metazoa implies that it has evolved from the catalytic protein kinase domain into an interaction domain possibly by keeping the substrate-binding features KISc SM00129 Kinesin motor, catalytic domain. ATPase. Microtubule-dependent molecular motors that play important roles in intracellular transport of organelles and in cell division. KLRAQ SM01254 Predicted coiled-coil domain-containing protein This is the N-terminal 100 amino acid domain of a family of proteins conserved from nematodes to humans. It carries a characteristic KLRAQ sequence-motif. The function is unknown. Knot1 SM00505 Knottins Knottins, representing plant lectins/antimicrobial peptides, plant proteinase/amylase inhibitors, plant gamma-thionins and arthropod defensins. KNOX1 SM01255 The MEINOX region is comprised of two domains, KNOX1 and KNOX2. KNOX1 plays a role in suppressing target gene expression. KNOX2, essential for function, is thought to be necessary for homo-dimerisation PUBMED:11549765. KNOX2 SM01256 The MEINOX region is comprised of two domains, KNOX1 and KNOX2. KNOX1 plays a role in suppressing target gene expression. KNOX2, essential for function, is thought to be necessary for homo-dimerisation PUBMED:11549765. KOW SM00739 KOW (Kyprides, Ouzounis, Woese) motif. Motif in ribosomal proteins, NusG, Spt5p, KIN17 and T54. KR SM00130 Kringle domain Named after a Danish pastry. Found in several serine proteases and in ROR-like receptors. Can occur in up to 38 copies (in apolipoprotein(a)). Plasminogen-like kringles possess affinity for free lysine and lysine- containing peptides. KRAB SM00349 krueppel associated box KRAP_IP3R_bind SM01257 Ki-ras-induced actin-interacting protein-IP3R-interacting domain This family includes the N-terminus of the actin-interacting protein sperm-specific antigen 2, or KRAP (Ki-ras-induced actin-interacting protein) PUBMED:14673706. This region is found to be the residues that interact with inositol 1,4,5-trisphosphate receptor (IP3R). KRAP was first localized as a membrane-bound form with extracellular regions suggesting it might be involved in the regulation of filamentous actin and signals from the outside of the cells PUBMED:14673706. It has now been shown to be critical for the proper subcellular localization and function of IP3R. Inositol 1,4,5-trisphosphate receptor functions as the Ca2+ release channel on specialized endoplasmic reticulum membranes, so the subcellular localisation of IP3R is crucial for its proper function PUBMED:21501587. KRBA1 SM01258 KRBA1 family repeat KRBA1 is a short repeating motif found in mammalian proteins. It is characterised by a highly conserved sequence of residues, SSPLxxLxxCLK. The function of the repeat, which can be present in up to seven copies, is unknown as is the function of the full length proteins. KU SM00131 BPTI/Kunitz family of serine protease inhibitors. Serine protease inhibitors. One member of the family is encoded by an alternatively-spliced form of Alzheimer's amyloid beta-protein. Ku78 SM00559 Ku70 and Ku80 are 70kDa and 80kDa subunits of the Lupus Ku autoantigen This is a single stranded DNA- and ATP-depedent helicase that has a role in chromosome translocation. This is a domain of unknown function C-terminal to its von Willebrand factor A domain, that also occurs in bacterial hypothetical proteins. L27 SM00569 domain in receptor targeting proteins Lin-2 and Lin-7 L51_S25_CI-B8 SM00916 Mitochondrial ribosomal protein L51 / S25 / CI-B8 domain Proteins containing this domain are located in the mitochondrion and include ribosomal protein L51, and S25. This domain is also found in mitochondrial NADH-ubiquinone oxidoreductase B8 subunit (CI-B8) . It is not known whether all members of this family form part of the NADH-ubiquinone oxidoreductase and whether they are also all ribosomal proteins. LA SM00715 Domain in the RNA-binding Lupus La protein; unknown function LAB_N SM01259 Lipid A Biosynthesis N-terminal domain This family is found at the N-terminus of a group of Chlamydial Lipid A biosynthesis proteins. It is also found by itself in a family of proteins of unknown function. Lactamase_B SM00849 Metallo-beta-lactamase superfamily Apart from the beta-lactamases a number of other proteins contain this domain PUBMED:7588620. These proteins include thiolesterases, members of the glyoxalase II family, that catalyse the hydrolysis of S-D-lactoyl-glutathione to form glutathione and D-lactic acid and a competence protein that is essential for natural transformation in Neisseria gonorrhoeae and could be a transporter involved in DNA uptake. Except for the competence protein these proteins bind two zinc ions per molecule as cofactor. LAG1_DNAbind SM01267 LAG1, DNA binding Members of this family are found in various eukaryotic hypothetical proteins and in the DNA-binding protein LAG-1. They adopt a beta sandwich structure, with nine strands in two beta-sheets, in a Greek-key topology, and allow for DNA binding PUBMED:15297877. This domain is also known as RHR-N (Rel-homology region) as it related to Rel domain proteins. LamB SM00281 Laminin B domain LamG SM00282 Laminin G domain LamGL SM00560 LamG-like jellyroll fold domain LamNT SM00136 Laminin N-terminal domain (domain VI) N-terminal domain of laminins and laminin-related protein such as Unc-6/ netrins. LAMTOR SM01262 Late endosomal/lysosomal adaptor and MAPK and MTOR activator LAMTOR is a family of eukaryotic proteins that have otherwise been referred to as Lipid raft adaptor protein p18, Late endosomal/lysosomal adaptor and MAPK and MTOR activator 1, and Protein associated with DRMs and endosomes. It is found to be one of three small proteins constituting the Rag complex or Ragulator that interact with each other, localise to endosomes and lysosomes, and play positive roles in the MAPK pathway. The complex does this by interacting with the Rag GTPases, recruiting them to lysosomes, and bringing about mTORC1 activation. LANC_like SM01260 Lanthionine synthetase C-like protein Lanthionines are thioether bridges that are putatively generated by dehydration of Ser and Thr residues followed by addition of cysteine residues within the peptide. This family contains the lanthionine synthetase C-like proteins 1 and 2 which are related to the bacterial lanthionine synthetase components C (LanC). LANCL1 (P40 seven-transmembrane-domain protein) and LANCL2 (testes-specific adriamycin sensitivity protein) are thought to be peptide-modifying enzyme components in eukaryotic cells. Both proteins are produced in large quantities in the brain and testes and may have role in the immune surveillance of these organs PUBMED:11376939. Lanthionines are found in lantibiotics, which are peptide-derived, post-translationally modified antimicrobials produced by several bacterial strains PUBMED:12127987. This region contains seven internal repeats. LCCL SM00603 LDLa SM00192 Low-density lipoprotein receptor domain class A Cysteine-rich repeat in the low-density lipoprotein (LDL) receptor that plays a central role in mammalian cholesterol metabolism. The N-terminal type A repeats in LDL receptor bind the lipoproteins. Other homologous domains occur in related receptors, including the very low-density lipoprotein receptor and the LDL receptor-related protein/alpha 2-macroglobulin receptor, and in proteins which are functionally unrelated, such as the C9 component of complement. Mutations in the LDL receptor gene cause familial hypercholesterolemia. LEM SM00540 in nuclear membrane-associated proteins LEM, domain in nuclear membrane-associated proteins, including lamino-associated polypeptide 2 and emerin. LeuA_dimer SM00917 LeuA allosteric (dimerisation) domain This is the C-terminal regulatory (R) domain of alpha-isopropylmalate synthase, which catalyses the first committed step in the leucine biosynthetic pathway PUBMED:15159544. This domain, is an internally duplicated structure with a novel fold PUBMED:15159544. It comprises two similar units that are arranged such that the two -helices pack together in the centre, crossing at an angle of 34 degrees, sandwiched between the two three-stranded, antiparallel beta-sheets. The overall domain is thus constructed as a beta-alpha-beta three-layer sandwich PUBMED:15159544. Leuk-A4-hydro_C SM01263 Leukotriene A4 hydrolase, C-terminal Members of this family adopt a structure consisting of two layers of parallel alpha-helices, five in the inner layer and four in the outer, arranged in an antiparallel manner, with perpendicular loops containing short helical segments on top. They are required for the formation of a deep cleft harbouring the catalytic Zn2+ site in Leukotriene A4 hydrolase PUBMED:11175901. LH2 SM00308 Lipoxygenase homology 2 (beta barrel) domain LIF_OSM SM00080 leukemia inhibitory factor OSM, Oncostatin M LIGANc SM00532 Ligase N family Lig_chan-Glu_bd SM00918 Ligated ion channel L-glutamate- and glycine-binding site This region, sometimes called the S1 domain, is the luminal domain just upstream of the first, M1, transmembrane region of transmembrane ion-channel proteins, and it binds L-glutamate and glycine. It is found in association with Lig_chan. LIM SM00132 Zinc-binding domain present in Lin-11, Isl-1, Mec-3. Zinc-binding domain family. Some LIM domains bind protein partners via tyrosine-containing motifs. LIM domains are found in many key regulators of developmental pathways. LINK SM00445 Link (Hyaluronan-binding) Lipid_DES SM01269 Sphingolipid Delta4-desaturase (DES) Sphingolipids are important membrane signalling molecules involved in many different cellular functions in eukaryotes. Sphingolipid delta 4-desaturase catalyses the formation of (E)-sphing-4-enine PUBMED:11937514. Some proteins in this family have bifunctional delta 4-desaturase/C-4-hydroxylase activity. Delta 4-desaturated sphingolipids may play a role in early signalling required for entry into meiotic and spermatid differentiation pathways during Drosophila spermatogenesis PUBMED:119375141. This small domain associates with FA_desaturase (PFAM PF00487) and appears to be specific to sphingolipid delta 4-desaturase. LisH SM00667 Lissencephaly type-1-like homology motif Alpha-helical motif present in Lis1, treacle, Nopp140, some katanin p60 subunits, muskelin, tonneau, LEUNIG and numerous WD40 repeat-containing proteins. It is suggested that LisH motifs contribute to the regulation of microtubule dynamics, either by mediating dimerisation, or else by binding cytoplasmic dynein heavy chain or microtubules directly. LITAF SM00714 Possible membrane-associated motif in LPS-induced tumor necrosis factor alpha factor (LITAF), also known as PIG7, and other animal proteins. LMWPc SM00226 Low molecular weight phosphatase family LNS2 SM00775 This domain is found in Saccharomyces cerevisiae protein SMP2, proteins with an N-terminal lipin domain and phosphatidylinositol transfer proteins. SMP2 is involved in plasmid maintenance and respiration. Lipin proteins are involved in adipose tissue development and insulin resistance. LON SM00464 Found in ATP-dependent protease La (LON) N-terminal domain of the ATP-dependent protease La (LON), present also in other bacterial ORFs. Longin SM01270 Regulated-SNARE-like domain Longin is one of the approximately 26 components required for transporting proteins from the ER to the plasma membrane, via the Golgi apparatus. It is necessary for the steps of the transfer from the ER to the Golgi complex PUBMED:16855025. Longins are the only R-SNAREs that are common to all eukaryotes, and they are characterised by a conserved N-terminal domain with a profilin-like fold called a longin domain PUBMED:15544955. LPD_N SM00638 Lipoprotein N-terminal Domain LRR SM00370 Leucine-rich repeats, outliers LRR_BAC SM00364 Leucine-rich repeats, bacterial type LRRcap SM00446 occurring C-terminal to leucine-rich repeats A motif occurring C-terminal to leucine-rich repeats in "sds22-like" and "typical" LRR-containing proteins. LRR_CC SM00367 Leucine-rich repeat - CC (cysteine-containing) subfamily LRRCT SM00082 Leucine rich repeat C-terminal domain LRRNT SM00013 Leucine rich repeat N-terminal domain LRR_RI SM00368 Leucine rich repeat, ribonuclease inhibitor type LRR_SD22 SM00365 Leucine-rich repeat, SDS22-like subfamily LRR_TYP SM00369 Leucine-rich repeats, typical (most populated) subfamily LSM14 SM01271 Scd6-like Sm domain The Scd6-like Sm domain is found in Scd6p from S. cerevisiae, Rap55 from the newt Pleurodeles walt, and its orthologs from fungi, animals, plants and apicomplexans PMID:15257761. The domain is also found in Dcp3p and the human EDC3/FLJ21128 protein where it is fused to the the Rossmanoid YjeF-N domain PMID: 15257761,15225602. In addition both EDC3 and Scd6p are found fused to the FDF domain PMID: 15257761,15225602. LsmAD SM01272 This domain is found associated with Lsm domain PMID:16115810. LU SM00134 Ly-6 antigen / uPA receptor -like domain Three-fold repeated domain in urokinase-type plasminogen activator receptor; occurs singly in other GPI-linked cell-surface glycoproteins (Ly-6 family, CD59, thymocyte B cell antigen, Sgp-2). Topology of these domains is similar to that of snake venom neurotoxins. LY SM00135 Low-density lipoprotein-receptor YWTD domain Type "B" repeats in low-density lipoprotein (LDL) receptor that plays a central role in mammalian cholesterol metabolism. Also present in a variety of molecules similar to gp300/megalin. LysM SM00257 Lysin motif LytTR SM00850 LytTr DNA-binding domain This domain is found in a variety of bacterial transcriptional regulators. The domain binds to a specific DNA sequence pattern. LYZ1 SM00263 Alpha-lactalbumin / lysozyme C LYZ2 SM00047 Lysozyme subfamily 2 Eubacterial enzymes distantly related to eukaryotic lysozymes. M16C_associated SM01264 Peptidase M16C associated This domain appears in eukaryotes as well as bacteria and tends to be found near the C-terminus of the metalloprotease (PFAM PF05193). M60-like SM01276 Peptidase M60-like family This family of peptidases contains a zinc metallopeptidase motif (HEXXHX(8,28)E) and possesses mucinase activity PMID:22299034. MA SM00283 Methyl-accepting chemotaxis-like domains (chemotaxis sensory transducer). Thought to undergo reversible methylation in response to attractants or repellants during bacterial chemotaxis. MA3 SM00544 Domain in DAP-5, eIF4G, MA-3 and other proteins. Highly alpha-helical. May contain repeats and/or regions similar to MIF4G domains Ponting (TIBS) "Novel eIF4G domain homologues" in press Mab-21 SM01265 This family contains Mab-21 and Mab-21 like proteins. In C. elegans these proteins are required for several aspects of embryonic development PUBMED:8582275; PUBMED:15385160. Mac SM01266 Maltose acetyltransferase This domain family is found in bacteria, archaea and eukaryotes, and is approximately 50 amino acids in length. The family is found in association with Bacterial transferase hexapeptide (PFAM PF00132). Mac uses acetyl-CoA as acetyl donor to acetylated cytoplasmic maltose. MACPF SM00457 membrane-attack complex / perforin Mad3_BUB1_I SM00777 Mad3/BUB1 hoMad3/BUB1 homology region 1 Proteins containing this domain are checkpoint proteins involved in cell division. This region has been shown to be essential for the binding of the binding of BUB1 and MAD3 to CDC20p. MADF SM00595 subfamily of SANT domain MADS SM00432 MAGE SM01373 Melanoma-associated antigen The MAGE (melanoma antigen-encoding gene) family are expressed in a wide variety of tumors but not in normal cells, with the exception of the male germ cells, placenta, and, possibly, cells of the developing embryo. The cellular function of this family is unknown. This family also contains the yeast protein, Nse3. The Nse3 protein is part of the Smc5-6 complex (PMID:15601840, 15331764). Nse3 has been demonstrated to be important for (PMID:15331764). MAGE_N SM01392 Melanoma associated antigen family N terminal This domain family is found in eukaryotes, and is typically between 82 and 96 amino acids in length. This family is the N terminal of various melanoma associated antigens. These are tumor rejection antigens which are expressed on HLA-A1 of tumor cells and they are recognized by cytotoxic T lymphocytes (CTLs). Mago-bind SM01273 Mago binding Members of this family adopt a structure consisting of a small globular all-beta-domain, with a three-stranded beta-sheet and a contiguous beta-hairpin. They bind to Mago alpha-helices via extensive electrostatic interactions and at a beta2-beta3 loop via hydrophobic interactions PMID:14968132. MAGUK_N_PEST SM01277 Polyubiquitination (PEST) N-terminal domain of MAGUK The residues upstream of this domain are the probable palmitoylation sites, particularly two cysteines. The domain has a putative PEST site at the very start that seems to be responsible for poly-ubiquitination PMID:8755249. PEST domains are polypeptide sequences enriched in proline (P), glutamic acid (E), serine (S) and threonine (T) that target proteins for rapid destruction. The whole domain, in conjunction with a C-terminal domain of the longer protein, is necessary for dimerisation of the whole protein PMID:18215622. malic SM01274 Malic enzyme, N-terminal domain Malic_M SM00919 Malic enzyme, NAD binding domain Malic enzymes (malate oxidoreductases) catalyse the oxidative decarboxylation of malate to form pyruvate. MAM SM00137 Domain in meprin, A5, receptor protein tyrosine phosphatase mu (and others) Likely to have an adhesive function. Mutations in the meprin MAM domain affect noncovalent associations within meprin oligomers. In receptor tyrosine phosphatase mu-like molecules the MAM domain is important for homophilic cell-cell interactions. MamL-1 SM01275 MamL-1 domain The MamL-1 domain is a polypeptide of up to 70 residues, numbers 15-67 of which adopt an elongated kinked helix that wraps around ANK and CSL forming one of the complexes in the build-up of the Notch transcriptional complex for recruiting general transcription factors. MANEC SM00765 The MANEC domain was formerly called MANSC. This domain, comprising 8 conserved cysteines, is found in the N terminus of higher multicellular animal membrane and extracellular proteins. It is postulated that this domain may play a role in the formation of protein complexes involving various protease activators and inhibitors. It is possible that some of the cysteine residues in the MANSC domain form structurally important disulfide bridges. All of the MANSC-containing proteins contain predicted transmembrane regions and signal peptides. It has been proposed that the MANSC domain in HAI-1 might function through binding with hepatocyte growth factor activator and matriptase. MAPKK1_Int SM01278 Mitogen-activated protein kinase kinase 1 interacting Mitogen-activated protein kinase kinase 1 interacting protein is a small subcellular adaptor protein required for MAPK signaling and ERK1/2 activation. The overall topology of this domain has a central five-stranded beta-sheet sandwiched between a two alpha-helix and a one alpha-helix layer PMID:15263099. MATH SM00061 meprin and TRAF homology Matrilin_ccoil SM01279 Trimeric coiled-coil oligomerisation domain of matrilin This short domain is a coiled coil structure and has a single cysteine residue at the start which is likely to form a di-sulfide bridge with a corresponding cysteine in an upstream EGF (SM00181) domain thereby spanning a VWA (SM00327) domain. All three domains can be associated together as in the cartilage matrix protein matrilin, where this domain is likely to be responsible for oligomerisation PMID:9287130. MBD SM00391 Methyl-CpG binding domain Methyl-CpG binding domain, also known as the TAM (TTF-IIP5, ARBP, MeCP1) domain MBT SM00561 Present in Drosophila Scm, l(3)mbt, and vertebrate SCML2 Present in Drosophila Scm, l(3)mbt, and vertebrate SCML2. These proteins are involved in transcriptional regulation. MbtH SM00923 MbtH-like protein This domain is found in the MbtH protein as well as at the N-terminus of the antibiotic synthesis protein NIKP1. This domain is about 70 amino acids long and contains 3 fully conserved tryptophan residues. Many of the members of this family are found in known antibiotic synthesis gene clusters. MCM SM00350 minichromosome maintenance proteins Mcm10 SM01280 Mcm10 replication factor Mcm10 is a eukaryotic DNA replication factor that regulates the stability and chromatin association of DNA polymerase alpha PMID:15494305. MD SM00604 Med12 SM01281 Transcription mediator complex subunit Med12 Med12 is a negative regulator of the Gli3-dependent sonic hedgehog signalling pathway via its interaction with Gli3 within the RNA polymerase II transcriptional Mediator. A complex is formed between Med12, Med13, CDK8 and CycC which is responsible for suppression of transcription. This subunit forms part of the Kinase section of Mediator PMID:15175151. MeTrc SM00138 Methyltransferase, chemotaxis proteins Methylates methyl-accepting chemotaxis proteins to form gamma-glutamyl methyl ester residues. MGS SM00851 MGS-like domain This domain composes the whole protein of methylglyoxal synthetase and the domain is also found in Carbamoyl phosphate synthetase (CPS) where it forms a regulatory domain that binds to the allosteric effector ornithine. This family also includes inosicase. The known structures in this family show a common phosphate binding site PUBMED:10526357. MgtE_N SM00924 MgtE intracellular N domain This region is the integral membrane part of the eubacterial MgtE family of magnesium transporters. It is presumed to be an intracellular domain, that may be involved in magnesium binding. MHC_II_alpha SM00920 Class II histocompatibility antigen, alpha domain Class II MHC glycoproteins are expressed on the surface of antigen-presenting cells (APC), including macrophages, dendritic cells and B cells. MHC II proteins present peptide antigens that originate extracellularly from foreign bodies such as bacteria. Proteins from the pathogen are degraded into peptide fragments within the APC, which sequesters these fragments into the endosome so they can bind to MHC class II proteins, before being transported to the cell surface. MHC class II receptors display antigens for recognition by helper T cells (stimulate development of B cell clones) and inflammatory T cells (cause the release of lymphokines that attract other cells to site of infection) PUBMED:15120183. MHC_II_beta SM00921 Class II histocompatibility antigen, beta domain Class II MHC glycoproteins are expressed on the surface of antigen-presenting cells (APC), including macrophages, dendritic cells and B cells. MHC II proteins present peptide antigens that originate extracellularly from foreign bodies such as bacteria. Proteins from the pathogen are degraded into peptide fragments within the APC, which sequesters these fragments into the endosome so they can bind to MHC class II proteins, before being transported to the cell surface. MHC class II receptors display antigens for recognition by helper T cells (stimulate development of B cell clones) and inflammatory T cells (cause the release of lymphokines that attract other cells to site of infection) PUBMED:15120183. MIF4G SM00543 Middle domain of eukaryotic initiation factor 4G (eIF4G) Also occurs in NMD2p and CBP80. The domain is rich in alpha-helices and may contain multiple alpha-helical repeats. In eIF4G, this domain binds eIF4A, eIF3, RNA and DNA. Ponting (TiBS) "Novel eIF4G domain homologues (in press) Milton SM01423 Kinesin associated protein This domain family is found in eukaryotes, and is typically between 143 and 173 amino acids in length. This family is a region of the protein milton. Milton recruits the heavy chain of kinesin to mitochondria to allow the motor movement function of kinesin. MIR SM00472 Domain in ryanodine and inositol trisphosphate receptors and protein O-mannosyltransferases MIT SM00745 Microtubule Interacting and Trafficking molecule domain ML SM00737 Domain involved in innate immunity and lipid metabolism. ML (MD-2-related lipid-recognition) is a novel domain identified in MD-1, MD-2, GM2A, Npc2 and multiple proteins of unknown function in plants, animals and fungi. These single-domain proteins were predicted to form a beta-rich fold containing multiple strands, and to mediate diverse biological functions through interacting with specific lipids. MltA SM00925 MltA specific insert domain This beta barrel domain is found inserted in the MltA a murein degrading transglycosylase enzyme. This domain may be involved in peptidoglycan binding. Mob1_phocein SM01388 Mob1/phocein family Mob1 is an essential Saccharomyces cerevisiae protein, identified from a two-hybrid screen, that binds Mps1p, a protein kinase essential for spindle pole body duplication and mitotic checkpoint regulation. Mob1 contains no known structural motifs; however MOB1 is a member of a conserved gene family and shares sequence similarity with a nonessential yeast gene, MOB2. Mob1 is a phosphoprotein in vivo and a substrate for the Mps1p kinase in vitro. Conditional alleles of MOB1 cause a late nuclear division arrest at restrictive temperature (PMID:9436989). This family also includes phocein Q9QYW3 a rat protein that by yeast two hybrid interacts with striatin (PMID:11251078). MobA_MobL SM01468 HUH conjugative element mobility domain mobilisation protein involved in plasmid transfer Mob_Pre SM01470 HUH conjugative element mobility domain Plasmid recombination enzyme PMID:28739894 MoCF_biosynth SM00852 Probable molybdopterin binding domain This domain is found a variety of proteins involved in biosynthesis of molybdopterin cofactor. The domain is presumed to bind molybdopterin. The structure of this domain is known, and it forms an alpha/beta structure. In the known structure of Gephyrin this domain mediates trimerisation. Molybdop_Fe4S4 SM00926 Molybdopterin oxidoreductase Fe4S4 domain The molybdopterin oxidoreductase Fe4S4 domain is found in a number of reductase/dehydrogenase families, which include the periplasmic nitrate reductase precursor and the formate dehydrogenase alpha chain. MORN SM00698 Possible plasma membrane-binding motif in junctophilins, PIP-5-kinases and protein kinases. Mre11_DNA_bind SM01347 Mre11 DNA-binding presumed domain The Mre11 complex is a multi-subunit nuclease that is composed of Mre11, Rad50 and Nbs1/Xrs2, and is involved in checkpoint signalling and DNA replication (PMID:11988766). Mre11 has an intrinsic DNA-binding activity that is stimulated by Rad50 on its own or in combination with Nbs1 (PMID:10823903). MR_MLE SM00922 Mandelate racemase / muconate lactonizing enzyme, C-terminal domain Mandelate racemase (MR) and muconate lactonizing enzyme (MLE) are two bacterial enzymes involved in aromatic acid catabolism. They catalyze mechanistically distinct reactions yet they are related at the level of their primary, quaternary (homooctamer) and tertiary structures PUBMED:2215699, PUBMED:8256284. This entry represents the C-terminal region of these proteins. Mterf SM00733 Mitochondrial termination factor repeats Human mitochondrial termination factor is a DNA-binding protein that acts as a transcription termination factor. Six repeats occur in human mTERF, that also are present in numerous plant proteins. MutH SM00927 DNA mismatch repair enzyme MutH MutS, MutL and MutH are the three essential proteins for initiation of methyl-directed DNA mismatch repair to correct mistakes made during DNA replication in Escherichia coli. MutH cleaves a newly synthesized and unmethylated daughter strand 5' to the sequence d(GATC) in a hemi-methylated duplex. Activation of MutH requires the recognition of a DNA mismatch by MutS and MutL PUBMED:9482749. MutL_C SM00853 MutL C terminal dimerisation domain MutL and MutS are key components of the DNA repair machinery that corrects replication errors. MutS recognises mispaired or unpaired bases in a DNA duplex and in the presence of ATP, recruits MutL to form a DNA signaling complex for repair. The N terminal region of MutL contains the ATPase domain and the C terminal is involved in dimerisation. MUTSac SM00534 ATPase domain of DNA mismatch repair MUTS family MUTSd SM00533 DNA-binding domain of DNA mismatch repair MUTS family Myc SM01459 sub-family of tyrosine recombinases tyrosine recombinase sub-family mediates DNA clevage through tyrosine residue at the active site https://doi.org/10.1101/542381 MYSc SM00242 Myosin. Large ATPases. ATPase; molecular motor. Muscle contraction consists of a cyclical interaction between myosin and actin. The core of the myosin structure is similar in fold to that of kinesin. MyTH4 SM00139 Domain in Myosin and Kinesin Tails Domain present twice in myosin-VIIa, and also present in 3 other myosins. N1221 SM01292 N1221-like protein The sequences featured in this family are similar to a hypothetical protein product of ORF N1221 in the CPT1-SPC98 intergenic region of the yeast genome . This encodes an acidic polypeptide with several possible transmembrane regions PMID:8619318. N2227 SM01296 This family features sequences that are similar to a region of hypothetical yeast gene product N2227. This is thought to be expressed during meiosis and may be involved in the defence response to stressful conditions PMID:8771715. NAC SM01407 NADH_4Fe-4S SM00928 NADH-ubiquinone oxidoreductase-F iron-sulfur binding region NADH-G_4Fe-4S_3 SM00929 NADH-ubiquinone oxidoreductase-G iron-sulfur binding region NAT_PEP SM00183 Natriuretic peptide Atrial natriuretic peptides are vertebrate hormones important in the overall control of cardiovascular homeostasis and sodium and water balance in general. Nbs1_C SM01348 DNA damage repair protein Nbs1 This C terminal region of the DNA damage repair protein Nbs1 has been identified to be necessary for the binding of Mre11 and Tel1 (PMID:15964794). NDK SM00562 These are enzymes that catalyze nonsubstrate specific conversions of nucleoside diphosphates to nucleoside triphosphates. These enzymes play important roles in bacterial growth, signal transduction and pathogenicity. NEAT SM00725 NEAr Transporter domain NEBU SM00227 The Nebulin repeat is present also in Las1. Tandem arrays of these repeats are known to bind actin. NEUZ SM00588 domain in neuralized proteins Nfu_N SM00932 Scaffold protein Nfu/NifU N terminal This domain is found at the N terminus of NifU and NifU related proteins, and in the human Nfu protein. Both of these proteins are thought to be involved in the the assembly of iron-sulphur clusters PUBMED:12886008. NGF SM00140 Nerve growth factor (NGF or beta-NGF) NGF is important for the development and maintenance of the sympathetic and sensory nervous systems. N-glycanase_N SM01290 Peptide-N-glycosidase F, N terminal Members of this family adopt an eight-stranded antiparallel beta jelly roll configuration, with the beta strands arranged into two sheets. They are similar in topology to many viral capsid proteins, as well as lectins and several glucanases. The domain allows the protein to bind sugars and catalyses the complete removal of N-linked oligosaccharide chains from glycoproteins PMID:7881905. NGN SM00738 In Spt5p, this domain may confer affinity for Spt4p. It possesses a RNP-like fold. In Spt5p, this domain may confer affinity for Spt4p.Spt4p NH SM00003 Neurohypophysial hormones Vasopressin/oxytocin gene family. NIDO SM00539 Extracellular domain of unknown function in nidogen (entactin) and hypothetical proteins. NIL SM00930 This domain is found at the C-terminus of ABC transporter proteins involved in D-methionine transport as well as a number of ferredoxin-like proteins. This domain is likely to act as a substrate binding domain. The domain has been named after a conserved sequence in some members of the family. NL SM00004 Domain found in Notch and Lin-12 The Notch protein is essential for the proper differentiation of the Drosophila ectoderm. This protein contains 3 NL domains. NMU SM00084 Neuromedin U Neuromedin U (NmU) is a vertebrate peptide which stimulates uterine smooth muscle contraction and causes selective vasoconstriction. Like most other active peptides, it is proteolytically processed from a larger precursor protein. The mature peptides are 8 (NmU-8) to 25 (NmU-25) residues long and C- terminally amidated. The sequence of the C-terminal extremity of NmU is extremely well conserved in mammals, birds and amphibians. NOD SM01338 NOTCH signalling plays a fundamental role during a great number of developmental processes in multicellular animals (PMID:10221902, 11112321). NOD and NODP represent a region present in many NOTCH proteins and NOTCH homologs in multiple species such as NOTCH2 and NOTCH3, LIN12, SC1 and TAN1. Role of NOD domain remains to be elucidated. NODP SM01339 NOTCH signalling plays a fundamental role during a great number of developmental processes in multicellular animals (PMID:10221902, 11112321). NOD and NODP represent a region present in many NOTCH proteins and NOTCH homologs in multiple species such as NOTCH2 and NOTCH3, LIN12, SC1 and TAN1. The role of the NOD and NODP domains remains to be elucidated. NOSIC SM00931 NOSIC (NUC001) domain This is the central domain in Nop56/SIK1-like proteins PUBMED:15112237. NPCBM SM00776 This novel putative carbohydrate binding module (NPCBM) domain is found at the N-terminus of glycosyl hydrolase family 98 proteins. NPP SM01364 Pro-opiomelanocortin, N-terminal region This family features the N-terminal peptide of pro-opiomelanocortin (NPP). It is thought to represent an important pituitary peptide, given its high yield from pituitary glands, and exhibits a potent in vitro aldosterone-stimulating activity (PMID:6945581). NRF SM00703 N-terminal domain in C. elegans NRF-6 (Nose Resistant to Fluoxetine-4) and NDG-4 (resistant to nordihydroguaiaretic acid-4). Also present in several other worm and fly proteins. N-SET SM01291 COMPASS (Complex proteins associated with Set1p) component N The n-SET or N-SET domain is a component of the COMPASS complex, associated with SET1, conserved in yeasts and in other eukaryotes up to humans. The COMPASS complex functions to methylate the fourth lysine of Histone 3 and for the silencing of genes close to the telomeres of chromosomes PMID:11805083. This domain promotes trimethylation in conjunction with an RRM domain PMID:15775977and is necessary for binding of the Spp1 component of COMPASS into the complex PMID:16921172. NTR SM00206 Tissue inhibitor of metalloproteinase family. Form complexes with metalloproteinases, such as collagenases, and irreversibly inactivate them. NUC SM00477 DNA/RNA non-specific endonuclease prokaryotic and eukaryotic double- and single-stranded DNA and RNA endonucleases also present in phosphodiesterases NUC194 SM01344 This is domain B in the catalytic subunit of DNA-dependent protein kinases. NurA SM00933 This family includes NurA a nuclease exhibiting both single-stranded endonuclease activity and 5'-3' exonuclease activity on single-stranded and double-stranded DNA from the hyperthermophilic archaeon Sulfolobus acidocaldarius PUBMED:12052775. OLF SM00284 Olfactomedin-like domains OMPdecase SM00934 Orotidine 5'-phosphate decarboxylase / HUMPS family Orotidine 5'-phosphate decarboxylase (OMPdecase) PUBMED:2835631, PUBMED:1730672 catalyzes the last step in the de novo biosynthesis of pyrimidines, the decarboxylation of OMP into UMP. In higher eukaryotes OMPdecase is part, with orotate phosphoribosyltransferase, of a bifunctional enzyme, while the prokaryotic and fungal OMPdecases are monofunctional protein. OmpH SM00935 Outer membrane protein (OmpH-like) This family includes outer membrane proteins such as OmpH among others. Skp (OmpH) has been characterised as a molecular chaperone that interacts with unfolded proteins as they emerge in the periplasm from the Sec translocation machinery PUBMED:15304217. Op_neuropeptide SM01365 Opioids neuropeptide This family corresponds to the conserved YGG motif that is found in a wide variety of opioid neuropeptides such as enkephalin. ORANGE SM00511 Orange domain This domain confers specificity among members of the Hairy/E(SPL) family. OrfB_IS605 SM01487 DDE transposase sub-family DDE domain containing transposase PMID:2553665 OSTEO SM00017 Osteopontin Osteopontin is an acidic phosphorylated glycoprotein of about 40 Kd which is abundant in the mineral matrix of bones and which binds tightly to hydroxyapatite [1,2,3]. It is suggested that osteopontin might function as a cell attachment factor and could play a key role in the adhesion of osteoclasts to the mineral matrix of bone P4Hc SM00702 Prolyl 4-hydroxylase alpha subunit homologues. Mammalian enzymes catalyse hydroxylation of collagen, for example. Prokaryotic enzymes might catalyse hydroxylation of antibiotic peptides. These are 2-oxoglutarate-dependent dioxygenases, requiring 2-oxoglutarate and dioxygen as cosubstrates and ferrous iron as a cofactor. P68HR SM01414 P68HR (NUC004) repeat This short region is found in two copies in p68-like RNA helicases PMID:15112237. PA14 SM00758 domain in bacterial beta-glucosidases other glycosidases, glycosyltransferases, proteases, amidases, yeast adhesins, and bacterial toxins. PA2c SM00085 Phospholipase A2 PAC SM00086 Motif C-terminal to PAS motifs (likely to contribute to PAS structural domain) PAC motif occurs C-terminal to a subset of all known PAS motifs. It is proposed to contribute to the PAS domain fold. PADR1 SM01335 This domain is found in poly(ADP-ribose)-synthetases (PMID:15112237). The function of this domain is unknown. PAH SM00309 Pancreatic hormones / neuropeptide F / peptide YY family Pancreatic hormone is a regulator of pancreatic and gastrointestinal functions. PAM SM00753 PCI/PINT associated module PAN_AP SM00473 divergent subfamily of APPLE domains Apple-like domains present in Plasminogen, C. elegans hypothetical ORFs and the extracellular portion of plant receptor-like protein kinases. Predicted to possess protein- and/or carbohydrate-binding functions. PAPA-1 SM01406 Family of proteins with a conserved region found in PAPA-1, a PAP-1 binding protein. ParB SM00470 ParB-like nuclease domain Plasmid RK2 ParB preferentially cleaves single-stranded DNA. ParB also nicks supercoiled plasmid DNA preferably at sites with potential single-stranded character, like AT-rich regions and sequences that can form cruciform structures. ParB also exhibits 5-->3 exonuclease activity. PAS SM00091 PAS domain PAS motifs appear in archaea, eubacteria and eukarya. Probably the most surprising identification of a PAS domain was that in EAG-like K+-channels ([1]; Ponting & Aravind, in press). PASTA SM00740 PAW SM00613 domain present in PNGases and other hypothetical proteins present in several copies in proteins with unknown function in C. elegans PAX SM00351 Paired Box domain PAZ SM00949 This domain is named PAZ after the proteins Piwi Argonaut and Zwille. This domain is found in two families of proteins that are involved in post-transcriptional gene silencing. These are the Piwi family and the Dicer family, that includes the Carpel factory protein. The function of the domains is unknown but has been suggested to mediate complex formation between proteins of the Piwi and Dicer families by hetero-dimerisation. The three-dimensional structure of this domain has been solved. The PAZ domain is composed of two subdomains. One subdomain is similar to the OB fold, albeit with a different topology. The OB-fold is well known as a single-stranded nucleic acid binding fold. The second subdomain is composed of a beta-hairpin followed by an alpha-helix. The PAZ domains shows low-affinity nucleic acid binding and appears to interact with the 3' ends of single-stranded regions of RNA in the cleft between the two subdomains. PAZ can bind the characteristic two-base 3' overhangs of siRNAs, indicating that although PAZ may not be a primary nucleic acid binding site in Dicer or RISC, it may contribute to the specific and productive incorporation of siRNAs and miRNAs into the RNAi pathway. PB1 SM00666 PB1 domain Phox and Bem1p domain, present in many eukaryotic cytoplasmic signalling proteins. The domain adopts a beta-grasp fold, similar to that found in ubiquitin and Ras-binding domains. A motif, variously termed OPR, PC and AID, represents the most conserved region of the majority of PB1 domains, and is necessary for PB1 domain function. This function is the formation of PB1 domain heterodimers, although not all PB1 domain pairs associate. PBD SM00285 P21-Rho-binding domain Small domains that bind Cdc42p- and/or Rho-like small GTPases. Also known as the Cdc42/Rac interactive binding (CRIB). PbH1 SM00710 Parallel beta-helix repeats The tertiary structures of pectate lyases and rhamnogalacturonase A show a stack of parallel beta strands that are coiled into a large helix. Each coil of the helix represents a structural repeat that, in some homologues, can be recognised from sequence information alone. Conservation of asparagines might be connected with asparagine-ladders that contribute to the stability of the fold. Proteins containing these repeats most often are enzymes with polysaccharide substrates. PBP5_C SM00936 Penicillin-binding protein 5, C-terminal domain Penicillin-binding protein 5 expressed by E. coli (P04287) functions as a D-alanyl-D-alanine carboxypeptidase. It is composed of two domains that are oriented at approximately right angles to each other. The N-terminal domain (PF00768) is the catalytic domain. The C-terminal domain featured in this family is organised into a sandwich of two anti-parallel beta-sheets, and has a relatively hydrophobic surface as compared to the N-terminal domain. Its precise function is unknown; it may mediate interactions with other cell wall-synthesising enzymes, thus allowing the protein to be recruited to areas of active cell wall synthesis. It may also function as a linker domain that positions the active site in the catalytic domain closer to the peptidoglycan layer, to allow it to interact with cell wall peptides PUBMED:10967102. PBPb SM00062 Bacterial periplasmic substrate-binding proteins bacterial proteins, eukaryotic ones are in PBPe PBPe SM00079 Eukaryotic homologues of bacterial periplasmic substrate binding proteins. Prokaryotic homologues are represented by a separate alignment: PBPb PCRF SM00937 This domain is found in peptide chain release factors. PD SM00018 P or trefoil or TFF domain Proposed role in renewal and pathology of mucous epithelia. PDDEXK SM01435 PD-(D/E)XK nuclease transposase transposase of PDDEKX nuclease superfamily PMID:15972856 PDDEXK_2 SM01432 PD-(D/E)XK nuclease transposase sub-family transposase of PDDEKX nuclease superfamily PMID:15972856 PDGF SM00141 Platelet-derived and vascular endothelial growth factors (PDGF, VEGF) family Platelet-derived growth factor is a potent activator for cells of mesenchymal origin. PDGF-A and PDGF-B form AA and BB homodimers and an AB heterodimer. Members of the VEGF family are homologues of PDGF. PDZ SM00228 Domain present in PSD-95, Dlg, and ZO-1/2. Also called DHR (Dlg homologous region) or GLGF (relatively well conserved tetrapeptide in these domains). Some PDZs have been shown to bind C-terminal polypeptides; others appear to bind internal (non-C-terminal) polypeptides. Different PDZs possess different binding specificities. PEHE SM01300 This domain was first identified in drosophila MSL1 (male-specific lethal 1) PMID:12698291. In drosophila it binds to the histone acetyltransferase males-absent on the first protein (MOF) and to protein male-specific lethal-3 (MSL3) PMID:15141166;PMID:21217699. Pept_C1 SM00645 Papain family cysteine protease PepX_C SM00939 X-Pro dipeptidyl-peptidase C-terminal non-catalytic domain This domain is found at the C-terminus of cocaine esterase CocE, several glutaryl-7-ACA acylases, and the putative diester hydrolase NonD of Streptomyces griseus (all hydrolases). The domain, which is a beta sandwich, is also found in serine peptidases belonging to MEROPS peptidase family S15: Xaa-Pro dipeptidyl-peptidases. Members of this entry, that are not characterised as peptidases, show extensive low-level similarity to the Xaa-Pro dipeptidyl-peptidases. PepX_N SM00940 X-Prolyl dipeptidyl aminopeptidase PepX, N-terminal This N-terminal domain adopts a secondary structure consisting of a helical bundle of eight alpha helices and three beta strands, with the last alpha helix connecting to the first strand of the catalytic domain. The first strand of the N-terminus also forms a small parallel beta sheet with strand five of the catalytic domain. This domain mediates dimerisation of the protein, with two proline residues present in the domain being critical for interaction PUBMED:12377124. PGA_cap SM00854 Bacterial capsule synthesis protein PGA_cap This protein is a putative poly-gamma-glutamate capsule biosynthesis protein found in bacteria. Poly-gamma-glutamate is a natural polymer that may be involved in virulence and may help bacteria survive in high salt concentrations. It is a surface-associated protein. PGAM SM00855 Phosphoglycerate mutase family Phosphoglycerate mutase (PGAM) and bisphosphoglycerate mutase (BPGM) are structurally related enzymes that catalyse reactions involving the transfer of phospho groups between the three carbon atoms of phosphoglycerate PUBMED:2847721, PUBMED:2831102, PUBMED:10958932. Both enzymes can catalyse three different reactions with different specificities, the isomerization of 2-phosphoglycerate (2-PGA) to 3-phosphoglycerate (3-PGA) with 2,3-diphosphoglycerate (2,3-DPG) as the primer of the reaction, the synthesis of 2,3-DPG from 1,3-DPG with 3-PGA as a primer and the degradation of 2,3-DPG to 3-PGA (phosphatase activity). In mammals, PGAM is a dimeric protein with two isoforms, the M (muscle) and B (brain) forms. In yeast, PGAM is a tetrameric protein. PGRP SM00701 Animal peptidoglycan recognition proteins homologous to Bacteriophage T3 lysozyme. The bacteriophage molecule, but not its moth homologue, has been shown to have N-acetylmuramoyl-L-alanine amidase activity. One member of this family, Tag7, is a cytokine. PH SM00233 Pleckstrin homology domain. Domain commonly found in eukaryotic signalling proteins. The domain family possesses multiple functions including the abilities to bind inositol phosphates, and various proteins. PH domains have been found to possess inserted domains (such as in PLC gamma, syntrophins) and to be inserted within other domains. Mutations in Brutons tyrosine kinase (Btk) within its PH domain cause X-linked agammaglobulinaemia (XLA) in patients. Point mutations cluster into the positively charged end of the molecule around the predicted binding site for phosphatidylinositol lipids. Phage_GPA SM01466 HUH phage replication sub-family bacteriophage replication protein PMID:7997180 Phage_integrase SM01460 Pfam phage integrase family site-specific recombinase of tyrosine recombinase family invovled in DNA clevage PMID:9288963 PHB SM00244 prohibitin homologues prohibitin homologues PhBP SM00708 Insect pheromone/odorant binding protein domains. PHD SM00249 PHD zinc finger The plant homeodomain (PHD) finger is a C4HC3 zinc-finger-like motif found in nuclear proteins thought to be involved in epigenetics and chromatin-mediated transcriptional regulation. The PHD finger binds two zinc ions using the so-called 'cross-brace' motif and is thus structurally related to the RING finger and the FYVE finger. It is not yet known if PHD fingers have a common molecular function. Several reports suggest that it can function as a protein-protein interacton domain and it was recently demonstrated that the PHD finger of p300 can cooperate with the adjacent BROMO domain in nucleosome binding in vitro. Other reports suggesting that the PHD finger is a ubiquitin ligase have been refuted as these domains were RING fingers misidentified as PHD fingers. PhnA_Zn_Ribbon SM00782 PhnA Zinc-Ribbon This protein family includes an uncharacterised member designated phnA in Escherichia coli, part of a large operon associated with alkylphosphonate uptake and carbon-phosphorus bond cleavage. This protein is not related to the characterised phosphonoacetate hydrolase designated PhnA. PI3Ka SM00145 Phosphoinositide 3-kinase family, accessory domain (PIK domain) PIK domain is conserved in all PI3 and PI4-kinases. Its role is unclear but it has been suggested to be involved in substrate presentation. PI3Kc SM00146 Phosphoinositide 3-kinase, catalytic domain Phosphoinositide 3-kinase isoforms participate in a variety of processes, including cell motility, the Ras pathway, vesicle trafficking and secretion, and apoptosis. These homologues may be either lipid kinases and/or protein kinases: the former phosphorylate the 3-position in the inositol ring of inositol phospholipids. The ataxia telangiectesia-mutated gene produced, the targets of rapamycin (TOR) and the DNA-dependent kinase have not been found to possess lipid kinase activity. Some of this family possess PI-4 kinase activities. PI3K_C2 SM00142 Phosphoinositide 3-kinase, region postulated to contain C2 domain Outlier of C2 family. PI3K_p85B SM00143 PI3-kinase family, p85-binding domain Region of p110 PI3K that binds the p85 subunit. PI3K_rbd SM00144 PI3-kinase family, Ras-binding domain Certain members of the PI3K family possess Ras-binding domains in their N-termini. These regions show some similarity (although not highly significant similarity) to Ras-binding RA domains (unpublished observation). PIG-X SM00780 PIG-X / PBN1 Mammalian PIG-X and yeast PBN1 are essential components of glycosylphosphatidylinositol-mannosyltransferase I. These enzymes are involved in the transfer of sugar molecules. P-II SM00938 Nitrogen regulatory protein P-II P-II modulates the activity of glutamine synthetase. PINc SM00670 Large family of predicted nucleotide-binding domains From similarities to 5'-exonucleases, these domains are predicted to be RNases. PINc domains in nematode SMG-5 and yeast NMD4p are predicted to be involved in RNAi. PINT SM00088 motif in proteasome subunits, Int-6, Nip-1 and TRIP-15 Also called the PCI (Proteasome, COP9, Initiation factor 3) domain. Unknown function. PIP49_N SM01299 N-term cysteine-rich ER, FAM69 The FAM69 family of cysteine-rich type II transmembrane proteins localise to the endoplasmic reticulum (ER) in cultured cells, probably via N-terminal di-arginine motifs. These proteins carry at least 14 luminal cysteines which are conserved in all FAM69s. There are currently few indications of the involvement of FAM69 members in human diseases PMID:21334309. It would appear that FAM69 proteins are predicted to be have a protein kinase structure and function. Analysis of three-dimensional structure models and conservation of the classic catalytic motifs of protein kinases in four of human FAM69 proteins suggests they might have retained catalytic phosphotransferase activity. An EF-hand Ca2+-binding domain, inserted within the structure of the kinase domain, suggests they function as Ca2+-dependent kinases (unpublished). PIPKc SM00330 Phosphatidylinositol phosphate kinases Piwi SM00950 This domain is found in the protein Piwi and its relatives. The function of this domain is the dsRNA guided hydrolysis of ssRNA. Determination of the crystal structure of Argonaute reveals that PIWI is an RNase H domain, and identifies Argonaute as Slicer, the enzyme that cleaves mRNA in the RNAi RISC complex PUBMED:15284453 . In addition, Mg+2 dependence and production of 3'-OH and 5' phosphate products are shared characteristics of RNaseH and RISC. The PIWI domain core has a tertiary structure belonging to the RNase H family of enzymes. RNase H fold proteins all have a five-stranded mixed beta-sheet surrounded by helices. By analogy to RNase H enzymes which cleave single-stranded RNA guided by the DNA strand in an RNA/DNA hybrid, the PIWI domain can be inferred to cleave single-stranded RNA, for example mRNA, guided by double stranded siRNA. PKD SM00089 Repeats in polycystic kidney disease 1 (PKD1) and other proteins Polycystic kidney disease 1 protein contains 14 repeats, present elsewhere such as in microbial collagenases. PKS_AT SM00827 Acyl transferase domain in polyketide synthase (PKS) enzymes. PKS_DH SM00826 PKS_ER SM00829 Enoylreductase Enoylreductase in Polyketide synthases. PKS_KR SM00822 This enzymatic domain is part of bacterial polyketide synthases and catalyses the first step in the reductive modification of the beta-carbonyl centres in the growing polyketide chain. It uses NADPH to reduce the keto group to a hydroxy group. PKS_KS SM00825 Beta-ketoacyl synthase The structure of beta-ketoacyl synthase is similar to that of the thiolase family and also chalcone synthase. The active site of beta-ketoacyl synthase is located between the N and C-terminal domains. PKS_MT SM00828 Methyltransferase in polyketide synthase (PKS) enzymes. PKS_PP SM00823 Phosphopantetheine attachment site Phosphopantetheine (or pantetheine 4' phosphate) is the prosthetic group of acyl carrier proteins (ACP) in some multienzyme complexes where it serves as a 'swinging arm' for the attachment of activated fatty acid and amino-acid groups PUBMED:5321311. PKS_PP_betabranch SM01294 ACPs in this domain have a fully conserved tryptophan. PKS_TE SM00824 Thioesterase Peptide synthetases are involved in the non-ribosomal synthesis of peptide antibiotics. Next to the operons encoding these enzymes, in almost all cases, are genes that encode proteins that have similarity to the type II fatty acid thioesterases of vertebrates. There are also modules within the peptide synthetases that also share this similarity. With respect to antibiotic production, thioesterases are required for the addition of the last amino acid to the peptide antibiotic, thereby forming a cyclic antibiotic. Thioesterases (non-integrated) have molecular masses of 25-29 kDa. PLAc SM00022 Cytoplasmic phospholipase A2, catalytic subunit Cytosolic phospholipases A2 hydrolyse arachidonyl phospholipids. Family includes phospholipases B isoforms. PLCXc SM00148 Phospholipase C, catalytic domain (part); domain X Phosphoinositide-specific phospholipases C. These enzymes contain 2 regions (X and Y) which together form a TIM barrel-like structure containing the active site residues. Phospholipase C enzymes (PI-PLC) act as signal transducers that generate two second messengers, inositol-1,4,5-trisphosphate and diacylglycerol. The bacterial enzyme [6] appears to be a homologue of the mammalian PLCs. PLCYc SM00149 Phospholipase C, catalytic domain (part); domain Y Phosphoinositide-specific phospholipases C. These enzymes contain 2 regions (X and Y) which together form a TIM barrel-like structure containing the active site residues. Phospholipase C enzymes (PI-PLC) act as signal transducers that generate two second messengers, inositol-1,4,5-trisphosphate and diacylglycerol. The bacterial enzyme [6] appears to be a homologue of the mammalian PLCs. PLDc SM00155 Phospholipase D. Active site motifs. Phosphatidylcholine-hydrolyzing phospholipase D (PLD) isoforms are activated by ADP-ribosylation factors (ARFs). PLD produces phosphatidic acid from phosphatidylcholine, which may be essential for the formation of certain types of transport vesicles or may be constitutive vesicular transport to signal transduction pathways. PC-hydrolysing PLD is a homologue of cardiolipin synthase, phosphatidylserine synthase, bacterial PLDs, and viral proteins. Each of these appears to possess a domain duplication which is apparent by the presence of two motifs containing well-conserved histidine, lysine, aspartic acid, and/or asparagine residues which may contribute to the active site. An E. coli endonuclease (nuc) and similar proteins appear to be PLD homologues but possess only one of these motifs. The profile contained here represents only the putative active site regions, since an accurate multiple alignment of the repeat units has not been achieved. PLEC SM00250 Plectin repeat PLP SM00002 Myelin proteolipid protein (PLP or lipophilin) PlsC SM00563 Phosphate acyltransferases Function in phospholipid biosynthesis and have either glycerolphosphate, 1-acylglycerolphosphate, or 2-acylglycerolphosphoethanolamine acyltransferase activities. Tafazzin, the product of the gene mutated in patients with Barth syndrome, is a member of this family. Plus3 SM00719 Short conserved domain in transcriptional regulators. Plus3 domains occur in the Saccharomyces cerevisiae Rtf1p protein, which interacts with Spt6p, and in parsley CIP, which interacts with the bZIP protein CPRF1. PMEI SM00856 Plant invertase/pectin methylesterase inhibitor This domain inhibits pectin methylesterases (PMEs) and invertases through formation of a non-covalent 1:1 complex PUBMED:8521860. It has been implicated in the regulation of fruit development, carbohydrate metabolism and cell wall extension. It may also be involved in inhibiting microbial pathogen PMEs. It has been observed that it is often expressed as a large inactive preprotein PUBMED:8521860. It is also found at the N-termini of PMEs predicted from DNA sequences, suggesting that both PMEs and their inhibitors are expressed as a single polyprotein and subsequently processed. It has two disulphide bridges and is mainly alpha-helical PUBMED:10880981. POL3Bc SM00480 DNA polymerase III beta subunit POLAc SM00482 DNA polymerase A domain POLBc SM00486 DNA polymerase type-B family DNA polymerase alpha, delta, epsilon and zeta chain (eukaryota), DNA polymerases in archaea, DNA polymerase II in e. coli, mitochondrial DNA polymerases and and virus DNA polymerases POLIIIAc SM00481 DNA polymerase alpha chain like domain DNA polymerase alpha chain like domain, incl. family of hypothetical proteins POLXc SM00483 DNA polymerase X family includes vertebrate polymerase beta and terminal deoxynucleotidyltransferases PolyA SM00517 C-terminal domain of Poly(A)-binding protein. Present also in Drosophila hyperplastics discs protein. Involved in homodimerisation (either directly or indirectly) POP4 SM00538 A domain found in a protein subunit of human RNase MRP and RNase P ribonucleoprotein complexes and archaeal proteins. PostSET SM00508 Cysteine-rich motif following a subset of SET domains POU SM00352 Found in Pit-Oct-Unc transcription factors POX SM00574 domain associated with HOX domains PP2Ac SM00156 Protein phosphatase 2A homologues, catalytic domain. Large family of serine/threonine phosphatases, that includes PP1, PP2A and PP2B (calcineurin) family members. PP2Cc SM00332 Serine/threonine phosphatases, family 2C, catalytic domain The protein architecture and deduced catalytic mechanism of PP2C phosphatases are similar to the PP1, PP2A, PP2B family of protein Ser/Thr phosphatases, with which PP2C shares no sequence similarity. PP2C_SIG SM00331 Sigma factor PP2C-like phosphatases PQQ SM00564 beta-propeller repeat Beta-propeller repeat occurring in enzymes with pyrrolo-quinoline quinone (PQQ) as cofactor, in Ire1p-like Ser/Thr kinases, and in prokaryotic dehydrogenases. PRE_C2HC SM00596 PreSET SM00468 N-terminal to some SET domains A Cys-rich putative Zn2+-binding domain that occurs N-terminal to some SET domains. Function is unknown. Unpublished. Pribosyltran_N SM01400 N-terminal domain of ribose phosphate pyrophosphokinase This family is frequently found N-terminal to the Pribosyltran. PriCT_1 SM00942 Primase C terminal 1 (PriCT-1) This alpha helical domain is found at the C terminal of primases. Prim-Pol SM00943 Bifunctional DNA primase/polymerase, N-terminal Members of this family adopt a structure consisting of a core of antiparallel beta sheets. They are found in various bacterial hypothetical proteins, and have been shown to harbour both primase and polymerase activities PUBMED:14730355. Prim_Zn_Ribbon SM00778 Zinc-binding domain of primase-helicase This region represents the zinc binding domain. It is found in the N-terminal region of the bacteriophage P4 alpha protein, which is a multifunctional protein with origin recognition, helicase and primase activities. Pro_CA SM00947 Carbonic anhydrase Carbonic anhydrases (CA) are zinc metalloenzymes which catalyze the reversible hydration of carbon dioxide. In Escherichia coli, CA (gene cynT) is involved in recycling carbon dioxide formed in the bicarbonate-dependent decomposition of cyanate by cyanase (gene cynS). By this action, it prevents the depletion of cellular bicarbonate PUBMED:1740425. In photosynthetic bacteria and plant chloroplast, CA is essential to inorganic carbon fixation PUBMED:1584776. Prokaryotic and plant chloroplast CA are structurally and evolutionary related and form a family distinct from the one which groups the many different forms of eukaryotic CA's. PROF SM00392 Profilin Binds actin monomers, membrane polyphosphoinositides and poly-L-proline. Pro-kuma_activ SM00944 Pro-kumamolisin, activation domain This domain is found at the N-terminus of peptidases belonging to MEROPS peptidase family S53 (sedolisin, clan SB). The domain adopts a ferredoxin-like fold, with an alpha+beta sandwich. Cleavage of the domain results in activation of the peptidase PUBMED:15242607. ProQ SM00945 ProQ/FINO family This family includes ProQ, which is required for full activation of the osmoprotectant transporter, ProQ, in Escherichia coli. Pro-rich SM01412 Proline-rich ProRS-C_1 SM00946 Prolyl-tRNA synthetase, C-terminal Members of this family are predominantly found in prokaryotic prolyl-tRNA synthetase. They contain a zinc binding site, and adopt a structure consisting of alpha helices and antiparallel beta sheets arranged in 2 layers, in a beta-alpha-beta-alpha-beta motif PUBMED:12578991. Proteasome_A_N SM00948 Proteasome subunit A N-terminal signature Add an annotation This domain is conserved in the A subunits of the proteasome complex proteins. PRP SM00157 Major prion protein The prion protein is a major component of scrapie-associated fibrils in Creutzfeldt-Jakob disease, kuru, Gerstmann-Straussler syndrome and bovine spongiform encephalopathy. PRY SM00589 associated with SPRY domains PSA SM00639 Paramecium Surface Antigen Repeat PSI SM00423 domain found in Plexins, Semaphorins and Integrins PSN SM00730 Presenilin, signal peptide peptidase, family Presenilin 1 and presenilin 2 are polytopic membrane proteins, whose genes are mutated in some individuals with Alzheimer's disease. Distant homologues, present in eukaryotes and archaea, also contain conserved aspartic acid residues which are predicted to contribute to catalysis. At least one member of this family has been shown to possess signal peptide peptidase activity. PSP SM00581 proline-rich domain in spliceosome associated proteins PTB SM00462 Phosphotyrosine-binding domain, phosphotyrosine-interaction (PI) domain PTB/PI domain structure similar to those of pleckstrin homology (PH) and IRS-1-like PTB domains. PTBI SM00310 Phosphotyrosine-binding domain (IRS1-like) PTEN_C2 SM01326 C2 domain of PTEN tumour-suppressor protein This is the C2 domain-like domain, in greek key form, of the PTEN protein, phosphatidyl-inositol triphosphate phosphatase, and it is the C-terminus. This domain may well include a CBR3 loop which means it plays a central role in membrane binding. This domain associates across an extensive interface with the N-terminal phosphatase domain DSPc (Pfam PF00782) suggesting that the C2 domain productively positions the catalytic part of the protein onto the membrane (PMID:10555148). PTH SM00087 Parathyroid hormone PTI SM00286 Plant trypsin inhibitors PTN SM00193 Pleiotrophin / midkine family Heparin-binding domain family. PTPc SM00194 Protein tyrosine phosphatase, catalytic domain PTPc_DSPc SM00012 Protein tyrosine phosphatase, catalytic domain, undefined specificity Protein tyrosine phosphatases. Homologues detected by this profile and not by those of "PTPc" or "DSPc" are predicted to be protein phosphatases with a similar fold to DSPs and PTPs, yet with unpredicted specificities. PTPc_motif SM00404 Protein tyrosine phosphatase, catalytic domain motif PTPlike_phytase SM01301 Inositol hexakisphosphate Inositol hexakisphosphate, often called phytate, is found in abundance in seeds and acting as an inorganic phosphate reservoir. Phytases are phosphatases that hydrolyze phytate to less-phosphorylated myo-inositol derivatives and inorganic phosphate. The active-site sequence (HCXXGXGR) of the phytase identified from the gut micro-organism Selenomonas ruminantium forms a loop (P loop) at the base of a substrate binding pocket that is characteristic of protein tyrosine phosphatases (PTPs). The depth of this pocket is an important determinant of the substrate specificity of PTPs. In humans this enzyme is thought to aid bone mineralization and salvage the inositol moiety prior to apoptosis PMID:9923613. PTX SM00159 Pentraxin / C-reactive protein / pentaxin family This family form a doscoid pentameric structure. Human serum amyloid P demonstrates calcium-mediated ligand-binding. PUA SM00359 Putative RNA-binding Domain in PseudoUridine synthase and Archaeosine transglycosylase PUG SM00580 domain in protein kinases, N-glycanases and other nuclear proteins Pumilio SM00025 Pumilio-like repeats Pumilio-like repeats that bind RNA. PUR SM00712 DNA/RNA-binding repeats in PUR-alpha/beta/gamma and in hypothetical proteins from spirochetes and the Bacteroides-Cytophaga-Flexibacter bacteria. PWI SM00311 PWI, domain in splicing factors PWWP SM00293 domain with conserved PWWP motif conservation of Pro-Trp-Trp-Pro residues PX SM00312 PhoX homologous domain, present in p47phox and p40phox. Eukaryotic domain of unknown function present in phox proteins, PLD isoforms, a PI3K isoform. PXA SM00313 Domain associated with PX domains unpubl. observations PYNP_C SM00941 Pyrimidine nucleoside phosphorylase C-terminal domain This domain is found at the C-terminal end of the large alpha/beta domain making up various pyrimidine nucleoside phosphorylases PUBMED:9817849, PUBMED:2199449. It has slightly different conformations in different members of this family. For example, in pyrimidine nucleoside phosphorylase (PYNP, ) there is an added three-stranded anti-parallel beta sheet as compared to other members of the family, such as E. coli thymidine phosphorylase (TP, ) PUBMED:9817849. The domain contains an alpha/ beta hammerhead fold and residues in this domain seem to be important in formation of the homodimer PUBMED:9817849. PYRIN SM01289 PAAD/DAPIN/Pyrin domain This domain is predicted to contain 6 alpha helices and to have the same fold as the Death domain SM00005. This similarity may mean that this is a protein-protein interaction domain. QLQ SM00951 QLQ is named after the conserved Gln, Leu, Gln motif. QLQ is found at the N-terminus of SWI2/SNF2 protein, which has been shown to be involved in protein-protein interactions. QLQ has been postulated to be involved in mediating protein interactions PUBMED:12974814. R3H SM00393 Putative single-stranded nucleic acids-binding domain RA SM00314 Ras association (RalGDS/AF-6) domain RasGTP effectors (in cases of AF6, canoe and RalGDS); putative RasGTP effectors in other cases. Kalhammer et al. have shown that not all RA domains bind RasGTP. Predicted structure similar to that determined, and that of the RasGTP-binding domain of Raf kinase. Predicted RA domains in PLC210 and nore1 found to bind RasGTP. Included outliers (Grb7, Grb14, adenylyl cyclases etc.) RAB SM00175 Rab subfamily of small GTPases Rab GTPases are implicated in vesicle trafficking. rADc SM00650 Ribosomal RNA adenine dimethylases RAN SM00176 Ran (Ras-related nuclear proteins) /TC4 subfamily of small GTPases Ran is involved in the active transport of proteins through nuclear pores. RanBD SM00160 Ran-binding domain Domain of apporximately 150 residues that stabilises the GTP-bound form of Ran (the Ras-like nuclear small GTPase). RAP SM00952 This domain is found in various eukaryotic species, particularly in apicomplexans such as Plasmodium falciparum, where it is found in proteins that are important in various parasite-host cell interactions. It is thought to be an RNA-binding domain PUBMED:15501674. Rapamycin_bind SM01345 Rapamycin binding domain This domain forms an alpha helical structure and binds to rapamycin (PMID:8662507). Raptor_N SM01302 Raptor N-terminal CASPase like domain This domain is found at the N-terminus of the Raptor protein. It has been identified to have a CASPase like structure PMID:15450605. It conserves the characteristic cys/his dyad of the caspases suggesting it may have a peptidase activity. RAS SM00173 Ras subfamily of RAS small GTPases Similar in fold and function to the bacterial EF-Tu GTPase. p21Ras couples receptor Tyr kinases and G protein receptors to protein kinase cascades Ras_bdg_2 SM01304 Ras-binding domain of Byr2 This domain is the binding/interacting region of several protein kinases, such as the Schizosaccharomyces pombe Byr2. Byr2 is a Ser/Thr-specific protein kinase acting as mediator of signals for sexual differentiation in S. pombe by initiating a MAPK module, which is a highly conserved element in eukaryotes. Byr2 is activated by interacting with Ras, which then translocates the molecule to the plasma membrane. Ras proteins are key elements in intracellular signaling and are involved in a variety of vital processes such as DNA transcription, growth control, and differentiation. They function like molecular switches cycling between GTP-bound on and GDP-bound off states PMID:11709168. RasGAP SM00323 GTPase-activator protein for Ras-like GTPases All alpha-helical domain that accelerates the GTPase activity of Ras, thereby "switching" it into an "off" position. Improved domain limits from structure. RasGEF SM00147 Guanine nucleotide exchange factor for Ras-like small GTPases RasGEFN SM00229 Guanine nucleotide exchange factor for Ras-like GTPases; N-terminal motif A subset of guanine nucleotide exchange factor for Ras-like small GTPases appear to possess this domain N-terminal to the RasGef (Cdc25-like) domain. The recent crystal structureof Sos shows that this domain is alpha-helical and plays a "purely structural role" (Nature 394, 337-343). RasGEF_N_2 SM01303 Rapamycin-insensitive companion of mTOR RasGEF_N domain Rictor appears to serve as a scaffolding protein that is important for maintaining mTORC2 integrity. The mammalian target of rapamycin (mTOR) is a conserved Ser/Thr kinase that forms two functionally distinct complexes, mTROC1 and mTORC2, important for nutrient and growth-factor signalling. This region is the more conserved central section that may include several individual domains. Rictor can be inhibited in the short-term by rapamycin. RB_A SM01368 Retinoblastoma-associated protein A domain This domain has the cyclin fold as predicted PMID:8152925, 9495340. Rb_C SM01369 Rb C-terminal domain The Rb C-terminal domain is required for high-affinity binding to E2F-DP complexes and for maximal repression of E2F-responsive promoters, thereby acting as a growth suppressor by blocking the G1-S transition of the cell cycle. This domain has a strand-loop-helix structure, which directly interacts with both E2F1 and DP1, followed by a tail segment that lacks regular secondary structure (PMID:16360038). RBD SM00455 Raf-like Ras-binding domain REC SM00448 cheY-homologous receiver domain CheY regulates the clockwise rotation of E. coli flagellar motors. This domain contains a phosphoacceptor site that is phosphorylated by histidine kinase homologues. RelA_SpoT SM00954 Region found in RelA / SpoT proteins The functions of Escherichia coli RelA and SpoT differ somewhat. RelA produces pppGpp (or ppGpp) from ATP and GTP (or GDP). SpoT degrades ppGpp, but may also act as a secondary ppGpp synthetase. The two proteins are strongly similar. In many species, a single homolog to SpoT and RelA appears reponsible for both ppGpp synthesis and ppGpp degradation. (p)ppGpp is a regulatory metabolite of the stringent response, but appears also to be involved in antibiotic biosynthesis in some species. Relaxase SM01473 HUH conjugative element mobility domain relaxase/mobilisation protein part of relaxosome involved in bacterial conjugation PMID:9350859 Rep_1 SM01467 HUH conjugative element mobility domain replication protein involved in rolling circle replication of plasmids PMID:9570403 Rep_2 SM01471 HUH conjugative element mobility domain bacterial plasmid replication protein PMID:1904536 Rep_N SM01474 HUH phage replication sub-family Adeno-associated virus replication proteinPMID:12191478 RES SM00953 This presumed protein contains 3 highly conserved polar groups that could form an active site. These are an arginine, glutamate and serine, hence the RES domain. RES is found widely distributed in bacteria, it has about 150 residues in length. Resolvase SM00857 Resolvase, N terminal domain The N-terminal domain of the resolvase family contains the active site and the dimer interface. The extended arm at the C-terminus of this domain connects to the C-terminal helix-turn-helix domain of resolvase. RESP18 SM01305 This domain is found in the glucocorticoid-responsive protein regulated endocrine-specific protein 18 (RESP18) and in the N-terminal extracellular region of receptor-type tyrosine-protein phosphatases containing the protein-tyrosine phosphatase receptor IA-2 domain (PFAM: PF11548).PMID:21104147; PMID:17951542 RFX5_DNA_bdg SM01306 RFX5 DNA-binding domain RFX5 and RFXAP reveals molecular details associated with MHCII gene expression. RGS SM00315 Regulator of G protein signalling domain RGS family members are GTPase-activating proteins for heterotrimeric G-protein alpha-subunits. RHO SM00174 Rho (Ras homology) subfamily of Ras-like small GTPases Members of this subfamily of Ras-like small GTPases include Cdc42 and Rac, as well as Rho isoforms. RHOD SM00450 Rhodanese Homology Domain An alpha beta fold found duplicated in the Rhodanese protein. The the Cysteine containing enzymatically active version of the domain is also found in the CDC25 class of protein phosphatases and a variety of proteins such as sulfide dehydrogenases and stress proteins such as Senesence specific protein 1 in plants, PspE and GlpE in bacteria and cyanide and arsenate resistance proteins. Inactive versions with a loss of the cysteine are also seen in Dual specificity phosphatases, ubiquitin hydrolases from yeast and in sulfuryltransferases. These are likely to play a role in protein interactions. RhoGAP SM00324 GTPase-activator protein for Rho-like GTPases GTPase activator proteins towards Rho/Rac/Cdc42-like small GTPases. etter domain limits and outliers. RhoGEF SM00325 Guanine nucleotide exchange factor for Rho/Rac/Cdc42-like GTPases Guanine nucleotide exchange factor for Rho/Rac/Cdc42-like GTPases Also called Dbl-homologous (DH) domain. It appears that PH domains invariably occur C-terminal to RhoGEF/DH domains. Improved coverage. Rho_N SM00959 Rho termination factor, N-terminal domain The Rho termination factor disengages newly transcribed RNA from its DNA template at certain, specific transcripts. It it thought that two copies of Rho bind to RNA and that Rho functions as a hexamer of protomers PUBMED:10230401. This domain is found to the N-terminus of the RNA binding domain. RIBOc SM00535 Ribonuclease III family Ribosomal_L14 SM01374 Ribosomal protein L14p/L23e Ribosomal_L15e SM01384 Ribosomal_L19e SM01416 Ribosomal_L2 SM01383 Ribosomal Proteins L2, RNA binding domain Ribosomal_L2_C SM01382 Ribosomal Proteins L2, C-terminal domain Ribosomal_L31e SM01380 Ribosomal_L32e SM01393 This family includes ribosomal protein L32 from eukaryotes and archaebacteria. Ribosomal_L40e SM01377 Bovine L40 has been identified as a secondary RNA binding protein (PMID:3129699). L40 is fused to a ubiquitin protein (PMID:7488009). Ribosomal_S10 SM01403 Ribosomal protein S10p/S20e This family includes small ribosomal subunit S10 from prokaryotes and S20 from eukaryotes. Ribosomal_S13_N SM01386 Ribosomal S13/S15 N-terminal domain This domain is found at the N-terminus of ribosomal S13 and S15 proteins. This domain is also identified as NUC021 (PMID:15112237). Ribosomal_S15 SM01387 Ribosomal_S19e SM01413 Ribosomal_S27 SM01402 Ribosomal protein S27a This family of ribosomal proteins consists mainly of the 40S ribosomal protein S27a which is synthesised as a C-terminal extension of ubiquitin (CEP). The S27a domain compromises the C-terminal half of the protein. The synthesis of ribosomal proteins as extensions of ubiquitin promotes their incorporation into nascent ribosomes by a transient metabolic stabilisation and is required for efficient ribosome biogenesis PMID:2538753. The ribosomal extension protein S27a contains a basic region that is proposed to form a zinc finger; its fusion gene is proposed as a mechanism to maintain a fixed ratio between ubiquitin necessary for degrading proteins and ribosomes a source of proteins PMID:2538756. Ribosomal_S3Ae SM01397 Ribosomal S3Ae family Ribosomal_S4 SM01390 Ribosomal protein S4/S9 N-terminal domain This family includes small ribosomal subunit S9 from prokaryotes and S16 from metazoans. This domain is predicted to bind to ribosomal RNA (PMID:9707415). This domain is composed of four helices in the known structure. However the domain is discontinuous in sequence and the alignment for this family contains only the first three helices. Ribosomal_S6e SM01405 Ribosomal protein S6e RICIN SM00458 Ricin-type beta-trefoil Carbohydrate-binding domain formed from presumed gene triplication. RICTOR_M SM01307 Rapamycin-insensitive companion of mTOR, middle domain Rictor appears to serve as a scaffolding protein that is important for maintaining mTORC2 integrity. The mammalian target of rapamycin (mTOR) is a conserved Ser/Thr kinase that forms two functionally distinct complexes, mTROC1 and mTORC2, important for nutrient and growth-factor signalling. This region is the more conserved central section that may include several individual domains. Rictor can be inhibited in the short-term by rapamycin. RICTOR_N SM01308 Rapamycin-insensitive companion of mTOR, N-term Rictor appears to serve as a scaffolding protein that is important for maintaining mTORC2 integrity. The mammalian target of rapamycin (mTOR) is a conserved Ser/Thr kinase that forms two functionally distinct complexes, mTROC1 and mTORC2, important for nutrient and growth-factor signalling. This region is the N-terminal conserved section that may include several individual domains. Rictor can be inhibited in the short-term by rapamycin. RICTOR_phospho SM01309 Rapamycin-insensitive companion of mTOR, phosphorylation-site Rictor appears to serve as a scaffolding protein that is important for maintaining mTORC2 integrity. The mammalian target of rapamycin (mTOR) is a conserved Ser/Thr kinase that forms two functionally distinct complexes, mTROC1 and mTORC2, important for nutrient- and growth-factor signalling. This short region is the phoshorylation site. Rictor does interact with 14-3-3 in a Thr1135-dependent manner. Rictor can be inhibited by short-term rapamycin treatment showing that Thr1135 is an mTORC1-regulated site. RICTOR_V SM01310 Rapamycin-insensitive companion of mTOR, domain 5 Rictor appears to serve as a scaffolding protein that is important for maintaining mTORC2 integrity. The mammalian target of rapamycin (mTOR) is a conserved Ser/Thr kinase that forms two functionally distinct complexes, mTROC1 and mTORC2, important for nutrient and growth-factor signalling. These long eukaryotic proteins carry several well-conserved domains, and this is No.5. RIIa SM00394 RIIalpha, Regulatory subunit portion of type II PKA R-subunit RIIalpha, Regulatory subunit portion of type II PKA R-subunit. Contains dimerisation interface and binding site for A-kinase-anchoring proteins (AKAPs). RING SM00184 Ring finger E3 ubiquitin-protein ligase activity is intrinsic to the RING domain of c-Cbl and is likely to be a general function of this domain; Various RING fingers exhibit binding activity towards E2 ubiquitin-conjugating enzymes (Ubc' s) RINGv SM00744 The RING-variant domain is a C4HC3 zinc-finger like motif found in a number of cellular and viral proteins. Some of these proteins have been shown both in vivo and in vitro to have ubiquitin E3 ligase activity. The RING-variant domain is reminiscent of both the RING and the PHD domains and may represent an evolutionary intermediate. To describe this domain the term PHD/LAP domain has been used in the past. Extended description: The RING-variant (RINGv) domain contains a C4HC3 zinc-finger-like motif similar to the PHD domain, while some of the spacing between the Cys/His residues follow a pattern somewhat closer to that found in the RING domain. The RINGv domain, similar to the RING, PHD and LIM domains, is thought to bind two zinc ions co-ordinated by the highly conserved Cys and His residues. RING variant domain: C-x (2) -C-x(10-45)-C-x (1) -C-x (7) -H-x(2)-C-x(11-25)-C-x(2)-C As opposed to a PHD: C-x(1-2) -C-x (7-13)-C-x(2-4)-C-x(4-5)-H-x(2)-C-x(10-21)-C-x(2)-C Classical RING domain: C-x (2) -C-x (9-39)-C-x(1-3)-H-x(2-3)-C-x(2)-C-x(4-48) -C-x(2)-C RIO SM00090 RIO-like kinase RitA SM01445 sub-family of tyrosine recombinases tyrosine recombinase sub-family involved in transposition, part of Recombinase in Trio(RIT) PMID: 23628708 and https://doi.org/10.1101/542381 RitB SM01451 sub-family of tyrosine recombinases tyrosine recombinase sub-family involved in transposition, part of Recombinase in Trio(RIT) PMID: 23628708 and https://doi.org/10.1101/542382 RitC SM01441 sub-family of tyrosine recombinases tyrosine recombinase sub-family involved in transposition, part of Recombinase in Trio(RIT) PMID: 23628708 and https://doi.org/10.1101/542383 RL11 SM00649 Ribosomal protein L11/L12 RNA_pol_Rpb6 SM01409 RNA polymerase Rpb6 Rpb6 is an essential subunit in the eukaryotic polymerases Pol I, II and III. This family also contains the bacterial equivalent to Rpb6, the omega subunit. Rpb6 and omega are structurally conserved and both function in polymerase assembly PMID:11158566. RNAse_Pc SM00092 Pancreatic ribonuclease RNB SM00955 This domain is the catalytic domain of ribonuclease II.PUBMED:16806266 Robl_LC7 SM00960 Roadblock/LC7 domain This family includes proteins that are about 100 amino acids long and have been shown to be related PUBMED:11084347. Members of this family of proteins are associated with both flagellar outer arm dynein and Drosophila and rat brain cytoplasmic dynein. It is proposed that roadblock/LC7 family members may modulate specific dynein functions PUBMED:10402468. This family also includes Golgi-associated MP1 adapter protein and MglB from Myxococcus xanthus, a protein involved in gliding motility PUBMED:2464581. However the family also includes members from non-motile bacteria such as Streptomyces coelicolor, suggesting that the protein may play a structural or regulatory role. Romo1 SM01378 Reactive mitochondrial oxygen species modulator 1 This is a family of small, approximately 100 amino acid, proteins found from yeasts to humans. The majority of endogenous reactive oxygen species (ROS) in cells are produced by the mitochondrial respiratory chain. An increase or imbalance in ROS alters the intracellular redox homeostasis, triggers DNA damage, and may contribute to cancer development and progression (PMID:16842742). Members of this family are mitochondrial reactive oxygen species modulator 1 (Romo1) proteins that are responsible for increasing the level of ROS in cells. Increased Romo1 expression can have a number of other effects including: inducing premature senescence of cultured human fibroblasts (PMID:18313394, 18836179) and increased resistance to 5-fluorouracil (PMID:17537404). RPEL SM00707 Repeat in Drosophila CG10860, human KIAA0680 and C. elegans F26H9.2 RPOL4c SM00657 DNA-directed RNA-polymerase II subunit RPOL8c SM00658 RNA polymerase subunit 8 subunit of RNA polymerase I, II and III RPOL9 SM00661 RNA polymerase subunit 9 RPOLA_N SM00663 RNA polymerase I subunit A N-terminus RPOLCX SM00659 RNA polymerase subunit CX present in RNA polymerase I, II and III RPOLD SM00662 RNA polymerases D DNA-directed RNA polymerase subunit D and bacterial alpha chain RPOL_N SM01311 DNA-directed RNA polymerase N-terminal This is the N-terminal domain of DNA-directed RNA polymerase. This domain has a role in interaction with regions of upstream promoter DNA and the nascent RNA chain, leading to the processivity of the enzyme PMID:9670025. In order to make mRNA transcripts the RNA polymerase undergoes a transition from the initiation phase (which only makes short fragments of RNA) to an elongation phase. This domain undergoes a structural change in the transition from initiation to elongation phase. The structural change results in abolition of the promoter binding site, creation of a channel accommodating the heteroduplex in the active site and formation of an exit tunnel which the RNA transcript passes through after peeling off the heteroduplex PMID:12242451. RPR SM00582 domain present in proteins, which are involved in regulation of nuclear pre-mRNA RQC SM00956 This DNA-binding domain is found in the RecQ helicase among others and has a helix-turn-helix structure. The RQC domain, found only in RecQ family enzymes, is a high affinity G4 DNA binding domain PUBMED:16530788. RRM SM00360 RNA recognition motif RRM_1 SM00361 RNA recognition motif RRM_2 SM00362 RNA recognition motif RTC4 SM01312 RTC4-like domain This presumed domain is found in the RTC4 protein from yeasts. In Saccharomyces cerevisiae, Cdc13 binds telomeric DNA to recruit telomerase and to "cap" chromosome ends. RTC4 was identified in a screen to identify novel proteins and pathways that cap telomeres, or that respond to uncapped telomeres PMID:18845848. This domain is also found in proteins that contain a DNA-binding myb domain. Rtt106 SM01287 Histone chaperone Rttp106-like This family includes Rttp106, a histone chaperone involved in heterochromatin-mediated silencing PMID:16157874. This domain belongs to the Pleckstrin homology domain superfamily. RuBisCO_small SM00961 Ribulose bisphosphate carboxylase, small chain RuBisCO (ribulose-1,5-bisphosphate carboxylase/oxygenase) is a bifunctional enzyme that catalyses both the carboxylation and oxygenation of ribulose-1,5-bisphosphate (RuBP), thus fixing carbon dioxide as the first step of the Calvin cycle. RuBisCO is the major protein in the stroma of chloroplasts, and in higher plants exists as a complex of 8 large and 8 small subunits. The function of the small subunit is unknown PUBMED:3012537. While the large subunit is coded for by a single gene, the small subunit is coded for by several different genes, which are distributed in a tissue specific manner. They are transcriptionally regulated by light receptor phytochrome PUBMED:3010233, which results in RuBisCO being more abundant during the day when it is required. RUN SM00593 domain involved in Ras-like GTPase signaling rve SM01477 DDE transposase retroviral integrase sub-family DDE domain containing integrases found in retroviruses and transposons PMID:7801124 rve_2 SM01494 DDE transposase retroviral integrase sub-family integrases found in retroviruses and transposons rve_3 SM01488 DDE transposase retroviral integrase sub-family integrases found in retroviruses and transposons RWD SM00591 domain in RING finger and WD repeat containing proteins and DEXDc-like helicases subfamily related to the UBCc domain S1 SM00316 Ribosomal protein S1-like RNA-binding domain S_100 SM01394 S-100/ICaBP type calcium binding domain S4 SM00363 S4 RNA-binding domain s48_45 SM00970 Sexual stage antigen s48/45 domain This family contains sexual stage s48/45 antigens from Plasmodium (approximately 450 residues long). These are surface proteins expressed by Plasmodium male and female gametes that have been shown to play a conserved and important role in fertilisation PUBMED:11163248. SAA SM00197 Serum amyloid A proteins Serum amyloid A proteins are induced during the acute-phase response. Secondary amyloidosis is characterised by the extracellular accumulation in tissues of SAA proteins. SAA proteins are apolipoproteins. SAF SM00858 This domain family includes a range of different proteins. Such as antifreeze proteins and flagellar FlgA proteins, and CpaB pilus proteins. SAM SM00454 Sterile alpha motif. Widespread domain in signalling and nuclear proteins. In EPH-related tyrosine kinases, appears to mediate cell-cell initiated signal transduction via the binding of SH2-containing proteins to a conserved tyrosine that is phosphorylated. In many cases mediates homodimerisation. SAM_HD_adjacent SM01426 SAM homeodomain adjacent domain The SAM HD adjacent domain is a conserved motif found immediately downstream of the homeodomain in SAMSARA three amino acid loop extension (TALE) homeodomain proteins (PMID: 30644818). This domain has been identified in SAMSARA proteins from a broad range of brown algae. SAM_PNT SM00251 SAM / Pointed domain A subfamily of the SAM domain SAND SM00258 SAND domain SANT SM00717 SANT SWI3, ADA2, N-CoR and TFIIIB'' DNA-binding domains SAP SM00513 Putative DNA-binding (bihelical) motif predicted to be involved in chromosomal organisation SAPA SM00162 Saposin/surfactant protein-B A-type DOMAIN Present as four and three degenerate copies, respectively, in prosaposin and surfactant protein B. Single copies in acid sphingomyelinase, NK-lysin amoebapores and granulysin. Putative phospholipid membrane binding domains. SapB SM00741 Saposin (B) Domains Present in multiple copies in prosaposin and in pulmonary surfactant-associated protein B. In plant aspartic proteinases, a saposin domain is circularly permuted. This causes the prediction algorithm to predict two such domains, where only one is truly present. SAR SM00178 Sar1p-like members of the Ras-family of small GTPases Yeast SAR1 is an essential gene required for transport of secretory proteins from the endoplasmic reticulum to the Golgi apparatus. SARA SM01422 Smad anchor for receptor activation (SARA) Smad proteins mediate transforming growth factor-beta (TGF-beta) signaling from the transmembrane serine-threonine receptor kinases to the nucleus PMID:10615055. SARA recruits Smad2 to the TGF-beta receptors for phosphorylation PMID:10615055. SATase_N SM00971 Serine acetyltransferase, N-terminal The N-terminal domain of serine acetyltransferase has a sequence that is conserved in plants PUBMED:7608200 and bacteria PUBMED:7608200. SCAN SM00431 leucine rich region SCP SM00198 SCP / Tpx-1 / Ag5 / PR-1 / Sc7 family of extracellular domains. Human glioma pathogenesis-related protein GliPR and the plant pathogenesis-related protein represent functional links between plant defense systems and human immune system. This family has no known function. SCPU SM00972 Spore Coat Protein U domain This domain is found in a bacterial family of spore coat proteins PUBMED:1904442 as well as a family of secreted pili proteins involved in motility and biofilm formation PUBMED:1904442. SCY SM00199 Intercrine alpha family (small cytokine C-X-C) (chemokine CXC). Family of cytokines involved in cell-specific chemotaxis, mediation of cell growth, and the inflammatory response. Sds3 SM01401 Sds3-like Repression of gene transcription is mediated by histone deacetylases containing repressor-co-repressor complexes, which are recruited to promoters of target genes via interactions with sequence-specific transcription factors. The co-repressor complex contains a core of at least seven proteins PMID:15451426. This family represents the conserved region found in Sds3, Dep1 and BRMS1-homologue p40 proteins. SEA SM00200 Domain found in sea urchin sperm protein, enterokinase, agrin Proposed function of regulating or binding carbohydrate sidechains. SEC14 SM00516 Domain in homologues of a S. cerevisiae phosphatidylinositol transfer protein (Sec14p) Domain in homologues of a S. cerevisiae phosphatidylinositol transfer protein (Sec14p) and in RhoGAPs, RhoGEFs and the RasGAP, neurofibromin (NF1). Lipid-binding domain. The SEC14 domain of Dbl is known to associate with G protein beta/gamma subunits. Sec3-PIP2_bind SM01313 Exocyst complex component SEC3 N-terminal PIP2 binding PH This is the N-terminal domain of fungal and eukaryotic Sec3 proteins. Sec3 is a component of the exocyst complex that is involved in the docking of exocytic vesicles with fusion sites on the plasma membrane.This N-terminal domain contains a cryptic pleckstrin homology (PH) fold, and all six positively charged lysine and arginine residues in the PH domain predicted to bind the PIP2 head group are conserved. The exocyst complex is essential for many exocytic events, by tethering vesicles at the plasma membrane for fusion. In fission yeast, polarised exocytosis for growth relies on the combined action of the exocyst at cell poles and myosin-driven transport along actin cables PMID:22768263. Sec63 SM00973 Sec63 Brl domain This domain was named after the yeast Sec63 (or NPL1) (also known as the Brl domain) protein in which it was found. This protein is required for assembly of functional endoplasmic reticulum translocons PUBMED:16368690, PUBMED:11023840. Other yeast proteins containing this domain include pre-mRNA splicing helicase BRR2, HFM1 protein and putative helicases. Sec7 SM00222 Sec7 domain Domain named after the S. cerevisiae SEC7 gene product, which is required for proper protein transport through the Golgi. The domain facilitates guanine nucleotide exchange on the small GTPases, ARFs (ADP ribosylation factors). SecA_DEAD SM00957 SecA DEAD-like domain SecA protein binds to the plasma membrane where it interacts with proOmpA to support translocation of proOmpA through the membrane. SecA protein achieves this translocation, in association with SecY protein, in an ATP dependent manner PUBMED:9644254 PUBMED:2542029. This domain represents the N-terminal ATP-dependent helicase domain, which is related to the PUBMED:12242434. SecA_PP_bind SM00958 SecA preprotein cross-linking domain The SecA ATPase is involved in the insertion and retraction of preproteins through the plasma membrane. This domain has been found to cross-link to preproteins, thought to indicate a role in preprotein binding. The pre-protein cross-linking domain is comprised of two sub domains that are inserted within the ATPase domain PUBMED:12242434. SEL1 SM00671 Sel1-like repeats. These represent a subfamily of TPR (tetratricopeptide repeat) sequences. Sema SM00630 semaphorin domain Semialdhyde_dh SM00859 Semialdehyde dehydrogenase, NAD binding domain The semialdehyde dehydrogenase family is found in N-acetyl-glutamine semialdehyde dehydrogenase (AgrC), which is involved in arginine biosynthesis, and aspartate-semialdehyde dehydrogenase, an enzyme involved in the biosynthesis of various amino acids from aspartate. This family is also found in yeast and fungal Arg5,6 protein, which is cleaved into the enzymes N-acety-gamma-glutamyl-phosphate reductase and acetylglutamate kinase. These are also involved in arginine biosynthesis. All proteins in this entry contain a NAD binding region of semialdehyde dehydrogenase. SEP SM00553 Domain present in Saccharomyces cerevisiae Shp1, Drosophila melanogaster eyes closed gene (eyc), and vertebrate p47. SER SM01439 resolvase domain containing family involved in site-specific recombination serine recombinase family contains resolvase domain and mediates DNA site-specific reocmbination via conserved serine residue PMID:26104451 ser_ce SM01437 serine recombinase conjugative transposon sub-family serine recombinase domain present in Integarative Conjugative Elements (ICE)/conjugative transposons eg:Tn3 involved in site-specific recombination. This sub-family is part of "resolvase" family PMID:26104451 ser_lsr SM01436 large serine recombinase sub-family Large Serine Recombinases (LSR) involved in site-specific reocmbination of phages and Integarative Conjugative Elements (ICEs). The sub-family is part of "resolvase" family and is commonly accompanied by a recombinase domain PMID:26104451 SERPIN SM00093 SERine Proteinase INhibitors ser_tn SM01438 serine transposase sub-family serine recombinase domain present in IS elements and Transposons involved in site-specific recombination. This sub-family is part of "resolvase" family PMID:26104451 SET SM00317 SET (Su(var)3-9, Enhancer-of-zeste, Trithorax) domain Putative methyl transferase, based on outlier plant homologues SFM SM00500 Splicing Factor Motif, present in Prp18 and Pr04 SF_P SM00019 Pulmonary surfactant proteins Pulmonary surfactant associated proteins promote alveolar stability by lowering the surface tension at the air-liquid interface in the peripheral air spaces. SP-C, a component of surfactant, is a highly hydrophobic peptide of 35 amino acid residues which is processed from a larger precursor protein. SP-C is post-translationally modified by the covalent attachment of two palmitoyl groups on two adjacent cysteines SH2 SM00252 Src homology 2 domains Src homology 2 domains bind phosphotyrosine-containing polypeptides via 2 surface pockets. Specificity is provided via interaction with residues that are distinct from the phosphotyrosine. Only a single occurrence of a SH2 domain has been found in S. cerevisiae. SH3 SM00326 Src homology 3 domains Src homology 3 (SH3) domains bind to target proteins through sequences containing proline and hydrophobic amino acids. Pro-containing polypeptides may bind to SH3 domains in 2 different binding orientations. SH3b SM00287 Bacterial SH3 domain homologues ShKT SM00254 ShK toxin domain ShK toxin domain SHR3_chaperone SM00786 ER membrane protein SH3 This family of proteins are membrane localised chaperones that are required for correct plasma membrane localisation of amino acid permeases (AAPs) PUBMED:15623581. Shr3 prevents AAPs proteins from aggregating and assists in their correct folding. In the absence of Shr3, AAPs are retained in the ER. Skp1 SM00512 Found in Skp1 protein family Family of Skp1 (kinetochore protein required for cell cycle progression) and elongin C (subunit of RNA polymerase II transcription factor SIII) homologues. Sm SM00651 snRNP Sm proteins small nuclear ribonucleoprotein particles (snRNPs) involved in pre-mRNA splicing small_GTPase SM00010 Small GTPase of the Ras superfamily; ill-defined subfamily SMART predicts Ras-like small GTPases of the ARF, RAB, RAN, RAS, and SAR subfamilies. Others that could not be classified in this way are predicted to be members of the small GTPase superfamily without predictions of the subfamily. SMC_hinge SM00968 SMC proteins Flexible Hinge Domain This entry represents the hinge region of the SMC (Structural Maintenance of Chromosomes) family of proteins. The hinge region is responsible for formation of the DNA interacting dimer. It is also possible that the precise structure of it is an essential determinant of the specificity of the DNA-protein interaction PUBMED:12411491. SMI1_KNR4 SM00860 SMI1 / KNR4 family Proteins in this family are involved in the regulation of 1,3-beta-glucan synthase activity and cell-wall formation. SMR SM00463 Small MutS-related domain SnAC SM01314 Snf2-ATP coupling, chromatin remodelling complex This domain appears to play a crucial role in chromatin remodelling for yeast SWI/SNF. It binds histones. It is required for mobilising nucleosomes and lies within the catalytic subunit of the yeast SWI/SNF. It is found to be universally conserved PMID:21835776. SNc SM00318 Staphylococcal nuclease homologues SO SM00201 Somatomedin B -like domains Somatomedin-B is a peptide, proteolytically excised from vitronectin, that is a growth hormone-dependent serum factor with protease-inhibiting activity. SOCS SM00253 suppressors of cytokine signalling suppressors of cytokine signalling SOCS_box SM00969 The SOCS box acts as a bridge between specific substrate- binding domains and more generic proteins that comprise a large family of E3 ubiquitin protein ligases. Solute_trans_a SM01417 Organic solute transporter Ostalpha This family is a transmembrane organic solute transport protein. In vertebrates these proteins form a complex with Ostbeta, and function as bile transporters PMID:15563450. In plants they may transport brassinosteroid-like compounds and act as regulators of cell death PMID:20830211. Sorb SM00459 Sorbin homologous domain First found in the peptide hormone sorbin and later in the ponsin/ArgBP2/vinexin family of proteins. Spc7 SM00787 Spc7 kinetochore protein This domain is found in cell division proteins which are required for kinetochore-spindle association. Spc7_N SM01315 N-terminus of kinetochore NMS complex subunit Spc7 SPEC SM00150 Spectrin repeats SPK SM00583 domain in SET and PHD domain containing proteins and protein kinases Spo7_2_N SM01316 Sporulation protein family 7 Spo7_2 constitutes a different set of fungal and related species from those found in Spo7. This domain is found in general at the N-terminus. In many members the domain is associated with a Pleckstrin-homology - PH - domain. SPOB_ab SM01317 Sporulation initiation phospho-transferase B, C-terminal Sporulation initiation phospho-transferase B or SpoOB is part of a phospho-relay that initiates sporulation in Bacillus subtilis. Spo0B is a two-domain protein consisting of an N-terminal alpha-helical hairpin domain and a C-terminal alpha/beta domain, represented by this family. Two subunits of Spo0B dimerise by a parallel association of helical hairpins to form a novel four-helix bundle from which the active histidine - involved in the auto-phosphorylation - protrudes. In the phospho-relay, the signal-receptor histidine kinases are dephosphorylated by a common response regulator, Spo0F. Spo0B then takes phosphorylated Spo0F as substrate hereby mediating the transfer of a phosphoryl group to Spo0A, the ultimate transcription factor. SpoU_sub_bind SM00967 RNA 2'-O ribose methyltransferase substrate binding This domain is a RNA 2'-O ribose methyltransferase substrate binding domain. SpoVT_AbrB SM00966 SpoVT / AbrB like domain This domain is found in AbrB from Bacillus subtilis. The product of the abrB gene is an ambiactive repressor and activator of the transcription of genes expressed during the transition state between vegetative growth and the onset of stationary phase and sporulation PUBMED:2504584. AbrB is thought to interact directly with the transcription initiation regions of genes under its control PUBMED:8755877. AbrB contains a helix-turn-helix structure, but this domain ends before the helix-turn-helix begins PUBMED:1908787. The product of the B. subtilis gene spoVT is another member of this family and is also a transcriptional regulator PUBMED:8755877. DNA-binding activity in this AbrB homologue requires hexamerisation PUBMED:10978510. Another family member has been isolated from the Sulfolobus solfataricus and has been identified as a homologue of bacterial repressor-like proteins. The Escherichia coli family member SohA or Prl1F appears to be bifunctional and is able to regulate its own expression as well as relieve the export block imposed by high-level synthesis of beta-galactosidase hybrid proteins PUBMED:2152898. SprT SM00731 SprT homologues. Predicted to have roles in transcription elongation. Contains a conserved HExxH motif, indicating a metalloprotease function. SPRY SM00449 Domain in SPla and the RYanodine Receptor. Domain of unknown function. Distant homologues are domains in butyrophilin/marenostrin/pyrin homologues. SPT16 SM01286 FACT complex subunit (SPT16/CDC68) Proteins in this family are subunits the FACT complex. The FACT complex plays a role in transcription initiation and promotes binding of TATA-binding protein (TBP) to a TATA box in chromatin PMID:15987999. SPT2 SM00784 SPT2 chromatin protein This entry includes the Saccharomyces cerevisiae protein SPT2 which is a chromatin protein involved in transcriptional regulation PUBMED:15563464. Spt4 SM01389 Spt4/RpoE2 zinc finger This family consists of several eukaryotic transcription elongation Spt4 proteins as well as archaebacterial RpoE2 (PMID:8127719). Three transcription-elongation factors Spt4, Spt5, and Spt6 are conserved among eukaryotes and are essential for transcription via the modulation of chromatin structure. Spt4 and Spt5 are tightly associated in a complex, while the physical association of the Spt4-Spt5 complex with Spt6 is considerably weaker. It has been demonstrated that Spt4, Spt5, and Spt6 play roles in transcription elongation in both yeast and humans including a role in activation by Tat. It is known that Spt4, Spt5, and Spt6 are general transcription-elongation factors, controlling transcription both positively and negatively in important regulatory and developmental roles (PMID:11182892). RpoE2 is one of 13 subunits in the archaeal RNA polymerase. These proteins contain a C4-type zinc finger, and the structure has been solved (PMID:19000817). The structure reveals that Spt4-Spt5 binding is governed by an acid-dipole interaction between Spt5 and Spt4, and the complex binds to and travels along the elongating RNA polymerase. The Spt4-Spt5 complex is likely to be an ancient, core component of the transcription elongation machinery. SR SM00202 Scavenger receptor Cys-rich The sea ucrhin egg peptide speract contains 4 repeats of SR domains that contain 6 conserved cysteines. May bind bacterial antigens in the protein MARCO. SRA SM00466 SET and RING finger associated domain. Domain of unknown function in SET domain containing proteins and in Deinococcus radiodurans DRA1533. Domain in SET domain containing proteins and in Deinococcus radiodurans DRA1533. SRP54 SM00962 SRP54-type protein, GTPase domain This entry represents the GTPase domain of the 54 kDa SRP54 component, a GTP-binding protein that interacts with the signal sequence when it emerges from the ribosome. SRP54 of the signal recognition particle has a three-domain structure: an N-terminal helical bundle domain, a GTPase domain, and the M-domain that binds the 7s RNA and also binds the signal sequence. The extreme C-terminal region is glycine-rich and lower in complexity and poorly conserved between species. The GTPase domain is evolutionary related to P-loop NTPase domains found in a variety of other proteins PUBMED:7518075. SRP54_N SM00963 SRP54-type protein, helical bundle domain This entry represents the N-terminal helical bundle domain of the 54 kDa SRP54 component, a GTP-binding protein that interacts with the signal sequence when it emerges from the ribosome. SRP54 of the signal recognition particle has a three-domain structure: an N-terminal helical bundle domain, a GTPase domain, and the M-domain that binds the 7s RNA and also binds the signal sequence. The extreme C-terminal region is glycine-rich and lower in complexity and poorly conserved between species. START SM00234 in StAR and phosphatidylcholine transfer protein putative lipid-binding domain in StAR and phosphatidylcholine transfer protein STAT_int SM00964 STAT protein, protein interaction domain STAT proteins (Signal Transducers and Activators of Transcription) are a family of transcription factors that are specifically activated to regulate gene transcription when cells encounter cytokines and growth factors. STAT proteins also include an SH2 domain. STE SM00424 STE like transcription factors STI SM00452 Soybean trypsin inhibitor (Kunitz) family of protease inhibitors STI1 SM00727 Heat shock chaperonin-binding motif. S_TKc SM00220 Serine/Threonine protein kinases, catalytic domain Phosphotransferases. Serine or threonine-specific kinase subfamily. S_TK_X SM00133 Extension to Ser/Thr-type protein kinases STN SM00965 Secretin and TonB N terminus short domain This is a short domain found at the N-terminus of the Secretins of the bacterial type II/III secretory system as well as the TonB-dependent receptor proteins. These proteins are involved in TonB-dependent active uptake of selective substrates. STYKc SM00221 Protein kinase; unclassified specificity. Phosphotransferases. The specificity of this class of kinases can not be predicted. Possible dual-specificity Ser/Thr/Tyr kinase. SVWC SM01318 Single domain von Willebrand factor type C SVWC is a family of single-domain von Willebrand factor type C proteins from lower eukaryotes. The canonical pattern of most von Willebrand factor type C (VWC) domains is of ten cysteines, however this family, largely but not exclusively of arthropod proteins, contains only eight. SVWC family proteins respond to environmental challenges, such as bacterial infection and nutritional status. They also are involved in anti-viral immunity, and all of these functions seem linked to SVWC expression being induced by Dicer2. SWAP SM00648 Suppressor-of-White-APricot splicing regulator domain present in regulators which are responsible for pre-mRNA splicing processes SWIB SM00151 SWI complex, BAF60b domains Sybindin SM01399 Sybindin-like family Sybindin is a physiological syndecan-2 ligand on dendritic spines, the small protrusions on the surface of dendrites that receive the vast majority of excitatory synapses PMID:11018053. SynN SM00503 Syntaxin N-terminal domain Three-helix domain that (in Sso1p) slows the rate of its reaction with the SNAP-25 homologue Sec9p T5orf172 SM00974 This entry represents the putative helicase A859L PUBMED:11897024. TAF SM00803 TATA box binding protein associated factor TAFs (TATA box binding protein associated factors) are part of the transcription initiation factor TFIID multimeric protein complex. TFIID is composed of the TATA box binding protein (TBP) and a number of TAFs. The TAFs provide binding sites for many different transcriptional activators and co-activators that modulate transcription initiation by Pol II. TAF proteins adopt a histone-like fold. TAFH SM00549 TAF homology Domain in Drosophila nervy, CBFA2T1, human TAF105, human TAF130, and Drosophila TAF110. Also known as nervy homology region 1 (NHR1). TAFII55_N SM01370 TAFII55 protein conserved region The general transcription factor, TFIID, consists of the TATA-binding protein (TBP) associated with a series of TBP-associated factors (TAFs) that together participate in the assembly of the transcription preinitiation complex. TAFII55 binds to TAFII250 and inhibits it acetyltransferase activity. The exact role of TAFII55 is currently unknown. The conserved region is situated towards the N-terminus of the protein (PMID:11592977). TAN SM01342 Telomere-length maintenance and DNA damage repair ATM is a large protein kinase, in humans, critical for responding to DNA double-strand breaks (DSBs). Tel1, the orthologue from budding yeast, also regulates responses to DSBs. Tel1 is important for maintaining viability and for phosphorylation of the DNA damage signal transducer kinase Rad53 (an orthologue of mammalian CHK2). In addition to functioning in the response to DSBs, numerous findings indicate that Tel1/ATM regulates telomeres. The overall domain structure of Tel1/ATM is shared by proteins of the phosphatidylinositol 3-kinase (PI3K)-related kinase (PIKK) family, but this family carries a unique and functionally important TAN sequence motif, near its N-terminal, LxxxKxxE/DRxxxL. which is conserved specifically in the Tel1/ATM subclass of the PIKKs. The TAN motif is essential for both telomere length maintenance and Tel1 action in response to DNA damage (PMID:18625723). It is classified as an EC:2.7.11.1. Tankyrase_bdg_C SM01319 Tankyrase binding protein C terminal domain This protein domain family is found at the C-terminal end of the Tankyrase binding protein in eukaryotes. The precise function of this protein is still unknown. However, it is known interacts with the enzyme tankyrase, a telomeric poly(ADP-ribose) polymerase, by binding to it. Tankyrin catalyses poly(ADP-ribose) chain formation onto proteins. More specifically, it binds to the ankyrin domain in tankyrase (PMID:11854288). The protein domain is approximately 170 amino acids in length and contains two conserved sequence motifs: FPG and LKA. TAP_C SM00804 C-terminal domain of vertebrate Tap protein The vertebrate Tap protein is a member of the NXF family of shuttling transport receptors for the nuclear export of mRNA. Its most C-terminal domain is important for binding to FG repeat-containing nuclear pore proteins (FG-nucleoporins) and is sufficient to mediate shuttling. This domain forms a compact four-helix fold related to that of a UBA domain. TarH SM00319 Homologues of the ligand binding domain of Tar Homologues of the ligand binding domain of the wild-type bacterial aspartate receptor, Tar. TBC SM00164 Domain in Tre-2, BUB2p, and Cdc16p. Probable Rab-GAPs. Widespread domain present in Gyp6 and Gyp7, thereby giving rise to the notion that it performs a GTP-activator activity on Rab-like GTPases. Tbf5 SM01395 Transcription factor TFIIH complex subunit Tfb5 This family is a component of the general transcription and DNA repair factor IIH. TFB5 has been shown to be required for efficient recruitment of TFIIH to a promoter PMID:15220919. TBOX SM00425 Domain first found in the mice T locus (Brachyury) protein TDU SM00711 Short repeats in human TONDU, fly vestigial and other proteins. Unknown function. TEA SM00426 TEA domain TECPR SM00706 Beta propeller repeats in Physarum polycephalum tectonins, Limulus lectin L-6 and animal hypothetical proteins. Telo_bind SM00976 Telomeric single stranded DNA binding POT1/CDC13 The telomere-binding protein forms a heterodimer in ciliates consisting of an alpha and a beta subunit. This complex may function as a protective cap for the single-stranded telomeric overhang. Alpha subunit consists of 3 structural domains, all with the same beta-barrel OB fold. Telomerase_RBD SM00975 Telomerase ribonucleoprotein complex - RNA binding domain Telomeres in most organisms are comprised of tandem simple sequence repeats PUBMED:9671704. The total length of telomeric repeat sequence at each chromosome end is determined in a balance of sequence loss and sequence addition PUBMED:9671704. One major influence on telomere length is the enzyme telomerase PUBMED:9671704. It is a reverse transcriptase that adds these simple sequence repeats to chromosome ends by copying a template sequence within the RNA component of the enzyme PUBMED:9671704. The RNA binding domain of telomerase - TRBD - is made up of twelve alpha helices and two short beta sheets PUBMED:17997966. How telomerase and associated regulatory factors physically interact and function with each other to maintain appropriate telomere length is poorly understood. It is known however that TRBD is involved in formation of the holoenzyme (which performs the telomere extension) in addition to recognition and binding of RNA PUBMED:17997966. Tet_JBP SM01333 Oxygenase domain of the 2OGFeDO superfamily A double-stranded beta helix (DSBH) fold domain of the 2-oxoglutarate (2OG)-Fe(II)-dependent dioxygenase (2OGFeDO) superfamily found in various eukaryotes, bacteria and bacteriophages (PMID:19411852). Members of this family catalyze nucleic acid modifications, such as thymidine hydroxylation during base J synthesis in kinetoplastids (PMID:20215442) and the conversion of 5 methyl-cytosine (5-mC) to 5-hydroxymethyl-cytosine (hmC) (PMID:19372391) or further oxidation to 5-formylcytosine (5fC) and 5-carboxylcytosine (5caC) (PMID:21817016). Metazoan TET proteins contain a cysteine-rich region inserted into the core of the DSBH fold. Vertebrate TET proteins are oncogenes that are mutated in various myeloid cancers (PMID:21057493). Fungal and algal versions of this family are linked to a predicted transposase and show lineage-specific expansions (PMID:19411852). TFIIA SM01371 Transcription factor IIA, alpha/beta subunit Transcription initiation factor IIA (TFIIA) is a heterotrimer, the three subunits being known as alpha, beta, and gamma, in order of molecular weight. This family represents the precursor that yields both the alpha and beta subunits. The TFIIA heterotrimer is an essential general transcription initiation factor for the expression of genes transcribed by RNA polymerase II. Together with TFIID, TFIIA binds to the promoter region; this is the first step in the formation of a pre-initiation complex (PIC). Binding of the rest of the transcription machinery follows this step (PMID:11089979). After initiation, the PIC does not completely dissociate from the promoter. Some components, including TFIIA, remain attached and re-initiate a subsequent round of transcription. TFIIE SM00531 Transcription initiation factor IIE TFS2M SM00510 Domain in the central regions of transcription elongation factor S-II (and elsewhere) TFS2N SM00509 Domain in the N-terminus of transcription elongation factor S-II (and elsewhere) TGc SM00460 Transglutaminase/protease-like homologues Transglutaminases are enzymes that establish covalent links between proteins. A subset of transglutaminase homologues appear to catalyse the reverse reaction, the hydrolysis of peptide bonds. Proteins with this domain are both extracellular and intracellular, and it is likely that the eukaryotic intracellular proteins are involved in signalling events. TGFB SM00204 Transforming growth factor-beta (TGF-beta) family Family members are active as disulphide-linked homo- or heterodimers. TGFB is a multifunctional peptide that controls proliferation, differentiation, and other functions in many cell types. THAP SM00980 The THAP domain is a putative DNA-binding domain (DBD) and probably also binds a zinc ion. It features the conserved C2CH architecture (consensus sequence: Cys - 2-4 residues - Cys - 35-50 residues - Cys - 2 residues - His). Other universal features include the location of the domain at the N-termini of proteins, its size of about 90 residues, a C-terminal AVPTIF box and several other conserved residues. Orthologues of the human THAP domain have been identified in other vertebrates and probably worms and flies, but not in other eukaryotes or any prokaryotes PUBMED:12575992. THEG SM00705 Repeats in THEG (testicular haploid expressed gene) and several fly proteins. Thiol-ester_cl SM01419 Alpha-macro-globulin thiol-ester bond-forming region This short highly conserved region of proteinase-binding alpha-macro-globulins contains the cysteine and a glutamine of a thiol-ester bond that is cleaved at the moment of proteinase binding, and mediates the covalent binding of the alpha-macro-globulin to the proteinase. The GCGEQ motif is highly conserved. THN SM00205 Thaumatin family The thaumatin family gathers proteins related to plant pathogenesis. The thaumatin family includes very basic members with extracellular and vacuolar localization. Thaumatin itsel is a potent sweet-tasting protein. Several members of this family display significant in vitro activity of inhibiting hyphal growth or spore germination of various fungi probably by a membrane permeabilizing mechanism. THUMP SM00981 The THUMP domain is named after after thiouridine synthases, methylases and PSUSs PUBMED:11295541. The THUMP domain consists of about 110 amino acid residues. The structure of ThiI reveals that the THUMP has a fold unlike that of previously characterised RNA-binding domains PUBMED:16343540. It is predicted that this domain is an RNA-binding domain The THUMP domain probably functions by delivering a variety of RNA modification enzymes to their targets PUBMED:11295541. THY SM00152 Thymosin beta actin-binding motif. Thymopoietin SM01261 Short protein of 49 amino acid isolated from bovine spleen cells PUBMED:7306506. Thymopoietins (TMPOs) are a group of ubiquitously expressed nuclear proteins. They are suggested to play an important role in nuclear envelope organisation and cell cycle control PUBMED:10430029. TIFY SM00979 This short possible domain is found in a variety of plant transcription factors that contain GATA domains as well as other motifs. Although previously known as the Zim domain this is now called the tify domain after its most conserved amino acids. TIFY proteins can be further classified into two groups depending on the presence (group I) or absence (group II) of a C2C2-GATA domain. Functional annotation of these proteins is still poor, but several screens revealed a link between TIFY proteins of group II and jasmonic acid-related stress response. TilS_C SM00977 TilS substrate C-terminal domain This domain is found in the tRNA(Ile) lysidine synthetase (TilS) protein. Tim44 SM00978 Tim44 is an essential component of the machinery that mediates the translocation of nuclear-encoded proteins across the mitochondrial inner membrane PUBMED:10430866. Tim44 is thought to bind phospholipids of the mitochondrial inner membrane both by electrostatic interactions and by penetrating the polar head group region PUBMED:10430866. TIR SM00255 Toll - interleukin 1 - resistance TK SM00203 Tachykinin family Tachykinins are a group of biologically active peptides which excite neurons, evoke behavioral responses, are potent vasodilatators and contract (directly or indirectly) many smooth muscles. These peptides are synthesized as longer precursors and then processed to peptides from ten to twelve residues long. TLC SM00724 TRAM, LAG1 and CLN8 homology domains. Protein domain with at least 5 transmembrane alpha-helices. Lag1p and Lac1p are essential for acyl-CoA-dependent ceramide synthesis, TRAM is a subunit of the translocon and the CLN8 gene is mutated in Northern epilepsy syndrome. The family may possess multiple functions such as lipid trafficking, metabolism, or sensing. Trh homologues possess additional homeobox domains. TLDc SM00584 domain in TBC and LysM domain containing proteins Tn7_Tnp_TnsA_N SM01434 PD-(D/E)XK nuclease transposase sub-family Tn7 tranposase invovled in transposition of PDDEKX superfamily PMID:22638584 TNF SM00207 Tumour necrosis factor family. Family of cytokines that form homotrimeric or heterotrimeric complexes. TNF mediates mature T-cell receptor-induced apoptosis through the p75 TNF receptor. TNFR SM00208 Tumor necrosis factor receptor / nerve growth factor receptor repeats. Repeats in growth factor receptors that are involved in growth factor binding. TNF/TNFR TnpA SM01442 sub-family of tyrosine recombinases tyrosine reocombinase sub-family from TnpA transposase https://doi.org/10.1101/542383 TnpR SM01443 sub-family of tyrosine recombinases tyrosine reocmbinase sub-family mediates DNA clevage through active site tyrosine residue https://doi.org/10.1101/542381 TOG SM01349 XMAP215/Dis1 proteins, such as Alp14 and XMAP215, increase microtubules dynamic polymerization rates by recruiting soluble αβ-tubulin via their conserved TOG domains to polymerizing microtubule plus ends. TOP1Ac SM00437 Bacterial DNA topoisomerase I DNA-binding domain Bacterial DNA topoisomerase I and III, Eukaryotic DNA topoisomeraes III, reverse gyrase alpha subunit TOP1Bc SM00436 Bacterial DNA topoisomeraes I ATP-binding domain Extension of TOPRIM in Bacterial DNA topoisomeraes I and III, Eukaryotic DNA topoisomeraes III, reverse gyrase beta subunit TOP2c SM00433 TopoisomeraseII Eukaryotic DNA topoisomerase II, GyrB, ParE TOP4c SM00434 DNA Topoisomerase IV Bacterial DNA topoisomerase IV, GyrA, ParC TOPEUc SM00435 DNA Topoisomerase I (eukaryota) DNA Topoisomerase I (eukaryota), DNA topoisomerase V, Vaccina virus topoisomerase, Variola virus topoisomerase, Shope fibroma virus topoisomeras TOPRIM SM00493 topoisomerases, DnaG-type primases, OLD family nucleases and RecR proteins Tower SM01341 Members of this family adopt a secondary structure consisting of a pair of long, antiparallel alpha-helices (the stem) that support a three-helix bundle (3HB) at their end. The 3HB contains a helix-turn-helix motif and is similar to the DNA binding domains of the bacterial site-specific recombinases, and of eukaryotic Myb and homeodomain transcription factors. The Tower domain has an important role in the tumor suppressor function of BRCA2, and is essential for appropriate binding of BRCA2 to DNA (PMID:12228710). TPK_B1_binding SM00983 Thiamin pyrophosphokinase, vitamin B1 binding domain Thiamin pyrophosphokinase (TPK) catalyzes the transfer of a pyrophosphate group from ATP to vitamin B1 (thiamin) to form the coenzyme thiamin pyrophosphate (TPP). Thus, TPK is important for the formation of a coenzyme required for central metabolic functions. The structure of thiamin pyrophosphokinase suggest that the enzyme may operate by a mechanism of pyrophosphoryl transfer similar to those described for pyrophosphokinases functioning in nucleotide biosynthesis PUBMED:11435118. TPR SM00028 Tetratricopeptide repeats Repeats present in 4 or more copies in proteins. Contain a minimum of 34 amino acids each and self-associate via a "knobs and holes" mechanism. Transket_pyr SM00861 Transketolase, pyrimidine binding domain Transketolase (TK) catalyzes the reversible transfer of a two-carbon ketol unit from xylulose 5-phosphate to an aldose receptor, such as ribose 5-phosphate, to form sedoheptulose 7-phosphate and glyceraldehyde 3- phosphate. This enzyme, together with transaldolase, provides a link between the glycolytic and pentose-phosphate pathways. TK requires thiamine pyrophosphate as a cofactor. In most sources where TK has been purified, it is a homodimer of approximately 70 Kd subunits. TK sequences from a variety of eukaryotic and prokaryotic sources show that the enzyme has been evolutionarily conserved. In the peroxisomes of methylotrophic yeast Hansenula polymorpha, there is a highly related enzyme, dihydroxy-acetone synthase (DHAS) (also known as formaldehyde transketolase), which exhibits a very unusual specificity by including formaldehyde amongst its substrates. Transposase_20 SM01430 transposase sub-family domain found with DEDD_Tnp_IS110 domain and found in IS116, IS110 and IS902 Transposase_21 SM01478 DDE transposase sub-family transposase DDE domain belonging to Tnp2 family invovled in transposition Transposase_31 SM01433 PD-(D/E)XK nuclease transposase sub-family putative YhgA-like transposase of PDDEKX superfamily Transposase_mut SM01501 DDE transposase sub-family mutator family transposase Trans_reg_C SM00862 Transcriptional regulatory protein, C terminal This domain is almost always found associated with the response regulator receiver domain. It may play a role in DNA binding. TRASH SM00746 metallochaperone-like domain TRCF SM00982 This domain is found in proteins necessary for strand-specific repair in DNA such as TRCF in Escherichia coli. A lesion in the template strand blocks the RNA polymerase complex (RNAP). The RNAP-DNA-RNA complex is specifically recognised by the transcription-repair-coupling factor (TRCF) which releases RNAP and the truncated transcript. TR_FER SM00094 Transferrin tRNA_SAD SM00863 Threonyl and Alanyl tRNA synthetase second additional domain The catalytically active form of threonyl/alanyl tRNA synthetase is a dimer. Within the tRNA synthetase class II dimer, the bound tRNA interacts with both monomers making specific interactions with the catalytic domain, the C-terminal domain, and this SAD domain (the second additional domain). The second additional domain is comprised of a pair of perpendicularly orientated antiparallel beta sheets, of four and three strands, respectively, that surround a central alpha helix that forms the core of the domain. TRP_2 SM01420 Transient receptor ion channel II This domain is found in the transient receptor ion channel (Trp) family of proteins. There is strong evidence that Trp proteins are structural elements of calcium-ion entry channels activated by G protein-coupled receptors PMID:10051594. TRP_N SM01320 ML-like domain This domain might be involved in lipid binding. (by similarity) TR_THY SM00095 Transthyretin TrwC SM01472 HUH conjugative element mobility domain DNA strand transferase involved in conjugation PMID:16540117 Tryp_SPc SM00020 Trypsin-like serine protease Many of these are synthesised as inactive precursor zymogens that are cleaved during limited proteolysis to generate their active forms. A few, however, are active as single chain molecules, and others are inactive due to substitutions of the catalytic triad residues. t_SNARE SM00397 Helical region found in SNAREs All alpha-helical motifs that form twisted and parallel four-helix bundles in target soluble N-ethylmaleimide-sensitive factor (NSF) attachment protein (SNAP) receptor proteins. This motif found in "Q-SNAREs". TSP1 SM00209 Thrombospondin type 1 repeats Type 1 repeats in thrombospondin-1 bind and activate TGF-beta. TSPc SM00245 tail specific protease tail specific protease TSPN SM00210 Thrombospondin N-terminal -like domains. Heparin-binding and cell adhesion domain of thrombospondin Tubulin SM00864 Tubulin/FtsZ family, GTPase domain This domain is found in all tubulin chains, as well as the bacterial FtsZ family of proteins. These proteins are involved in polymer formation. Tubulin is the major component of microtubules, while FtsZ is the polymer-forming protein of bacterial cell division, it is part of a ring in the middle of the dividing cell that is required for constriction of cell membrane and cell envelope to yield two daughter cells. FtsZ and tubulin are GTPases, this entry is the GTPase domain. FtsZ can polymerise into tubes, sheets, and rings in vitro and is ubiquitous in bacteria and archaea. Tubulin_C SM00865 Tubulin/FtsZ family, C-terminal domain This domain is found in the tubulin alpha, beta and gamma chains, as well as the bacterial FtsZ family of proteins. These proteins are GTPases and are involved in polymer formation. Tubulin is the major component of microtubules, while FtsZ is the polymer-forming protein of bacterial cell division, it is part of a ring in the middle of the dividing cell that is required for constriction of cell membrane and cell envelope to yield two daughter cells. FtsZ can polymerise into tubes, sheets, and rings in vitro and is ubiquitous in bacteria and archaea. This is the C-terminal domain. TUDOR SM00333 Tudor domain Domain of unknown function present in several RNA-binding proteins. 10 copies in the Drosophila Tudor protein. Initial proposal that the survival motor neuron gene product contain a Tudor domain are corroborated by more recent database search techniques such as PSI-BLAST (unpublished). TY SM00211 Thyroglobulin type I repeats. The N-terminal region of human thyroglobulin contains 11 type-1 repeats TY repeats are proposed to be inhibitors of cysteine proteases and binding partners of heparin. TYR SM01461 family involved in site-specific recombination, contains conserved tyrosine residue invovled in catalysis tyrosine recombinase family mediates DNA site-specific recombination through conserved tyrosine residue in phages, transposons, conjugative elements and cellular recombinases PMID:9288963, https://doi.org/10.1101/542381 TyrKc SM00219 Tyrosine kinase, catalytic domain Phosphotransferases. Tyrosine-specific kinase subfamily. UAS SM00594 UBA SM00165 Ubiquitin associated domain Present in Rad23, SNF1-like kinases. The newly-found UBA in p62 is known to bind ubiquitin. UBA_e1_C SM00985 Ubiquitin-activating enzyme e1 C-terminal domain This presumed domain found at the C terminus of Ubiquitin-activating enzyme e1 proteins is functionally uncharacterised. UBCc SM00212 Ubiquitin-conjugating enzyme E2, catalytic domain homologues Proteins destined for proteasome-mediated degradation may be ubiquitinated. Ubiquitination follows conjugation of ubiquitin to a conserved cysteine residue of UBC homologues. This pathway functions in regulating many fundamental processes required for cell viability.TSG101 is one of several UBC homologues that lacks this active site cysteine. Ubox SM00504 Modified RING finger domain Modified RING finger domain, without the full complement of Zn2+-binding ligands. Probable involvement in E2-dependent ubiquitination. UBQ SM00213 Ubiquitin homologues Ubiquitin-mediated proteolysis is involved in the regulated turnover of proteins required for controlling cell cycle progression UBX SM00166 Domain present in ubiquitin-regulatory proteins Present in FAF1 and Shp1p. uDENN SM00800 Domain always found upstream of DENN domain, found in a variety of signalling proteins The uDENN domain is part of the tripartite DENN domain. It is always found upstream of the DENN domain itself, which is found in a variety of signalling proteins involved in Rab-mediated processes or regulation of MAPKs signalling pathways. The DENN domain is always encircled on both sides by more divergent domains, called uDENN (for upstream DENN) and dDENN (for downstream DENN). The function of the DENN domain remains to date unclear, although it appears to represent a good candidate for a GTP/GDP exchange activity. UDG SM00986 Uracil DNA glycosylase superfamily UDPG_MGDP_dh_C SM00984 UDP binding domain The UDP-glucose/GDP-mannose dehydrogenases are a small group of enzymes which possesses the ability to catalyse the NAD-dependent 2-fold oxidation of an alcohol to an acid without the release of an aldehyde intermediate PUBMED:2470755, PUBMED:9013585. UIM SM00726 Ubiquitin-interacting motif. Present in proteasome subunit S5a and other ubiquitin-associated proteins. UME SM00802 Domain in UVSB PI-3 kinase, MEI-41 and ESR-1 Characteristic domain in UVSP PI-3 kinase, MEI-41 and ESR-1. Found in nucleolar proteins. Associated with FAT, FATC, PI3_PI4_kinase modules. UreE_C SM00987 UreE urease accessory protein, C-terminal domain UreE is a urease accessory protein. Urease hydrolyses urea into ammonia and carbamic acid. The C-terminal region of members of this family contains a His rich Nickel binding site. UreE_N SM00988 UreE urease accessory protein, N-terminal domain UreE is a urease accessory protein. Urease hydrolyses urea into ammonia and carbamic acid. UTG SM00096 Uteroglobin UTRA SM00866 The UbiC transcription regulator-associated (UTRA) domain is a conserved ligand-binding domain that has a similar fold to PUBMED:12757941. It is believed to modulate activity of bacterial transcription factors in response to binding small molecules. V4R SM00989 The V4R (vinyl 4 reductase) domain is a predicted small molecular binding domain, that may bind to hydrocarbons. VHP SM00153 Villin headpiece domain VHS SM00288 Domain present in VPS-27, Hrs and STAM Unpublished observations. Domain of unknown function. Viral_Rep SM01469 HUH phage replication sub-family viral replication proteins VIT SM00609 Vault protein Inter-alpha-Trypsin domain VKc SM00756 Family of likely enzymes that includes the catalytic subunit of vitamin K epoxide reductase. Bacterial homologues are fused to members of the thioredoxin family of oxidoreductases. VPS10 SM00602 VPS9 SM00167 Domain present in VPS9 Domain present in yeast vacuolar sorting protein 9 and other proteins. VRR_NUC SM00990 This entry contains proteins with the VRR-NUC domain. It is associated with members of the PD-(D/E)XK nuclease superfamily, which include the type III restriction modification enzymes, for example StyLTI. VWA SM00327 von Willebrand factor (vWF) type A domain VWA domains in extracellular eukaryotic proteins mediate adhesion via metal ion-dependent adhesion sites (MIDAS). Intracellular VWA domains and homologues in prokaryotes have recently been identified. The proposed VWA domains in integrin beta subunits have recently been substantiated using sequence-based methods. VWC SM00214 von Willebrand factor (vWF) type C domain VWC_def SM00011 VWC_out SM00215 von Willebrand factor (vWF) type C domain VWD SM00216 von Willebrand factor (vWF) type D domain Von Willebrand factor contains several type D domains: D1 and D2 are present within the N-terminal propeptide whereas the remaining D domains are required for multimerisation. WAP SM00217 Four-disulfide core domains WD40 SM00320 WD40 repeats Note that these repeats are permuted with respect to the structural repeats (blades) of the beta propeller domain. WGR SM00773 Proposed nucleic acid binding domain This domain is named after its most conserved central motif. It is found in a variety of polyA polymerases as well as in molybdate metabolism regulators (e.g. in E.coli) and other proteins of unknown function. The domain is found in isolation in some proteins and is between 70 and 80 residues in length. It is proposed that it may be a nucleic acid binding domain. WH1 SM00461 WASP homology region 1 Region of the Wiskott-Aldrich syndrome protein (WASp) that contains point mutations in the majority of patients with WAS. Unknown function. Ena-like WH1 domains bind polyproline-containing peptides, and that Homer contains a WH1 domain. WH2 SM00246 Wiskott Aldrich syndrome homology region 2 Wiskott Aldrich syndrome homology region 2 / actin-binding motif WHEP-TRS SM00991 A conserved domain of 46 amino acids, called WHEP-TRS has been shown PUBMED:1756734 to exist in a number of higher eukaryote aminoacyl-transfer RNA synthetases. This domain is present one to six times in the several enzymes. There are three copies in mammalian multifunctional aminoacyl-tRNA synthetase in a region that separates the N-terminal glutamyl-tRNA synthetase domain from the C-terminal prolyl-tRNA synthetase domain, and six copies in the intercatalytic region of the Drosophila enzyme. The domain is found at the N-terminal extremity of the mammalian tryptophanyl- tRNA synthetase and histidyl-tRNA synthetase, and the mammalian, insect, nematode and plant glycyl- tRNA synthetases PUBMED:8463296. This domain could contain a central alpha-helical region and may play a role in the association of tRNA-synthetases into multienzyme complexes. WHy SM00769 Water Stress and Hypersensitive response WIF SM00469 Wnt-inhibitory factor-1 like domain Occurs as extracellular domain in metazoan Ryk receptor tyrosine kinases. C. elegans Ryk is required for cell-cuticle recognition. WIF-1 binds to Wnt and inhibits its activity. WNT1 SM00097 found in Wnt-1 WR1 SM00289 Worm-specific repeat type 1 Worm-specific repeat type 1. Cysteine-rich domain apparently unique (so far) to C. elegans. Often appears with KU domains. About 3 dozen worm proteins contain this domain. WRKY SM00774 DNA binding domain The WRKY domain is a DNA binding domain found in one or two copies in a superfamily of plant transcription factors. These transcription factors are involved in the regulation of various physiological programs that are unique to plants, including pathogen defense, senescence and trichome development. The domain is a 60 amino acid region that is defined by the conserved amino acid sequence WRKYGQK at its N-terminal end, together with a novel zinc-finger-like motif. It binds specifically to the DNA sequence motif (T)(T)TGAC(C/T), which is known as the W box. The invariant TGAC core is essential for function and WRKY binding. WSC SM00321 present in yeast cell wall integrity and stress response component proteins Domain present in WSC proteins, polycystin and fungal exoglucanase WSN SM00453 Worm-specific (usually) N-terminal domain WW SM00456 Domain with 2 conserved Trp (W) residues Also known as the WWP or rsp5 domain. Binds proline-rich polypeptides. WWE SM00678 Domain in Deltex and TRIP12 homologues. Possibly involved in regulation of ubiquitin-mediated proteolysis. X8 SM00768 Possibly involved in carbohydrate binding The X8 domain, which may be involved in carbohydrate binding, is found in an Olive pollen antigen as well as at the C terminus of family 17 glycosyl hydrolases. It contains 6 conserved cysteine residues which presumably form three disulfide bridges. Xer SM01453 sub-family of tyrosine recombinases tyrosine recombinase sub-family responsible for chromosome dimer resolution in bacteria and archaea https://doi.org/10.1101/542381 XPGI SM00484 Xeroderma pigmentosum G I-region domain in nucleases XPGN SM00485 Xeroderma pigmentosum G N-region domain in nucleases XTALbg SM00247 Beta/gamma crystallins Beta/gamma crystallins Y1_Tnp SM01321 Transposase IS200 like Transposases are needed for efficient transposition of the insertion sequence or transposon DNA. This family includes transposases for IS200 from E. coli. YaeQ SM01322 This family consists of several hypothetical bacterial proteins of around 180 residues in length which are often known as YaeQ. YaeQ is homologous to RfaH, a specialised transcription elongation protein. YaeQ is known to compensate for loss of RfaH function (PMID:9604894). YajC SM01323 Preprotein translocase subunit YARHG SM01324 This presumed extracellular domain is about 70 amino acids in length. It is named YARHG after a conserved motif in the sequence. This domain is associated with peptidases and bacterial kinase proteins. Its molecular function is unknown. YccV-like SM00992 Hemimethylated DNA-binding protein YccV like YccV is a hemimethylated DNA binding protein which has been shown to regulate dnaA gene expression. The structure of one of the hypothetical proteins in this family has been solved and it forms a beta sheet structure with a terminating alpha helix. YceI SM00867 YceI-like domain E. coli YceI is a base-induced periplasmic protein. The recent structure of a member of this family shows that it binds to polyisoprenoid. The structure consists of an extended, eight-stranded, antiparallel beta-barrel that resembles the lipocalin fold. YL1_C SM00993 YL1 nuclear protein C-terminal domain This domain is found in proteins of the YL1 family. These proteins have been shown to be DNA-binding and may be a transcription factor. This domain is found in proteins that are not YL1 proteins. YqgFc SM00732 Likely ribonuclease with RNase H fold. YqgF proteins are likely to function as an alternative to RuvC in most bacteria, and could be the principal holliday junction resolvases in low-GC Gram-positive bacteria. In Spt6p orthologues, the catalytic residues are substituted indicating that they lack enzymatic functions. Zalpha SM00550 Z-DNA-binding domain in adenosine deaminases. Helix-turn-helix-containing domain. Also known as Zab. Zds_C SM01327 Activator of mitotic machinery Cdc14 phosphatase activation C-term This region of the Zds1 protein is critical for sporulation and has also been shown to suppress the calcium sensitivity of Zds1 deletions (PMID:16322512). The C-terminal motif is common to both Zds1 and Zds2 proteins, both of which are putative interactors of Cdc55 and are required for the completion of mitotic exit and cytokinesis. They both contribute to timely Cdc14 activation during mitotic exit and are required downstream of separase to facilitate nucleolar Cdc14 (PMID:18762578). zf-3CxxC SM01328 Zinc-binding domain This is a family with several pairs of CxxC motifs possibly representing a multiple zinc-binding region. Only one pair of cysteines is associated with a highly conserved histidine residue. zf-AD SM00868 Zinc-finger associated domain (zf-AD) The zf-AD domain, also known as ZAD, forms an atypical treble-cleft-like zinc co-ordinating fold. The zf-AD domain is thought to be involved in mediating dimer formation, but does not bind to DNA. zf-C4_ClpX SM00994 ClpX C4-type zinc finger The ClpX heat shock protein of Escherichia coli is a member of the universally conserved Hsp100 family of proteins, and possesses a putative zinc finger motif of the C4 type. This presumed zinc binding domain is found at the N-terminus of the ClpX protein. ClpX is an ATPase which functions both as a substrate specificity component of the ClpXP protease and as a molecular chaperone. The molecular function of this domain is now known. zf-PARP SM01336 Poly(ADP-ribose) polymerase and DNA-Ligase Zn-finger region Poly(ADP-ribose) polymerase is an important regulatory component of the cellular response to DNA damage. The amino-terminal region of Poly(ADP-ribose) polymerase consists of two PARP-type zinc fingers. This region acts as a DNA nick sensor. ZipA_C SM00771 ZipA, C-terminal domain (FtsZ-binding) C-terminal domain of ZipA, a component of cell division in E.coli. It interacts with the FtsZ protein in one of the initial steps of septum formation. The structure of this domain is composed of three alpha-helices and a beta-sheet consisting of six antiparallel beta-strands. ZM SM00735 ZASP-like motif Short motif (26 amino acids) present in an alpha-actinin-binding protein, ZASP, and similar molecules. Zn_dep_PLPC SM00770 Zinc dependent phospholipase C (alpha toxin) This domain conveys a zinc dependent phospholipase C activity (EC 3.1.4.3). It is found in a monomeric phospholipase C of Bacillus cereus as well as in the alpha toxin of Clostridium perfringens and Clostridium bifermentans, which is involved in haemolysis and cell rupture. It is also found in a lecithinase of Listeria monocytogenes, which is involved in breaking the 2-membrane vacuoles that surround the bacterium. Structure information: PDB 1ca1. ZnF_A20 SM00259 A20-like zinc fingers A20- (an inhibitor of cell death)-like zinc fingers. The zinc finger mediates self-association in A20. These fingers also mediate IL-1-induced NF-kappaB activation. ZnF_AN1 SM00154 AN1-like Zinc finger Zinc finger at the C-terminus of An1, a ubiquitin-like protein in Xenopus laevis. ZnF_BED SM00614 BED zinc finger DNA-binding domain in chromatin-boundary-element-binding proteins and transposases ZnF_C2C2 SM00440 C2C2 Zinc finger Nucleic-acid-binding motif in transcriptional elongation factor TFIIS and RNA polymerases. ZnF_C2H2 SM00355 zinc finger ZnF_C2HC SM00343 zinc finger ZnF_C3H1 SM00356 zinc finger ZnF_C4 SM00399 c4 zinc finger in nuclear hormone receptors ZnF_CDGSH SM00704 CDGSH-type zinc finger. Function unknown. ZnF_CHCC SM00400 zinc finger ZnF_DBF SM00586 Zinc finger in DBF-like proteins ZnF_GATA SM00401 zinc finger binding to DNA consensus sequence [AT]GATA[AG] ZnF_NFX SM00438 Repressor of transcription ZnF_PMZ SM00575 plant mutator transposase zinc finger ZnF_Rad18 SM00734 Rad18-like CCHC zinc finger Yeast Rad18p functions with Rad5p in error-free post-replicative DNA repair. This zinc finger is likely to bind nucleic-acids. ZnF_RBZ SM00547 Zinc finger domain Zinc finger domain in Ran-binding proteins (RanBPs), and other proteins. In RanBPs, this domain binds RanGDP. ZnF_TAZ SM00551 TAZ zinc finger, present in p300 and CBP ZnF_TTF SM00597 zinc finger in transposases and transcription factors ZnF_U1 SM00451 U1-like zinc finger Family of C2H2-type zinc fingers, present in matrin, U1 small nuclear ribonucleoprotein C and other RNA-binding proteins. ZnF_UBP SM00290 Ubiquitin Carboxyl-terminal Hydrolase-like zinc finger ZnF_UBR1 SM00396 Putative zinc finger in N-recognin, a recognition component of the N-end rule pathway Domain is involved in recognition of N-end rule substrates in yeast Ubr1p ZnF_ZZ SM00291 Zinc-binding domain, present in Dystrophin, CREB-binding protein. Putative zinc-binding domain present in dystrophin-like proteins, and CREB-binding protein/p300 homologues. The ZZ in dystrophin appears to bind calmodulin. A missense mutation of one of the conserved cysteines in dystrophin results in a patient with Duchenne muscular dystrophy [3]. ZnMc SM00235 Zinc-dependent metalloprotease Neutral zinc metallopeptidases. This alignment represents a subset of known subfamilies. Highest similarity occurs in the HExxH zinc-binding site/ active site. Zn_pept SM00631 ZP SM00241 Zona pellucida (ZP) domain ZP proteins are responsible for sperm-adhesion fo the zona pellucida. ZP domains are also present in multidomain transmembrane proteins such as glycoprotein GP2, uromodulin and TGF-beta receptor type III (betaglycan). Zpr1 SM00709 Duplicated domain in the epidermal growth factor- and elongation factor-1alpha-binding protein Zpr1. Also present in archaeal proteins. ZU5 SM00218 Domain present in ZO-1 and Unc5-like netrin receptors Domain of unknown function.