The C-terminal domain (CTD) of the large subunit of RNA polymerase II is a platform for mRNA processing factors and links gene transcription to mRNA capping, splicing and polyadenylation. CTD recognition is dependent on the phosphorylation state of the CTD itself, which varies during the course of transcription but has also been linked to the isomerization state of the CTD's proline residues. Several RNA-processing factors recognise the CTD by means of a conserved CTD-interacting domain (CID). Factors with CID domains include the serine/arginine-rich-like factors SCAF4 and SCAF8, Nrd1 (which is implicated in polyadenylation-independent RNA 3'-end formation) and Pcf11. Pcf11 is a conserved and essential subunit of the yeast cleavage factor 1A, which is required for 3'-RNA processing and transcription termination [ (PUBMED:15241417) (PUBMED:15665873) ].
The CID domain is a right-handed superhelix of eight alpha-helices forming a compact domain. The CID fold closely resembles that of VHS domains IPR002014 and is related to armadillo-repeat proteins IPR000225 except for the two amino-terminal helices. Amino acid residues in the hydrophobic core of the domain are highly conserved across CID domains [ (PUBMED:15241417) (PUBMED:15665873) ].
Family alignment:
There are 9580 RPR domains in 9578 proteins in SMART's nrdb database.
Click on the following links for more information.
Evolution (species in which this domain is found)
Taxonomic distribution of proteins containing RPR domain.
This tree includes only several representative species. The complete taxonomic breakdown of all proteins with RPR domain is also avaliable.
Click on the protein counts, or double click on taxonomic names to display all proteins containing RPR domain in the selected taxonomic class.
Literature (relevant references for this domain)
Primary literature is listed below; Automatically-derived, secondary literature is also avaliable.
Systematic identification of novel protein domain families associated with nuclear functions.
Genome Res. 2002; 12: 47-56
Display abstract
A systematic computational analysis of protein sequences containing known nuclear domains led to the identification of 28 novel domain families. This represents a 26% increase in the starting set of 107 known nuclear domain families used for the analysis. Most of the novel domains are present in all major eukaryotic lineages, but 3 are species specific. For about 500 of the 1200 proteins that contain these new domains, nuclear localization could be inferred, and for 700, additional features could be predicted. For example, we identified a new domain, likely to have a role downstream of the unfolded protein response; a nematode-specific signalling domain; and a widespread domain, likely to be a noncatalytic homolog of ubiquitin-conjugating enzymes.