Did you know that the first organism whose genome became fully sequenced was FC174, a single-stranded (ss) DNA bacteriophage? Did you know that the discovery of messenger RNA, the nature of genetic codes, and the function of ribosomes were all based on the study of the double-stranded (ds) DNA phage, T4? Did you know that the design of the widely used protein expression system pET benefited from the understanding of another ds DNA phage, T7? Also worth mentioning is l phage, a paradigm for studying the regulation of gene expression. It was the knowledge gained from these bacterial viruses that laid the foundations of molecular biology. What’s more? Studies of some ds DNA animal viruses such as mouse polyomavirus and simian virus 40 have helped us understand DNA replication, RNA and protein synthesis in the eukaryotic system, and they also provide a model system to study tumorigenesis.
Interestingly, while about 95% of the known bacterial viruses have double-stranded DNA genomes, most plant viruses have positive single-stranded RNA genomes, and many fungal viruses have double-stranded RNA genomes. Animal viruses can have either DNA or RNA genomes.
The first two groups of the Baltimore classification system are DNA viruses. Group I viruses contain ds DNA genomes and show the most diversity in genome sizes, ranging from ~ 5000 base pairs such as the Simian virus 40 that infects monkeys and humans to ~ 2.5 million base pairs of the giant virus, Pandoravirus salinus, that infects amoebas. In contrast, Group II ss DNA viruses have much smaller genomes (2000 - 6000 bases) and their DNAs are usually in a circular form or ia linear form but with hairpin ends so that the ends of the ss DNA are protected from nuclease degradation.
When a virus infects a host cell, its ultimate goal is to make many copies of viral proteins and many copies of its genome so that new viral particles can be reproduced. For all the DNA viruses, the process of making viral proteins and replicating genomes follows a particular sequence. During the initial stage of infection, only a few viral genes are expressed. These genes are called the early genes, and they are transcribed by the host DNA-dependent RNA polymerases (DdRP). Because DdRPs can only use ds DNA as templates, for viruses with ss DNA genomes, the ss DNA must be converted to a ds DNA form before any transcription can happen. This is carried out by the host DNA-dependent DNA polymerase (DdDP) and other accessory proteins.
Early genes code for proteins that are necessary for viral DNA synthesis and regulation of viral gene expression. For example, when the T7 phage infects E. coli and injects its DNA into the cell, E. coli RNA polymerase will recognize the promoter region of the early genes and turn on their expression. The early gene products include the T7 RNA polymerase and a protein kinase that specifically inhibits E. coli RNA polymerase activity. These two early proteins ensure that host gene expression is replaced with viral gene expression. T7 RNA polymerase then turns on the expression of middle genes that are required for T7 DNA replication. Let’s use mouse polyomavirus as another example. Because the host is an animal cell, the viral DNA needs to be transported into the nucleus of the cell, where RNA polymerase II will make the initial transcript that is then spliced into four mRNAs for making the four early proteins (called T antigens). These T antigens interact with the cellular signaling pathways to bring the resting cell (G0 phase) into the DNA synthesis (S) phase of the cell cycle. They also bind to the viral DNA replication origin, unwind the ds DNA, and help with the assembly of the cellular DNA synthesis machinery for viral DNA replication.
Genes that code for viral structure proteins are called late genes because their expression is only turned on when viral DNA synthesis starts. This is to make sure that copies of the viral genome and the structural proteins are generated coordinately. Some of the late gene products also inhibit early gene expression. Therefore, early gene products stimulate late gene expression, and late gene products, in turn, inhibit early genes. DNA viruses present us with an exquisite gene regulatory cascade!
Next time let’s talk about RNA viruses (To Be Continued).
Your blog post content here…