Baltimore classification is a system used to classify viruses by their routes of transferring genetic information from the genome to messenger RNA (mRNA). Seven Baltimore groups, or classes, exist and are numbered in Roman numerals from I to VII. Groups are defined by whether the viral genome is made of deoxyribonucleic acid (DNA) or ribonucleic acid (RNA), whether the genome is single- or double-stranded, whether a single-stranded RNA genome is positive-sense (+) or negative-sense (–), and whether the virus makes DNA from RNA (reverse transcription (RT)). Viruses within Baltimore groups typically have the same replication method, but other characteristics such as virion structure are not directly related to Baltimore classification.
The seven Baltimore groups are for double-stranded DNA (dsDNA) viruses, single-stranded DNA (ssDNA) viruses, double-stranded RNA (dsRNA) viruses, positive-sense single-stranded RNA (+ssRNA) viruses, negative-sense single-stranded RNA (–ssRNA) viruses, ssRNA viruses that have a DNA intermediate in their life cycle (ssRNA-RT), and dsDNA viruses that have an RNA intermediate in their life cycle (dsDNA-RT). Only one class exists for ssDNA viruses because their genomes are converted to dsDNA before transcription regardless of sense. Some viruses belong to more than one Baltimore group, such as DNA viruses that have either dsDNA or ssDNA as their genome.
Many virus characteristics do not define which Baltimore group they belong to but do correlate to specific Baltimore groups. This includes the use of RNA editing and alternative splicing, whether the virus's genome is segmented, the size and structure of the virus's genome, the host range of viruses, whether the virus packages replication and transcription machinery into , and unorthodox methods of translating mRNA into proteins. Furthermore, while Baltimore groups were not established based on evolutionary relationships, research in the 21st century has found that certain groups, such as dsRNA, +ssRNA, and many –ssRNA viruses, share common ancestry.
Baltimore classification was created in 1971 by virologist David Baltimore and initially only included the first six groups. It was later expanded to include group VII after the discovery of dsDNA-RT viruses. Since then, it has become common among virologists to use Baltimore classification alongside virus taxonomy due to its utility. In 2018 and 2019, Baltimore classification was partially integrated into virus taxonomy based on evidence that certain groups were descended from common ancestors. Various taxa now correspond to specific Baltimore groups. An extension of Baltimore classification has been proposed by virologist Vadim Agol to encompass all possible routes of genetic information transmission.
Baltimore classification is chiefly based on the path toward transcription of the viral genome, and viruses within each group usually share the manner by which the mRNA synthesis occurs. While not the direct focus of Baltimore classification, groups are organized in such a manner that viruses in each group also typically have the same mechanisms of replicating the viral genome. Structural characteristics of the extracellular virus particle, called a virion, such as the shape of the viral capsid, which stores the genome, and the presence of a viral envelope, a lipid membrane that surrounds the capsid, have no direct relation to Baltimore groups, nor do the groups necessarily show genetic relation based on evolutionary history.
dsDNA viruses make use of several mechanisms to replicate their genome. A widely used method is bidirectional replication, in which two are established at a replication origin site and move in opposite directions on a DNA molecule. A rolling circle mechanism that produces linear strands while progressing in a loop around a circular genome is also common. Many dsDNA viruses use a strand displacement method whereby one strand is synthesized from a template strand, and a complementary strand is then synthesized from the previously synthesized strand to form a dsDNA genome. Lastly, some dsDNA viruses are replicated as part of a process called replicative transposition, whereby a viral genome that is Integrase into a host cell's genome is replicated to another part of the host cell's genome.
dsDNA viruses can be divided informally into those that replicate in the Cell nucleus, and as such are relatively dependent on host cell machinery for transcription and replication, and those that replicate in cytoplasm, in which case they have obtained their own means of transcription and replication. dsDNA viruses are also sometimes divided between tailed dsDNA viruses, which refers to members of the realm Duplodnaviria, specifically the head-tail of the class Caudoviricetes, and tailless or non-tailed (icosahedral) dsDNA viruses, which refers to viruses in the realms Singelaviria and Varidnaviria.
dsDNA viruses are classified into five realms and include many taxa that are unassigned to a realm:
Most ssDNA viruses contain circular genomes that are replicated by rolling circle replication (RCR). ssDNA RCR is initiated by an endonuclease enzyme that bonds to and cleaves the positive-sense strand, which allows a DNA polymerase to use the negative-sense strand as a template for replication. Replication progresses in a loop around the genome by extending the 3′-end ("three prime end") of the positive-sense strand, which displaces the prior positive-sense strand. The endonuclease then cleaves the positive-sense strand again to create a standalone genome that is joined (ligated) into a circular loop. The new ssDNA genome may be packaged into virions or replicated by a DNA polymerase to create a double-stranded form for transcription or additional rounds of replication.
and Bidensovirus have linear ssDNA genomes that are replicated by rolling hairpin replication (RHR), which is similar to RCR. Their genomes have Stem-loop at each end of the genome that repeatedly unfold and refold during replication to change the direction of DNA synthesis to move back and forth along the linear genome, which produces numerous copies of the genome in a continuous process. Individual genomes are then excised from this molecule by the endonuclease.
Nearly all ssDNA viruses have positive-sense genomes, but a few exceptions and peculiarities exist. Anelloviridae are the only ssDNA viruses that have negative-sense genomes. Parvoviruses may package either the positive- or negative-sense strand into capsids. Lastly, bidnaviruses may package both the positive- and negative-sense strands of their bipartite genome. In any case, the sense of ssDNA viruses, unlike that of ssRNA viruses, is not sufficient to separate ssDNA viruses into two Baltimore groups since all ssDNA viral genomes are converted to dsDNA forms before transcription and replication.
ssDNA viruses are classified into two realms and include a few families that are unassigned to realms:
dsRNA is not a molecule made by cells, so eukaryotes have evolved antiviral systems to detect and inactivate viral dsRNA. To counter this, many dsRNA viruses replicate their genomes inside of capsids, thereby avoiding detection inside of the host cell's cytoplasm. Positive-sense strands are then forced out from the capsid to be translated or translocated from the mature capsid to a progeny capsid.
dsRNA viruses are classified into two phyla within the kingdom Orthornavirae, realm Riboviria:
Many +ssRNA viruses are able to have only a portion of their genome transcribed. Typically, subgenomic RNA (sgRNA) strands are used for the translation of structural and movement proteins needed during intermediate and late stages of infection. sgRNA transcription may occur by commencing RNA synthesis within the genome rather than from the 5′-end ("five prime end"), by stopping RNA synthesis at specific sequences in the genome, or, as a part of both aforementioned methods, by synthesizing leader sequences from viral RNA that are then attached to sgRNA strands. During infection, the viral RdRp is always translated directly from the genome first because replication, performed by the RdRp, is required for sgRNA synthesis.
Because the process of replicating the viral genome produces intermediate dsRNA molecules, +ssRNA viruses can be targeted by the host cell's immune system. To avoid detection, +ssRNA viruses replicate in membrane-associated vesicles that are used as replication factories. From there, only +ssRNA strands enter the main cytoplasmic area of the cell. These strands may be used as mRNA or as progeny genomes.
+ssRNA viruses can be divided informally into those that have polycistronic mRNA, which encodes a polyprotein that is cleaved to form multiple mature proteins, and those that undergo multiple rounds of translation of the genome or subgenomic mRNAs to express proteins. +ssRNA viruses are classified into three phyla in the kingdom Orthornavirae, realm Riboviria:
The second manner is similar, but instead of synthesizing a cap, the RdRp may use its endonuclease activity to Cap snatching of nucleotides from host cell mRNA and use it as the 5′ cap of viral mRNA. Genomic –ssRNA is replicated from the positive-sense antigenome in a manner similar to transcription, except in reverse using the antigenome as a template for the genome. The RdRp complex moves from the 3′-end to the 5′-end of the antigenome and ignores all transcription signals when synthesizing genomic –ssRNA.
Various –ssRNA viruses use special mechanisms for transcription. The way of polyadenylating the end of an mRNA sequence may be through polymerase stuttering, during which the RdRp transcribes an adenine from uracil and then moves back in the RNA sequence to transcribe it again, continuing this process until hundreds of adenines have been added to the 3′-end of the mRNA. Some –ssRNA viruses are ambisense, as both the positive- and negative-sense strands separately encode viral proteins. These viruses produce one mRNA strand from the genome and one from a complementary strand.
–ssRNA viruses in Negarnaviricota can be divided informally into those that have non-segmented and segmented genomes. Non-segmented –ssRNA viruses replicate in the cytoplasm, and segmented –ssRNA viruses replicate in the nucleus. For segmented viruses, the RdRp transcribes one monocistronic mRNA strand from each segment of the genome. This distinction is closely followed within Negarnaviricota, as viruses in the subphylum Haploviricotina usually have non-segmented genomes, and viruses in the subphylum Polyploviricotina have segmented genomes. Moreover, –ssRNA viruses that synthesize a cap structure on viral mRNA are assigned to Haploviricotina, whereas –ssRNA viruses that snatch caps from host mRNA belong to Polyploviricotina.
The second lineage of –ssRNA viruses is the realm Ribozyviria, which includes Hepatitis D virus (HDV) and its relatives. Ribozyvirians have -closed circular –ssRNA genomes that are covered in nucleocapsid proteins to form a ribonucleoprotein (RNP) complex. After entering a cell, the RNP complex migrates from the cytosol to the nucleus, where the genome is replicated by RCR by a host RNA polymerase II enzyme. This process creates a long molecule with many copies of the genome, called a concatemer, that has a series of positive-sense genomic strands. encoded in this antigenome catalyze cleavage of the concatemer to form individual strands that are either translated or ligated for replication through RCR to produce concatemers of –ssRNA antigenomic strands. Ribozymes encoded in the negative-sense strands then catalyze cleavage of the negative-sense concatemer to produce individual genomic –ssRNA strands.
Lastly, there is a group of –ssRNA viruses assigned to the tentative phylum Arctiviricota in the kingdom Orthornavirae. Arctiviricots inhabit the Arctic Ocean and are believed to represent a separate –ssRNA lineage in Orthornavirae from Negarnaviricota. Their mechanisms of replication and transcription have not been described. In summary, –ssRNA viruses belong to the following taxa:
ssRNA-RT viruses are all included in the class Revtraviricetes, the sole class in the kingdom Pararnavirae, realm Riboviria. Excluding the family Caulimoviridae, which belongs to group VII, all members of the Revtraviricetes order Ortervirales are ssRNA-RT viruses. ssRNA-RT viruses are sometimes called retroviruses, a term shared with members of the ssRNA-RT family Retroviridae.
dsDNA-RT viruses are, like ssRNA-RT viruses, all included in the class Revtraviricetes. Two families of dsDNA-RT viruses are recognized: Caulimoviridae, which belongs to the order Ortervirales, and Hepadnaviridae, which is the sole family in the order Blubervirales. The provisional family Nudnaviridae is considered to be a sister family to hepadnavirids. dsDNA-RT viruses are often called pararetroviruses.
Alternative splicing differs from RNA editing in that alternative splicing does not change the mRNA sequence like RNA editing but instead changes the coding capacity of an mRNA sequence as a result of alternative splicing sites. The two processes otherwise have the same result: multiple proteins are expressed from a single gene.
By host, a large majority of prokaryotic viruses are dsDNA viruses, but a significant minority are ssDNA and +ssRNA viruses. There are a relatively small number of prokaryotic dsRNA viruses and no prokaryotic –ssRNA or RT viruses. Eukaryotic viruses, in contrast, are predominantly RNA viruses, though eukaryotic DNA viruses are common. Well-characterized eukaryotic contain mostly +ssRNA viruses and, in some lineages such as fungi, dsRNA viruses. ssRNA-RT viruses are also common in eukaryotes, especially in animals.
Biological factors influence host range. For example, dsDNA viruses do not infect plants because large dsDNA molecules are unable to pass through plasmodesmata, intercellular channels that connect . The dominance of DNA viruses in prokaryotes may be because they outcompete RNA viruses. In eukaryotic cells, however, the nucleus is a barrier that requires adaptation by DNA viruses. They either have to evolve means to enter the nucleus to replicate or obtain their own replication and transcription machinery to replicate in virus factories in the cytosol. In contrast, the of eukaryotic cells appear to be a beneficial environment for RNA virus replication.
dsDNA viruses encode a broad range of proteins involved in replication and transcription. In some cases, they encode nearly complete systems that grant the virus partial autonomy from cells during genome expression and replication. Most ssDNA viruses encode an endonuclease that initiates RCR or RHR while relying on host cell machinery for the rest of replication and transcription. The endonuclease has to be encoded by these viruses because they use a DNA replication method not normally used by cells. Anelloviruses and bidnaviruses are the exceptions: anelloviruses encode proteins that aren't homologous to known proteins, and bidnaviruses encode a protein-primed DNA polymerase.
RNA replication and reverse transcription are usually discouraged by cells, which necessitates that all RNA and RT viruses encode their own RNA-dependent polymerase. , such as the viruses of Ribozyviria, are the only exception because they depend on other viruses for replication. Almost all RNA and RT viruses incorporate their RNA-dependent polymerase into the virion because the enzyme is required to synthesize viral mRNA in infected cells. The exceptions are +ssRNA viruses and caulimoviruses, which are dsDNA-RT viruses. +ssRNA viruses do not do so because their genomes function as mRNA and are translated upon cell entry. For caulimoviruses, the host enzyme RNA polymerase II transcribes the genome.
Most ssDNA viruses likely originate from that, on multiple occasions, recombined with other genomes to obtain the structural proteins needed to form virions. The evolutionary history of dsDNA viruses is the most complex as they appear to have emerged independently on numerous occasions. Two major lineages of dsDNA viruses exist: the realm Duplodnaviria and the realm Varidnaviria, the latter of which also contains ssDNA viruses that are descended from dsDNA viruses. The opposite is true in the realm Monodnaviria, which contains dsDNA viruses descended from ssDNA viruses. There are also two minor realms, Adnaviria and Singelaviria, that exclusively contain dsDNA viruses. Lastly, there are dsDNA virus families unassigned to higher taxa that are unique from existing realms and which likely constitute small realms.
Of the replication-expression systems used by viruses, only Baltimore group I (dsDNA) is used by cells. The other groups may be remnants of the primordial stage of life before the emergence of modern-like cells, during which the dsDNA system used by extant cells had not yet become uniform. The ancestors of RNA viruses in particular may have emerged during the time of the RNA world. And although virus realms are evolutionarily independent from each other, the replicative proteins encoded by viruses in the four major realms ( Duplodnaviria, Monodnaviria, Riboviria, and Varidnaviria) are built on the core RNA recognition motif, one of the most common nucleic acid-binding Protein domain in nature. Therefore, the replication-expression cycles most likely diversified before the separation of large dsDNA replicators, which became the ancestors of cellular life, from other types of replicators, which became selfish genetic elements and gave rise to viruses.
Over time, the belief that Baltimore groups were monophyletic spread among virologists. This was reflected in taxonomies published by the ICTV and the National Center for Biotechnology Information, which for decades placed the Baltimore groups as informal higher ranks above official taxonomic ranks. From 1991 to 2017, virus taxonomy used a five-rank system ranging from order to species, with Baltimore classification used in conjunction. Outside of official taxonomy, supergroups of viruses joining different taxa were created over time based on increasing evidence of deeper evolutionary relations. The advancement of sequencing methods in the 21st century in particular made it possible to study virus evolution and diversity in greater detail. This enabled virologists to better understand the relationships between Baltimore groups and the evolutionary history of viruses. Consequently, in 2016, the ICTV began to consider establishing ranks higher than order as well as how the Baltimore groups would be treated among higher taxa.
In two votes in 2018 and 2019, the ICTV established a 15-rank system ranging from realm to species. As part of this, the Baltimore groups for ssDNA, dsRNA, +ssRNA, –ssRNA, and RT viruses were incorporated into formal taxa. In 2019, the realm Riboviria was established and initially included all dsRNA, +ssRNA, and –ssRNA viruses. A year later, Riboviria was expanded to also include RT viruses. Within the realm, RT viruses are included in the kingdom Pararnavirae and the three other Baltimore groups in the kingdom Orthornavirae as defining traits of the kingdom's phyla. While –ssRNA viruses of Ribozyviria were initially classified in Riboviria, this was a clerical error that was fixed in 2020. A year later, Ribozyviria was established for HDV and its relatives. For ssDNA viruses, the realm Monodnaviria was established in 2020 to accommodate almost all ssDNA viruses, as well as dsDNA viruses descended from them.
In 1974, virologist Vadim Agol proposed an extension of Baltimore classification to encompass all possible means of genetic information transmission and describe the hierarchical routes of information transmission, including both expression and replication, rather than solely mRNA synthesis. In the expanded system, there are 35 classes, 17 superclasses, and six types of genetic information transfer. The system was revisited in 2021 by Eugene Koonin et al. in light of discoveries made since the 1970s. Known viruses occupy 13 classes, one of which is shared with cells, seven superclasses, and three types. A fourteenth class is occupied by F-plasmid. Ambisense viruses occupy two classes simultaneously, though separate classes could be made for them. Most unoccupied classes are of DNA-RNA hybrids, which appear to be disfavored by evolution since it may be advantageous to convert such molecules to dsDNA, the molecule most suitable for genome replication. According to Koonin et al., viruses that belong to the unoccupied classes are unlikely to be discovered unless they are rare in nature.
|
|