MicrobiologyBytes: Virology: Retroviruses Updated: April 8, 2009 Search


MicrobiologyBytes: Latest Updates


Retroviruses have received much attention in recent years (even before the discovery of the first human retrovirus in 1981), but they have a long history:
  • 1908: Ellerman and Bang, searching for an infectious cause (bacterium) for leukaemia, studied leukaemia in chickens and succeed in transferring the disease from one to another by cell-free tissue filtrates (Ellerman, C., and O. Bang. Centralbl. Bakteriol. 46: 595–609).
  • 1909: Paul Ehrlich proposed his theory of 'immune surveillance' (Ehrlich, P. Über den jetzigen stand der karzinomforschung. Ned. Tijdschr. Geneeskd. 5, 273–290):
    • Tumour cells frequently emerge in the organism
    • They are rapidly eliminated by the immune system
  • 1910/1: Peyton Rous transmitted solid tumours of chickens by transplanting tissue, but also isolated the infectious agent (Rous Sarcoma Virus: Rous, P. J. Exp. Med. 12:696–705; Rous, P. J. Exp. Med. 13:397–411).
    This discovery was followed by many other examples of acutely transforming retroviruses, together with the structural characterization of the viruses involved.
  • 1960's: Howard Temin knew that retrovirus genomes were composed of RNA and observed that replication was inhibited by actinomycin D (inhibits DNA synthesis therefore he proposed the concept of reverse transcription (Nobel prize awarded to Baltimore and Temin, 1975).
  • 1969: Huebner and Todaro proposed the viral oncogene hypothesis - the transmission of viral and oncogenic information as genetic elements (rather than as a pathogenic response to a virus) - explains the vertical (germ line) transmission of 'cancers', first observed by Gross, 1951.
  • 1970s: Richard Nixon's 'war on cancer' (post Kennedy space programme - the race to the moon) - failed to find any retroviral agents which cause human cancer (many false alarms - but did pump a lot of money into biomedical research).
  • 1981: Human T-cell leukaemia virus discovered, the first pathogenic human retrovirus.
  • 1983: Human immunodeficiency virus discovered.

Most of the retroviruses we currently know (many!) infect vertebrates, but as a group, they have been identified in virtually all organisms including invertebrates - an evolutionarily successful design!


Group VI: RNA Reverse Transcribing Viruses



Type Species




Avian leukosis virus



Mouse mammary tumor virus



Murine leukernia virus



Bovine leukemia virus



Walley dermal sarcoma virus



Human immunodeficiency virus 1



Chimpanzee foamy virus




Saccharomyces cerevisiae Ty3 virus



Drosophila melanogaster gypsy virus




Saccharomyces cerevisiae Ty1 virus



Drosophila melanogaster copia virus


Historically, retroviruses were divided into groups based on their morphology in negatively-stained E.M. pictures:

By and large, molecular genetic studies have borne out these morphologic differences, but have also largely replaced them - most comparisons now made on the basis of sequence conservation.

Retrovirus Structure:

There is considerable diversity between various types of retrovirus; the following is a generalized description of the particle. There is a universal nomenclature for retrovirus proteins:
Name: Protein: Function:
MA Matrix matrix protein (gag gene); lines envelope
CA Capsid capsid protein (gag gene); protects the core; most abundant protein in virus particle
NC Nucleocapsid capsid protein (gag gene); protects the genome; forms the core
PR Protease Essential for gag protein cleavage during maturation
RT Reverse transcriptase Reverse transcribes the RNA genome; also has RNAseH activity
IN Integrase Encoded by the pol gene; needed for integration of the provirus
SU Surface glycoprotein The outer envelope glycoprotein; major virus antigen
TM Transmembrane protein The inner component of the mature envelope glycoprotein

All the above proteins are essential for replication; some retroviruses also encode additional essential and non-essential proteins.

Retrovirus particle

Retroviruses have enveloped particles, somewhat variable in size/shape but ~100nm diameter. The envelope carries a virus-encoded glycoprotein, which forms spikes in the membrane. There are certain structural/functional similarities between the envelope glycoprotein and the influenza haemagglutinin (N.B: NO SEQUENCE SIMILARITIES). The mature protein is cleaved into 2 polypeptides:

Envelope glycoprotein

Inside the membrane is the matrix (MA) protein, rather amorphous. This largely obscures the capsid (CA), which is believed to be icosahedral. CA is the most abundant protein in the particle (~33% total weight). Inside the capsid is the core = RNA genome+NC protein+RT+IN. This is usually a conical, electron-dense structure clearly visible in -ve stained E.M. pictures (matrix and capsid appear amorphous).
Turner B.G., Summers M.F. (1999) Structural Biology of HIV. J.Mol.Biol. 285: 1-32.


All retrovirus genomes consist of two molecules of RNA, which are s/s, (+)sense and have 5' cap and 3' poly-(A) (equivalent to mRNA). These vary in size from ~8-11kb. Retrovirus genomes have 4 unique features:
  1. They are the only viruses which are truly diploid.
  2. They are the only RNA viruses whose genome is produced by cellular transcriptional machinery (without any participation by a virus-encoded polymerase).
  3. They are the only viruses whose genome requires a specific cellular RNA (tRNA) for replication.
  4. They are the only (+)sense RNA viruses whose genome does not serve directly as mRNA immediately after infection.
These two molecules are physically linked as a dimer by hydrogen bonds (co-sediment). In addition, there is a 3rd type of nucleic acid present in all particles, a specific type of tRNA (usually trp, pro or lys) - required for replication (below).
Gene order in all retroviruses is invariant:
5' - gag - pol - env - 3'

Some retroviruses have additional genes:

Retrovirus genes

Sequence features of retrovirus genomes:

Genome organization
R Region: A short (18-250nt) sequence which forms a direct repeat at the both ends of the genome, which is therefore 'terminally redundant'.U5: A unique, non-coding region of 75-250nt which is the first part of the genome to be reverse transcribed, forming the 3' end of the provirus genome (below). Primer Binding Site: 18nt complementary to the 3' end of the specific tRNA primer used by the virus to begin reverse transcription. Leader: A relatively long (90-500nt) non-translated region downstream of the transcription start site and therefore present at the 5' end of all virus mRNAs. Polypurine Tract: A short (~10) run of A/G residues responsible for initiating (+)strand synthesis during reverse transcription. U3: A unique non-coding region of 200-1,200nt which forms the 5' end of the provirus after reverse transcription; contains the promoter elements responsible for transcription of the provirus.


Bestsellers - Music - DVDs - Videos - Electronics
Search for ... (keywords):
Search for ... (keywords):

Bestsellers - Music - DVDs - Videos - Electronics



To initiate the infection, the SU envelope glycoprotein binds to a specific receptor on the surface of the host target cell. The specificity of this interaction does much to determine the cell-tropism and pathogenesis of different retroviruses, or even different isolates of the same virus (e.g. HIV). Murine retroviruses (MLVs) are sub-divided on the basis of receptor-determined host species specificity:

Interference between an exogenous virus and an endogenous virus of the same receptor specificity results in 'interference groups' of viruses (e.g. ALVs). In recent years, a number of different retrovirus receptor molecules have been identified: Sommerfelt M.A. (1999) Retrovirus receptors. J.Gen.Virol. 80:3049-3064. See also: Restriction factors: a defense against retroviral infection Bieniasz P.D. (2003) Trends in Microbiology 11: 286-291.


Retrovirus replication

It is probable that receptor binding results in conformational changes in the glycoprotein spike, revealing the (previously masked) fusion domain in the TM protein and resulting in fusion of the virus envelope with the cell membrane. Penetration and uncoating are poorly understood, but it is now known that uncoating is only partial, resulting eventually in a core (nucleocapsid) particle within the cytoplasm. Reverse transcription occurs inside the ordered structure of this core particle - with the reactants (RT + RNA + nucleotides) free in solution, reverse transcription is initiated but cannot be completed, and aborts soon after.

Reverse Transcription:

Reverse transcription

Reverse Transcription: The Movie

The d/s DNA product formed by this reaction is known as the provirus (c.f. 'prophage') and differs from the vRNA in being longer by one U3,R,U5 sequence. As a result, there is a direct repeat of this sequence present at each end of the provirus genome, and these are known as the long terminal repeats (LTRs). Three forms of provirus DNA are found in all infected cells:

Provirus DNA

It is not clear how these are related to one another, but the circles probably form by intracellular ligation. The linear and 2-LTR circle forms are infectious (unlike the (+)sense vRNA!). Reverse transcription occurs in the cytoplasm, after which the provirus DNA migrates into the nucleus.


Catalysed by the IN polypeptide (part of the RTase complex). Integration is a highly specific reaction with respect to the provirus, but random with respect to host cell DNA. Formerly, it was thought that the 2-LTR circle was the substrate for integration, but it is now believed that the linear form (probably the direct product of reverse transcription) is the actual substrate used.


The ends of the LTRs consist of inverted repeats of 4-6 bp. These are brought together to form a cleavage site for IN and are cleaved to form a staggered cut. This molecule is then inserted into the host cell DNA. The net result of the integration process is that:

  1. The integrated provirus contains 1 or 2 less bases at the end of each LTR
  2. The ends of the integrated LTRs always have the same sequence: 5' - TG...CA - 3'
  3. 4-6 bp of host cell DNA flanking the integrated provirus are duplicated.
These observations can be explained by a model where a staggered cut (5' overhang) is introduced into both the ends of the LTRs and the host cell DNA, followed by joining of the cut ends and repair of the free 3' ends. Once integrated, the provirus is present for the lifetime of the cell (think about germ-line integration). There is no specific mechanism for excision of the provirus (c.f. lambda), and the infected cell cannot be 'cured'.

Gene expression:

Retroviruses use the cellular transcriptional machinery for expression (although a few encode additional transcriptional and post-transcriptional regulatory factors - HTLV and HIV). Therefore they are expressed like cellular genes. To compress maximal information into a small genome, they make use of a number to 'tricks', such as splicing and ribosomal frameshifting.

LTR Structure:

The U3 region of the LTR contains the promoter elements responsible for the initiation of transcription:

Promoter elements

In recent years, various LTRs have been intensively studied and dissected by molecular techniques, including:



Splicing is regulated by the cellular apparatus which interacts with cis-acting sequences present in the mRNA. The proteins encoded by gag, pol and pro (see below) genes are expressed from a full length genomic RNA (= vRNA). The env protein is expressed from a spliced mRNA. In more complex retro's, e.g. HTLV, Lentiviruses, there are multiply spliced mRNAs are produced. Pattern of splicing in HIV is very complex!
Expression of the protease gene: pro overlaps gag and/or pol, but is still expressed from the same full-length mRNA. Different viruses have a variety of post-transcriptional strategies to do this:



The genome is packaged as the particle buds out through the membrane. With both types, maturation occurs after the particle has budded, by cleavage events catalysed by the protease. Considerable structural changes occur during this process, resulting in the smooth gag shell of the immature particle being completely rearranged and leading to the condensation of the core visible in mature particles. N.B. - some types of retrovirus, notably Lentiviruses, are capable of infecting cells by direct cell-to-cell contact, without the formation of infectious extracellular particles.

Freed EO. (2004) HIV-1 and the host cell: an intimate association. Trends Microbiol. 12: 170-177.


A negatively stained electron micrograph of HIV (C-type) particles.

Retrovirus Genetics:

The genetics of retroviruses are complex:

Retrotransposons - endogenous retrovirus-like genetic elements:

Retrovirus-like elements

Much of the human genome consists of interspersed repetitive DNA sequences. The origin these sequences seems to have been retrotransposition in the germ line, generated by:

~11% of the mammalian genome is composed of retrovirus-like retrotransposons: "transposable elements in which transposition involves a process of reverse transcription with an RNA intermediate similar to that of a retrovirus". Compare this with only ~2.5% of the human genome which encodes unique (non-repeated) genes!
Alu and L1 are the major families of human interspersed repeated DNA, amounting to 10-15% of the genome. Another type of repetitive DNA element consists of retrovirus-like elements (RLEs), or human endogenous retroviruses (HERVs), representing about 7% of the human genome. Their structure closely resembles that of retroviruses, carrying internal sequences with homology to gag, pol, and sometimes env open reading frames flanked by long terminal repeats. Similar sequences occur in all organisms, from yeast to vertebrates. Bannert N, Kurth R. Retroelements and the human genome: new perspectives on an old relation. Proc Natl Acad Sci USA. 101: 14572-14579, 2004.

Phoenix from the ashes: The 5 million year old virus

Where do all these retroviruses come from?

So what have retroviruses ever done for us?

Mi, S. et al.(2000) Syncytin is a captive retroviral envelope protein involved in human placental morphogenesis Nature 403: 785-9.



Retroviral pathogenesis has concentrated on oncogenesis & more recently, AIDS, but retroviruses also cause a variety of haematopoetic and neurological conditions, including:

It was recently reported that an ancient retrotransposon insertion is the cause of Fukuyama-type muscular dystrophy, one of the commonest autosomal recessive disorders in Japan (Kobayashi,K. et al, (1998) Nature 394: 388-392). To date this is the only known instance of insertional mutagenesis of the human genome caused by this type of element, but other examples look certain to be discovered in future.

Retroviruses are under active development as vectors for gene therapy.

Pretend to be a Retrovirus!

© MicrobiologyBytes 2009.