PHILADELPHIA (July 29, 2010) – Over millions of years of evolution, retroviruses, which insert their genetic material into the host genome as part of their replication, have left behind numerous bits of their genetic material in vertebrate genomes. Now, in an unexpected discovery, a team of researchers reports that human and other vertebrate genomes also contain many ancient sequences from two deadly virus families.
Because neither virus family inserts their genetic material into the host genome during replication, as retroviruses do, the discovery was unanticipated. The conservation of some of these sequences over evolutionary time, however, suggests that they must give the host a selective advantage, perhaps protecting them from future infections by viruses from those families.
"This was a surprise for us," says Anna Marie Skalka, PhD, Director Emerita of the Institute for Cancer Research at Fox Chase Cancer Center and corresponding author on the study to be published in the July 29 issue of PLoS Pathogens. "It says that the source of our genetic material is considerably wider than we thought. It includes our own genes and unexpected viral genes as well. It is extraordinary."
The team, which included lead author Vladimir A. Belyi, PhD, and co-author Arnold J. Levine, PhD, both at the Institute for Advanced Study in Princeton, compared 5,666 viral genes from all known non-retroviral families with single-stranded RNA genomes to the genomes of 48 vertebrate species, including humans. In doing so, they uncovered 80 separate viral sequence integrations into 19 different vertebrate species. Remarkably, nearly all of the viral sequences come from ancient relatives of just two viral families, the Ebola/Marburgviruses and Bornaviruses, both of which include deadly pathogens that cause hemorrhagic fevers and neurological disease, respectively.
The findings were also unexpected because there was no obvious mechanism to account for integration of these viral genes into the host genome.
"These viruses are RNA viruses," Skalka says. "They replicate their RNA and are not known to make any DNA. And they have no known mechanism for getting their genetic material integrated into the DNA of the host genome. Indeed, some of them don't even enter the nucleus when they replicate."
It is remarkable that the sequences, some of which may have been integrated into the genomes more than 40 million years ago, have been largely conserved and contain open reading frames, DNA sequences potentially able to encode a protein. The maintenance of open reading frames over such a long time suggests that the sequences provide some active benefit to the host, such as protection during subsequent exposures to related viruses. "In a way, one might even think of these integrations as genomic vaccinations," says Skalka, who initiated the study while on sabbatical at the Institute for Advanced Study.
Demonstrating conclusively that the viral sequences have some biological function will take additional work. However, the team has noted that expression of some of these viral open reading frames has been detected in human tissues, which supports the possibility that they are biologically active in host species.
Skalka, whose main line of work focuses on retroviral integration and expression in host genomes, says this and other recent discoveries show there is a lot more movement in our genome than has been previously thought. Their PLoS Pathogens paper shows that integration of the ancient viral sequences was probably mediated by movable elements, LINEs, which are abundant in mammlian genomes. "It goes back to the work of Nobelist Barbara McClintock and her vision of the genome as being very flexible, due to the activity of transposable elements, which she first discovered in plant genomes," Skalka says.