Our cells’ nuclei aren’t exactly what you’d call calm, quiet places. They’re more like busy city squares, filled with a constant bustle of activity: DNA folds and unfolds, proteins zip in and out to read genes and tag histones and whole chromosomes duplicate themselves while the cell preps for its next round of division.
Now add one more ingredient to this mix: genes that don’t sit still. Our genome is full of what are called transposons, the remnants of ancient viruses that bound themselves within our DNA over evolutionary time. Transposons pretty much do just one thing—copy and insert themselves all over the genome, cutting in on other genes like suitors at a debutantes’ ball.
You might think that having pieces of DNA randomly jumping into and out of genes wouldn’t be a very good thing. And you’d be right: members of Boston Children’s Hospital’s Informatics Program (CHIP) recently reported in Science the first evidence that transposons may directly contribute to the development of some cancers. But the story isn’t that simple.
“Half of the human genome is made up of transposons,” says Peter Kharchenko, PhD, a CHIP researcher. “Most of them are no longer functional, in that they’ve lost the ability to copy themselves into new locations. But there are plenty within the genome that are still active.”
Some transposons have played crucial roles in making our cells what they are today—evolving into the very mechanisms that help control expression of our genes. At the same time, our cells have also come up with ways of making sure that transposons don’t start randomly jumping all over the genome.
But there are windows of opportunity for transposons to cause trouble. “Transposons can cause problems in early embryonic development,” says Peter Park, PhD, another CHIP member and Kharchenko’s former mentor. “It’s long been tempting to think that they might help drive the development of cancer cells.
“But because cancer is, from a genomic standpoint, a rare event,” he adds, “we’ve never had enough high quality cancer sequence data to dive in and fish them out.”
Until now, that is. The Park lab participates in The Cancer Genome Atlas (TCGA) network, a National Cancer Institute-supported effort to catalog the driving mutations of several major cancers. Through TCGA and other published papers, the pair accessed sequence data from 43 tumors representing five kinds of cancer—prostate, colon, ovarian, brain and blood—and compared those data with normal sequences from the same patients.
To carry out their analysis, Eunjung Lee, PhD, a postdoctoral fellow in the Park lab, developed a computational tool that would sift through all of that sequence data—30 terabytes’ worth (that’s a 30 followed by 12 zeroes)—and predict where any transposons may have inserted themselves.
They then turned to experimentalist colleagues, Rebecca Iskow, PhD, and others in the lab of Brigham and Women’s Hospital’s Charles Lee, PhD, to verify whether the transposons the computer predicted actually existed at those locations in the genomes studied.
“We were pleasantly surprised,” says Park. “Our colleagues confirmed 97 percent of our predicted insertion sites.”
And those sites taught them a lot about what transposons are capable of when it comes to cancer. “For the most part, the transposons didn’t disrupt any genes directly, but instead broke parts of the genome that likely control how certain genes are expressed, especially genes that tend to be mutated in tumors in general,” Kharchenko explains. “These were functional disruptions, rather than sequence disruptions.”
The findings add further support to the idea that transposons play an active role in cancer development. “This is one aspect of cancer genomics that people haven’t worked on much,” says Park, “but which we think could be causative in at least a percentage of cases.
“Prior to this work, just 12 transposon insertions specific to cancers had been found in the literature,” he notes. “We found another 200 transposon insertions just among the 43 tumor genomes we studied, primarily in prostate, colon and ovarian cancer tumors.”
Both Park and Kharchenko believe that, as sequence data becomes available on more tumors, our knowledge of cancer’s dirty tricks will grow exponentially.
“I think the genome holds lots of surprises when it comes to cancer and other diseases,” says Park. “There are hundreds of terabytes, maybe even petabytes, of tumor genomic data available to computational biologists like us right now, and we want to analyze it all.
“We’re mainly limited by analytical methods, computing power and disk space,” he continues. “We are working hard to develop new methods—and buying more computers and disks.”