NYTimes: BGI, based in China, is the world’s largest genomics research institute, with 167 DNA sequencers producing the equivalent of 2,000 human genomes a day.
BGI churns out so much data that it often cannot transmit its results to clients or collaborators over the Internet or other communications lines because that would take weeks. Instead, it sends computer disks containing the data, via FedEx.
“It sounds like an analog solution in a digital age,” conceded Sifei He, the head of cloud computing for BGI, formerly known as the Beijing Genomics Institute. But for now, he said, there is no better way.
The field of genomics is caught in a data deluge. DNA sequencing is becoming faster and cheaper at a pace far outstripping Moore’s law, which describes the rate at which computing gets faster and cheaper.
The result is that the ability to determine DNA sequences is starting to outrun the ability of researchers to store, transmit and especially to analyze the data.
... The cost of sequencing a human genome — all three billion bases of DNA in a set of human chromosomes — plunged to $10,500 last July from $8.9 million in July 2007, according to the National Human Genome Research Institute.
That is a decline by a factor of more than 800 over four years. By contrast, computing costs would have dropped by perhaps a factor of four in that time span.
The lower cost, along with increasing speed, has led to a huge increase in how much sequencing data is being produced. World capacity is now 13 quadrillion DNA bases a year, an amount that would fill a stack of DVDs two miles high, according to Michael Schatz, assistant professor of quantitative biology at the Cold Spring Harbor Laboratory on Long Island.
There will probably be 30,000 human genomes sequenced by the end of this year, up from a handful a few years ago, according to the journal Nature. And that number will rise to millions in a few years.
In a few cases, human genomes are being sequenced to help diagnose mysterious rare diseases and treat patients. But most are being sequenced as part of studies. The federally financed Cancer Genome Atlas, for instance, is sequencing the genomes of thousands of tumors and of healthy tissue from the same people, looking for genetic causes of cancer. ...
Here's a slide I sometimes use in talks.