Leaving the Napster age, entering the genomic data sharing economy

Genomic sharing. Let’s compare our state of sharing genomic data to the music industry. We are at a point where we share large datasets between genome centers, occasionally shipping hard drives or using fast upload tools. Still, we are copying and transferring terabytes of data from A to B, making copies of datasets locally in order to have them for our local analysis. A BAM file is only useful if it is present on our server. It is like downloading all the Led Zeppelin albums in the early 2000s through sharing platforms. We are currently in the Napster age of genomic data, amazed how quickly these large datasets tend to clog our server space. Let us revisit what happened to Napster and why our current way of thinking will soon be outdated in what will become the genomic sharing economy.

Napster peaked in 2001. Image modified from public domain Wikipedia image https://commons.wikimedia.org/wiki/File:Napster_Unique_Users.svg

Napster peaked in 2001. Image modified from public domain Wikipedia image https://commons.wikimedia.org/wiki/File:Napster_Unique_Users.svg

Napster, iTunes, Spotify. Let us quickly review what happened to Napster. This peer-to-peer music platform was eventually brought down through legal difficulties, but this is only part of the story. In fact, Napster, having demonstrated that there is an audience for sharing music online, also got outdated by other developments. If Napster re-opened today, it would not find anybody willing to buy into the concept of downloading large files. Why download when you can stream over the internet and, with every new player in the market field, streaming gets easier and less expensive? The same holds true of Apple iTunes. Do you think that anybody will buy songs in the future? Translate this to genomic data sharing. There will be genomic streaming services and the BAM file, the large exome/genome file that contains all the information will go the way of the MP3. In five years, 90% of all users of genomic data won’t even remember what a BAM file is.

The genomic sharing economy. Airbnb and Uber are examples of what is called the sharing economy. The basic concept of the sharing economy is that you you do not need to own the basic product any longer, you simply profit from providing a platform that connects demand and supply for a given product. While it sounds like a sub-par way for businesses to survive if they do not offer or own the actual product, it turned out to be a goldmine. By focusing on the connectivity aspects, these platforms grow and prosper much faster than anyone could ever imagine – connecting ten drivers turns out to be more valuable than owning a single cab. This way, we have arrived at a new normal where the world’s largest hotel chain does not own any buildings and the world’s largest cab company does not own a single car.

Freeing genomic data. Generating genomic data, including whole genome sequences, is becoming cheaper, and we are already at the point where the costs of obtaining and providing biosamples is higher than the cost of whole-genome sequencing. This trend will continue up to the point where genome sequencing is no longer a major issue. It will be something that can be done and will always cost a little, but it will be a small investment. Let us translate the Airbnb and Uber example to genome science. Will there be a situation where generating and possessing genomic data will be much less important than providing the platform to share this data by connecting centers, individuals, companies? A situation where traditional genome centers will suddenly only play a minor role once data is freed into a data sharing platform? Looking back at the history of various industries over the last five years, this is likely. The question is simply who the first will to provide such a platform that satisfies this need.

Genomics, really? Will genomics really be so big that it is reasonable to compare it to Spotify, Airbnb or Uber? Isn’t genomics a scientific field for few experts rather than for a broader audience? Let’s make a comparison with the personal computer. When computers were first invented, only a few highly trained engineers could operate these machines that are easily outmatched by a five-year-old smartphone. However, with improving technology, price reduction, and novel features, these machines made onto everybody’s desk and – as smartphones- into everybody’s palm. Genomics is interesting, not just for me as a clinician/scientist, but also for most people if they are given the chance to see their own data. This is a phenomenon that 23andme is currently profiting from. Once there are platforms to introduce genomics to a larger audience, the interest will soar and the floodgates will open. And the resulting novel genomic sharing economy will be different from everything we had before.

Ingo Helbig is a child neurologist and epilepsy genetics researcher working at the Children’s Hospital of Philadelphia (CHOP), USA.

Twitter