Genomics In The Cloud - Driving Efficiencies To Advance Scientific Progress

Published date24 February 2023
Subject MatterTechnology, New Technology
Law FirmKainos
AuthorKainos &nbsp

The growth of genomics sequencing

The last 20 years have seen tremendous growth in human genomics. To put things in perspective, the Human Genome project began in 1990 as an international scientific research project with the goal of creating the first human genome sequence.

The project took 10 years to create its first working genome sequence draft and 13 years before it was completed. Today, this process can be completed in under 24 hours 1 .

Due to these incredible advances in technology, millions of genomes have been sequenced so far, making it easier to study diseases associated with mutations in a single gene.

But with genomics sequencing analysis becoming more accessible, the amount of data being produced is rapidly expanding, with the amount of raw genomic data being produced around the world doubling every seven months 2 .

By 2025 it is estimated that:

Between 100M and 2B genomes will be sequenced3

Between 2 and 40 exabytes of storage capacity will be required to store the entire globe's human genomic data 4

This data, combined with ever-growing amounts of single-cell and functional genomics data, digital medical records, and other critical biomedical data, has the potential to substantially enhance our understanding of the fundamental processes for healthy life and revolutionise the treatment of disease. But this doesn't come without its challenges.

Data challenges - overcoming research bottlenecks

Genomics data output is increasing all the time. But while these ever-larger scientific datasets may be a goldmine for discovery, analysing them within on-premise legacy environments - which many life-science organisations still use - has become a bottleneck in genomics research. Massive processing power and scalability is required, presenting a challenge to organisations of all sizes with on-premise storage systems based on outdated legacy infrastructure.

For example, to gain insights from huge and archived datasets, a researcher must secure sufficient storage space and perform large, time-consuming downloads, followed by a compute-intensive data re-analysis from scratch.

Many labs aren't equipped for this, so valuable data goes unused. In addition, the velocity and volume of genomic data continues to rise in response to reduced sequencing costs and broader adoption. Eventually, single organisations may struggle to independently manage, sequence, process and analyse all insight available from a particular data set.

Instead, we may see smaller, agile...

To continue reading

Request your trial

VLEX uses login cookies to provide you with a better browsing experience. If you click on 'Accept' or continue browsing this site we consider that you accept our cookie policy. ACCEPT