Edward Holmes was in Australia on a Saturday morning in early January 2020, talking on the phone with a Chinese scientist named Yong-Zhen Zhang who had just sequenced the genome of a novel pathogen that was infecting people in Wuhan.
The two men – old friends – debated the results.
“I knew we were looking at a respiratory virus,” recalls Holmes, a virologist and professor at the University of Sydney. He also knew it looked dangerous.
Could he share the genetic code publicly? Holmes asked. Zhang was in China, on an airplane waiting for takeoff. He wanted to think it over for a minute.
“OK,” Zhang said at last. Holmes posted the sequence on a website called Virological.org; then he linked to it on Twitter.
Holmes knew researchers around the world would instantly start unwinding the pathogen’s code to try to find ways to defeat it.
From the moment the virus genome was first posted by Holmes, if you looked, you could find a genetic component in almost every aspect of our public health responses to SARS-CoV-2.
It’s typically the case, for instance, that a pharmaceutical company needs samples of a virus to create a vaccine. But once the sequence was in the public realm, Moderna, an obscure biotech company in the US immediately began working with the US National Institutes of Health on a plan.
“They never had the virus on site at all; they really just used the sequence, and they viewed it as a software problem,” Francis deSouza, the chief executive of Illumina, which makes the sequencer that Zhang used, told me with some amazement last summer, six months before the Moderna vaccine received an emergency-use authorisation by the US Food and Drug Administration.
The virus’s code also set the testing industry into motion. Only by analysing characteristic aspects of the virus’s genetic sequence could scientists create kits for the devices known as PCR machines, which for decades have used genetic information to formulate fast diagnostic tests.
In the meantime, sequencing was put to use to track viral mutations – beginning with studies published in February 2020 demonstrating that the virus was spreading in the US.
This kind of work falls within the realm of genomic epidemiology, or “gen epi”, as those in the field tend to call it. Many of the insights date to the mid-1990s and a group of researchers in Oxford, England, Holmes among them.
They perceived that following evolutionary changes in viruses that gain lasting mutations every 10 days (like the flu) or every 20 days (like Ebola) was inherently similar to – and, as we now know, inherently more useful than – following them in animals, where evolution might occur over a million years.
It wasn’t until the late 2000s that drastic improvements in genetic-sequencing machines, aided by huge leaps in computing power, allowed researchers to more easily and quickly read the complete genetic codes of viruses, as well as the genetic blueprint for humans, animals, plants and microbes.
In the sphere of public health, one of the first big breakthroughs enabled by faster genomic sequencing came in 2014, when a team at the Broad Institute of MIT and Harvard began sequencing samples of the Ebola virus from infected victims during an outbreak in Africa.
The work showed that, by contrasting genetic codes, hidden pathways of transmission could be identified and interrupted, with the potential for slowing (or even stopping) the spread of infection. It was one of the first real-world uses of what has come to be called genetic surveillance.
The advent of commercial genome sequencing has recently, and credibly, been compared to the invention of the microscope. Already, as the Harvard geneticist George Church said, “sequencing is 10 million times cheaper and 100,000 times higher quality than it was just a few years ago”.
If a new technological paradigm is arriving, bringing with it a future in which we constantly monitor the genetics of our bodies and everything around us, these sequencers – easy, quick, ubiquitous – are the machines taking us into that realm. And unexpectedly, Covid-19 has proved to be the catalyst.
Illumina achieved the $1,000 genome in 2014. Last summer, the company announced its NovaSeq 6000 could sequence a whole human genome for $600; at the time, deSouza, Illumina’s chief executive, said his company’s path to a $100 genome would not entail a breakthrough, just incremental technical improvements.
Several of Illumina’s competitors – including BGI, a Chinese genomics company – have indicated they will also soon achieve a $100 genome. Those in the industry whom I spoke with predicted that it may be only a year or two away.
In healthcare, the prospect of a cheap whole-genome test, perhaps from birth, suggests a significant step closer to the realisation of personalised medicines and lifestyle plans, tailored to our genetic strengths and vulnerabilities.
“When that happens, that’s probably going to be the most powerful and valuable clinical test you could have, because it’s a lifetime record,” Tom Maniatis, the head of the New York Genome Centre, said.
Your complete genome doesn’t change over the course of your life, so it needs to be sequenced only once. And Maniatis imagines that as new information is accumulated through clinical studies, your physician, armed with new research results, could revisit your genome and discover, say, when you’re 35, that you have a mutation that’s going be a problem when you’re 50.
“Really, that is not science fiction,” he said. “That is, I’m personally certain, going to happen.”
In some respects, it has begun already, even amid a public health crisis. In January, the New York Genome Centre began a partnership with Weill-Cornell and New York-Presbyterian hospitals to conduct whole-genome sequences on thousands of patients.
Olivier Elemento, a doctor who leads the initiative at Weill-Cornell, said the goal was to see how a whole-genome sequence – not merely the identification of a few genetic traits – could inform diagnosis and treatment.
What is the best medication based on a patient’s genome? What is the ideal dosage?
“We’re trying to address a very important question that’s never been answered at this scale,” Elemento explained.
“What is the utility of whole-genome sequencing?”
He said he believed that within one or two years, the study would lead to an answer.
Some of the grandest hopes for sequencing have arisen from the notion that our genes are deterministic – and that by understanding our DNA’s code, we might limn our destiny.
But the past 20 years have demonstrated that inherited genes are just one aspect of a confounding system that’s not easily interpreted. In the meantime, scientists have come to realise something else: a complex overlay of environmental and lifestyle factors, as well as our microbiomes, appear to have interconnected effects on health, development and behaviour.
And yet, in the course of the past year, some of the extraordinary hopes for genomic sequencing did come true, but for an unexpected reason. During the summer and autumn, I spoke frequently with executives at Illumina, as well as its competitor in Britain, Oxford Nanopore.
It was clear the pandemic had meant a startling interruption in their business, but at each company the top executives perceived the situation as an opportunity – the first pandemic in history in which genomic sequencing would inform our decisions and actions in real time.
From the start, the gen-epi community understood the SARS-CoV-2 virus would form new variants every few weeks as it reproduced and spread; it soon became clear that it could develop one or more alterations (or mutations) at a time.
Because of this insight, on January 19, 2020, just over a week after the virus's code was released to the world, scientists could look at 12 complete virus genomes shared from China and conclude that the fact that they were nearly identical meant that those 12 people had been infected around the same time and were almost certainly infecting one another.
“That was something where the genomic epidemiology could help us to say, loudly, that human transmission was rampant, when it wasn’t really being acknowledged as it should have been,” Trevor Bedford, a scientist at the Fred Hutchinson Cancer Research Center, said.
When Bedford’s lab began studying viral genomes in Seattle, he could go a step further. By late February, he concluded that new cases he was seeing were not just being imported to the US from China.
Based on observations of local mutations – two strains found six weeks apart looked too similar to be a coincidence – community transmission was happening here.
On February 29, Bedford put up a Twitter post that noted, chillingly, “I believe we’re facing an already substantial outbreak in Washington State that was not detected until now.” His proof was in the code.
Bedford’s lab was one of many around the world that began tracking the virus’s evolution and sharing it in global databases. In the meantime, gen-epi researchers used sequencing for local experiments too.
In the spring of 2020, a team of British scientists compared virus sequences sampled from ill patients at a single hospital to see if their infections came from one another or from elsewhere.
“We were able to generate data that were useful in real time,” Esteé Torok, an academic physician at the University of Cambridge who helped lead the research, said. And in an ideal world, you could do that every day.”
In other words, sequencing had advanced from a few years ago, when scientists might publish papers a year after an outbreak, to the point that genetic epidemiologists could compare mutations in a specific location in order to be able to raise alarms – We have community spread! Patients on Floor 3 are transmitting to Floor 5! – and act immediately.
To watch the pandemic unfold from the perspective of those working in the field of genomics was to see both the astounding power of new sequencing tools and the catastrophic failure of the American public health system to take full advantage of them.
At the end of July, the US National Academy of Sciences released a report noting that advances in genomic sequencing could enable our ability “to break or delay virus transmission to reduce morbidity and mortality”.
And yet the report scathingly noted that sequencing endeavours for the coronavirus were “patchy, typically passive, reactive, uncoordinated and underfunded”.
Researchers were similarly worried that our sequencing efforts to track the pathways of infection – unlike more serious and government-supported efforts in Britain or Australia – were flailing.
One of the Biden administration’s approaches to slowing the pandemic has been to invest $200m in sequencing virus samples from those who test positive.
With the recent approval of the $1.9trn American Rescue Plan, a further $1.75bn will be allocated to the Centres for Disease Control (CDC) and Prevention to support genomic sequencing and disease surveillance.
In late January, the CDC began disbursing money to public-health laboratories around the US to bolster the sequencing work already being done at academic labs. But the effort was starting from a low baseline.
One calculation in thenoted that the United States had ranked 38th globally in terms of employing sequencing during the pandemic; as of mid-February, the US was still trying to catch up to many European and Asian countries.
And it therefore couldn’t be said that new or dangerous variants weren’t landing on our shores or emerging here afresh. What could be said is that we were unable to know.
While countless genomics companies have already sprung up, for now just four companies run most of the sequencing analyses in the world. These are Illumina and Pacific Biosciences, based in the United States; Oxford Nanopore Technologies, based in Britain; and China’s BGI Group.
According to the US Federal Trade Commission, Illumina controls roughly 90% of the market for sequencing machines in the US, and by the company’s own assessment, it compiles 80% of the genomic information that exists in the world in a given year.
In late September, Illumina announced it intended to acquire, for $8bn, a biotech company called Grail, which has created a genomic test that runs on an Illumina sequencer and that an early study suggests can successfully detect more than 50 types of cancers from a small sample of blood.
On a recent corporate earnings call, deSouza called Grail and early cancer detection “by far the largest clinical application of genomics we’re likely to see over the next decade or two”.
For now, the industry will seek other areas of growth. In many respects, a genetic sequencer is over-engineered for the task of simply testing for a virus. APCR machine is faster, cheaper and less complex.
And yet there are potential advantages to the sequencer. Illumina eventually won emergency approval from the US Food and Drug Administration for a diagnostic test for the NovaSeq that can run about 3,000 swab samples, simultaneously, over the course of 12 hours.
Two hundred NovaSeqs could do more than a million.
In addition to this immense capacity, it’s viable to test for the virus and sequence the virus at the same time: an analysis run on a sequencer could inform patients whether they have the virus, and the anonymised sequencing data on positive samples could give public-health agencies a huge amount of epidemiology data for use in tracking variants.
“I can envision a world where diagnosis and sequencing are kind of one and the same,” Bronwyn MacInnis, who directs pathogen genomic surveillance at the Broad Institute, said.
“We’re not there yet, but we’re not a million miles off, either.”
• Adapted from an article that originally appeared in the New York Times Magazine