Image courtesy of the Transboundary and Emerging Disease journal
Image courtesy of the Transboundary and Emerging Disease journal

Key patient insights the missing link in understanding COVID-19 and its mutations

Embargoed until: Publicly released:
Peer-reviewed: This work was reviewed and scrutinised by relevant independent experts.

Systematic review: This type of study is a structured approach to reviewing all the evidence to answer a specific question. It can include a meta-analysis which is a statistical method of combining the data from multiple studies to get an overall result.

People: This is a study based on research using people.

A new study led by Australia’s national science agency CSIRO, has found 95.5 per cent of current entries in GISAID, the world’s largest novel coronavirus genome database, do not contain relevant patient information — a critical piece of the puzzle to understand the virus and how it is evolving.

Journal/conference: Transboundary and Emerging Diseases

Link to research (DOI): 10.1111/tbed.13892

Organisation/s: CSIRO

Funder: N/A

Media release

From: CSIRO

A new study led by Australia’s national science agency CSIRO, has found 95.5 per cent of current entries in GISAID, the world’s largest novel coronavirus genome database, do not contain relevant patient information — a critical piece of the puzzle to understand the virus and how it is evolving.

The researchers have used this finding to develop a standardised data collection template, which can be implemented on repositories like GISAID, without identifying the patient and making it easier for clinical teams treating patients to share more of their knowledge.

This enables the scientific community to access important information including symptoms, vaccine status and travel history and in doing so build a more complete picture of the impact of COVID-19 on each patient.

SARS-CoV-2, the virus that causes COVID-19, is one of the most sequenced viruses in history, with over 200,000 sequences on GISAID as of 16 November 2020.

The last 100,000 sequences of the virus were uploaded in the past two months, a global record.

The study, a collaboration with GISAID and other academic partners, proposes a standardised data collection method to help scientists and clinicians around the world gather and share vital information in the fight against COVID-19.

CSIRO researcher and senior author of the paper Dr S.S. Vasan said it is critical to collect the ‘patient journey’ in as much detail as possible to understand the impact of virus evolution on the disease and its consequences.

“We urgently need de-identified patient data associated with these virus genome sequences in order to decipher whether disease outcomes are due to a mutation, or multiple mutations, in the virus or host factors such as age, gender and co-morbidities,” Dr Vasan said.

“It’s very likely this information is known to the clinical teams who treated the patient but does not make its way to public repositories such as GISAID, due to the number of steps involved.”

Recognising this need for clinical data, GISAID made ‘patient status’ a compulsory field for uploading virus sequences since 27 April 2020.

However, the study showed a lack of digital infrastructure for collecting clinical information has hampered progress.

It also identified the need for a standardised vocabulary and mechanism for linking in with health systems as key factors for capturing the necessary information.

Lead author and CSIRO researcher Dr Denis Bauer said with the adoption of the study’s proposed data collection template, future sequences shared through the GISAID initiative could contain more meaningful de-identified patient information.

“We have identified steps in the clinical health data acquisition cycle and workflows that likely have the biggest impact in the data-driven understanding of this virus,” Dr Bauer said.

“Following the ‘Fast Healthcare Interoperable Resource’ implementation guide, we have introduced an ontology-based standard questionnaire consistent with the World Health Organization’s recommendations.”

Barwon Health’s Director of Infectious Diseases Professor Eugene Athan welcomed the new data collection template.

“Barwon Health is leading a study on the long-term biological, physiological and psychological effects of COVID-19, in partnership with CSIRO and Deakin University, and we intend to implement this mechanism for our data collection and reporting,” Prof Athan said.

“Having a simplified and standardised approach to sharing relevant patient information alongside genome sequences will enable critical research into COVID-19 and comparisons between different studies and population sets.

“I encourage clinicians and scientists around the world to share, wherever possible, de-identified patient information and clinical outcomes using this template to support ongoing research efforts.”

The paper 'Interoperable medical data: the missing link for understanding COVID‐19' was published in the Transboundary and Emerging Diseases journal.

News for:

Australia

Multimedia:

  • 95.5% of entries do not contain relevant patient information
    95.5% of entries do not contain relevant patient information

    A word cloud showing the most commonly used terms to describe patient outcomes uploaded with SARS-CoV-2 genome sequences.

    File size: 1020.4 KB

    Attribution: CSIRO

    Permission category: © - Only use with this story

    Last modified: 17 Nov 2020 12:06am

    NOTE: High resolution files can only be downloaded here by registered journalists who are logged in.

  • An example of the standardised template developed by the researchers.
    An example of the standardised template developed by the researchers.

    An example of the standardised template developed by the researchers.

    File size: 447.3 KB

    Attribution: CSIRO

    Permission category: Free to share (must credit)

    Last modified: 18 Nov 2020 12:16am

    NOTE: High resolution files can only be downloaded here by registered journalists who are logged in.

Show less
Show more

Media contact details for this story are only visible to registered journalists.