Highlighting key advances in Glycoproteomics software

Publicly released:
Australia; NSW

Glycoproteome profiling (glycoproteomics) is a powerful yet challenging research tool in the proteomics space. A new study conducted through the HUPO – Human Glycoproteomics Initiative, evaluated the performance of informatics solutions for system-wide glycopeptide analysis whilst also seeking to identify key search variables that may guide future software developments and assist informatics decision-making in the field.

Media release

From: Macquarie University

Glycoproteome profiling (glycoproteomics) is a powerful yet challenging research tool in the proteomics space. The information-dense spectral data generated from mass spectrometry (MS) analysis of complex glycopeptide mixtures require sophisticated informatics pipelines for structural determination. Diverse software streamlining the spectral annotation and glycopeptide identification process have appeared, but their relative performance remains untested.

A new study conducted through the HUPO – Human Glycoproteomics Initiative, evaluated the performance of informatics solutions for system-wide glycopeptide analysis whilst also seeking to identify key search variables that may guide future software developments and assist informatics decision-making in the field.

Researchers from Macquarie University led this global standardisation study involving 54 scientists across 35 institutions in 11 countries from five continents. In total, 22 teams completed the challenge that was founded on shared MS data of N- and O-glycopeptides from human serum. Both developers and experienced users of glycoproteomics software participated in the study.

“Over the course of this extensive community-wide study, several high-performance glycoproteomics informatics solutions were identified which will have a significant impact in the field of glycoproteomics,” says first author Dr Rebeca Kawahara, a Cancer Institute NSW Early Career Research Fellow at Macquarie University.

By performing a systematic, comprehensive and unbiased comparison of the relative performance of the informatics solutions available to the community, the study, published this week in Nature Methods, identified for the first time the strengths and weaknesses of the existing software for glycopeptide data analysis. Importantly, data from the study enabled the researchers to unpick search parameters of particular importance for precision glycopeptide data analysis.

Despite analysing the same MS data, the identified glycopeptides varied dramatically between the teams as illustrated by the wide range of N-glycopeptides (49-2,122) and O-glycopeptides (5-578) reported by the participants.

“Currently, the annotation process of glycopeptide MS/MS data is subject to high levels of error. This is due to the challenging nature of correctly assigning the glycan composition, modification site(s) and peptide carrier. As a result, glycopeptides are frequently misidentified or suffer from ambiguous annotation,” says co-author Anastasia Chernykh, a PhD student in the Analytical Glycoimmunology Group.

Despite the discordant reporting, high-confidence lists of glycopeptides commonly reported by the teams could be generated from the standardised reports. These consensus glycopeptides form an important reference for future studies of the human serum glycoproteome and have therefore been made publicly available.

Amongst the high-performance glycoproteomics informatics solutions were both well-established and recently developed software from academic and commercial sources. Even amongst the high performing software, the researchers found different performance profiles for N- and O-glycopeptide data analysis highlighting that there is still no one universal informatics solution available to the field.

Exploration of the impact of the different search strategies on the glycoproteomics data output by the popular Byonic search engine used by no less than half of the teams led to recommendations for improved “high coverage” and “high accuracy” glycoproteomics search strategies of benefit when employing this software.

“This study will immediately benefit researchers in the field; findings from our comprehensive evaluation of software for glycopeptide data analysis will help users and developers to progress by pointing out areas of strengths and highlighting areas where improvements are warranted” says senior author Dr Morten Thaysen-Andersen, Group Leader of the Analytical Glycoimmunology Group and ARC Future Fellow at Macquarie University.

“While informatics challenges undoubtedly still exist in glycoproteomics, our study interestingly highlights that several high performing and promising computational tools are already available to the wider community. The future appears bright for the burgeoning field of glycoproteomics,” Dr Kawahara.

Reference:
Kawahara R, Chernykh A, Alagesan K, Bern M, Cao W, Chalkley RJ, Cheng K, Choo MS, Edwards N, Goldman R, Hoffmann M, Hu Y, Huang Y, Kim JY, Kletter D, Liquet B, Liu M, Mechref Y, Meng B, Neelamegham S, Nguyen-Khuong T, Nilsson J, Pap A, Park GW, Parker BL, Pegg CL, Penninger JM, Phung TK, Pioch M, Rapp E, Sakalli E, Sanda M, Schulz BL, Scott NE, Sofronov G, Stadlmann J, Vakhrushev SY, Woo CM, Wu H.-Y, Yang P, Ying W, Zhang H, Zhang Y, Zhao J, Zaia J, Haslam SM, Palmisano G, Yoo JS, Larson G, Khoo K-H, Medzihradszky KF, Kolarich D, Packer NH, and Thaysen-Andersen M. Community Evaluation of Glycoproteomics Informatics Solutions Reveals High-Performance Search Strategies of Serum Glycopeptide Data. Nature Methods, Accepted 22/09/2021, ahead of print, 2021. https://doi.org/10.1038/s41592-021-01309-x

Journal/
conference:
Nature Methods
Research:Paper
Organisation/s: Macquarie University, Griffith University, The University of Melbourne
Funder: N/A
Media Contact/s
Contact details are only visible to registered journalists.