Development of a scoring parameter to characterize data quality of centroids in high-resolution mass spectra

Research output: Journal contributionsJournal articlesResearchpeer-review

Authors

High-resolution mass spectrometry is widely used in many research fields allowing for accurate mass determinations. In this context, it is pretty standard that high-resolution profile mode mass spectra are reduced to centroided data, which many data processing routines rely on for further evaluation. Yet information on the peak profile quality is not conserved in those approaches; i.e., describing results reliability is almost impossible. Therefore, we overcome this limitation by developing a new statistical parameter called data quality score (DQS). For the DQS calculations, we performed a very fast and robust regression analysis of the individual high-resolution peak profiles and considered error propagation to estimate the uncertainties of the regression coefficients. We successfully validated the new algorithm with the vendor-specific algorithm implemented in Proteowizard’s msConvert. Moreover, we show that the DQS is a sum parameter associated with centroid accuracy and precision. We also demonstrate the benefit of the new algorithm in nontarget screenings as the DQS prioritizes signals that are not influenced by non-resolved isobaric ions or isotopic fine structures. The algorithm is implemented in Python, R, and Julia programming languages and supports multi- and cross-platform downstream data handling.

Original languageEnglish
JournalAnalytical and Bioanalytical Chemistry
Volume414
Issue number22
Pages (from-to)6635-6645
Number of pages11
ISSN1618-2642
DOIs
Publication statusPublished - 09.2022
Externally publishedYes

Bibliographical note

Publisher Copyright:
© 2022, The Author(s).

    Research areas

  • Centroiding, Data processing, Data quality, HRMS
  • Chemistry

Recently viewed

Publications

  1. Trait correlation network analysis identifies biomass allocation traits and stem specific length as hub traits in herbaceous perennial plants
  2. Material flow during constrained friction processing and its effects on the local properties of AM50 rods
  3. Applications of the Simultaneous Modular Approach in the Field of Material Flow Analysis
  4. Understanding reading as a form of language-use
  5. HAWK - hybrid question answering using linked data
  6. Identification of conductive fiber parameters with transcutaneous electrical nerve stimulation signal using RLS algorithm
  7. Introducing split orders and optimizing operational policies in robotic mobile fulfillment systems
  8. Dynamic priority based dispatching of AGVs in flexible job shops
  9. Stability analysis of a linear model predictive control and its application in a water recovery process
  10. Supporting discourse in a synchronous learning environment
  11. From Knowledge to Application
  12. What can conservation strategies learn from the ecosystem services approach?
  13. Modeling items for text comprehension assessment using confirmatory factor analysis
  14. Text Comprehension as a Mediator in Solving Mathematical Reality-Based Tasks
  15. How Much Tracking Is Necessary? - The Learning Curve in Bayesian User Journey Analysis
  16. Reality-Based Tasks with Complex-Situations
  17. Self-tuning of a kalman filter applied in a DC drive and in a kalman-based sensor
  18. Wavelet functions for rejecting spurious values
  19. Distinguishing state variability from trait change in longitudinal data
  20. Evaluation of standard ERP software implementation approaches in terms of their capability for business process optimization
  21. A Lyapunov based PI controller with an anti-windup scheme for a purification process of potable water
  22. Identification of sites with elevated PM levels along an urban cycle path using a mobile platform and the analysis of 48 particle bound PAH
  23. Data based analysis of order processing strategies to support the positioning between conflicting economic and logistic objectives
  24. Alternating between Partial and Complete Organization
  25. On the Appropriate Methodologies for Data Science Projects
  26. A Column Generation Approach for Bus Driver Rostering Problems
  27. Linear free vibrations with uncertain initial conditions
  28. Age effects on controlling tools with sensorimotor transformations
  29. Overcoming Multi-legacy Application Challenges through Building Dynamic Capabilities for Low-Code Adoption