y-Randomization and its variants in QSPR/QSAR

Publikation: Beiträge in ZeitschriftenZeitschriftenaufsätzeForschungbegutachtet

Authors

y-Randomization is a tool used in validation of QSPR/QSAR models, whereby the performance of the original model in data description (r 2) is compared to that of models built for permuted (randomly shuffled) response, based on the original descriptor pool and the original model building procedure. We compared y-randomization and several variants thereof, using original response, permuted response, or random number pseudoresponse and original descriptors or random number pseudodescriptors, in the typical setting of multilinear regression (MLR) with descriptor selection. For each combination of number of observations (compounds), number of descriptors in the final model, and number of descriptors in the pool to select from, computer experiments using the same descriptor selection method result in two different mean highest random r 2 values. A lower one is produced by y-randomization or a variant likewise based on the original descriptors, while a higher one is obtained from variants that use random number pseudodescriptors. The difference is due to the intercorrelation of real descriptors in the pool. We propose to compare an original model's r 2 to both of these whenever possible. The meaning of the three possible outcomes of such a double test is discussed. Often y-randomization is not available to a potential user of a model, due to the values of all descriptors in the pool for all compounds not being published. In such cases random number experiments as proposed here are still possible. The test was applied to several recently published MLR QSAR equations, and cases of failure were identified. Some progress also is reported toward the aim of obtaining the mean highest r 2 of random pseudomodels by calculation rather than by tedious multiple simulations on random number variables.

Titel in Übersetzungy-Randomisierung in QSPR/QSAR
OriginalspracheEnglisch
ZeitschriftJournal of Chemical Information and Modeling
Jahrgang47
Ausgabenummer6
Seiten (von - bis)2345-2357
Anzahl der Seiten13
ISSN1549-9596
DOIs
PublikationsstatusErschienen - 11.2007
Extern publiziertJa

DOI

Zuletzt angesehen

Forschende

  1. Sandy Hannibal

Publikationen

  1. Prototypische Lehr-Lern-Bausteine
  2. Ringen um Sinn
  3. How to protect the truth? Challenges of cybersecurity, investigative journalism and whistleblowing in times of surveillance capitalism.
  4. Emotional states of drivers and the impact on speed, acceleration and traffic violations - A simulator study
  5. Sozialfaschismusthese
  6. Studiengang "Master in Auditing" an der Leuphana Universität Lüneburg
  7. Bioconversion of agri-food residues into lactic acid
  8. Shifting Competency Requirements for IT Professionals in the Digital Transformation: A Competency Transformation Process Model
  9. Grundsatzfragen und Paradoxien für die Netzwerkarbeit in BBS futur 2.0
  10. Optical flow fields and visual attention in car driving
  11. Grist to the mill of subversion
  12. Empirische Erfassung eines „messy constructs“
  13. Konfiguration der PPS
  14. On the Differential and Shared Effects of Leadership for Learning on Teachers’ Organizational Commitment and Job Satisfaction
  15. Integration of Material Flow Management into Company Processes within the Automotive Industry
  16. Benno Reifenberg (1892-1970)
  17. Dis/Ability and Digital Cultures. A Media-Archaeological Perspective on Inclusion as a Cipher
  18. Beyond Allyship
  19. Maintaining the impact of action-oriented entrepreneurship training
  20. Interdisciplinary engineering education in the context of digitalization and global transformation prozesses.
  21. Auf's Wasser
  22. Elevational shifts in tree community composition in the Brazilian Atlantic Forest related to climate change
  23. Science, policy and implementation gaps: An exploration of groundwater management in Hungary
  24. Grain Structure Evolution Ahead of the Die During Friction Extrusion of AA2024
  25. Links between media communication and local perceptions of climate change in an indigenous society
  26. Evidence-Based Management and Organizational Reality
  27. Moderators of intergroup evaluation in disadvantaged groups
  28. Sustainability learnings from the COVID-19 crisis