y-Randomization and its variants in QSPR/QSAR

Research output: Journal contributionsJournal articlesResearchpeer-review

Authors

y-Randomization is a tool used in validation of QSPR/QSAR models, whereby the performance of the original model in data description (r 2) is compared to that of models built for permuted (randomly shuffled) response, based on the original descriptor pool and the original model building procedure. We compared y-randomization and several variants thereof, using original response, permuted response, or random number pseudoresponse and original descriptors or random number pseudodescriptors, in the typical setting of multilinear regression (MLR) with descriptor selection. For each combination of number of observations (compounds), number of descriptors in the final model, and number of descriptors in the pool to select from, computer experiments using the same descriptor selection method result in two different mean highest random r 2 values. A lower one is produced by y-randomization or a variant likewise based on the original descriptors, while a higher one is obtained from variants that use random number pseudodescriptors. The difference is due to the intercorrelation of real descriptors in the pool. We propose to compare an original model's r 2 to both of these whenever possible. The meaning of the three possible outcomes of such a double test is discussed. Often y-randomization is not available to a potential user of a model, due to the values of all descriptors in the pool for all compounds not being published. In such cases random number experiments as proposed here are still possible. The test was applied to several recently published MLR QSAR equations, and cases of failure were identified. Some progress also is reported toward the aim of obtaining the mean highest r 2 of random pseudomodels by calculation rather than by tedious multiple simulations on random number variables.

Translated title of the contributiony-Randomisierung in QSPR/QSAR
Original languageEnglish
JournalJournal of Chemical Information and Modeling
Volume47
Issue number6
Pages (from-to)2345-2357
Number of pages13
ISSN1549-9596
DOIs
Publication statusPublished - 11.2007
Externally publishedYes

DOI

Recently viewed

Researchers

  1. Georg Reischauer

Publications

  1. Exports and productivity: A survey of the evidence from firm-level data
  2. Mythos
  3. Exports, R&D and Productivity
  4. Sigrid Kopfermann
  5. Effects of oral corrective feedback on the development of complex morphosyntax
  6. Quality and time-related indicators in inceptive plans
  7. Online to offline social networking
  8. Silver Work
  9. Towards a Real-world Laboratory
  10. Sustainable Statehood: Reflections on Critical (Pre-)Conditions, Requirements and Design Options
  11. Do it again
  12. Prologue: Analyzing the Fine Details of Political Commitment
  13. Towards an agri-environment index for biodiversity conservation payment schemes
  14. Online-Beratung für Eltern
  15. Constitutive views on csr communication
  16. NFDI4DS Gateway and Portal
  17. How work values relate to the intention to work after retirement
  18. MICSIM-4J - A General Microsimulation Model
  19. Update wurde nicht ausgeführt
  20. Self-selection, socialization, and risk perception
  21. The Social Case as a Business Case
  22. Automatic imitation of pro- and antisocial gestures
  23. Governing Climate Change by Diffusion
  24. Gramsci global.
  25. Power and control on the waterfront
  26. Diversity as Polyphony
  27. Long-term drought triggers severe declines in carabid beetles in a temperate forest
  28. The Invisualities of Capture in Amazon’s Logistical Operations
  29. Analyis of a Potential Single and Combined Business Model for Stationary Battery Storage Systems
  30. Die Eco-rational Path-Method (EPM)
  31. Stakeholder and citizen involvement for Water Framework Directive implementation in Spain
  32. Doing Commons
  33. Materialities of the Performative
  34. Hydration and Dehydration of CaO/ Ca(OH)2 and CaCl2 / CaCl2 * 6 H2O– TGA/ DSC studies
  35. QUANT - Question Answering Benchmark Curator
  36. Documenting Artistic Networks
  37. An empirically tested overlap between indigenous and scientific knowledge of a changing climate in Bolivian Amazonia
  38. Appetizers for Business Integration into the heavy Meal of Transdisciplinary Practices
  39. § 39
  40. der schreiber schreibt
  41. Classification of playing position in elite junior Australian football using technical skill indicators
  42. Towards an Intra- and Interorganizational Perspective