How Much Tracking Is Necessary? - The Learning Curve in Bayesian User Journey Analysis

Research output: Contributions to collected editions/worksArticle in conference proceedingsResearchpeer-review

Authors

Extracting value from big data is one of today’s business challenges. In online marketing, for instance, advertisers use high volume clickstream data to increase the efficiency of their campaigns. To prevent collecting, storing, and processing of irrelevant data, it is crucial to determine how much data to analyze to achieve acceptable model performance. We propose a general procedure that employs the learning curve sampling method to determine the optimal sample size with respect to cost/benefit considerations. Applied in two case studies, we model the users' click behavior based on clickstream data and offline channel data. We observe saturation effects of the predictive accuracy when the sample size is increased and, thus, demonstrate that advertisers only have to analyze a very small subset of the full dataset to obtain an acceptable predictive accuracy and to optimize profits from advertising activities. In both case studies we observe that a random intercept logistic model outperforms a non-hierarchical model in terms of predictive accuracy. Given the high infrastructure costs and the users' growing awareness for tracking activities, our results have managerial implications for companies in the online marketing field.
Original languageEnglish
Title of host publicationProceedings of the Twenty-Third European Conference on Information Systems
Number of pages13
PublisherAIS eLibrary
Publication date29.05.2015
ISBN (print)978-3-00-050284-2
DOIs
Publication statusPublished - 29.05.2015
Event23rd European Conference on Information Systems - ECIS 2015 - Münster, Germany
Duration: 26.05.201529.05.2015
Conference number: 23
https://www.ercis.org/
http://www.ecis2015.eu/

Links

DOI

Recently viewed

Publications

  1. Towards improved dispatching rules for complex shop floor scenarios - A genetic programming approach
  2. Expertise in research integration and implementation for tackling complex problems
  3. Using augmented video to test in-car user experiences of context analog HUDs
  4. Microstructural development of as-cast AM50 during Constrained Friction Processing: grain refinement and influence of process parameters
  5. Special Issue The Discourse of Redundancy Introduction
  6. Multiphase-field modeling of temperature-driven intermetallic compound evolution in an Al-Mg system for application to solid-state joining processes
  7. Failure to Learn From Failure Is Mitigated by Loss-Framing and Corrective Feedback
  8. Towards an open question answering architecture
  9. Systematic feature evaluation for gene name recognition
  10. Guest Editorial - ''Econometrics of Anonymized Micro Data''
  11. Embarrassment as a public vs. private emotion and symbolic coping behaviour
  12. »HOW TO MAKE YOUR OWN SAMPLES«
  13. Meta-Image – a collaborative environment for the image discourse
  14. Development of high performance single-phase solid solution magnesium alloy at low temperature
  15. HAWK - hybrid question answering using linked data
  16. Binary Random Nets II
  17. Is the market classification of risk always efficient?
  18. Reporting and Analysing the Environmental Impact of Language Models on the Example of Commonsense Question Answering with External Knowledge
  19. Does cognitive load moderate the seductive details effect? A multimedia study
  20. Current issues in competence modeling and assessment
  21. Estimation of minimal data sets sizes for machine learning predictions in digital mental health interventions