How Much Tracking Is Necessary? - The Learning Curve in Bayesian User Journey Analysis

Research output: Contributions to collected editions/worksArticle in conference proceedingsResearchpeer-review

Authors

Extracting value from big data is one of today’s business challenges. In online marketing, for instance, advertisers use high volume clickstream data to increase the efficiency of their campaigns. To prevent collecting, storing, and processing of irrelevant data, it is crucial to determine how much data to analyze to achieve acceptable model performance. We propose a general procedure that employs the learning curve sampling method to determine the optimal sample size with respect to cost/benefit considerations. Applied in two case studies, we model the users' click behavior based on clickstream data and offline channel data. We observe saturation effects of the predictive accuracy when the sample size is increased and, thus, demonstrate that advertisers only have to analyze a very small subset of the full dataset to obtain an acceptable predictive accuracy and to optimize profits from advertising activities. In both case studies we observe that a random intercept logistic model outperforms a non-hierarchical model in terms of predictive accuracy. Given the high infrastructure costs and the users' growing awareness for tracking activities, our results have managerial implications for companies in the online marketing field.
Original languageEnglish
Title of host publicationProceedings of the Twenty-Third European Conference on Information Systems
Number of pages13
PublisherAIS eLibrary
Publication date29.05.2015
ISBN (print)978-3-00-050284-2
DOIs
Publication statusPublished - 29.05.2015
Event23rd European Conference on Information Systems - ECIS 2015 - Münster, Germany
Duration: 26.05.201529.05.2015
Conference number: 23
https://www.ercis.org/
http://www.ecis2015.eu/

Links

DOI

Recently viewed

Publications

  1. ActiveMath - a Learning Platform With Semantic Web Features
  2. Text Comprehension as a Mediator in Solving Mathematical Reality-Based Tasks
  3. Modelling and implementation of an Order2Cash Process in distributed systems
  4. FFTSMC with Optimal Reference Trajectory Generated by MPC in Robust Robotino Motion Planning with Saturating Inputs
  5. On the Nonlinearity Compensation in Permanent Magnet Machine Using a Controller Based on a Controlled Invariant Subspace
  6. Fixed-term Contracts and Wages Revisited Using Linked Employer-Employee Data from Germany
  7. A Gait Pattern Generator for Closed-Loop Position Control of a Soft Walking Robot
  8. The elicitation process in developing of case library for Case-Based Reasoner system whilst consideration for validating electronic communication technologies
  9. Export Intensity and Plant Characteristics: What can we learn from Quantile Regression?
  10. Proceedings of TextGraphs-17: Graph-based Methods for Natural Language Processing
  11. Don’t underestimate the problems of user centredness in software development projectsthere are many!?
  12. Control versus Complexity
  13. Children's use of spatial skills in solving two map-reading tasks in real space.
  14. Selecting and Adapting Methods for Analysis and Design in Value-Sensitive Digital Social Innovation Projects: Toward Design Principles
  15. Template-based Question Answering using Recursive Neural Networks
  16. NH4+ ad-/desorption in sequencing batch reactors
  17. Reality-Based Tasks with Complex-Situations
  18. Automated Invoice Processing: Machine Learning-Based Information Extraction for Long Tail Suppliers
  19. Using corpus-linguistic methods to track longitudinal development
  20. Toward Application and Implementation of in Silico Tools and Workflows within Benign by Design Approaches