How Much Tracking Is Necessary? - The Learning Curve in Bayesian User Journey Analysis

Research output: Contributions to collected editions/worksArticle in conference proceedingsResearchpeer-review

Standard

How Much Tracking Is Necessary? - The Learning Curve in Bayesian User Journey Analysis. / Stange, Martin; Funk, Burkhardt.
Proceedings of the Twenty-Third European Conference on Information Systems. AIS eLibrary, 2015.

Research output: Contributions to collected editions/worksArticle in conference proceedingsResearchpeer-review

Harvard

Stange, M & Funk, B 2015, How Much Tracking Is Necessary? - The Learning Curve in Bayesian User Journey Analysis. in Proceedings of the Twenty-Third European Conference on Information Systems. AIS eLibrary, 23rd European Conference on Information Systems - ECIS 2015, Münster, Germany, 26.05.15. https://doi.org/10.18151/7217484

APA

Stange, M., & Funk, B. (2015). How Much Tracking Is Necessary? - The Learning Curve in Bayesian User Journey Analysis. In Proceedings of the Twenty-Third European Conference on Information Systems AIS eLibrary. https://doi.org/10.18151/7217484

Vancouver

Stange M, Funk B. How Much Tracking Is Necessary? - The Learning Curve in Bayesian User Journey Analysis. In Proceedings of the Twenty-Third European Conference on Information Systems. AIS eLibrary. 2015 doi: 10.18151/7217484

Bibtex

@inbook{02f02f601bbf4d3c855ca7f8227751ad,
title = "How Much Tracking Is Necessary? - The Learning Curve in Bayesian User Journey Analysis",
abstract = "Extracting value from big data is one of today{\textquoteright}s business challenges. In online marketing, for instance, advertisers use high volume clickstream data to increase the efficiency of their campaigns. To prevent collecting, storing, and processing of irrelevant data, it is crucial to determine how much data to analyze to achieve acceptable model performance. We propose a general procedure that employs the learning curve sampling method to determine the optimal sample size with respect to cost/benefit considerations. Applied in two case studies, we model the users' click behavior based on clickstream data and offline channel data. We observe saturation effects of the predictive accuracy when the sample size is increased and, thus, demonstrate that advertisers only have to analyze a very small subset of the full dataset to obtain an acceptable predictive accuracy and to optimize profits from advertising activities. In both case studies we observe that a random intercept logistic model outperforms a non-hierarchical model in terms of predictive accuracy. Given the high infrastructure costs and the users' growing awareness for tracking activities, our results have managerial implications for companies in the online marketing field. ",
keywords = "Business informatics, Big Data, Online Marketing, User Journey Analysis, Learning Curve, Bayesian Models",
author = "Martin Stange and Burkhardt Funk",
year = "2015",
month = may,
day = "29",
doi = "10.18151/7217484",
language = "English",
isbn = "978-3-00-050284-2",
booktitle = "Proceedings of the Twenty-Third European Conference on Information Systems",
publisher = "AIS eLibrary",
address = "United States",
note = "23rd European Conference on Information Systems - ECIS 2015, ECIS conference 2015 ; Conference date: 26-05-2015 Through 29-05-2015",
url = "https://www.ercis.org/, http://www.ecis2015.eu/",

}

RIS

TY - CHAP

T1 - How Much Tracking Is Necessary? - The Learning Curve in Bayesian User Journey Analysis

AU - Stange, Martin

AU - Funk, Burkhardt

N1 - Conference code: 23

PY - 2015/5/29

Y1 - 2015/5/29

N2 - Extracting value from big data is one of today’s business challenges. In online marketing, for instance, advertisers use high volume clickstream data to increase the efficiency of their campaigns. To prevent collecting, storing, and processing of irrelevant data, it is crucial to determine how much data to analyze to achieve acceptable model performance. We propose a general procedure that employs the learning curve sampling method to determine the optimal sample size with respect to cost/benefit considerations. Applied in two case studies, we model the users' click behavior based on clickstream data and offline channel data. We observe saturation effects of the predictive accuracy when the sample size is increased and, thus, demonstrate that advertisers only have to analyze a very small subset of the full dataset to obtain an acceptable predictive accuracy and to optimize profits from advertising activities. In both case studies we observe that a random intercept logistic model outperforms a non-hierarchical model in terms of predictive accuracy. Given the high infrastructure costs and the users' growing awareness for tracking activities, our results have managerial implications for companies in the online marketing field.

AB - Extracting value from big data is one of today’s business challenges. In online marketing, for instance, advertisers use high volume clickstream data to increase the efficiency of their campaigns. To prevent collecting, storing, and processing of irrelevant data, it is crucial to determine how much data to analyze to achieve acceptable model performance. We propose a general procedure that employs the learning curve sampling method to determine the optimal sample size with respect to cost/benefit considerations. Applied in two case studies, we model the users' click behavior based on clickstream data and offline channel data. We observe saturation effects of the predictive accuracy when the sample size is increased and, thus, demonstrate that advertisers only have to analyze a very small subset of the full dataset to obtain an acceptable predictive accuracy and to optimize profits from advertising activities. In both case studies we observe that a random intercept logistic model outperforms a non-hierarchical model in terms of predictive accuracy. Given the high infrastructure costs and the users' growing awareness for tracking activities, our results have managerial implications for companies in the online marketing field.

KW - Business informatics

KW - Big Data

KW - Online Marketing

KW - User Journey Analysis

KW - Learning Curve

KW - Bayesian Models

U2 - 10.18151/7217484

DO - 10.18151/7217484

M3 - Article in conference proceedings

SN - 978-3-00-050284-2

BT - Proceedings of the Twenty-Third European Conference on Information Systems

PB - AIS eLibrary

T2 - 23rd European Conference on Information Systems - ECIS 2015

Y2 - 26 May 2015 through 29 May 2015

ER -

Links

DOI

Recently viewed

Publications

  1. A Wavelet Packet Algorithm for Online Detection of Pantograph Vibrations
  2. Integrating errors into the training process
  3. Formative Perspectives on the Relation Between CSR Communication and CSR Practices
  4. Sensitivity to complexity - an important prerequisite of problem solving mathematics teaching
  5. An extended analytical approach to evaluating monotonic functions of fuzzy numbers
  6. Comparison of Bio-Inspired Algorithms in a Case Study for Optimizing Capacitor Bank Allocation in Electrical Power Distribution
  7. Mining positional data streams
  8. HAWK - hybrid question answering using linked data
  9. Development and validation of the short form of the Later Life Workplace Index
  10. A Lyapunov based PI controller with an anti-windup scheme for a purification process of potable water
  11. Age effects on controlling tools with sensorimotor transformations
  12. Towards a Global Script?
  13. Gain Adaptation in Sliding Mode Control Using Model Predictive Control and Disturbance Compensation with Application to Actuators
  14. Overcoming Multi-legacy Application Challenges through Building Dynamic Capabilities for Low-Code Adoption
  15. Validation of an open source, remote web-based eye-tracking method (WebGazer) for research in early childhood
  16. A Cross-Classified CFA-MTMM Model for Structurally Different and Nonindependent Interchangeable Methods
  17. Using heuristic worked examples to promote solving of reality‑based tasks in mathematics in lower secondary school
  18. Interaction-Dominant Causation in Mind and Brain, and Its Implication for Questions of Generalization and Replication
  19. A simple control strategy for increasing the soft bending actuator performance by using a pressure boost
  20. Use of Machine-Learning Algorithms Based on Text, Audio and Video Data in the Prediction of Anxiety and Post-Traumatic Stress in General and Clinical Populations
  21. Mathematical relation between extended connectivity and eigenvector coefficients.
  22. Intraspecific trait variation patterns along a precipitation gradient in Mongolian rangelands
  23. Outperformed by a Computer? - Comparing Human Decisions to Reinforcement Learning Agents, Assigning Lot Sizes in a Learning Factory
  24. Artificial intelligence
  25. Early Detection of Faillure in Conveyor Chain Systems by Wireless Sensor Node
  26. A framework for business model development in technology-driven start-ups
  27. Collaborative open science as a way to reproducibility and new insights in primate cognition research
  28. Chapter 9: Particular Remedies for Non-performance: Section 1: Right to Performance
  29. Strengthening the transformative impulse while mainstreaming real-world labs: Lessons learned from three years of BaWü-Labs
  30. Design of an Information-Based Distributed Production Planning System