How Much Tracking Is Necessary? - The Learning Curve in Bayesian User Journey Analysis

Research output: Contributions to collected editions/worksArticle in conference proceedingsResearchpeer-review

Standard

How Much Tracking Is Necessary? - The Learning Curve in Bayesian User Journey Analysis. / Stange, Martin; Funk, Burkhardt.
Proceedings of the Twenty-Third European Conference on Information Systems. AIS eLibrary, 2015.

Research output: Contributions to collected editions/worksArticle in conference proceedingsResearchpeer-review

Harvard

Stange, M & Funk, B 2015, How Much Tracking Is Necessary? - The Learning Curve in Bayesian User Journey Analysis. in Proceedings of the Twenty-Third European Conference on Information Systems. AIS eLibrary, 23rd European Conference on Information Systems - ECIS 2015, Münster, Germany, 26.05.15. https://doi.org/10.18151/7217484

APA

Stange, M., & Funk, B. (2015). How Much Tracking Is Necessary? - The Learning Curve in Bayesian User Journey Analysis. In Proceedings of the Twenty-Third European Conference on Information Systems AIS eLibrary. https://doi.org/10.18151/7217484

Vancouver

Stange M, Funk B. How Much Tracking Is Necessary? - The Learning Curve in Bayesian User Journey Analysis. In Proceedings of the Twenty-Third European Conference on Information Systems. AIS eLibrary. 2015 doi: 10.18151/7217484

Bibtex

@inbook{02f02f601bbf4d3c855ca7f8227751ad,
title = "How Much Tracking Is Necessary? - The Learning Curve in Bayesian User Journey Analysis",
abstract = "Extracting value from big data is one of today{\textquoteright}s business challenges. In online marketing, for instance, advertisers use high volume clickstream data to increase the efficiency of their campaigns. To prevent collecting, storing, and processing of irrelevant data, it is crucial to determine how much data to analyze to achieve acceptable model performance. We propose a general procedure that employs the learning curve sampling method to determine the optimal sample size with respect to cost/benefit considerations. Applied in two case studies, we model the users' click behavior based on clickstream data and offline channel data. We observe saturation effects of the predictive accuracy when the sample size is increased and, thus, demonstrate that advertisers only have to analyze a very small subset of the full dataset to obtain an acceptable predictive accuracy and to optimize profits from advertising activities. In both case studies we observe that a random intercept logistic model outperforms a non-hierarchical model in terms of predictive accuracy. Given the high infrastructure costs and the users' growing awareness for tracking activities, our results have managerial implications for companies in the online marketing field. ",
keywords = "Business informatics, Big Data, Online Marketing, User Journey Analysis, Learning Curve, Bayesian Models",
author = "Martin Stange and Burkhardt Funk",
year = "2015",
month = may,
day = "29",
doi = "10.18151/7217484",
language = "English",
isbn = "978-3-00-050284-2",
booktitle = "Proceedings of the Twenty-Third European Conference on Information Systems",
publisher = "AIS eLibrary",
address = "United States",
note = "23rd European Conference on Information Systems - ECIS 2015, ECIS conference 2015 ; Conference date: 26-05-2015 Through 29-05-2015",
url = "https://www.ercis.org/, http://www.ecis2015.eu/",

}

RIS

TY - CHAP

T1 - How Much Tracking Is Necessary? - The Learning Curve in Bayesian User Journey Analysis

AU - Stange, Martin

AU - Funk, Burkhardt

N1 - Conference code: 23

PY - 2015/5/29

Y1 - 2015/5/29

N2 - Extracting value from big data is one of today’s business challenges. In online marketing, for instance, advertisers use high volume clickstream data to increase the efficiency of their campaigns. To prevent collecting, storing, and processing of irrelevant data, it is crucial to determine how much data to analyze to achieve acceptable model performance. We propose a general procedure that employs the learning curve sampling method to determine the optimal sample size with respect to cost/benefit considerations. Applied in two case studies, we model the users' click behavior based on clickstream data and offline channel data. We observe saturation effects of the predictive accuracy when the sample size is increased and, thus, demonstrate that advertisers only have to analyze a very small subset of the full dataset to obtain an acceptable predictive accuracy and to optimize profits from advertising activities. In both case studies we observe that a random intercept logistic model outperforms a non-hierarchical model in terms of predictive accuracy. Given the high infrastructure costs and the users' growing awareness for tracking activities, our results have managerial implications for companies in the online marketing field.

AB - Extracting value from big data is one of today’s business challenges. In online marketing, for instance, advertisers use high volume clickstream data to increase the efficiency of their campaigns. To prevent collecting, storing, and processing of irrelevant data, it is crucial to determine how much data to analyze to achieve acceptable model performance. We propose a general procedure that employs the learning curve sampling method to determine the optimal sample size with respect to cost/benefit considerations. Applied in two case studies, we model the users' click behavior based on clickstream data and offline channel data. We observe saturation effects of the predictive accuracy when the sample size is increased and, thus, demonstrate that advertisers only have to analyze a very small subset of the full dataset to obtain an acceptable predictive accuracy and to optimize profits from advertising activities. In both case studies we observe that a random intercept logistic model outperforms a non-hierarchical model in terms of predictive accuracy. Given the high infrastructure costs and the users' growing awareness for tracking activities, our results have managerial implications for companies in the online marketing field.

KW - Business informatics

KW - Big Data

KW - Online Marketing

KW - User Journey Analysis

KW - Learning Curve

KW - Bayesian Models

U2 - 10.18151/7217484

DO - 10.18151/7217484

M3 - Article in conference proceedings

SN - 978-3-00-050284-2

BT - Proceedings of the Twenty-Third European Conference on Information Systems

PB - AIS eLibrary

T2 - 23rd European Conference on Information Systems - ECIS 2015

Y2 - 26 May 2015 through 29 May 2015

ER -

Links

DOI

Recently viewed

Publications

  1. Taking the pulse of Earth's tropical forests using networks of highly distributed plots
  2. Hierarchical trait filtering at different spatial scales determines beetle assemblages in deadwood
  3. Dynamic Lot Size Optimization with Reinforcement Learning
  4. Use of Machine-Learning Algorithms Based on Text, Audio and Video Data in the Prediction of Anxiety and Post-Traumatic Stress in General and Clinical Populations
  5. Comparison of different FEM codes approach for extrusion process analysis
  6. Towards a spatial understanding of identity play
  7. Global Finite-Time Stabilization of Planar Linear Systems With Actuator Saturation
  8. Effectiveness of a guided multicomponent internet and mobile gratitude training program - A pragmatic randomized controlled trial
  9. Sensor Fusion for Power Line Sensitive Monitoring and Load State Estimation
  10. Clause identification using entropy guided transformation learning
  11. Experimentally established correlation of friction surfacing process temperature and deposit geometry
  12. Constraints are the solution, not the problem
  13. Segment Introduction
  14. Understanding storytelling in the context of information systems
  15. The signal location task as a method quantifying the distribution of attention
  16. Universal Threshold Calculation for Fingerprinting Decoders using Mixture Models
  17. Real-time RDF extraction from unstructured data streams
  18. Age effects on controlling tools with sensorimotor transformations
  19. Supporting the Development and Realization of Data-Driven Business Models with Enterprise Architecture Modeling and Management
  20. Computing regression statistics from grouped data
  21. A localized boundary element method for the floating body problem
  22. On the Decoupling and Output Functional Controllability of Robotic Manipulation
  23. Analysis of PI controllers with anti-windup techniques on level systems
  24. Image compression based on periodic principal components
  25. TRY plant trait database – enhanced coverage and open access
  26. A Review of Latent Variable Modeling Using R - A Step-by-Step-Guide
  27. Knowledge-Enhanced Language Models Are Not Bias-Proof
  28. An Orthogonal Wavelet Denoising Algorithm for Surface Images of Atomic Force Microscopy
  29. Data-driven and physics-based modelling of process behaviour and deposit geometry for friction surfacing
  30. Teaching methods for modelling problems and students’ task-specific enjoyment, value, interest and self-efficacy expectations
  31. Self-regulation in error management training: emotion control and metacognition as mediators of performance effects
  32. Spaces for challenging experiences, indeterminacy, and experimentation