How Big Does Big Data Need to Be?
Research output: Contributions to collected editions/works › Contributions to collected editions/anthologies › Research › peer-review
Authors
Collecting and storing of as many data as possible is common practice in many companies these days. To reduce costs of collecting and storing data that is not relevant, it is important to define which analytical questions are to be answered and how much data is needed to answer these questions. In this chapter,
a process to define an optimal sampling size is proposed. Based on benefit/cost considerations, the authors show how to find the sample size that maximizes the utility of predictive analytics. By applying the proposed process to a case study is shown that only a very small fraction of the available data set is needed to make accurate predictions.
a process to define an optimal sampling size is proposed. Based on benefit/cost considerations, the authors show how to find the sample size that maximizes the utility of predictive analytics. By applying the proposed process to a case study is shown that only a very small fraction of the available data set is needed to make accurate predictions.
Original language | English |
---|---|
Title of host publication | Enterprise Big Data Engineering, Analytics, and Management |
Editors | Martin Atzmueller, Samia Oussena, Thomas Roth-Berghofer |
Number of pages | 12 |
Place of Publication | Hershey |
Publisher | Business Science Reference |
Publication date | 06.2016 |
Pages | 1-12 |
ISBN (print) | 9781522502937 |
ISBN (electronic) | 9781522502944 |
DOIs | |
Publication status | Published - 06.2016 |
- Business informatics - Big Data, Predictive Analytics, Learning Curve