How Big Does Big Data Need to Be?

Research output: Contributions to collected editions/worksContributions to collected editions/anthologiesResearchpeer-review

Authors

Collecting and storing of as many data as possible is common practice in many companies these days. To reduce costs of collecting and storing data that is not relevant, it is important to define which analytical questions are to be answered and how much data is needed to answer these questions. In this chapter,
a process to define an optimal sampling size is proposed. Based on benefit/cost considerations, the authors show how to find the sample size that maximizes the utility of predictive analytics. By applying the proposed process to a case study is shown that only a very small fraction of the available data set is needed to make accurate predictions.
Original languageEnglish
Title of host publicationEnterprise Big Data Engineering, Analytics, and Management
EditorsMartin Atzmueller, Samia Oussena, Thomas Roth-Berghofer
Number of pages12
Place of PublicationHershey
PublisherBusiness Science Reference
Publication date06.2016
Pages1-12
ISBN (print)9781522502937
ISBN (electronic)9781522502944
DOIs
Publication statusPublished - 06.2016

Recently viewed

Activities

  1. Changing Implicitness – Functions of Arts and Culture in Urban Planning and Policies across Times and Places
  2. Virtual Songwriting. Fostering Creative Processes through „Challenge“ and „Collaboration“.
  3. 15th Organization Studies Summer Workshop 2020
  4. The 14th International Symposium of Mathematical Theory of Networks and Systems - MTNS 2000
  5. It’s all about engagement with texts – Empirical findings about promoting students’ reading comprehension by well-structured texts
  6. Workshop „Different Worlds of Political Party Development. Comparative Analysis of the Institutionalization of Political Parties“
  7. Academy of Management Conference
  8. New Sites in Organization Studies: A Seminar Series
  9. Education for Sustainable Development – Experiences from Theory and Practice
  10. Interactions between social movements and international organisations
  11. Inter-university Consortium for Political and Social Research Summer Program in Quantitative Methods - 2019
  12. 18th International Conference on Panel Data 2012
  13. Eigenzeiten of Creativity: Challenges for Temporal Coordination in Creative Projects in Arts and Science
  14. Language and Communication (Fachzeitschrift)
  15. On the measuring accuracy of the “Vehrs-Hebel”, a scaling apparatus for nonverbal real-time assessment of perceived quantity
  16. LABOR.A® 2019 - LABORA19
  17. The History of Art is Linked but the Data Is Not: Georgia O’Keeffe, Provenance and Scholarship
  18. Organizational responses to evaluations, rankings and performance indicators – evidence from French and German Universities
  19. Materialexplosion und Avantgardeanspruch
  20. Strange Signs: Invented Languages from Alienation to Zany

Publications

  1. Lyapunov stability analysis to set up a saturating PI controller with anti-windup for a mass flow system
  2. Meta-analytic cointegrating rank tests for dependent panels
  3. Responsibility and environment
  4. A hybrid hydraulic piezo actuator modeling and hysteresis effect identification for control in camless internal combustion engines
  5. From niche to mainstream
  6. Article 32 Date of Application
  7. New trends in pragmatics
  8. End-to-End Active Speaker Detection
  9. DESI
  10. An empirical investigation of experiences and the link between a servicedominant logic mindset, competitive advantage, and performance of nonprofit organizations
  11. Special issue on Variational Pragmatics
  12. What do employers pay for employees’ complex problem solving skills?
  13. Understanding Records. A Field Guide to Recording Practice
  14. QALD-10 — The 10th Challenge on Question Answering over Linked Data
  15. The Exilic Classroom
  16. DEVELOPMENT OF AN INTEGRATIVE LOGISTICS MODEL FOR LINKING PLANNING AND CONTROL TASKS WITH LOGISTICAL VARIABLES ALONG THE COMPANY'S INTERNAL SUPPLY CHAIN.
  17. The messenger as a model in Media Theory. Reflections on the philosophical di-mensions of theorizing Media
  18. A generalized α-level decomposition concept for numerical fuzzy calculus
  19. The Pricing of Default-free Interest Rate Cap, Floor, and Collar Agreements
  20. Vibration Converter with Passive Energy Management for Battery‐Less Wireless Sensor Nodes in Predictive Maintenance
  21. Using Principal Component Analysis for information-rich socio-ecological vulnerability mapping in Southern Africa