FaST: A linear time stack trace alignment heuristic for crash report deduplication

Research output: Contributions to collected editions/worksArticle in conference proceedingsResearchpeer-review

Authors

In software projects, applications are often monitored by systems that automatically identify crashes, collect their information into reports, and submit them to developers. Especially in popular applications, such systems tend to generate a large number of crash reports in which a significant portion of them are duplicate. Due to this high submission volume, in practice, the crash report deduplication is supported by devising automatic systems whose efficiency is a critical constraint. In this paper, we focus on improving deduplication system throughput by speeding up the stack trace comparison. In contrast to the state-of-the-art techniques, we propose FaST, a novel sequence alignment method that computes the similarity score between two stack traces in linear time. Our method independently aligns identical frames in two stack traces by means of a simple alignment heuristic. We evaluate FaST and five competing methods on four datasets from open-source projects using ranking and binary metrics. Despite its simplicity, FaST consistently achieves state-of-the-art performance regarding all metrics considered. Moreover, our experiments confirm that FaST is substantially more efficient than methods based on optimal sequence alignment.

Original languageEnglish
Title of host publicationThe 2022 Mining Software Repositories Conference : MSR 2022, Proceedings; 18-20 May 2022, Virtual; 23-24 May 2022, Pittsburgh, Pennsylvania
Number of pages12
Place of PublicationNew York
PublisherInstitute of Electrical and Electronics Engineers Inc.
Publication date23.05.2022
Pages549-560
ISBN (print)9781665452106
ISBN (electronic)978-1-4503-9303-4
DOIs
Publication statusPublished - 23.05.2022
Event19th International Conference on Mining Software Repositories - MSR 2022 - Pittsburgh, United States
Duration: 23.05.202224.05.2022
Conference number: 19
https://conf.researchr.org/home/msr-2022

Bibliographical note

Titel der Druckausgabe: 2022 IEEE/ACM 19th International Conference on Mining Software Repositories (MSR 2022)

Funding Information:
We would like to gratefully acknowledge the Natural Sciences and Engineering Research Council of Canada (NSERC), Ericsson, Ciena, and EffciOS for funding this project. Moreover, this research was enabled in part by the support provided by WestGrid (https://www. westgrid.ca/) and Compute Canada (www.computecanada.ca).

Publisher Copyright:
© 2022 ACM.

    Research areas

  • Automatic Crash Reporting, Crash Report Deduplication, Duplicate Crash Report, Duplicate Crash Report Detection, Stack Trace Similarity
  • Business informatics

DOI

Recently viewed

Publications

  1. Automatic enumeration of all connected subgraphs.
  2. Supporting discourse in a synchronous learning environment
  3. Modelling the Complexity of Measurement Estimation Situations - A Theoretical Framework for the Estimation of Lengths
  4. Authenticity and authentication in language learning
  5. How Much Tracking Is Necessary? - The Learning Curve in Bayesian User Journey Analysis
  6. Towards improved dispatching rules for complex shop floor scenarios - A genetic programming approach
  7. Expertise in research integration and implementation for tackling complex problems
  8. Identification of structure-biodegradability relationships for ionic liquids - clustering of a dataset based on structural similarity
  9. Using augmented video to test in-car user experiences of context analog HUDs
  10. Linux-based Embedded System for Wavelet Denoising and Monitoring of sEMG Signals using an Axiomatic Seminorm
  11. Top-down contingent attentional capture during feed-forward visual processing
  12. Microstructural development of as-cast AM50 during Constrained Friction Processing: grain refinement and influence of process parameters
  13. Distributed robust Gaussian Process regression
  14. Continuous 3D scanning mode using servomotors instead of stepping motors in dynamic laser triangulation
  15. A decoupled MPC using a geometric approach and feedforward action for motion control in robotino
  16. Situated multiplying in primary school
  17. Dynamically changing sequencing rules with reinforcement learning in a job shop system with stochastic influences
  18. Long-term memory predictors of adult language learning at the interface between syntactic form and meaning
  19. Early Detection of Faillure in Conveyor Chain Systems by Wireless Sensor Node
  20. A Switching Cascade Sliding PID-PID Controllers Combined with a Feedforward and an MPC for an Actuator in Camless Internal Combustion Engines
  21. Transductive support vector machines for structured variables
  22. Selecting and Adapting Methods for Analysis and Design in Value-Sensitive Digital Social Innovation Projects: Toward Design Principles
  23. Accounting and Modeling as Design Metaphors for CEMIS
  24. Taking notes as a strategy for solving reality-based tasks in mathematics
  25. Special Issue The Discourse of Redundancy Introduction
  26. Towards a Global Script?
  27. Correlation of Microstructure and Local Mechanical Properties Along Build Direction for Multi-layer Friction Surfacing of Aluminum Alloys
  28. Multiphase-field modeling of temperature-driven intermetallic compound evolution in an Al-Mg system for application to solid-state joining processes
  29. Sliding-Mode-Based Input-Output Linearization of a Peltier Element for Ice Clamping Using a State and Disturbance Observer