FaST: A linear time stack trace alignment heuristic for crash report deduplication

Research output: Contributions to collected editions/worksArticle in conference proceedingsResearchpeer-review

Authors

In software projects, applications are often monitored by systems that automatically identify crashes, collect their information into reports, and submit them to developers. Especially in popular applications, such systems tend to generate a large number of crash reports in which a significant portion of them are duplicate. Due to this high submission volume, in practice, the crash report deduplication is supported by devising automatic systems whose efficiency is a critical constraint. In this paper, we focus on improving deduplication system throughput by speeding up the stack trace comparison. In contrast to the state-of-the-art techniques, we propose FaST, a novel sequence alignment method that computes the similarity score between two stack traces in linear time. Our method independently aligns identical frames in two stack traces by means of a simple alignment heuristic. We evaluate FaST and five competing methods on four datasets from open-source projects using ranking and binary metrics. Despite its simplicity, FaST consistently achieves state-of-the-art performance regarding all metrics considered. Moreover, our experiments confirm that FaST is substantially more efficient than methods based on optimal sequence alignment.

Original languageEnglish
Title of host publicationThe 2022 Mining Software Repositories Conference : MSR 2022, Proceedings; 18-20 May 2022, Virtual; 23-24 May 2022, Pittsburgh, Pennsylvania
Number of pages12
Place of PublicationNew York
PublisherInstitute of Electrical and Electronics Engineers Inc.
Publication date17.10.2022
Pages549-560
ISBN (print)9781665452106
ISBN (electronic)978-1-4503-9303-4
DOIs
Publication statusPublished - 17.10.2022
Event19th International Conference on Mining Software Repositories - MSR 2022 - Pittsburgh, United States
Duration: 23.05.202224.05.2022
Conference number: 19
https://conf.researchr.org/home/msr-2022

Bibliographical note

Titel der Druckausgabe: 2022 IEEE/ACM 19th International Conference on Mining Software Repositories (MSR 2022)

Funding Information:
We would like to gratefully acknowledge the Natural Sciences and Engineering Research Council of Canada (NSERC), Ericsson, Ciena, and EffciOS for funding this project. Moreover, this research was enabled in part by the support provided by WestGrid (https://www. westgrid.ca/) and Compute Canada (www.computecanada.ca).

Publisher Copyright:
© 2022 ACM.

    Research areas

  • Automatic Crash Reporting, Crash Report Deduplication, Duplicate Crash Report, Duplicate Crash Report Detection, Stack Trace Similarity
  • Business informatics

DOI

Recently viewed

Publications

  1. The Influence of Note-taking on Mathematical Solution Processes while Working on Reality-Based Tasks
  2. What does it mean to be sensitive for the complexity of (problem oriented) teaching?
  3. A Quadrant Approach of Camera Calibration Method for Depth Estimation Using a Stereo Vision System
  4. Mapping interest rate projections using neural networks under cointegration
  5. DialogueMaps: Supporting interactive transdisciplinary dialogues with a web-based tool for multi-layer knowledge maps
  6. Comparison of Odor Thresholds obtained by a Three Alternative Choice Procedure and by the Method of Limits
  7. Automatic enumeration of all connected subgraphs.
  8. Cross-document coreference resolution using latent features
  9. Second language learners' performance in mathematics
  10. The signal location task as a method quantifying the distribution of attention
  11. Age effects on controlling tools with sensorimotor transformations
  12. Development and validation of a method for the determination of trace alkylphenols and phthalates in the atmosphere
  13. Return of Fibonacci random walks
  14. A sufficient asymptotic stability condition in generalised model predictive control to avoid input saturation
  15. On the Decoupling and Output Functional Controllability of Robotic Manipulation
  16. Supporting discourse in a synchronous learning environment
  17. Modelling the Complexity of Measurement Estimation Situations - A Theoretical Framework for the Estimation of Lengths
  18. An Improved Approach to the Semi-Process-Oriented Implementation of Standardised ERP-Systems
  19. Distinguishing state variability from trait change in longitudinal data
  20. Optimization Analysis for an Uncovered Wagon Transportation with an Interactive Animated Simulation-Based Platform for Multidisciplinary Learning
  21. Switching from a Managing to a Monitoring Function on the Board
  22. Towards a Dynamic Interpretation of Subjective and Objective Values
  23. Performance analysis for loss systems with many subscribers and concurrent services
  24. On finding nonisomorphic connected subgraphs and distinct molecular substructures.
  25. Gaussian processes for dispatching rule selection in production scheduling
  26. Using mixture distribution models to test the construct validity of the Physical Self-Description Questionnaire
  27. Comments on "Tracking Control of Robotic Manipulators With Uncertain Kinematics and Dynamics"
  28. Analysis of long-term statistical data of cobalt flows in the EU
  29. A discrete approximate solution for the asymptotic tracking problem in affine nonlinear systems
  30. Authenticity and authentication in language learning
  31. Learning Analytics with Matlab Grader in Undergraduate Engineering Courses
  32. Appendix A: Design, implementation, and analysis of the iGOES project
  33. Supporting the Development and Implementation of a Digitalization Strategy in SMEs through a Lightweight Architecture-based Method
  34. Improved sensorimotor control is not connected with improved proprioception
  35. A guided simulated annealing search for solving the pick-up and delivery problem with time windows and capacity constraints
  36. How Much Tracking Is Necessary? - The Learning Curve in Bayesian User Journey Analysis
  37. Analyzing math teacher students' sensitivity for aspects of the complexity of problem oriented mathematics instruction
  38. Evaluation of Time/Phase Parameters in Frequency Measurements for Inertial Navigation Systems