FaST: A linear time stack trace alignment heuristic for crash report deduplication

Research output: Contributions to collected editions/worksArticle in conference proceedingsResearchpeer-review

Authors

In software projects, applications are often monitored by systems that automatically identify crashes, collect their information into reports, and submit them to developers. Especially in popular applications, such systems tend to generate a large number of crash reports in which a significant portion of them are duplicate. Due to this high submission volume, in practice, the crash report deduplication is supported by devising automatic systems whose efficiency is a critical constraint. In this paper, we focus on improving deduplication system throughput by speeding up the stack trace comparison. In contrast to the state-of-the-art techniques, we propose FaST, a novel sequence alignment method that computes the similarity score between two stack traces in linear time. Our method independently aligns identical frames in two stack traces by means of a simple alignment heuristic. We evaluate FaST and five competing methods on four datasets from open-source projects using ranking and binary metrics. Despite its simplicity, FaST consistently achieves state-of-the-art performance regarding all metrics considered. Moreover, our experiments confirm that FaST is substantially more efficient than methods based on optimal sequence alignment.

Original languageEnglish
Title of host publicationThe 2022 Mining Software Repositories Conference : MSR 2022, Proceedings; 18-20 May 2022, Virtual; 23-24 May 2022, Pittsburgh, Pennsylvania
Number of pages12
Place of PublicationNew York
PublisherInstitute of Electrical and Electronics Engineers Inc.
Publication date17.10.2022
Pages549-560
ISBN (print)9781665452106
ISBN (electronic)978-1-4503-9303-4
DOIs
Publication statusPublished - 17.10.2022
Event19th International Conference on Mining Software Repositories - MSR 2022 - Pittsburgh, United States
Duration: 23.05.202224.05.2022
Conference number: 19
https://conf.researchr.org/home/msr-2022

Bibliographical note

Titel der Druckausgabe: 2022 IEEE/ACM 19th International Conference on Mining Software Repositories (MSR 2022)

Funding Information:
We would like to gratefully acknowledge the Natural Sciences and Engineering Research Council of Canada (NSERC), Ericsson, Ciena, and EffciOS for funding this project. Moreover, this research was enabled in part by the support provided by WestGrid (https://www. westgrid.ca/) and Compute Canada (www.computecanada.ca).

Publisher Copyright:
© 2022 ACM.

    Research areas

  • Automatic Crash Reporting, Crash Report Deduplication, Duplicate Crash Report, Duplicate Crash Report Detection, Stack Trace Similarity
  • Business informatics

DOI

Recently viewed

Publications

  1. Multidimensional recurrence quantification analysis (MdRQA) for the analysis of multidimensional time-series
  2. Robust Flatness Based Control of an Electromagnetic Linear Actuator Using Adaptive PID Controller
  3. A computational study of a model of single-crystal strain-gradient viscoplasticity with an interactive hardening relation
  4. Predicting the Difficulty of Exercise Items for Dynamic Difficulty Adaptation in Adaptive Language Tutoring
  5. Return of Fibonacci random walks
  6. Evaluation of Time/Phase Parameters in Frequency Measurements for Inertial Navigation Systems
  7. Modelling and implementation of an Order2Cash Process in distributed systems
  8. Investigation and modeling of the material behavior due to evolving dislocation microstructures in fcc and bcc metals
  9. The Scalable Question Answering Over Linked Data (SQA) Challenge 2018
  10. An expert-based reference list of variables for characterizing and monitoring social-ecological systems
  11. Homogenization modeling of thin-layer-type microstructures
  12. The fuzzy relationship of intelligence and problem solving in computer simulations
  13. Integration of laser scanning and projection speckle pattern for advanced pipeline monitoring
  14. Considerations on efficient touch interfaces - How display size influences the performance in an applied pointing task
  15. An Orthogonal Wavelet Denoising Algorithm for Surface Images of Atomic Force Microscopy
  16. Analyzing User Journey Data In Digital Health: Predicting Dropout From A Digital CBT-I Intervention
  17. Probabilistic approach to modelling of recession curves
  18. Model inversion using fuzzy neural network with boosting of the solution
  19. Some model properties to control a permanent magnet machine using a controlled invariant subspace
  20. Supporting the Decision of the Order Processing Strategy by Using Logistic Models
  21. Optimizing price levels in e-commerce applications with respect to customer lifetime values
  22. Using transition management concepts for the evaluation of intersecting policy domains ('grand challenges')
  23. Visualizing the Hidden Activity of Artificial Neural Networks
  24. Efficient Order Picking Methods in Robotic Mobile Fulfillment Systems
  25. Optimized neural networks for modeling of loudspeaker directivity diagrams
  26. Model-based logistic controlling of converging material flows
  27. Top-down contingent attentional capture during feed-forward visual processing
  28. Applied quality assurance methods under the open source development model
  29. Microstructural development of as-cast AM50 during Constrained Friction Processing: grain refinement and influence of process parameters
  30. Gain Scheduling Controller for Improving Level Control Performance
  31. Problem solving in mathematics education
  32. A two-stage Kalman estimator for motion control using model predictive strategy