FaST: A linear time stack trace alignment heuristic for crash report deduplication

Publikation: Beiträge in SammelwerkenAufsätze in KonferenzbändenForschungbegutachtet

Authors

In software projects, applications are often monitored by systems that automatically identify crashes, collect their information into reports, and submit them to developers. Especially in popular applications, such systems tend to generate a large number of crash reports in which a significant portion of them are duplicate. Due to this high submission volume, in practice, the crash report deduplication is supported by devising automatic systems whose efficiency is a critical constraint. In this paper, we focus on improving deduplication system throughput by speeding up the stack trace comparison. In contrast to the state-of-the-art techniques, we propose FaST, a novel sequence alignment method that computes the similarity score between two stack traces in linear time. Our method independently aligns identical frames in two stack traces by means of a simple alignment heuristic. We evaluate FaST and five competing methods on four datasets from open-source projects using ranking and binary metrics. Despite its simplicity, FaST consistently achieves state-of-the-art performance regarding all metrics considered. Moreover, our experiments confirm that FaST is substantially more efficient than methods based on optimal sequence alignment.

OriginalspracheEnglisch
TitelThe 2022 Mining Software Repositories Conference : MSR 2022, Proceedings; 18-20 May 2022, Virtual; 23-24 May 2022, Pittsburgh, Pennsylvania
Anzahl der Seiten12
ErscheinungsortNew York
VerlagInstitute of Electrical and Electronics Engineers Inc.
Erscheinungsdatum17.10.2022
Seiten549-560
ISBN (Print)9781665452106
ISBN (elektronisch)978-1-4503-9303-4
DOIs
PublikationsstatusErschienen - 17.10.2022
Veranstaltung19th International Conference on Mining Software Repositories - MSR 2022 - Pittsburgh, USA / Vereinigte Staaten
Dauer: 23.05.202224.05.2022
Konferenznummer: 19
https://conf.researchr.org/home/msr-2022

Bibliographische Notiz

Publisher Copyright:
© 2022 ACM.

DOI

Zuletzt angesehen

Publikationen

  1. Considerations on efficient touch interfaces - How display size influences the performance in an applied pointing task
  2. Computing regression statistics from grouped data
  3. On the Decoupling and Output Functional Controllability of Robotic Manipulation
  4. Mapping interest rate projections using neural networks under cointegration
  5. Partitioned beta diversity patterns of plants across sharp and distinct boundaries of quartz habitat islands
  6. Analysis of PI controllers with anti-windup techniques on level systems
  7. Study on the effects of tool design and process parameters on the robustness of deep drawing
  8. TRY plant trait database – enhanced coverage and open access
  9. An evaluation of BPR methodologies adopting NIMSAD: A systematic framework for understanding and evaluating methodologies
  10. On finding nonisomorphic connected subgraphs and distinct molecular substructures.
  11. 7th open challenge on question answering over linked data (QALD-7)
  12. An expert-based reference list of variables for characterizing and monitoring social-ecological systems
  13. A Review of Latent Variable Modeling Using R - A Step-by-Step-Guide
  14. Practical guide to SAP Netweaver PI-development
  15. Modelling and implementation of an Order2Cash Process in distributed systems
  16. Knowledge-Enhanced Language Models Are Not Bias-Proof
  17. Mechanistic Realization of the Turtle Shell
  18. An Orthogonal Wavelet Denoising Algorithm for Surface Images of Atomic Force Microscopy
  19. Performance concepts and performance theory
  20. A Multilevel Inverter Bridge Control Structure with Energy Storage Using Model Predictive Control for Flat Systems
  21. Mirrored piezo servo hydraulic actuators for use in camless combustion engines and its Control with mirrored inputs and MPC
  22. Neural network-based estimation and compensation of friction for enhanced deep drawing process control
  23. Data-driven and physics-based modelling of process behaviour and deposit geometry for friction surfacing
  24. Changes of Perception
  25. Spaces for challenging experiences, indeterminacy, and experimentation
  26. For a return to the forgotten formula: 'Data 1 + Data 2 > Data 1'
  27. Errors in Training Computer Skills
  28. Teachers’ use of data from digital learning platforms for instructional design
  29. GENESIS - A generic RDF data access interface
  30. A Multimethod Latent State-Trait Model for Structurally Different and Interchangeable Methods