FaST: A linear time stack trace alignment heuristic for crash report deduplication

Research output: Contributions to collected editions/worksArticle in conference proceedingsResearchpeer-review

Authors

In software projects, applications are often monitored by systems that automatically identify crashes, collect their information into reports, and submit them to developers. Especially in popular applications, such systems tend to generate a large number of crash reports in which a significant portion of them are duplicate. Due to this high submission volume, in practice, the crash report deduplication is supported by devising automatic systems whose efficiency is a critical constraint. In this paper, we focus on improving deduplication system throughput by speeding up the stack trace comparison. In contrast to the state-of-the-art techniques, we propose FaST, a novel sequence alignment method that computes the similarity score between two stack traces in linear time. Our method independently aligns identical frames in two stack traces by means of a simple alignment heuristic. We evaluate FaST and five competing methods on four datasets from open-source projects using ranking and binary metrics. Despite its simplicity, FaST consistently achieves state-of-the-art performance regarding all metrics considered. Moreover, our experiments confirm that FaST is substantially more efficient than methods based on optimal sequence alignment.

Original languageEnglish
Title of host publicationThe 2022 Mining Software Repositories Conference : MSR 2022, Proceedings; 18-20 May 2022, Virtual; 23-24 May 2022, Pittsburgh, Pennsylvania
Number of pages12
Place of PublicationNew York
PublisherInstitute of Electrical and Electronics Engineers Inc.
Publication date17.10.2022
Pages549-560
ISBN (print)9781665452106
ISBN (electronic)978-1-4503-9303-4
DOIs
Publication statusPublished - 17.10.2022
Event19th International Conference on Mining Software Repositories - MSR 2022 - Pittsburgh, United States
Duration: 23.05.202224.05.2022
Conference number: 19
https://conf.researchr.org/home/msr-2022

Bibliographical note

Titel der Druckausgabe: 2022 IEEE/ACM 19th International Conference on Mining Software Repositories (MSR 2022)

Funding Information:
We would like to gratefully acknowledge the Natural Sciences and Engineering Research Council of Canada (NSERC), Ericsson, Ciena, and EffciOS for funding this project. Moreover, this research was enabled in part by the support provided by WestGrid (https://www. westgrid.ca/) and Compute Canada (www.computecanada.ca).

Publisher Copyright:
© 2022 ACM.

    Research areas

  • Automatic Crash Reporting, Crash Report Deduplication, Duplicate Crash Report, Duplicate Crash Report Detection, Stack Trace Similarity
  • Business informatics

DOI

Recently viewed

Projects

  1. Kloster Lüne

Activities

  1. Presentation of the paper entitled: "Controlling a Bank Model Economy by Using an Adaptive Model Predictive Control with Help of an Extended Kalman Filter"
  2. Spas in the New Länder: A Transformation with an Uncertain Outcome.
  3. Enhancing careless responding detection: A norm group-based calculation approach
  4. Improving Human-Machine Interaction – A Multimodal Non-Invasive Approach to Detect Emotions in Car Drivers
  5. Thinking of Time - A Resource which Should be Allocated Equally
  6. Transparency and Secrecy
  7. Implementing Sustainability Strategies Through Accounting Controls: An Exploration of Practices in Seven Multinational Corporations
  8. Winding (and not Knowing) the Sociotechnical Mangle of Organization
  9. Dimension theory of representations of real numbers
  10. Just why, how and when should more participation lead to better environmental policy outcomes? A causal framework for analysis
  11. One generation plants the trees, another gets the shade? Negotiators' perceptions and behaviors in intergenerational allocations of resources.
  12. Determination of Bearing Clearance by the Application of Neural Networks
  13. While the Angels are Naming Us.
  14. Crumpled Times. Temporal and Epistemological Depths of Agent-Based Traffic Simulations
  15. Computersimulation als Erkenntnismethode
  16. Metamorphosen
  17. Peripheral Expressionisms
  18. Open Access in der Gesellschaft für Informatik e. V.
  19. Guidance on the application of in silico tools for Benign by Design
  20. Fostering Oral Skills Through the Use of Participatory Web 2.0 Technologies in the Project-based EFL Classroom
  21. Time-Induced Political Inequality: Why Future Generations Need Proxy Representation
  22. Harvard Universität
  23. Where tasks, technology, and textbooks meet: Intelligent tutoring systems on the task-based language teacher's horizon (SLTED, Universität Wien)
  24. MIS Quarterly Executive (Zeitschrift)
  25. Posterpräsentation zum Projekt StudiKommKlima
  26. Do mindsets make a difference? Professionalizing teachers for inclusive language learning environments
  27. Harald Aschemann
  28. Effects of an international student exchange program on knowledge of international health care systems based on a real patient ́s case
  29. How do rhizobacterial volatiles influence root system architecture, biomass production and allocation of the model grass Brachypodium distachyon?
  30. The Role and Use of Environmental Management Accounting for Supply Chains: An explorative Study