FaST: A linear time stack trace alignment heuristic for crash report deduplication

Research output: Contributions to collected editions/worksArticle in conference proceedingsResearchpeer-review

Standard

FaST: A linear time stack trace alignment heuristic for crash report deduplication. / Rodrigues, Irving Muller; Aloise, Daniel; Fernandes, Eraldo Rezende.
The 2022 Mining Software Repositories Conference: MSR 2022, Proceedings; 18-20 May 2022, Virtual; 23-24 May 2022, Pittsburgh, Pennsylvania. New York: Institute of Electrical and Electronics Engineers Inc., 2022. p. 549-560 (Proceedings - IEEE/ACM International Conference on Mining Software Repositories ).

Research output: Contributions to collected editions/worksArticle in conference proceedingsResearchpeer-review

Harvard

Rodrigues, IM, Aloise, D & Fernandes, ER 2022, FaST: A linear time stack trace alignment heuristic for crash report deduplication. in The 2022 Mining Software Repositories Conference: MSR 2022, Proceedings; 18-20 May 2022, Virtual; 23-24 May 2022, Pittsburgh, Pennsylvania. Proceedings - IEEE/ACM International Conference on Mining Software Repositories , Institute of Electrical and Electronics Engineers Inc., New York, pp. 549-560, 19th International Conference on Mining Software Repositories - MSR 2022, Pittsburgh, Pennsylvania, United States, 23.05.22. https://doi.org/10.1145/3524842.3527951

APA

Rodrigues, I. M., Aloise, D., & Fernandes, E. R. (2022). FaST: A linear time stack trace alignment heuristic for crash report deduplication. In The 2022 Mining Software Repositories Conference: MSR 2022, Proceedings; 18-20 May 2022, Virtual; 23-24 May 2022, Pittsburgh, Pennsylvania (pp. 549-560). (Proceedings - IEEE/ACM International Conference on Mining Software Repositories ). Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1145/3524842.3527951

Vancouver

Rodrigues IM, Aloise D, Fernandes ER. FaST: A linear time stack trace alignment heuristic for crash report deduplication. In The 2022 Mining Software Repositories Conference: MSR 2022, Proceedings; 18-20 May 2022, Virtual; 23-24 May 2022, Pittsburgh, Pennsylvania. New York: Institute of Electrical and Electronics Engineers Inc. 2022. p. 549-560. (Proceedings - IEEE/ACM International Conference on Mining Software Repositories ). doi: 10.1145/3524842.3527951

Bibtex

@inbook{09be0d21a29545aeaf7dd56dff26239d,
title = "FaST: A linear time stack trace alignment heuristic for crash report deduplication",
abstract = "In software projects, applications are often monitored by systems that automatically identify crashes, collect their information into reports, and submit them to developers. Especially in popular applications, such systems tend to generate a large number of crash reports in which a significant portion of them are duplicate. Due to this high submission volume, in practice, the crash report deduplication is supported by devising automatic systems whose efficiency is a critical constraint. In this paper, we focus on improving deduplication system throughput by speeding up the stack trace comparison. In contrast to the state-of-the-art techniques, we propose FaST, a novel sequence alignment method that computes the similarity score between two stack traces in linear time. Our method independently aligns identical frames in two stack traces by means of a simple alignment heuristic. We evaluate FaST and five competing methods on four datasets from open-source projects using ranking and binary metrics. Despite its simplicity, FaST consistently achieves state-of-the-art performance regarding all metrics considered. Moreover, our experiments confirm that FaST is substantially more efficient than methods based on optimal sequence alignment.",
keywords = "Automatic Crash Reporting, Crash Report Deduplication, Duplicate Crash Report, Duplicate Crash Report Detection, Stack Trace Similarity, Business informatics",
author = "Rodrigues, {Irving Muller} and Daniel Aloise and Fernandes, {Eraldo Rezende}",
note = "Titel der Druckausgabe: 2022 IEEE/ACM 19th International Conference on Mining Software Repositories (MSR 2022) Funding Information: We would like to gratefully acknowledge the Natural Sciences and Engineering Research Council of Canada (NSERC), Ericsson, Ciena, and EffciOS for funding this project. Moreover, this research was enabled in part by the support provided by WestGrid (https://www. westgrid.ca/) and Compute Canada (www.computecanada.ca). Publisher Copyright: {\textcopyright} 2022 ACM.; 19th International Conference on Mining Software Repositories - MSR 2022, MSR 2022 ; Conference date: 23-05-2022 Through 24-05-2022",
year = "2022",
month = oct,
day = "17",
doi = "10.1145/3524842.3527951",
language = "English",
isbn = "9781665452106 ",
series = "Proceedings - IEEE/ACM International Conference on Mining Software Repositories ",
publisher = "Institute of Electrical and Electronics Engineers Inc.",
pages = "549--560",
booktitle = "The 2022 Mining Software Repositories Conference",
address = "United States",
url = "https://conf.researchr.org/home/msr-2022",

}

RIS

TY - CHAP

T1 - FaST: A linear time stack trace alignment heuristic for crash report deduplication

AU - Rodrigues, Irving Muller

AU - Aloise, Daniel

AU - Fernandes, Eraldo Rezende

N1 - Conference code: 19

PY - 2022/10/17

Y1 - 2022/10/17

N2 - In software projects, applications are often monitored by systems that automatically identify crashes, collect their information into reports, and submit them to developers. Especially in popular applications, such systems tend to generate a large number of crash reports in which a significant portion of them are duplicate. Due to this high submission volume, in practice, the crash report deduplication is supported by devising automatic systems whose efficiency is a critical constraint. In this paper, we focus on improving deduplication system throughput by speeding up the stack trace comparison. In contrast to the state-of-the-art techniques, we propose FaST, a novel sequence alignment method that computes the similarity score between two stack traces in linear time. Our method independently aligns identical frames in two stack traces by means of a simple alignment heuristic. We evaluate FaST and five competing methods on four datasets from open-source projects using ranking and binary metrics. Despite its simplicity, FaST consistently achieves state-of-the-art performance regarding all metrics considered. Moreover, our experiments confirm that FaST is substantially more efficient than methods based on optimal sequence alignment.

AB - In software projects, applications are often monitored by systems that automatically identify crashes, collect their information into reports, and submit them to developers. Especially in popular applications, such systems tend to generate a large number of crash reports in which a significant portion of them are duplicate. Due to this high submission volume, in practice, the crash report deduplication is supported by devising automatic systems whose efficiency is a critical constraint. In this paper, we focus on improving deduplication system throughput by speeding up the stack trace comparison. In contrast to the state-of-the-art techniques, we propose FaST, a novel sequence alignment method that computes the similarity score between two stack traces in linear time. Our method independently aligns identical frames in two stack traces by means of a simple alignment heuristic. We evaluate FaST and five competing methods on four datasets from open-source projects using ranking and binary metrics. Despite its simplicity, FaST consistently achieves state-of-the-art performance regarding all metrics considered. Moreover, our experiments confirm that FaST is substantially more efficient than methods based on optimal sequence alignment.

KW - Automatic Crash Reporting

KW - Crash Report Deduplication

KW - Duplicate Crash Report

KW - Duplicate Crash Report Detection

KW - Stack Trace Similarity

KW - Business informatics

UR - http://www.scopus.com/inward/record.url?scp=85134075597&partnerID=8YFLogxK

UR - https://ieeexplore.ieee.org/document/9796190

UR - https://www.proceedings.com/64472.html

U2 - 10.1145/3524842.3527951

DO - 10.1145/3524842.3527951

M3 - Article in conference proceedings

AN - SCOPUS:85134075597

SN - 9781665452106

T3 - Proceedings - IEEE/ACM International Conference on Mining Software Repositories

SP - 549

EP - 560

BT - The 2022 Mining Software Repositories Conference

PB - Institute of Electrical and Electronics Engineers Inc.

CY - New York

T2 - 19th International Conference on Mining Software Repositories - MSR 2022

Y2 - 23 May 2022 through 24 May 2022

ER -

DOI

Recently viewed

Publications

  1. Lyapunov stability analysis to set up a PI controller for a mass flow system in case of a non-saturating input
  2. Springback prediction and reduction in deep drawing under influence of unloading modulus degradation
  3. Should learners use their hands for learning? Results from an eye-tracking study
  4. Modeling of Logistic Processes in Assembly Areas
  5. Different kinds of interactive exercises with response analysis on the web
  6. A sensor fault detection scheme as a functional safety feature for DC-DC converters
  7. Harvesting information from captions for weakly supervised semantic segmentation
  8. Understanding the socio-technical aspects of low-code adoption for software development
  9. Introduction Mobile Digital Practices. Situating People, Things, and Data
  10. Fast, Fully Automated Analysis of Voriconazole from Serum by LC-LC-ESI-MS-MS with Parallel Column-Switching Technique
  11. Exact and approximate inference for annotating graphs with structural SVMs
  12. Exploration strategies, performance, and error consequences when learning a complex computer task
  13. Lessons learned for spatial modelling of ecosystem services in support of ecosystem accounting
  14. How to support synchronous net-based learning discourses
  15. Construct Objectification and De-Objectification in Organization Theory
  16. Development and validation of a method for the determination of trace alkylphenols and phthalates in the atmosphere
  17. Modeling and numerical simulation of multiscale behavior in polycrystals via extended crystal plasticity
  18. A fast sequential injection analysis system for the simultaneous determination of ammonia and phosphate
  19. Taking the pulse of Earth's tropical forests using networks of highly distributed plots
  20. Backstepping-based Input-Output Linearization of a Peltier Element for Ice Clamping using an Unscented Kalman Filter
  21. A simple nonlinear PD control for faster and high-precision positioning of servomechanisms with actuator saturation
  22. How, when and why do negotiators use reference points?
  23. A lyapunov approach in the derivative approximation using a dynamic system
  24. Hierarchical trait filtering at different spatial scales determines beetle assemblages in deadwood
  25. Transductive support vector machines for structured variables
  26. Training effects of two different unstable shoe constructions on postural control in static and dynamic testing situations
  27. Selecting and Adapting Methods for Analysis and Design in Value-Sensitive Digital Social Innovation Projects: Toward Design Principles
  28. Volume of Imbalance Container Prediction using Kalman Filter and Long Short-Term Memory