FaST: A linear time stack trace alignment heuristic for crash report deduplication
Research output: Contributions to collected editions/works › Article in conference proceedings › Research › peer-review
Standard
The 2022 Mining Software Repositories Conference: MSR 2022, Proceedings; 18-20 May 2022, Virtual; 23-24 May 2022, Pittsburgh, Pennsylvania. New York: Institute of Electrical and Electronics Engineers Inc., 2022. p. 549-560 (Proceedings - IEEE/ACM International Conference on Mining Software Repositories ).
Research output: Contributions to collected editions/works › Article in conference proceedings › Research › peer-review
Harvard
APA
Vancouver
Bibtex
}
RIS
TY - CHAP
T1 - FaST: A linear time stack trace alignment heuristic for crash report deduplication
AU - Rodrigues, Irving Muller
AU - Aloise, Daniel
AU - Fernandes, Eraldo Rezende
N1 - Conference code: 19
PY - 2022/5/23
Y1 - 2022/5/23
N2 - In software projects, applications are often monitored by systems that automatically identify crashes, collect their information into reports, and submit them to developers. Especially in popular applications, such systems tend to generate a large number of crash reports in which a significant portion of them are duplicate. Due to this high submission volume, in practice, the crash report deduplication is supported by devising automatic systems whose efficiency is a critical constraint. In this paper, we focus on improving deduplication system throughput by speeding up the stack trace comparison. In contrast to the state-of-the-art techniques, we propose FaST, a novel sequence alignment method that computes the similarity score between two stack traces in linear time. Our method independently aligns identical frames in two stack traces by means of a simple alignment heuristic. We evaluate FaST and five competing methods on four datasets from open-source projects using ranking and binary metrics. Despite its simplicity, FaST consistently achieves state-of-the-art performance regarding all metrics considered. Moreover, our experiments confirm that FaST is substantially more efficient than methods based on optimal sequence alignment.
AB - In software projects, applications are often monitored by systems that automatically identify crashes, collect their information into reports, and submit them to developers. Especially in popular applications, such systems tend to generate a large number of crash reports in which a significant portion of them are duplicate. Due to this high submission volume, in practice, the crash report deduplication is supported by devising automatic systems whose efficiency is a critical constraint. In this paper, we focus on improving deduplication system throughput by speeding up the stack trace comparison. In contrast to the state-of-the-art techniques, we propose FaST, a novel sequence alignment method that computes the similarity score between two stack traces in linear time. Our method independently aligns identical frames in two stack traces by means of a simple alignment heuristic. We evaluate FaST and five competing methods on four datasets from open-source projects using ranking and binary metrics. Despite its simplicity, FaST consistently achieves state-of-the-art performance regarding all metrics considered. Moreover, our experiments confirm that FaST is substantially more efficient than methods based on optimal sequence alignment.
KW - Automatic Crash Reporting
KW - Crash Report Deduplication
KW - Duplicate Crash Report
KW - Duplicate Crash Report Detection
KW - Stack Trace Similarity
KW - Business informatics
UR - http://www.scopus.com/inward/record.url?scp=85134075597&partnerID=8YFLogxK
UR - https://ieeexplore.ieee.org/document/9796190
UR - https://www.proceedings.com/64472.html
U2 - 10.1145/3524842.3527951
DO - 10.1145/3524842.3527951
M3 - Article in conference proceedings
AN - SCOPUS:85134075597
SN - 9781665452106
T3 - Proceedings - IEEE/ACM International Conference on Mining Software Repositories
SP - 549
EP - 560
BT - The 2022 Mining Software Repositories Conference
PB - Institute of Electrical and Electronics Engineers Inc.
CY - New York
T2 - 19th International Conference on Mining Software Repositories - MSR 2022
Y2 - 23 May 2022 through 24 May 2022
ER -