TraceSim: An Alignment Method for Computing Stack Trace Similarity

Publikation: Beiträge in ZeitschriftenZeitschriftenaufsätzeForschungbegutachtet

Authors

  • Irving Muller Rodrigues
  • Aleksandr Khvorov
  • Daniel Aloise
  • Roman Vasiliev
  • Dmitrij Koznov
  • Eraldo Rezende Fernandes
  • George Chernishev
  • Dmitry Luciv
  • Nikita Povarov

Software systems can automatically submit crash reports to a repository for investigation when program failures occur. A significant portion of these crash reports are duplicate, i.e., they are caused by the same software issue. Therefore, if the volume of submitted reports is very large, automatic grouping of duplicate crash reports can significantly ease and speed up analysis of software failures. This task is known as crash report deduplication. Given a huge volume of incoming reports, increasing quality of deduplication is an important task. The majority of studies address it via information retrieval or sequence matching methods based on the similarity of stack traces from two crash reports. While information retrieval methods disregard the position of a frame in a stack trace, the existing works based on sequence matching algorithms do not fully consider subroutine global frequency and unmatched frames. Besides, due to data distribution differences among software projects, parameters that are learned using machine learning algorithms are necessary to provide more flexibility to the methods. In this paper, we propose TraceSim – an approach for crash report deduplication which combines TF-IDF, optimum global alignment, and machine learning (ML) in a novel way. Moreover, we propose a new evaluation methodology for this task that is more comprehensive and robust than previously used evaluation approaches. TraceSim significantly outperforms seven baselines and state-of-the-art methods in the majority of the scenarios. It is the only approach that achieves competitive results on all datasets regarding all considered metrics. Moreover, we conduct an extensive ablation study that demonstrates the importance of each TraceSim’s element to its final performance and robustness. Finally, we provide the source code for all considered methods and evaluation methodology as well as the created datasets.

OriginalspracheEnglisch
Aufsatznummer53
ZeitschriftEmpirical Software Engineering
Jahrgang27
Ausgabenummer2
Anzahl der Seiten41
ISSN1382-3256
DOIs
PublikationsstatusErschienen - 01.03.2022
Extern publiziertJa

DOI

Zuletzt angesehen

Publikationen

  1. Communication management of start-ups: an empirical analysis of entrepreneurs’ communication and networking success on Facebook
  2. Expanding Material Flow Cost Accounting
  3. Die Camera Obscura der Identität
  4. Functional traits drive ground beetle community structures in Central European forests
  5. Einleitung
  6. Do red herrings swim in circles?
  7. Diagnostik und Testverfahren für die Sekundarstufe
  8. Brexit's implications for EU-NATO cooperation
  9. Numerical investigations of a thermochemical heat storage system during the discharging
  10. Preventing the onset of major depressive disorder
  11. Retterkinder
  12. Task-in-Process in Breakout Rooms eines aufgabenbasierten Videokonferenzprojekts
  13. Effects of weld line in deep drawing of tailor welded blanks of high strength steels
  14. Action, en passant
  15. Leveraging Governance Performance to Enhance Climate Resilience
  16. Elbe – Venedig – Elbe
  17. Do outliers and unobserved heterogeneity explain the exporter productivity premium?
  18. Kenneth Grahame, The wind in the willows
  19. The Triple Transformation
  20. Five priorities to advance transformative transdisciplinary research
  21. Promoting Sustainable Consumption in Educational Settings
  22. Interdisziplinarität der Rechtsdogmatik
  23. Die Zukunft aktiv gestalten: Digitale Transformation und Künstliche Intelligenz in Bildung und Wissenschaft
  24. Systematic literature review of flipping classroom in mathematics
  25. Introduction: "Did Somebody Say New Media?"
  26. Überleben in der Wildnis
  27. The Aesthetic Responsiveness Assessment (AReA)
  28. Biomas Nutrient Recycling
  29. Transparenz und Geheimnis
  30. Social dynamics of community resilience building in the face of climate change
  31. Der Fremdsprachliche Unterricht Englisch
  32. PeTAL-DTD