Rhetorical Role Identification for Portuguese Legal Documents

Research output: Contributions to collected editions/worksArticle in conference proceedingsResearchpeer-review


In this paper, we present a new corpus for Rhetorical Role Identification in Portuguese legal documents. The corpus comprises petitions from 70 civil lawsuits filed in TJMS court and was manually labeled with rhetorical roles specifically tailored for petitions. Since petition documents are created without a standard structure, we had to deal with several issues to clean the extracted textual content. We assessed classic and deep learning machine learning methods on the proposed corpus. The best performing method obtained an F-score of 80.50. At the best of our knowledge, this is the first work to deal with rhetorical role identification for petitions, given that previous works focused only on judicial decisions. Additionally, it is also the first work to tackle this task for the Portuguese language. The proposed corpus, as well as the proposed rhetorical roles, can foster new research in the judicial area and also lead to new solutions to improve the flow of Brazilian court houses.

Original languageEnglish
Title of host publicationIntelligent Systems : 10th Brazilian Conference, BRACIS 2021, Virtual Event, November 29 – December 3, 2021, Proceedings, Part II
EditorsAndré Britto, Karina Valdivia Delgado
Number of pages15
Place of PublicationCham
PublisherSpringer Schweiz
Publication date2021
ISBN (Print)978-3-030-91698-5
ISBN (Electronic)978-3-030-91699-2
Publication statusPublished - 2021
Externally publishedYes
EventBrazilian Conference on Intelligent Systems - BRACIS 2021 - Virtual, Online
Duration: 29.11.202103.12.2021
Conference number: 10