Portuguese part-of-speech tagging with large margin structure learning

Eraldo Rezende Fernandes; Irving Muller Rodrigues; Ruy Luiz Milidiú

doi:10.1109/BRACIS.2014.16

Portuguese part-of-speech tagging with large margin structure learning

Research output: Contributions to collected editions/works › Article in conference proceedings › Research › peer-review

Authors

Eraldo Rezende Fernandes
Irving Muller Rodrigues
Ruy Luiz Milidiú

Part-of-Speech Tagging is a fundamental task on many Natural Language Processing systems. This task consists in identifying the syntactic category, i.e. the part of speech, of each word in a sentence. Despite the fact that the current state-of-the-art accuracy for this task is around 97%, any improvement has an immediate impact on more complex tasks, like Parsing, Semantic Role Labeling and Information Extraction. Thus, it is still relevant to explore this task. In this paper, we introduce a part-of-speech tagger based on the Structure Learning framework that reduces the smallest known error on the Portuguese Mac-Morpho corpus by 7.8%. We also apply our tagger to a recently revised version of Mac-Morpho. Our system accuracy on this latter version is competitive with a semi-supervised Neural Network trained on Mac-Morpho plus a very large non-annotated corpus. Additionally, our system is simpler than previous systems and uses a very limited feature set. Our system employs a Large Margin training criteria to derive a structure predictor that is more robust on unseen data.

Original language	English
Title of host publication	BRACIS 2014 : 2014 Brazilian Conference on Intelligent Systems ; 19-23 October 2014, São Carlos, São Paulo, Brazil ; proceedings
Number of pages	6
Place of Publication	Piscataway
Publisher	Institute of Electrical and Electronics Engineers Inc.
Publication date	12.12.2014
Pages	25-30
Article number	6984802
ISBN (print)	978-1-4799-7859-5
ISBN (electronic)	978-1-4799-5618-0
DOIs	https://doi.org/10.1109/BRACIS.2014.16
Publication status	Published - 12.12.2014
Externally published	Yes
Event	Brazilian Conference on Intelligent Systems - BRACIS 2014 - Sao Carlos, Sao Paulo, Brazil Duration: 18.10.2014 → 23.10.2014 Conference number: 3 https://ieeexplore.ieee.org/xpl/conhome/6979382/proceeding

Research areas

Machine Learning, Natural Language Processing, POS Tagging, Structure Learning
Informatics
Business informatics

Other publications by the same author(s)

Data practices in apps from Brazil: What do privacy policies inform us about?

Quadros dos Reis, V., Rabello, M. E. R., Lima, A. C., Jardim, G. P. S., Fernandes, E. R. & Brefeld, U., 10.02.2023, In: Journal on Interactive Systems. 14, 1, p. 1-8 8 p.

Research output: Journal contributions › Journal articles › Research › peer-review

Entity Extraction from Portuguese Legal Documents Using Distant Supervision

Navarezi, L. M., Sakiyama, K., Rodrigues, L. S., Robaldo, C. M. O., Lobato, G. R., Vilela, P. A., Matsubara, E. T. & Fernandes, E. R., 2022, Computational Processing of the Portuguese Language : 15th International Conference, PROPOR 2022, Fortaleza, Brazil, March 21-23, 2022, Proceedings. Pinheiro, V., Gamallo, P., Amaro, R., Scarton, C., Batista, F., Silva, D., Magro, C. & Pinto, H. (eds.). Cham: Springer Nature Switzerland AG, p. 166-176 11 p. (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); vol. 13208 LNAI).

Research output: Contributions to collected editions/works › Article in conference proceedings › Research › peer-review

FaST: A linear time stack trace alignment heuristic for crash report deduplication

Rodrigues, I. M., Aloise, D. & Fernandes, E. R., 17.10.2022, The 2022 Mining Software Repositories Conference: MSR 2022, Proceedings; 18-20 May 2022, Virtual; 23-24 May 2022, Pittsburgh, Pennsylvania. New York: Institute of Electrical and Electronics Engineers Inc., p. 549-560 12 p. (Proceedings - IEEE/ACM International Conference on Mining Software Repositories ).

Research output: Contributions to collected editions/works › Article in conference proceedings › Research › peer-review

Performance predictors for graphics processing units applied to dark-silicon-aware design space exploration

Sonohata, R., Arigoni, D. C. A., Fernandes, E. R., Ribeiro dos Santos, R. & Dessandre Duenha, L., 01.08.2023, In: Concurrency and Computation: Practice and Experience. 35, 17, 16 p., e6877.

Research output: Journal contributions › Journal articles › Research › peer-review

TraceSim: An Alignment Method for Computing Stack Trace Similarity

Rodrigues, I. M., Khvorov, A., Aloise, D., Vasiliev, R., Koznov, D., Fernandes, E. R., Chernishev, G., Luciv, D. & Povarov, N., 01.03.2022, In: Empirical Software Engineering. 27, 2, 41 p., 53.

Research output: Journal contributions › Journal articles › Research › peer-review

DOI

https://doi.org/10.1109/BRACIS.2014.16
Final published version

Portuguese part-of-speech tagging with large margin structure learning

Authors

Research areas

Other publications by the same author(s)

Data practices in apps from Brazil: What do privacy policies inform us about?

Entity Extraction from Portuguese Legal Documents Using Distant Supervision

FaST: A linear time stack trace alignment heuristic for crash report deduplication

Performance predictors for graphics processing units applied to dark-silicon-aware design space exploration

TraceSim: An Alignment Method for Computing Stack Trace Similarity

DOI

Recently viewed

Projects

Organisations

Activities

Publications

Press / Media