Portuguese part-of-speech tagging with large margin structure learning

Research output: Contributions to collected editions/worksArticle in conference proceedingsResearchpeer-review

Authors

Part-of-Speech Tagging is a fundamental task on many Natural Language Processing systems. This task consists in identifying the syntactic category, i.e. the part of speech, of each word in a sentence. Despite the fact that the current state-of-the-art accuracy for this task is around 97%, any improvement has an immediate impact on more complex tasks, like Parsing, Semantic Role Labeling and Information Extraction. Thus, it is still relevant to explore this task. In this paper, we introduce a part-of-speech tagger based on the Structure Learning framework that reduces the smallest known error on the Portuguese Mac-Morpho corpus by 7.8%. We also apply our tagger to a recently revised version of Mac-Morpho. Our system accuracy on this latter version is competitive with a semi-supervised Neural Network trained on Mac-Morpho plus a very large non-annotated corpus. Additionally, our system is simpler than previous systems and uses a very limited feature set. Our system employs a Large Margin training criteria to derive a structure predictor that is more robust on unseen data.

Original languageEnglish
Title of host publicationBRACIS 2014 : 2014 Brazilian Conference on Intelligent Systems ; 19-23 October 2014, São Carlos, São Paulo, Brazil ; proceedings
Number of pages6
Place of PublicationPiscataway
PublisherInstitute of Electrical and Electronics Engineers Inc.
Publication date12.12.2014
Pages25-30
Article number6984802
ISBN (print)978-1-4799-7859-5
ISBN (electronic)978-1-4799-5618-0
DOIs
Publication statusPublished - 12.12.2014
Externally publishedYes
EventBrazilian Conference on Intelligent Systems - BRACIS 2014 - Sao Carlos, Sao Paulo, Brazil
Duration: 18.10.201423.10.2014
Conference number: 3
https://ieeexplore.ieee.org/xpl/conhome/6979382/proceeding

DOI

Recently viewed

Publications

  1. AUC Maximizing Support Vector Learning
  2. Analysis of the construction of an autonomous robot to improve its energy efficiency when traveling through irregular terrain
  3. Artificial intelligence in songwriting and composing - perspectives and challenges in creative practices
  4. How to support teachers to give feedback to modelling tasks effectively? Results from a teacher-training-study in the Co²CA project
  5. Sliding Mode Control of an Inductive Power Transmission System with Maximum Efficiency
  6. Robustness of coherent sets computations
  7. Deeper Insights into Different Consumer Perceptions of CSR Communication
  8. Legitimation problems of participatory processes in technology assessment and technology policy
  9. Collaborative open science as a way to reproducibility and new insights in primate cognition research
  10. Joint Proceedings of Scholarly QALD 2023 and SemREC 2023 co-located with 22nd International Semantic Web Conference ISWC 2023
  11. Expectations on Hierarchical Scales of Discourse
  12. Effect of yttrium addition on lattice parameter, Young's modulus and vacancy of magnesium
  13. Othering Space
  14. "And I Think That Is a Very Straightforward Way of Dealing With It''
  15. Self-perceived quality of life predicts mortality risk better than a multi-biomarker panel, but the combination of both does best
  16. What´s in a net? or: The end of the average
  17. Introduction to Philosophy of Management
  18. Lessons from modeling 100% renewable scenarios using GENeSYS-MOD
  19. Res Lunae: Characterizing Diverse Lunar Resource Systems Using the Social-Ecological System Framework
  20. Continental mapping of forest ecosystem functions reveals a high but unrealised potential for forest multifunctionality.
  21. Continued logarithm representation of real numbers
  22. Root-root interactions: extending our perspective to be more inclusive of the range of theories in ecology and agriculture using in-vivo analyses