Portuguese part-of-speech tagging with large margin structure learning

Research output: Contributions to collected editions/worksArticle in conference proceedingsResearchpeer-review

Authors

Part-of-Speech Tagging is a fundamental task on many Natural Language Processing systems. This task consists in identifying the syntactic category, i.e. the part of speech, of each word in a sentence. Despite the fact that the current state-of-the-art accuracy for this task is around 97%, any improvement has an immediate impact on more complex tasks, like Parsing, Semantic Role Labeling and Information Extraction. Thus, it is still relevant to explore this task. In this paper, we introduce a part-of-speech tagger based on the Structure Learning framework that reduces the smallest known error on the Portuguese Mac-Morpho corpus by 7.8%. We also apply our tagger to a recently revised version of Mac-Morpho. Our system accuracy on this latter version is competitive with a semi-supervised Neural Network trained on Mac-Morpho plus a very large non-annotated corpus. Additionally, our system is simpler than previous systems and uses a very limited feature set. Our system employs a Large Margin training criteria to derive a structure predictor that is more robust on unseen data.

Original languageEnglish
Title of host publicationBRACIS 2014 : 2014 Brazilian Conference on Intelligent Systems ; 19-23 October 2014, São Carlos, São Paulo, Brazil ; proceedings
Number of pages6
Place of PublicationPiscataway
PublisherInstitute of Electrical and Electronics Engineers Inc.
Publication date12.12.2014
Pages25-30
Article number6984802
ISBN (print)978-1-4799-7859-5
ISBN (electronic)978-1-4799-5618-0
DOIs
Publication statusPublished - 12.12.2014
Externally publishedYes
EventBrazilian Conference on Intelligent Systems - BRACIS 2014 - Sao Carlos, Sao Paulo, Brazil
Duration: 18.10.201423.10.2014
Conference number: 3
https://ieeexplore.ieee.org/xpl/conhome/6979382/proceeding

DOI

Recently viewed

Publications

  1. On Gender Statistics in the Art Field and Leading Positions in the International Sphere
  2. Positional income concerns and personality
  3. Compressive creep behavior and microstructural evolution of sand-cast and peak-aged Mg–12Gd–0.4Zr alloy at 250 °C
  4. Business Model Innovation for Sustainable Energy
  5. Abiotic and biotic drivers of tree trait effects on soil microbial biomass and soil carbon concentration
  6. Polizei und Jugendliche in der Geschichte der Bundesrepublik
  7. Die Verbreitung einer wegweisenden Idee: Der Beitrag der UN-Dekade für die Diffusion von Bildung für nachhaltige Entwicklung
  8. Smartphones im Unterricht – Wollen das Schülerinnen und Schüler überhaupt?!
  9. Local and landscape level variables influence butterfly diversity in critically endangered South African renosterveld
  10. Reiseanalyse Trendstudie 2030 - Urlaubsnachfrage im Quellmarkt Deutschland.
  11. Kwame Gyekye’s Critical Dialogue with Kant’s Ethics and its Political Consequences
  12. Mainstreaming of Sustainable Cotton in the German Clothing Industry
  13. Entrepreneurial Marketing and Capital Acquisition
  14. Fernunterricht und neue Informationstechnologien
  15. Bürgerlich-rechtliche, öffentlich-rechtliche und strafrechtliche Zwangsunterbringung
  16. Rezension zu Christoph Weischer: Sozialforschung. UVK Verlagsgesellschaft (Konstanz) 2007. 415 Seiten
  17. Belastung von Krankenhausabwasser mit gefährlichen Stoffen im Sinne §7a WHG
  18. Un nietzschianesimo senza riserve. La volontà di potenza nel dispositivo del potere pastorale
  19. Eine ökonomische Analyse der neuen Verbrauchsgüterkaufrichtlinie zum Gewährleistungsrecht
  20. Konzept eines Orientierungsrahmens für den Lernbereich Globale Entwicklung im Fach Informatik im Rahmen der Bildung fur Nachhaltige Entwicklung