Portuguese part-of-speech tagging with large margin structure learning

Publikation: Beiträge in SammelwerkenAufsätze in KonferenzbändenForschungbegutachtet

Authors

Part-of-Speech Tagging is a fundamental task on many Natural Language Processing systems. This task consists in identifying the syntactic category, i.e. the part of speech, of each word in a sentence. Despite the fact that the current state-of-the-art accuracy for this task is around 97%, any improvement has an immediate impact on more complex tasks, like Parsing, Semantic Role Labeling and Information Extraction. Thus, it is still relevant to explore this task. In this paper, we introduce a part-of-speech tagger based on the Structure Learning framework that reduces the smallest known error on the Portuguese Mac-Morpho corpus by 7.8%. We also apply our tagger to a recently revised version of Mac-Morpho. Our system accuracy on this latter version is competitive with a semi-supervised Neural Network trained on Mac-Morpho plus a very large non-annotated corpus. Additionally, our system is simpler than previous systems and uses a very limited feature set. Our system employs a Large Margin training criteria to derive a structure predictor that is more robust on unseen data.

OriginalspracheEnglisch
TitelBRACIS 2014 : 2014 Brazilian Conference on Intelligent Systems ; 19-23 October 2014, São Carlos, São Paulo, Brazil ; proceedings
Anzahl der Seiten6
ErscheinungsortPiscataway
VerlagInstitute of Electrical and Electronics Engineers Inc.
Erscheinungsdatum12.12.2014
Seiten25-30
Aufsatznummer6984802
ISBN (Print)978-1-4799-7859-5
ISBN (elektronisch)978-1-4799-5618-0
DOIs
PublikationsstatusErschienen - 12.12.2014
Extern publiziertJa
VeranstaltungBrazilian Conference on Intelligent Systems - BRACIS 2014 - Sao Carlos, Sao Paulo, Brasilien
Dauer: 18.10.201423.10.2014
Konferenznummer: 3
https://ieeexplore.ieee.org/xpl/conhome/6979382/proceeding

DOI

Zuletzt angesehen

Publikationen

  1. As cast microstructures on the mechanical and corrosion behaviour of ZK40 modified with Gd and Nd additions
  2. Constructing strangeness
  3. On entrepreneurial risk-taking and the macroeconomic effects of financial constraints
  4. Aim and structure of this book
  5. The Weird and the Eerie
  6. Investigation of the sulfur speciation in petroleum products by capillary gas chromatography with ICP-collision cell-MS detection
  7. Kombinatorik mit Ziffernkarten
  8. Dissolved carbon leaching from soil is a crucial component of the net ecosystem carbon balance
  9. Was fehlt in der EVS?
  10. Biomass energy with carbon capture and storage (BECCS or Bio-CCS)
  11. Complex Times, Complex Time
  12. Leverage points for improving gender equality and human well-being in a smallholder farming context
  13. Assessing the environmental fate of S-metolachlor, its commercial product Mercantor Gold® and their photoproducts using a water-sediment test and in silico methods
  14. Assessing Exposure of Pesticides to Bees
  15. Understanding Cross-Country Differences in Exporter Premia
  16. Infinite Mixtures of Markov Chains
  17. Die coatings influence evaluation and friction model selection in aluminium extrusion by means of visioplasticity technique
  18. Implementation of the location-based Game Application Nebolus to promote Health Literacy in the Community Environment. Results of a qualitative Study
  19. Pathways for Transformatio
  20. A geometric procedure for robust decoupling control of contact forces in robotic manipulation
  21. Sustainable Development and Quality Assurance in Higher Education
  22. Promoting recovery in daily life
  23. Hot deformation behavior of Mg-2Sn-2Ca alloy in as-cast condition and after homogenization
  24. Das Fleisch der Diskurse
  25. "When in Rome, do as the Romans do?"
  26. Forecasting Government Bond Yields with Neural Networks Considering Cointegration
  27. Towards an Extended Enterprise Architecture Meta-Model for Big Data
  28. Assessing nature-based solutions for transformative change
  29. Flat-of-the-curve medicine
  30. Ecosystem services values in Spain