Portuguese part-of-speech tagging with large margin structure learning

Research output: Contributions to collected editions/worksArticle in conference proceedingsResearchpeer-review

Authors

Part-of-Speech Tagging is a fundamental task on many Natural Language Processing systems. This task consists in identifying the syntactic category, i.e. the part of speech, of each word in a sentence. Despite the fact that the current state-of-the-art accuracy for this task is around 97%, any improvement has an immediate impact on more complex tasks, like Parsing, Semantic Role Labeling and Information Extraction. Thus, it is still relevant to explore this task. In this paper, we introduce a part-of-speech tagger based on the Structure Learning framework that reduces the smallest known error on the Portuguese Mac-Morpho corpus by 7.8%. We also apply our tagger to a recently revised version of Mac-Morpho. Our system accuracy on this latter version is competitive with a semi-supervised Neural Network trained on Mac-Morpho plus a very large non-annotated corpus. Additionally, our system is simpler than previous systems and uses a very limited feature set. Our system employs a Large Margin training criteria to derive a structure predictor that is more robust on unseen data.

Original languageEnglish
Title of host publicationBRACIS 2014 : 2014 Brazilian Conference on Intelligent Systems ; 19-23 October 2014, São Carlos, São Paulo, Brazil ; proceedings
Number of pages6
Place of PublicationPiscataway
PublisherInstitute of Electrical and Electronics Engineers Inc.
Publication date12.12.2014
Pages25-30
Article number6984802
ISBN (print)978-1-4799-7859-5
ISBN (electronic)978-1-4799-5618-0
DOIs
Publication statusPublished - 12.12.2014
Externally publishedYes
EventBrazilian Conference on Intelligent Systems - BRACIS 2014 - Sao Carlos, Sao Paulo, Brazil
Duration: 18.10.201423.10.2014
Conference number: 3
https://ieeexplore.ieee.org/xpl/conhome/6979382/proceeding

DOI

Recently viewed

Publications

  1. Geometric structures for the parameterization of non-interacting dynamics for multi-body mechanisms
  2. Learning linear classifiers sensitive to example dependent and noisy costs
  3. Simple measures and complex structures
  4. Quantification and analysis of surface macroplastic contamination on arable areas
  5. Emergence of Responsiveness Across Organizations, Networks, and Clusters from a Dynamic Capability Perspective
  6. State-wide university implementation of an online platform for eating disorders screening and intervention.
  7. Studying embodied encounters
  8. Set oriented computation of transport rates in 3-degree of freedom systems
  9. Generative 3D reconstruction of Ti-6Al-4V basketweave microstructures by optimization of differentiable microstructural descriptors
  10. Manufacturing, control, and performance evaluation of a Gecko-inspired soft robot
  11. Embodiment of Science in Science Slams.
  12. More than a YouTube Channel
  13. Generalizing Trust
  14. Attention and the Speed of Information Processing
  15. Building capacity for the science-policy interface on biodiversity and ecosystem services
  16. Export entry, export exit, and productivity in German manufacturing industries
  17. Managing Global Production Networks
  18. A Bayesian EAP-Based Nonlinear Extension of Croon and Van Veldhoven’s Model for Analyzing Data from Micro–Macro Multilevel Designs
  19. Allometric equations for maximum filtration rate in blue mussels Mytilus edulis and importance of condition index
  20. Gross, Richard. Understanding Grief: An Introduction, Routledge, 2016
  21. Cross-Fertilizing Qualitative Perspectives on Effects of a Mindfulness-Based Intervention: An Empirical Comparison of Four Methodical Approaches
  22. How attribution-of-competence and scale-granularity explain the anchor precision effect in negotiations and estimations.
  23. Keep calm and follow the news
  24. Utopian Hacks
  25. Developing European conservation and mitigation tools for pollination services: approaches of the STEP (Status and Trends of European Pollinators) project
  26. History of Embryology: Visualizations Through Series and Animation
  27. Modeling and predicting aquatic aerobic biodegradation
  28. Von "cool" zu Klärung
  29. Does forest continuity enhance the resilience of trees to environmental change?