Portuguese part-of-speech tagging with large margin structure learning

Publikation: Beiträge in SammelwerkenAufsätze in KonferenzbändenForschungbegutachtet

Authors

Part-of-Speech Tagging is a fundamental task on many Natural Language Processing systems. This task consists in identifying the syntactic category, i.e. the part of speech, of each word in a sentence. Despite the fact that the current state-of-the-art accuracy for this task is around 97%, any improvement has an immediate impact on more complex tasks, like Parsing, Semantic Role Labeling and Information Extraction. Thus, it is still relevant to explore this task. In this paper, we introduce a part-of-speech tagger based on the Structure Learning framework that reduces the smallest known error on the Portuguese Mac-Morpho corpus by 7.8%. We also apply our tagger to a recently revised version of Mac-Morpho. Our system accuracy on this latter version is competitive with a semi-supervised Neural Network trained on Mac-Morpho plus a very large non-annotated corpus. Additionally, our system is simpler than previous systems and uses a very limited feature set. Our system employs a Large Margin training criteria to derive a structure predictor that is more robust on unseen data.

OriginalspracheEnglisch
TitelBRACIS 2014 : 2014 Brazilian Conference on Intelligent Systems ; 19-23 October 2014, São Carlos, São Paulo, Brazil ; proceedings
Anzahl der Seiten6
ErscheinungsortPiscataway
VerlagInstitute of Electrical and Electronics Engineers Inc.
Erscheinungsdatum12.12.2014
Seiten25-30
Aufsatznummer6984802
ISBN (Print)978-1-4799-7859-5
ISBN (elektronisch)978-1-4799-5618-0
DOIs
PublikationsstatusErschienen - 12.12.2014
Extern publiziertJa
VeranstaltungBrazilian Conference on Intelligent Systems - BRACIS 2014 - Sao Carlos, Sao Paulo, Brasilien
Dauer: 18.10.201423.10.2014
Konferenznummer: 3
https://ieeexplore.ieee.org/xpl/conhome/6979382/proceeding

DOI

Zuletzt angesehen

Publikationen

  1. As cast microstructures on the mechanical and corrosion behaviour of ZK40 modified with Gd and Nd additions
  2. Corrigendum to ‘Likelihood‐based cointegration tests in heterogeneous panels’
(Larsson R., J. Lyhagen and M. Löthgren, Econometrics Journal, 4, 2001, 109–142)
  3. Mining for critical stock price movements using temporal power laws and integrated autoregressive models
  4. Challenging the status quo of accelerator research: Concluding remarks
  5. A microsystem for growth inhibition test of Enterococcus faecalis based on impedance measurement
  6. CSR
  7. Digital language teaching after COVID-19: what can we learn from the crisis?
  8. Log in and breathe out: cost-effectiveness of internet-based recreation training for better sleep in stressed employees
  9. Do consumers prefer pasture-raised dual-purpose cattle when considering meat products? A hypothetical discrete choice experiment for the case of minced beef
  10. Coupling ordination techniques and GAM to spatially predict vegetation assemblages along a climatic gradient in an ENSO-affected region of extremely high climate variability
  11. A dissociation between two classes of spatial abilities in elementary school children
  12. A Note on Pensions and Firm Performance
  13. Three schools of transformation thinking
  14. Precision Denoising in Medical Imaging via Generative Adversarial Network-Aided Low-Noise Discriminator Technique
  15. Do abundance distributions and species aggregation correctly predict macroecological biodiversity patterns in tropical forests?
  16. How many organic compounds are graph-theoretically nonplanar?
  17. Identifying determinants of teachers' judgment (in)accuracy regarding students' school-related motivations using a Bayesian cross-classified multi-level model
  18. Use of Recurrence Quantification Analysis to Examine Associations Between Changes in Text Structure Across an Expressive Writing Intervention and Reductions in Distress Symptoms in Women With Breast Cancer
  19. Design rules for environmental biodegradability of phenylalanine alkyl ester linked ionic liquids
  20. Effects of samarium content on microstructure and mechanical properties of Mg–0.5Zn–0.5Zr alloy
  21. Noninteracting force/motion control of defective manipulation systems
  22. Robust Control as a Mathematical Paradigm for Innovative Engineering Applications
  23. A classification of teacher interventions in mathematics teaching
  24. TextCSN
  25. Cascaded Kalman Filters for a Sliding Mode Control in a Peltier Structure for an Innovative Manufacturing System
  26. Revisiting Carbon Disclosure and Performance
  27. Epistemic Domination by Data Extraction
  28. Assessing the costs and cost-effectiveness of ICare internet-based interventions (protocol)
  29. A hysteresis hybrid extended kalman filter as an observer for sensorless valve control in camless internal combustion engines
  30. On entrepreneurial risk-taking and the macroeconomic effects of financial constraints