Entropy-guided feature generation for structured learning of Portuguese dependency parsing

Publikation: Beiträge in SammelwerkenAufsätze in KonferenzbändenForschungbegutachtet

Authors

Feature generation is a difficult, yet highly necessary, subtask of machine learning modeling. Usually, it is partially solved by a domain expert that generates complex and discriminative feature templates by conjoining the available basic features. This is a limited and expensive way to obtain feature templates and is recognized as a modeling bottleneck. In this work, we propose an automatic method to generate feature templates for structured learning algorithms. The method receives as input the training dataset with basic features and produces a set of feature templates by conjoining basic features that are highly discriminative together. We denote this method entropy guided since it is based on the conditional entropy of local decision variables given the feature values. We illustrate our approach on the Portuguese dependency parsing task and report on experiments with the Bosque corpus. We show that the entropy-guided templates outperform the manually built templates used by MSTParser, which was the best performing system on the Bosque corpus up to now. Furthermore, our approach allows an effortless inclusion of two new basic features that automatically generate additional templates. As a result, our system achieves a per-token accuracy of 92.66%, what represents a reduction by more than 15% on the previous smallest error rate for Portuguese dependency parsing.

OriginalspracheEnglisch
TitelComputational Processing of the Portuguese Language : 10th International Conference, PROPOR 2012, Coimbra, Portugal, April 17-20, 2012. Proceedings
HerausgeberHelena Caseli, Aline Villavicencio, Antonio Teixeira, Fernando Perdigao
Anzahl der Seiten11
ErscheinungsortBerlin, Heidelberg
VerlagSpringer Verlag
Erscheinungsdatum2012
Seiten146-156
ISBN (Print)978-3-642-28884-5
ISBN (elektronisch)978-3-642-28885-2
DOIs
PublikationsstatusErschienen - 2012
Extern publiziertJa
VeranstaltungInternational Conference on Computational Processing of Portuguese - Coimbra, Portugal
Dauer: 17.04.201220.04.2012
Konferenznummer: 10
https://aclweb.org/portal/content/10th-international-conference-computational-processing-portuguese-propor-2012

DOI

Zuletzt angesehen

Publikationen

  1. Supporting discourse in a synchronous learning environment
  2. What can conservation strategies learn from the ecosystem services approach?
  3. Reality-Based Tasks with Complex-Situations
  4. MICSIM: Concept, Developments, and Applications of a PC Microsimulation Model for Research and Teaching
  5. Influence of Long-Lasting Static Stretching Intervention on Functional and Morphological Parameters in the Plantar Flexors
  6. The link between in- and external rotation of the auditor and the quality of financial accounting and audit
  7. How to support students-learning in mathematical bridging-courses using ITS? Remedial Scenarios in the EU-Project Math-Bridge
  8. Second-order SMC with disturbance compensation for robust tracking control in PMSM applications
  9. Actuator- and/or sensor element for sleeve in medical field e.g. limb or joint fracture treatment, has nano-wires comprising nano-fibers, where element deforms and acquires dimensional change of nano-fibers via electrical signal
  10. Acceleration as process
  11. Using latent class analysis to produce a typology of environmental concern in the UK
  12. The Effectivity of Technological Innovation on Mitigating the Costs of Climate Change Policies
  13. Productive Transformations and Bilateralism in the Semi-Periphery
  14. The Influence of Tree Diversity on Natural Enemies—a Review of the “Enemies” Hypothesis in Forests
  15. Multimodality
  16. Assoggettamento/Soggettivazione
  17. Biomedical Entity Linking with Triple-aware Pre-Training
  18. Whose home is it anyway?
  19. Multiple
  20. Effects of gadolinium and neodymium addition on young’s modulus of magnesium-based binary alloys
  21. Double perspective taking processes of primary children - adoption and application of a psychological instrument
  22. Harnessing place attachment for local climate mitigation?
  23. Technology Development and Stakeholder Influence
  24. Relative and absolute scarcity of nature
  25. Measuring Work Ability with Its Antecedents
  26. Zuhause in der Mediengesellschaft
  27. Remotely sensed effectiveness assessments of protected areas lack a common framework
  28. Mining the Campus
  29. Development and Testing of Water-Filled Tube Systems for Flood Protection Measures