Entropy-guided feature generation for structured learning of Portuguese dependency parsing

Publikation: Beiträge in SammelwerkenAufsätze in KonferenzbändenForschungbegutachtet

Authors

Feature generation is a difficult, yet highly necessary, subtask of machine learning modeling. Usually, it is partially solved by a domain expert that generates complex and discriminative feature templates by conjoining the available basic features. This is a limited and expensive way to obtain feature templates and is recognized as a modeling bottleneck. In this work, we propose an automatic method to generate feature templates for structured learning algorithms. The method receives as input the training dataset with basic features and produces a set of feature templates by conjoining basic features that are highly discriminative together. We denote this method entropy guided since it is based on the conditional entropy of local decision variables given the feature values. We illustrate our approach on the Portuguese dependency parsing task and report on experiments with the Bosque corpus. We show that the entropy-guided templates outperform the manually built templates used by MSTParser, which was the best performing system on the Bosque corpus up to now. Furthermore, our approach allows an effortless inclusion of two new basic features that automatically generate additional templates. As a result, our system achieves a per-token accuracy of 92.66%, what represents a reduction by more than 15% on the previous smallest error rate for Portuguese dependency parsing.

OriginalspracheEnglisch
TitelComputational Processing of the Portuguese Language : 10th International Conference, PROPOR 2012, Coimbra, Portugal, April 17-20, 2012. Proceedings
HerausgeberHelena Caseli, Aline Villavicencio, Antonio Teixeira, Fernando Perdigao
Anzahl der Seiten11
ErscheinungsortBerlin, Heidelberg
VerlagSpringer Verlag
Erscheinungsdatum2012
Seiten146-156
ISBN (Print)978-3-642-28884-5
ISBN (elektronisch)978-3-642-28885-2
DOIs
PublikationsstatusErschienen - 2012
Extern publiziertJa
VeranstaltungInternational Conference on Computational Processing of Portuguese - Coimbra, Portugal
Dauer: 17.04.201220.04.2012
Konferenznummer: 10
https://aclweb.org/portal/content/10th-international-conference-computational-processing-portuguese-propor-2012

DOI

Zuletzt angesehen

Forschende

  1. Zhiyong Xie

Publikationen

  1. The elicitation process in developing of case library for Case-Based Reasoner system whilst consideration for validating electronic communication technologies
  2. Factor structure and measurement invariance of the Students’ Self-report Checklist of Social and Learning Behaviour (SSL)
  3. Reciprocal Relationships Between Dispositional Optimism and Work Experiences
  4. Changing Data Collection Methods Means Different Kind of Data
  5. Modeling of lateness distributions depending on the sequencing method with respect to productivity effects
  6. Vergütung, variable
  7. 8th challenge on question answering over linked data (QALD-8)
  8. Intellectual property issues in the use and distribution of remote sensing data
  9. Highly Efficient IPT Transmitter Circuit Based on a Novel Enhanced Class B Amplifier for Consumer Applications
  10. The model of educational reconstruction: A framework for the design of theory-based content specific interventions
  11. A cascade controller structure using an internal PID controller for a hybrid piezo-hydraulic actuator in camless internal combustion engines
  12. Species constancy depends on plot size - A problem for vegetation classification and how it can be solved
  13. Temporal dynamics of conflict monitoring and the effects of one or two conflict sources on error-(related) negativity
  14. Exploiting ConvNet diversity for flooding identification
  15. A latent state-trait analysis of current achievement motivation across different tasks of cognitive ability
  16. Hybrid models for future event prediction
  17. Supporting Visual and Verbal Learning Preferences in a Second-Language Multimedia Learning Environment
  18. Anomalous Results in G-Factor Models
  19. A blueprint for mapping and modelling ecosystem services
  20. The role of task complexity, modality and aptitude in narrative task performance
  21. Technical concept and evaluation design of the state subsidized project [Level-Q]
  22. Parametric finite element model and mechanical characterisation of electrospun materials for biomedical applications
  23. Intraspecific trait variation increases species diversity in a trait-based grassland model
  24. Predicting the Individual Mood Level based on Diary Data
  25. How do students and teachers deal with mathematical modelling problems?
  26. Influence of initial severity of depression on effectiveness of low intensity interventions
  27. Differences in the sophistication of Value-based Management
  28. Machine vision system errors for unmanned aerial vehicle navigation
  29. Detection of significant tracer gases by means of polymer gas sensors
  30. Improve a 3D distance measurement accuracy in stereo vision systems using optimization methods’ approach
  31. Visual-Inertial Navigation Systems and Technologies
  32. Defining the notion of mining, extraction and collection
  33. Adaptive capacity and learning to learn as leverage for social-ecological resilience