Entropy-guided feature generation for structured learning of Portuguese dependency parsing

Publikation: Beiträge in SammelwerkenAufsätze in KonferenzbändenForschungbegutachtet

Authors

Feature generation is a difficult, yet highly necessary, subtask of machine learning modeling. Usually, it is partially solved by a domain expert that generates complex and discriminative feature templates by conjoining the available basic features. This is a limited and expensive way to obtain feature templates and is recognized as a modeling bottleneck. In this work, we propose an automatic method to generate feature templates for structured learning algorithms. The method receives as input the training dataset with basic features and produces a set of feature templates by conjoining basic features that are highly discriminative together. We denote this method entropy guided since it is based on the conditional entropy of local decision variables given the feature values. We illustrate our approach on the Portuguese dependency parsing task and report on experiments with the Bosque corpus. We show that the entropy-guided templates outperform the manually built templates used by MSTParser, which was the best performing system on the Bosque corpus up to now. Furthermore, our approach allows an effortless inclusion of two new basic features that automatically generate additional templates. As a result, our system achieves a per-token accuracy of 92.66%, what represents a reduction by more than 15% on the previous smallest error rate for Portuguese dependency parsing.

OriginalspracheEnglisch
TitelComputational Processing of the Portuguese Language : 10th International Conference, PROPOR 2012, Coimbra, Portugal, April 17-20, 2012. Proceedings
HerausgeberHelena Caseli, Aline Villavicencio, Antonio Teixeira, Fernando Perdigao
Anzahl der Seiten11
ErscheinungsortBerlin, Heidelberg
VerlagSpringer Verlag
Erscheinungsdatum2012
Seiten146-156
ISBN (Print)978-3-642-28884-5
ISBN (elektronisch)978-3-642-28885-2
DOIs
PublikationsstatusErschienen - 2012
Extern publiziertJa
VeranstaltungInternational Conference on Computational Processing of Portuguese - Coimbra, Portugal
Dauer: 17.04.201220.04.2012
Konferenznummer: 10
https://aclweb.org/portal/content/10th-international-conference-computational-processing-portuguese-propor-2012

DOI

Zuletzt angesehen

Publikationen

  1. Using the flatness of DC-Drives to emulate a generator for a decoupled MPC using a geometric approach for motion control in Robotino
  2. A discrete-time fractional order PI controller for a three phase synchronous motor using an optimal loop shaping approach
  3. Globally asymptotic output feedback tracking of robot manipulators with actuator constraints
  4. Construct Objectification and De-Objectification in Organization Theory
  5. A model predictive control in Robotino and its implementation using ROS system
  6. Long-term memory predictors of adult language learning at the interface between syntactic form and meaning
  7. Comparing the performance of computational estimation methods for physicochemical properties of dimethylsiloxanes and selected siloxanols
  8. Human–learning–machines: introduction to a special section on how cybernetics and constructivism inspired new forms of learning
  9. A change of values is in the air
  10. Integrating errors into the training process
  11. Analysis of Complexity Reduction in Kalman Filters Through Decoupling Control With Chattered Inputs in PMSM
  12. TextGraphs 2024 Shared Task on Text-Graph Representations for Knowledge Graph Question Answering
  13. Fast, Fully Automated Analysis of Voriconazole from Serum by LC-LC-ESI-MS-MS with Parallel Column-Switching Technique
  14. Using learning protocols for knowledge acquisition and problem solving with individual and group incentives
  15. The Influence of Note-taking on Mathematical Solution Processes while Working on Reality-Based Tasks
  16. Predicting the Difficulty of Exercise Items for Dynamic Difficulty Adaptation in Adaptive Language Tutoring
  17. Effectiveness of a guided multicomponent internet and mobile gratitude training program - A pragmatic randomized controlled trial
  18. A Review of Latent Variable Modeling Using R - A Step-by-Step-Guide
  19. An Orthogonal Wavelet Denoising Algorithm for Surface Images of Atomic Force Microscopy
  20. Stability analysis of a linear model predictive control and its application in a water recovery process
  21. Supporting the Development and Realization of Data-Driven Business Models with Enterprise Architecture Modeling and Management
  22. Building a process layer for business applications using the blackboard pattern
  23. Expertise in research integration and implementation for tackling complex problems
  24. Comparing the Sensitivity of Social Networks, Web Graphs, and Random Graphs with Respect to Vertex Removal
  25. Binary Random Nets I
  26. Partitioned beta diversity patterns of plants across sharp and distinct boundaries of quartz habitat islands
  27. Efficient Order Picking Methods in Robotic Mobile Fulfillment Systems
  28. Guest Editorial Special Issue on Sensors in Machine Vision of Automated Systems
  29. Data-driven and physics-based modelling of process behaviour and deposit geometry for friction surfacing
  30. Machine Learning and Knowledge Discovery in Databases