Entropy-guided feature generation for structured learning of Portuguese dependency parsing

Publikation: Beiträge in SammelwerkenAufsätze in KonferenzbändenForschungbegutachtet

Authors

Feature generation is a difficult, yet highly necessary, subtask of machine learning modeling. Usually, it is partially solved by a domain expert that generates complex and discriminative feature templates by conjoining the available basic features. This is a limited and expensive way to obtain feature templates and is recognized as a modeling bottleneck. In this work, we propose an automatic method to generate feature templates for structured learning algorithms. The method receives as input the training dataset with basic features and produces a set of feature templates by conjoining basic features that are highly discriminative together. We denote this method entropy guided since it is based on the conditional entropy of local decision variables given the feature values. We illustrate our approach on the Portuguese dependency parsing task and report on experiments with the Bosque corpus. We show that the entropy-guided templates outperform the manually built templates used by MSTParser, which was the best performing system on the Bosque corpus up to now. Furthermore, our approach allows an effortless inclusion of two new basic features that automatically generate additional templates. As a result, our system achieves a per-token accuracy of 92.66%, what represents a reduction by more than 15% on the previous smallest error rate for Portuguese dependency parsing.

OriginalspracheEnglisch
TitelComputational Processing of the Portuguese Language : 10th International Conference, PROPOR 2012, Coimbra, Portugal, April 17-20, 2012. Proceedings
HerausgeberHelena Caseli, Aline Villavicencio, Antonio Teixeira, Fernando Perdigao
Anzahl der Seiten11
ErscheinungsortBerlin, Heidelberg
VerlagSpringer Verlag
Erscheinungsdatum2012
Seiten146-156
ISBN (Print)978-3-642-28884-5
ISBN (elektronisch)978-3-642-28885-2
DOIs
PublikationsstatusErschienen - 2012
Extern publiziertJa
VeranstaltungInternational Conference on Computational Processing of Portuguese - Coimbra, Portugal
Dauer: 17.04.201220.04.2012
Konferenznummer: 10
https://aclweb.org/portal/content/10th-international-conference-computational-processing-portuguese-propor-2012

DOI

Zuletzt angesehen

Publikationen

  1. Digital Control of a Camless Engine Using Lyapunov Approach with Backward Euler Approximation
  2. Analyzing different types of moderated method effects in confirmatory factor models for structurally different methods
  3. The elicitation process in developing of case library for Case-Based Reasoner system whilst consideration for validating electronic communication technologies
  4. Different approaches to learning from errors: Comparing the effectiveness of high reliability and error management approaches
  5. Evaluating OWL 2 reasoners in the context of checking entity-relationship diagrams during software development
  6. A discrete-time fractional order PI controller for a three phase synchronous motor using an optimal loop shaping approach
  7. Dynamic Lot Size Optimization with Reinforcement Learning
  8. A multi input sliding mode control for Peltier Cells using a cold-hot sliding surface
  9. A Control Scheme for PMSMs using Model Predictive Control and a Feedforward Action in the Presence of Saturated Inputs
  10. Constructs for Assessing Integrated Reports-Testing the Predictive Validity of a Taxonomy for Organization Size, Industry, and Performance
  11. Design and Control of an Inductive Power Transmission System with AC-AC Converter for a Constant Output Current
  12. Design and characterization of an EOG signal acquisition system based on the programming of saccadic movement routines
  13. Intersection tests for the cointegrating rank in dependent panel data
  14. Latent structure perceptron with feature induction for unrestricted coreference resolution
  15. On robustness properties in permanent magnet machine control by using decoupling controller
  16. Globally asymptotic output feedback tracking of robot manipulators with actuator constraints
  17. Vision-Based Deep Learning Algorithm for Detecting Potholes
  18. Construct Objectification and De-Objectification in Organization Theory
  19. Kinematic self-calibration of non-contact five-axis measuring machine using improved genetic algorithm.
  20. Methodologies for Noise and Gross Error Detection using Univariate Signal-Based Approaches in Industrial Application
  21. Algebraic combinatorics in mathematical chemistry. Methods and algorithms. I. Permutation groups and coherent (cellular) algebras.
  22. Springback prediction and reduction in deep drawing under influence of unloading modulus degradation
  23. Modelling tasks—The relation between linguistic skills, intra-mathematical skills, and context-related prior knowledge
  24. Diffusion-driven microstructure evolution in OpenCalphad
  25. Recurrence Quantification Analysis of Processes and Products of Discourse
  26. Exact and approximate inference for annotating graphs with structural SVMs
  27. Selecting and Adapting Methods for Analysis and Design in Value-Sensitive Digital Social Innovation Projects: Toward Design Principles
  28. Long-term memory predictors of adult language learning at the interface between syntactic form and meaning
  29. Interpreting Strings, Weaving Threads