Hedge Detection Using the RelHunter Approach

Research output: Contributions to collected editions/worksArticle in conference proceedingsResearchpeer-review

Authors

RelHunter is a Machine Learning based method for the extraction of structured information from text. Here, we apply RelHunter to the Hedge Detection task, proposed as the CoNLL-2010 Shared Task. RelHunter's key design idea is to model the target structures as a relation over entities. The method decomposes the original task into three subtasks: (i) Entity Identification; (ii) Candidate Relation Generation; and (iii) Relation Recognition. In the Hedge Detection task, we define three types of entities: cue chunk, start scope token and end scope token. Hence, the Entity Identification subtask is further decomposed into three token classification subtasks, one for each entity type. In the Candidate Relation Generation sub-task, we apply a simple procedure to generate a ternary candidate relation. Each instance in this relation represents a hedge candidate composed by a cue chunk, a start scope token and an end scope token. For the Relation Recognition subtask, we use a binary classifier to discriminate between true and false candidates. The four classifiers are trained with the Entropy Guided Transformation Learning algorithm. When compared to the other hedge detection systems of the CoNLL shared task, our scheme shows a competitive performance. The F-score of our system is 54.05 on the evaluation corpus.
Original languageEnglish
Title of host publicationProceedings of the Fourteenth Conference on Computational Natural Language Learning --- Shared Task
EditorsRichard Farkas, Veronika Vincze, György Szarvas, György Mora, Janos Csirik
Number of pages6
Place of PublicationUSA
PublisherAssociation for Computational Linguistics (ACL)
Publication date2010
Pages64–69
ISBN (print)978-1-932432-84-8
Publication statusPublished - 2010
Externally publishedYes
Event14th Conference on Computational Natural Language Learning - CoNLL 2010: Shared Task - Uppsala, Uppsala, Sweden
Duration: 15.07.201017.07.2010
Conference number: 14
http://toc.proceedings.com/08986webtoc.pdf

Recently viewed

Publications

  1. 8th challenge on question answering over linked data (QALD-8)
  2. Extending talk on a prescribed discussion topic in a learner-native speaker eTandem learning task
  3. Differences in the sophistication of Value-based Management
  4. Eulerian and Lagrangian perspectives on turbulent superstructures in Rayleigh-Bénard convection
  5. Processing of CSR communication: insights from the ELM
  6. Diversity: Konzept. Programmatik. Praxis.
  7. On the distinctiveness of tags in collaborative tagging systems
  8. Participatory energy scenario development as dramatic scripting
  9. ZooKeys, unlocking Earth's incredible biodiversity and building a sustainable bridge into the public domain: From "print-based" to "web-based" taxonomy, systematics, and natural history ZooKeys Editorial Opening Paper
  10. Multi-Professional Support
  11. Conceptions of problem solving mathematics teaching
  12. Analytic reproducibility in articles receiving open data badges at the journal Psychological Science
  13. Foreword to applied data science, demo, and nectar tracks
  14. An empirically grounded ontology for analyzing IT-based interventions in business ecosystems
  15. Paired case research design and mixed-methods approach
  16. How do controls and trust interact?
  17. Integration of laboratory experiments into introductory electrical engineering courses
  18. Learning from partially annotated sequences
  19. Influence of three different unstable shoe constructions on EMG-activity during treadmill walking
  20. Toward a lifespan metric of reading fluency
  21. Machine Learning Applications
  22. Differentiating forest types using TerraSAR–X spotlight images based on inferential statistics and multivariate analysis
  23. Modern Baselines for SPARQL Semantic Parsing
  24. RAWSim-O: A Simulation Framework for Robotic Mobile Fulfillment Systems
  25. Bifactor Models for Predicting Criteria by General and Specific Factors
  26. Nonlinear analyses of self-paced reading