RelHunter: A machine learning method for relation extraction from text

Research output: Journal contributionsJournal articlesResearchpeer-review

Standard

RelHunter: A machine learning method for relation extraction from text. / Fernandes, Eraldo R.; Milidiú, Ruy L.; Rentería, Raúl P.
In: Journal of the Brazilian Computer Society, Vol. 16, No. 3, 18, 09.2010, p. 191-199.

Research output: Journal contributionsJournal articlesResearchpeer-review

Harvard

APA

Vancouver

Fernandes ER, Milidiú RL, Rentería RP. RelHunter: A machine learning method for relation extraction from text. Journal of the Brazilian Computer Society. 2010 Sept;16(3):191-199. 18. doi: 10.1007/s13173-010-0018-y

Bibtex

@article{631a377ca8674819a40caf7adcc40e4a,
title = "RelHunter: A machine learning method for relation extraction from text",
abstract = "We propose RelHunter, a machine learning-based method for the extraction of structured information from text. RelHunter's key idea is to model the target structures as a relation over entities. Hence, the modeling effort is reduced to the identification of entities and the generation of a candidate relation, which are simpler problems than the original one. RelHunter fits a very broad spectrum of complex computational linguistic problems. We apply it to five tasks: phrase chunking, clause identification, hedge detection, quotation extraction, and dependency parsing. We compare RelHunter to token classification approaches through several computational experiments on seven multilingual corpora. RelHunter outperforms the token classification approaches by 2.14% on average. Moreover, we compare the derived systems against state-of-the-art systems for each corpus. Our systems achieve state-of-the-art performances for three corpora: Portuguese phrase chunking, Portuguese clause identification, and English quotation extraction. Additionally, the derived systems show good quality performance for the other four corpora.",
keywords = "Entity relation extraction, Entropy Guided Transformation Learning, Machine learning, Natural language processing, Informatics, Business informatics",
author = "Fernandes, {Eraldo R.} and Milidi{\'u}, {Ruy L.} and Renter{\'i}a, {Ra{\'u}l P.}",
note = "This work was partially funded by CNPq and FAPERJ grants 557.128/2009-9 and E-26/170028/2008. The first author holds a CNPq doctoral fellowship and is supported by Instituto Federal de Educa{\c c}{\~a}o, Ci{\^e}ncia e Tecnologia de Goi{\'a}s, Brazil.",
year = "2010",
month = sep,
doi = "10.1007/s13173-010-0018-y",
language = "English",
volume = "16",
pages = "191--199",
journal = "Journal of the Brazilian Computer Society",
issn = "0104-6500",
publisher = "Brazilian Computing Society",
number = "3",

}

RIS

TY - JOUR

T1 - RelHunter

T2 - A machine learning method for relation extraction from text

AU - Fernandes, Eraldo R.

AU - Milidiú, Ruy L.

AU - Rentería, Raúl P.

N1 - This work was partially funded by CNPq and FAPERJ grants 557.128/2009-9 and E-26/170028/2008. The first author holds a CNPq doctoral fellowship and is supported by Instituto Federal de Educação, Ciência e Tecnologia de Goiás, Brazil.

PY - 2010/9

Y1 - 2010/9

N2 - We propose RelHunter, a machine learning-based method for the extraction of structured information from text. RelHunter's key idea is to model the target structures as a relation over entities. Hence, the modeling effort is reduced to the identification of entities and the generation of a candidate relation, which are simpler problems than the original one. RelHunter fits a very broad spectrum of complex computational linguistic problems. We apply it to five tasks: phrase chunking, clause identification, hedge detection, quotation extraction, and dependency parsing. We compare RelHunter to token classification approaches through several computational experiments on seven multilingual corpora. RelHunter outperforms the token classification approaches by 2.14% on average. Moreover, we compare the derived systems against state-of-the-art systems for each corpus. Our systems achieve state-of-the-art performances for three corpora: Portuguese phrase chunking, Portuguese clause identification, and English quotation extraction. Additionally, the derived systems show good quality performance for the other four corpora.

AB - We propose RelHunter, a machine learning-based method for the extraction of structured information from text. RelHunter's key idea is to model the target structures as a relation over entities. Hence, the modeling effort is reduced to the identification of entities and the generation of a candidate relation, which are simpler problems than the original one. RelHunter fits a very broad spectrum of complex computational linguistic problems. We apply it to five tasks: phrase chunking, clause identification, hedge detection, quotation extraction, and dependency parsing. We compare RelHunter to token classification approaches through several computational experiments on seven multilingual corpora. RelHunter outperforms the token classification approaches by 2.14% on average. Moreover, we compare the derived systems against state-of-the-art systems for each corpus. Our systems achieve state-of-the-art performances for three corpora: Portuguese phrase chunking, Portuguese clause identification, and English quotation extraction. Additionally, the derived systems show good quality performance for the other four corpora.

KW - Entity relation extraction

KW - Entropy Guided Transformation Learning

KW - Machine learning

KW - Natural language processing

KW - Informatics

KW - Business informatics

UR - http://www.scopus.com/inward/record.url?scp=84870389863&partnerID=8YFLogxK

UR - https://link.springer.com/journal/13173/volumes-and-issues/16-3

U2 - 10.1007/s13173-010-0018-y

DO - 10.1007/s13173-010-0018-y

M3 - Journal articles

AN - SCOPUS:84870389863

VL - 16

SP - 191

EP - 199

JO - Journal of the Brazilian Computer Society

JF - Journal of the Brazilian Computer Society

SN - 0104-6500

IS - 3

M1 - 18

ER -

Recently viewed

Publications

  1. Glitch(ing)! A refusal and gateway to more caring techno-urban worlds?
  2. Probing turbulent superstructures in Rayleigh-Bénard convection by Lagrangian trajectory clusters
  3. Assessing tree dendrometrics in young regenerating plantations using terrestrial laser scanning
  4. Toxicity testing with luminescent bacteria - Characterization of an automated method for the combined assessment of acute and chronic effects
  5. RelHunter
  6. Swarm Robotics, or: The Smartness of 'a bunch of cheap dumb things'
  7. Perceptions of Organizational Downsizing
  8. Policy implementation through multi-level governance
  9. Pre-service mathematics teachers' modelling processes within model eliciting activity through digital technologies
  10. Advantages and difficulties of conducting thinking-aloud protocols in the school setting
  11. Development of a procedure for forming assisted thermal joining of tubes
  12. The complementarity of single-species and ecosystem-oriented research in conservation research
  13. Innovation in Continuing Engineering Education with focus on gender and non-traditional students' pathways
  14. Does transition to IFRS substantially affect key financial ratios in shareholder-oriented common law regimes?
  15. Do it again
  16. Classification of playing position in elite junior Australian football using technical skill indicators
  17. Global patterns of ecologically unequal exchange
  18. The use of force against terrorists
  19. Wir sind ihr
  20. Delivering community benefits through REDD plus : Lessons from Joint Forest Management in Zambia
  21. Internet-Based Prevention of Depression in Employees
  22. Toward a Production-Oriented Imagology
  23. The Computational Turn, or, a New Weltbild
  24. Archival research on carbon reporting quality. A review of determinants and consequences for firm value
  25. Community and Training in NFDI4DS
  26. Kriminalisierung und Versicherheitlichung von Migration. Editorial
  27. Assoggettamento/Soggettivazione
  28. On the micro-structure of the German export boom
  29. The Measurement of Grip-Strength in Automobiles
  30. Front in the mouth, front in the word
  31. Intra- and interspecific hybridization in invasive Siberian elm