FaQuAD: Reading comprehension dataset in the domain of brazilian higher education

Research output: Contributions to collected editions/worksArticle in conference proceedingsResearchpeer-review

Authors

Academic secretaries and faculty members of higher education institutions face a common problem: the abundance of questions sent by academics whose answers are found in available institutional documents. The official documents produced by Brazilian public universities are vast and disperse, which discourage students to further search for answers in such sources. In order to lessen this problem, we present FaQuAD: a novel machine reading comprehension dataset in the domain of Brazilian higher education institutions. FaQuAD follows the format of SQuAD (Stanford Question Answering Dataset) [Rajpurkar et al.2016]. It comprises 900 questions about 249 reading passages(paragraphs), which were taken from 18 official documents of a computer science college from a Brazilian federal university and 21 Wikipedia articles related to Brazilian higher education system. As far as we know, this is the first Portuguese reading comprehension dataset in this format. We trained a state-of-the-art model on this dataset, which is based on the Bi-Directional Attention Flow model [Seo et al. 2016]. We report on several ablation tests to assess different aspects of both the model and the dataset. For instance, we report learning curves to assess the amount of training data, the use of different levels of pre-trained models, and the use of more than one correct answer for each question.

Original languageEnglish
Title of host publication2019 Brazilian Conference on Intelligent Systems : BRACIS 2019 : 15-18 October 2019, Salvador, Bahia, Brazil : proceedings
Number of pages6
Place of PublicationPiscataway
PublisherInstitute of Electrical and Electronics Engineers Inc.
Publication date10.2019
Pages443-448
Article number8923668
ISBN (print)978-1-7281-4254-8
ISBN (electronic)978-1-7281-4253-1
DOIs
Publication statusPublished - 10.2019
Externally publishedYes
EventBrazilian Conference on Intelligent Systems - BRACIS 2019 - Salvador, Bahia, Brazil
Duration: 15.10.201918.10.2019
Conference number: 8
http://www.bracis2019.ufba.br/#:~:text=The%208th%20Brazilian%20Conference%20on,October%2015%20to%2018%2C%202019.

    Research areas

  • Dataset, Machine Reading Comprehension, Natural Language Processing
  • Business informatics

Recently viewed

Activities

  1. Gaming the system: Harnessing the power of commercial computer games for foreign language learning
  2. Joint Sessions of Workshops 2024
  3. “I’m not like a big feminist and stuff.” – (Post-)Feminism in the reception of televised modeling contests in Germany and the USA
  4. Mixed salts in thermochemical heat storage applications
  5. Space-focused stereotypes of ethnically diverse places
  6. Einführung in das Asylrecht
  7. Dealing with Climate Change. Calculus & Catastrophe in the Age of Simulation - 2015
  8. The concept of a sustainable use of biocidal active substances – applied to rodenticides
  9. Non-expert AI Educators in Higher Education: How can OER support AI Teaching Practices?
  10. Vortrag: "Neapel"
  11. Universities as Transformative Locations for Sustainable Approaches to Science: Network of Early-Career Sustainable Scientists and Engineers’
  12. Frontiers and borders of superdiversity
  13. Lena Meyer-Bergner’s conception of modernism between graphics and weaving, between folk art and technology
  14. Interpretation and contestation of fracking in a changing context: The case of Germany and its proclaimed energy transition
  15. Introduction to the Special Issue "Understanding the Platform Economy"
  16. How Can Principals Successfully Navigate a Crisis? Assessing the Interplay of Exploration, Exploitation, and Innovation
  17. Changes in Health Tourism in Europe
  18. Functions of Innovation Systems: the Case of Flywheel Energy Storage
  19. Opportunities to Learn, professional beliefs and the Ability to identify Academic Language Features in a Mathematical Explanation. A Study Among Pre-Service Teachers.
  20. What if Civilization Collapses? Management Scholarship in and for Deep Adaption
  21. UV photodegradation of trimipramine under different environmental variables and chemical nature of aqueous solution - biodegradation and LC-MSn characterization of the formed transformation products
  22. Reading strategy instruction and students' perceptions on fostering self-regulated reading
  23. Further Creation of Know how in the Opportunity Recognition Process