FaQuAD: Reading comprehension dataset in the domain of brazilian higher education

Research output: Contributions to collected editions/worksArticle in conference proceedingsResearchpeer-review

Authors

Academic secretaries and faculty members of higher education institutions face a common problem: the abundance of questions sent by academics whose answers are found in available institutional documents. The official documents produced by Brazilian public universities are vast and disperse, which discourage students to further search for answers in such sources. In order to lessen this problem, we present FaQuAD: a novel machine reading comprehension dataset in the domain of Brazilian higher education institutions. FaQuAD follows the format of SQuAD (Stanford Question Answering Dataset) [Rajpurkar et al.2016]. It comprises 900 questions about 249 reading passages(paragraphs), which were taken from 18 official documents of a computer science college from a Brazilian federal university and 21 Wikipedia articles related to Brazilian higher education system. As far as we know, this is the first Portuguese reading comprehension dataset in this format. We trained a state-of-the-art model on this dataset, which is based on the Bi-Directional Attention Flow model [Seo et al. 2016]. We report on several ablation tests to assess different aspects of both the model and the dataset. For instance, we report learning curves to assess the amount of training data, the use of different levels of pre-trained models, and the use of more than one correct answer for each question.

Original languageEnglish
Title of host publication2019 Brazilian Conference on Intelligent Systems : BRACIS 2019 : 15-18 October 2019, Salvador, Bahia, Brazil : proceedings
Number of pages6
Place of PublicationPiscataway
PublisherInstitute of Electrical and Electronics Engineers Inc.
Publication date10.2019
Pages443-448
Article number8923668
ISBN (print)978-1-7281-4254-8
ISBN (electronic)978-1-7281-4253-1
DOIs
Publication statusPublished - 10.2019
Externally publishedYes
EventBrazilian Conference on Intelligent Systems - BRACIS 2019 - Salvador, Bahia, Brazil
Duration: 15.10.201918.10.2019
Conference number: 8
http://www.bracis2019.ufba.br/#:~:text=The%208th%20Brazilian%20Conference%20on,October%2015%20to%2018%2C%202019.

    Research areas

  • Dataset, Machine Reading Comprehension, Natural Language Processing
  • Business informatics

Recently viewed

Activities

  1. Breaks and age related strain in continuous physical work
  2. Are Self-Employed Time and Money Poor? Dynamics of Interpendent Multidimensional Poverty with German Time Use Diary Data
  3. Istron-Tagung 2008
  4. IMISCOE (Verlag)
  5. Internes Anti-Rassismus-Training
  6. HOW SUSTAINABILITY ACCOUNTING CONTRIBUTES TO IMPROVED INFORMATION MANAGEMENT AND MANAGEMENT CONTROL
  7. The relationship between intragenerational and intergenerational justice in the use of ecosystems and their services. An ecological-economic mode.
  8. Sonic Spaces and Playfulness
  9. The relationship between intragenerational and intergenerational justice in the use of ecosystems and their services
  10. Investigating the relationship between teachers' acceptance and use of educational technology and student data
  11. Processes of Sustainability Transformation. An inter- and transdisciplinary project
  12. Empathic Healthcare Chatbots: Comparing the Effects of Emotional Expression and Caring Behavior
  13. Kunstuniversität Linz
  14. Aesthetics of complexity, artists and resilient communities in urban anthropo-scenes
  15. From the Environmental State to the Sustainability State? Conceptualization, Indicators, and Examples
  16. 2nd Interdisciplinary Insights on Fraud and Corruption - I2FC 2014
  17. Universität von St. Andrews
  18. Der "als-ob" Modus: Polizei, Protest, Staatlichkeit
  19. Experimente in den Sozialwissenschaften: Methodenkurs
  20. Stimmtraining - 2009
  21. From Traditional Games to Digital Games: Predigital Precursors of Gamification
  22. “Relying on Spontaneity”
  23. Student Gender and Teachers' Grading and Written Feedback on Math or Language Assignments