The SQAD database consists of 3301 records obtained from Czech Wikipedia articles. The record structure is following:
- the original sentence(s) from Wikipedia
- a question that is directly answered in the text
- the expected answer to the question as it appears in the original text
- the URL of the Wikipedia web page from which the original text was extracted
- name of the author of this SQAD record
Simple question answering database version 3.2 (SQAD v3.2) created from Czech Wikipedia. The new version consists of more than 16000 records. Each record of SQAD consists of multiple files - question, answer extraction, answer selection, URL, question metadata, and in some cases, answer context.