Contributor: Ministerstvo školství, mládeže a tělovýchovy České republiky@@LK11221@@Vývoj metod pro návrh statistických mluvených dialogových systémů@@nationalFunds@@ / Original context has metadata only: false / Publisher: Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL) / Rights: http://creativecommons.org/licenses/by-sa/4.0/

Start Over Contributor Ministerstvo školství, mládeže a tělovýchovy České republiky@@LK11221@@Vývoj metod pro návrh statistických mluvených dialogových systémů@@nationalFunds@@ Publisher Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL) Rights http://creativecommons.org/licenses/by-sa/4.0/ Original context has metadata only false

1. A Small Dataset for English-to-Czech Speech Translation in the Travel Domain

Creator:: Cífka, Ondřej and Bojar, Ondřej
Publisher:: Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL)
Type:: audio and corpus
Subject:: speech corpus, ASR, and machine translation
Language:: English and Czech
Description:: This small dataset contains 3 speech corpora collected using the Alex Translate telephone service (https://ufal.mff.cuni.cz/alex#alex-translate). The "part1" and "part2" corpora contain English speech with transcriptions and Czech translations. These recordings were collected from users of the service. Part 1 contains earlier recordings, filtered to include only clean speech; Part 2 contains later recordings with no filtering applied. The "cstest" corpus contains recordings of artificially created sentences, each containing one or more Czech names of places in the Czech Republic. These were recorded by a multinational group of students studying in Prague.
Rights:: Creative Commons - Attribution-ShareAlike 4.0 International (CC BY-SA 4.0), http://creativecommons.org/licenses/by-sa/4.0/, and PUB

2. Alex Context NLG Dataset

Creator:: Dušek, Ondřej and Jurčíček, Filip
Publisher:: Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL)
Type:: text and corpus
Subject:: dialogue system, natural language generation, dialogue alignment, and entrainment
Language:: English
Description:: A dataset intended for fully trainable natural language generation (NLG) systems in task-oriented spoken dialogue systems (SDS), covering the English public transport information domain. It includes preceding context (user utterance) along with each data instance (pair of source meaning representation and target natural language paraphrase to be generated). Taking the form of the previous user utterance into account for generating the system response allows NLG systems trained on this dataset to entrain (adapt) to the preceding utterance, i.e., reuse wording and syntactic structure. This should presumably improve the perceived naturalness of the output, and may even lead to a higher task success rate. Crowdsourcing has been used to obtain natural context user utterances as well as natural system responses to be generated.
Rights:: Creative Commons - Attribution-ShareAlike 4.0 International (CC BY-SA 4.0), http://creativecommons.org/licenses/by-sa/4.0/, and PUB

3. Czech restaurant information dataset for NLG

Creator:: Dušek, Ondřej, Jurčíček, Filip, Dvořák, Josef, Grycová, Petra, Hejda, Matěj, Olivová, Jana, Starý, Michal, and Štichová, Eva
Publisher:: Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL)
Type:: text and corpus
Subject:: natural language generation, dialogue system, and morphological generation
Language:: Czech
Description:: This is a dataset for natural language generation (NLG) in task-oriented spoken dialogue systems with Czech as the target language. It originated as a translation of the English San Francisco Restaurants dataset by Wen et al. (2015). It includes input dialogue acts and the corresponding output natural language paraphrases in Czech. Since the dataset is intended for recurrent neural network based NLG systems using delexicalization, inflection tables for all slot values appearing verbatim in the text are provided.
Rights:: Creative Commons - Attribution-ShareAlike 4.0 International (CC BY-SA 4.0), http://creativecommons.org/licenses/by-sa/4.0/, and PUB

4. Question Dialogs Dataset

Creator:: Vodolán, Miroslav and Jurčíček, Filip
Publisher:: Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL)
Type:: text, other, and lexicalConceptualResource
Subject:: question dialogs and interactive learning
Language:: English
Description:: Dataset collected from natural dialogs which enables to test the ability of dialog systems to interactively learn new facts from user utterances throughout the dialog. The dataset, consisting of 1900 dialogs, allows simulation of an interactive gaining of denotations and questions explanations from users which can be used for the interactive learning.
Rights:: Creative Commons - Attribution-ShareAlike 4.0 International (CC BY-SA 4.0), http://creativecommons.org/licenses/by-sa/4.0/, and PUB

1. A Small Dataset for English-to-Czech Speech Translation in the Travel Domain

2. Alex Context NLG Dataset

3. Czech restaurant information dataset for NLG

4. Question Dialogs Dataset

Limit your search

Show values starting with

Search

Search Constraints

Search Results

Limit your search

Contributor

Creator

Show values starting with

Language

Publisher

Rights

Subject

Type

Date

Original context has metadata only

Harvested from