Annotated list of dependency bigrams occurring in the PDT more than five times and having part-of-speech patterns that can possibly form a collocation. Each bigram is assigned to one of the six MWE categories by three annotators.
The GrandStaff-LMX dataset is based on the GrandStaff dataset described in the "End-to-end optical music recognition for pianoform sheet music" paper by Antonio Ríos-Vila et al., 2023, https://doi.org/10.1007/s10032-023-00432-z .
The GrandStaff-LMX dataset contains MusicXML and Linearized MusicXML encodings of all systems from the original datase, suitable for evaluation with the TEDn metric. It also contains the GrandStaff official train/dev/split.
70K words, Non-validated sentence segmentation. Non-validated POS tagging, Manual annotation of syntactic dependencies and dependency labels, Manual annotation of semantic roles, Manual annotation of events based on a shallow domain specific ontology (only for a 31K words subset of GDT)
A close-up of Gustav Adolf Procházka, the Patriarch of Czechoslovak Church, from a newsreel segment to mark the 15th anniversary of the Czechoslovak Church in 1935.
Wrestler Gustav Frištejnský in a match against Josef Šmejkal in Prague-Letná in 1913. Frištejnský with his young nephew František at a farm in Lužice. Frištejnský swimming with his dog in the Sitka River. Frištejnský exercising in the countryside. Footage of Frištejnský with his wife Miroslava. Several scenes from the documentary Město Litovel (The Town of Litovel, 1927) are used.
Recording of Jan Wenig´s radio interview with film director Gustav Machatý during his visit to Prague in late 1946. Machatý with his fiancée Helga in front of the Alcron Hotel on Štěpánská Street in Prague.