« Previous |
1 - 10 of 17
|
Next »
Number of results to display per page
Search Results
2. Amara - universal subtitles
- Type:
- corpus
- Language:
- Arabic, Danish, Dutch, English, German, Modern Greek (1453-), Italian, Japanese, Korean, Portuguese, Russian, Spanish, and Turkish
- Description:
- Large set of subtitles available for download in multiple languages. Can be used as parallel corpus.
- Rights:
- Not specified
3. Ancient echoes in the culture of modern Egypt /
- Publisher:
- Charles University in Prague, Faculty of Arts,
- Subject:
- Egypt starověký, identita kulturní, vlivy kulturní, společnost moderní, sborníky, Egypt v pravěku a starověku, egyptologie, and české (československé) sborníky a kolektivní monografie
- Language:
- English and Arabic
- Rights:
- unknown
4. Arabic ACL corpus
- Creator:
- Salah Elfahal Elebaed, Hoyam, Kasbi, Mohammed, Nasri, Mohammed, and Bouzoubaa, Karim
- Publisher:
- International Journal of Computer Science Trends and Technology (IJCST)
- Type:
- text and corpus
- Subject:
- Controlled Natural Language, Arabic CNL, ACL, Arabic Corpus, and and TEI.
- Language:
- Arabic
- Description:
- This corpus constitutes all sentences representing the Arabic Controlled Language (ACL). It contains 551 sentences taken from four textbooks and websites dedicated to teach Arabic language to kids such as: a) First grade book, Republic of Sudan (كتاب الصف الاول جمهورية السودان), b) Al Jazeera Educational Site (موقع الجزيرة التعليمي), c) Bella Preparatory School Girls Forum (منتدى مدرسة بيلا الاعدادية بنات), and d) Albahr website (موقع انا البحر). These sentences are respecting 52 ACL rules. The average number of sentences for each rule is 10.6. All sentences in the corpus were analyzed by Farasa syntactic parser to confirm they are correctly analyzed. The validity of the parsing was done manually by linguist experts. The structure of this corpus is made of a header and a body. The header consists of a set of metadata that describe the corpus, such as the corpus name, the authors, the sources and further meta data. While the header is made of metadata, the body contains rules. Each rule has a code, a structure and all sentences respecting that rule. For each sentence, we store an id, the vowelledand unvowelled text as well as the result of parsing using Farasa.
- Rights:
- Not specified
5. Češi a Slované v arabských rukopisech :
- Creator:
- Bahbouh, Charif,
- Type:
- text and překlady
- Subject:
- Rukopisy, prvotisky, staré tisky. Vzácná a pozoruhodná díla, Arabové, cestovatelé, rukopisy, Češi, Slované, české země od příchodu Slovanů do roku 1306, dějiny vědy, umění, kultury a techniky, kulturní vztahy, and cestopisy, cestovatelé
- Language:
- Czech and Arabic
- Description:
- Souběžný název v arabštině and Chronologický přehled
- Rights:
- unknown
6. CorpusExplorer
- Creator:
- Rüdiger, Jan Oliver
- Publisher:
- Jan Oliver Rüdiger
- Type:
- tool and toolService
- Subject:
- Corpus Linguisitics, NLP, conll, tei, XML, nlp, Natural Language Processing, linguistics, Linguistics, Computational Linguistics, corpus processing, tagger, POS tagger, lemmatization, text cleaning, CommonCrawl, epub, JSON, Twitter, Pandoc, Wikipedia, digital data, DTA, DSpin, MySQL, ElasticSearch, TextGrid, text corpora, TigerXML, and WeblichtXML
- Language:
- German, English, French, Italian, Dutch, Spanish, Polish, Arabic, Chinese, and Portuguese
- Description:
- Software for corpus linguists and text/data mining enthusiasts. The CorpusExplorer combines over 45 interactive visualizations under a user-friendly interface. Routine tasks such as text acquisition, cleaning or tagging are completely automated. The simple interface supports the use in university teaching and leads users/students to fast and substantial results. The CorpusExplorer is open for many standards (XML, CSV, JSON, R, etc.) and also offers its own software development kit (SDK). Source code available at https://github.com/notesjor/corpusexplorer2.0
- Rights:
- Not specified
7. Dutch Bilingualism Data Base (DBD)
- Publisher:
- Radboud University Nijmegen, Max Planck Institute for Psycholinguistics, Meertens Institute KNAW The Netherlands, and Babylon Centre for Studies of Multilingualism in the Multicultural Society
- Type:
- corpus
- Language:
- Arabic, Dutch, and Turkish
- Description:
- Audio recordings, transcripts,
- Rights:
- Not specified
8. ElixirFM
- Creator:
- Smrž, Otakar, Bielický, Viktor, and Buckwalter, Tim
- Publisher:
- Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL)
- Type:
- toolService
- Subject:
- Arabic morphology and ElixirFM
- Language:
- Arabic
- Description:
- ElixirFM is a high-level implementation of Functional Arabic Morphology documented at http://elixir-fm.wiki.sourceforge.net/. The core of ElixirFM is written in Haskell, while interfaces in Perl support lexicon editing and other interactions.
- Rights:
- http://opensource.org/licenses/GPL-3.0
9. Encyklopedie islámu /
- Creator:
- Bahbouh, Charif,
- Type:
- text and encyklopedie
- Subject:
- Islám, islám, přehledná zpracování světových dějin (chronologicky), církevní a náboženské dějiny, and oborové slovníky
- Language:
- Czech and Arabic
- Rights:
- unknown
10. Hledání skrytého pokladu :
- Creator:
- Ostřanský, Bronislav,
- Type:
- text and antologie
- Subject:
- Afroasijské (hamitosemitské) literatury, Biografie, písemnictví arabské, antologie, středověk, mystika arabská, súfismus, islám, literatura, spisovatelé, světové dějiny středověku (do r. 1492), and církevní a náboženské dějiny
- Language:
- Czech and Arabic
- Rights:
- unknown