Datasets
Title | Description | Categories |
---|---|---|
Russian state institutions 2024 | This is a collection of full-text datasets based on contents extracted from the websites of Russian institutions. | |
Russian state institutions 2025 (draft) | Until you see this notice, please ignore this work-in-progress section and refer to the 2024 version of this dataset or other sections of this website. | |
1tv.ru_ru | All items published on the Pervy Kanal (1tv.ru) | dataset, Russian media, Russian language |
Prigozhin audio files, transcribed | An automatic transcription of all the audio messages posted on Prigozhin’s official Telegram channel | dataset, automatic transcription, Telegram, Russian language, Russian media |
Zavtra.ru - Summary of articles | A sample of all items published on the website of the Russian weekly magazine ‘Zavtra’, summarised with a locally deployed LLM (gemma:4b) | dataset, Russian media, Russian language, llm, summary |
archive.government.ru_ru_2024 | Corpus based on the archived version of Russia’s government website (in Russian, 2008-2013) | dataset, Russian institutions, Russian government, Russian language |
archive.premier.gov.ru_ru_2024 | Corpus based on the archived version of the website of Russia’s prime minister (in Russian, 2008-2012) | dataset, Russian institutions, Russian government, Russian language |
duma.gov.ru_ru | All news items published on the website of the Russian Duma | dataset, Russian institutions, Russian language |
duma.gov.ru_ru_2024 | Corpus based on the Russia’s Duma website (in Russian, 2006-2023) | dataset, Russian institutions, Russian government, Russian language |
government.ru_ru_2024 | Corpus based on the Russia’s government website (in Russian, 2013-2023) | dataset, Russian institutions, Russian government, Russian language |
kp.ru_ru | All items published in the politics section of Komsomolskaya Pravda | dataset, Russian media, Russian language |
kremlin.ru_en | All items published on the English language version of the Kremlin’s website | dataset, Russian institutions, English language |
kremlin.ru_en_2024 | Corpus based on Russia’s president website (in English, 1999-2023) | corpus, full corpus, Russian institutions, Russia’s president, English language |
kremlin.ru_ru | All items published on the Russian language version of the Kremlin’s website | dataset, Russian institutions, Russian language |
kremlin.ru_ru_2024 | Corpus based on Russia’s president website (in Russian, 1999-2023) | dataset, Russian institutions, Russian language |
mid.ru_en | All English-languge news items published on the website of the Russian Ministry of Foreign Affairs | dataset, Russian institutions, English language |
mid.ru_en_2024 | Corpus based on the website of Russia’s MFA (in English, 2003-2023) | corpus, full corpus, Russian institutions, Russia’s MFA, English language |
mid.ru_ru | All Russian-languge news items published on the website of the Russian Ministry of Foreign Affairs | dataset, Russian institutions, Russian language |
mid.ru_ru_2024 | Corpus based on the website of Russia’s MFA (in Russian, 2003-2023) | corpus, full corpus, Russian institutions, Russia’s MFA, Russian language |
ng.ru_ru | All items published on Nezavisimaya Gazeta | dataset, Russian media, Russian language |
novostipmr.com_ru | All items published on the website of Transnistria’s news agency Novosti PMR | dataset, Russian language, Transnistria |
patriarhia.ru_ru | All items published on the official website of the Moscow Patriarchate | dataset, Russian language |
rg.ru_ru | All items published on Rossiiskaya Gazeta | dataset, Russian media, Russian language |
transcript.duma.gov.ru_ru_2024 | Corpus based on the Russia’s Duma website (in Russian, 2006-2023) | dataset, Russian institutions, Russian parliament, Russian language |
tsargrad.tv_ru | All textual items published on the website of the Russian TV broadcaster ‘Tsargrad’ | dataset, Russian media, Russian language |
zavtra.ru_ru | All items published on the website of the Russian weekly magazine ‘Zavtra’ | dataset, Russian media, Russian language |
zavtra.ru_ru_2024 | Corpus based on the website of Russian weekly newspaper ‘Zavtra’ (in Russian, 1996-2023) | corpus, full corpus, Russian media, Russian language |
zavtra.ru_ru_2025 | Corpus based on the website of Russian weekly newspaper ‘Zavtra’ (in Russian, 1996-2024) | corpus, full corpus, Russian media, Russian language |
No matching items