Skip to content
View Uyghur-Corpus's full-sized avatar
  • Joined Jan 12, 2026

Block or report Uyghur-Corpus

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Uyghur-Corpus/README.md
license language pretty_name size_categories task_categories tags
mit
ug
Uyghur Socio-Political Articles Dataset
n<1k
text-generation
text-classification
uyghur
nlp
dataset
political-critique
history
llm-training

Uyghur Socio-Political and Literary Dataset (109+ Articles)

ئۇيغۇر ئىجتىمائىي-سىياسىي ۋە ئەدەبىي ماقالىلەر سانلىق مەلۇمات توپلىمى

This repository contains a curated dataset of 109+ Uyghur articles. This is an actively maintained project, and new content is added regularly.

بۇ ئامباردا جەمئىي 109 پارچىدىن ئارتۇق ماقالە جەملەندى. بۇ سانلىق مەلۇمات توپلىمى ئاكتىپ يېڭىلىنىپ تۇرىدىغان تۈر بولۇپ، يېڭى ماقالىلەر قوشۇلۇپ تۇرىدۇ.

🔄 Project Status / يېڭىلىنىش ئەھۋالى

  • Status: Active / ئاكتىپ
  • Update Frequency: Regular updates / يېڭىلىنىپ تۇرىدۇ
  • Current Count: 109 articles (As of Feb 2026)

📋 Dataset Overview / ئومۇمىي ئەھۋال

  • Format: .jsonl (UTF-8)
  • Authors: Burhan Muhammed, Enwer Haji Muhammed (Erturk), Karimjan Ghafuri, Mahmud Muhiti, Muhammad Amin Bughra, etc.

🛠 Usage Guide / تېخنىكىلىق قوللانما

Developers can integrate this dataset using the Hugging Face datasets library:

from datasets import load_dataset

# Load the dataset
dataset = load_dataset("Uyghur-Corpus/Uyghur-Corpus")

# Access an article
print(dataset['train'][0]['content'])

Popular repositories Loading

  1. Uyghur-Corpus Uyghur-Corpus Public