Picture for Fajri Koto

Fajri Koto

Simulating Training Data Leakage in Multiple-Choice Benchmarks for LLM Evaluation

Add code
May 30, 2025
Viaarxiv icon

Llama-3-Nanda-10B-Chat: An Open Generative Large Language Model for Hindi

Add code
Apr 08, 2025
Viaarxiv icon

Crowdsource, Crawl, or Generate? Creating SEA-VL, a Multicultural Vision-Language Dataset for Southeast Asia

Add code
Mar 10, 2025
Viaarxiv icon

Llama-3.1-Sherkala-8B-Chat: An Open Large Language Model for Kazakh

Add code
Mar 03, 2025
Viaarxiv icon

Unveiling Cultural Blind Spots: Analyzing the Limitations of mLLMs in Procedural Text Comprehension

Add code
Feb 20, 2025
Viaarxiv icon

Qorgau: Evaluating LLM Safety in Kazakh-Russian Bilingual Contexts

Add code
Feb 19, 2025
Viaarxiv icon

Instruction Tuning on Public Government and Cultural Data for Low-Resource Language: a Case Study in Kazakh

Add code
Feb 19, 2025
Viaarxiv icon

Commonsense Reasoning in Arab Culture

Add code
Feb 18, 2025
Viaarxiv icon

Synthetic Data Generation for Culturally Nuanced Commonsense Reasoning in Low-Resource Languages

Add code
Feb 18, 2025
Viaarxiv icon

KazMMLU: Evaluating Language Models on Kazakh, Russian, and Regional Knowledge of Kazakhstan

Add code
Feb 18, 2025
Viaarxiv icon