Post by Marius Dumitran
Lecturer at the University of Bucharest, President of the National Olympiad In Informatics (2023-2025) , Former SWE(Google/Palantir), PhD in CS
Article number 3 — Full Paper @ AIED 2026 the top venue in the AI in Education field 📖 "𝗥𝗼𝗠𝗮𝘁𝗵𝗘𝘅𝗮𝗺: 𝗔 𝗟𝗼𝗻𝗴𝗶𝘁𝘂𝗱𝗶𝗻𝗮𝗹 𝗗𝗮𝘁𝗮𝘀𝗲𝘁 𝗼𝗳 𝗥𝗼𝗺𝗮𝗻𝗶𝗮𝗻 𝗠𝗮𝘁𝗵 𝗘𝘅𝗮𝗺𝘀 (𝟭𝟴𝟵𝟱–𝟮𝟬𝟮𝟱) 𝘄𝗶𝘁𝗵 𝗮 𝗦𝗲𝘃𝗲𝗻-𝗗𝗲𝗰𝗮𝗱𝗲 𝗖𝗼𝗿𝗲" My co-authors Luca Cuclea and Sabin-Codruț Badea are bachelor students who met in my optional AI in Education course at the Faculty of Mathematics and Computer Science. They clicked, started collaborating, and what you see here is the result. The quality of their work has been extraordinary — and I'm happy to say we are already working together on more exciting projects. 🌟 --- 𝗪𝗵𝗮𝘁 𝗶𝘀 𝗥𝗼𝗠𝗮𝘁𝗵𝗘𝘅𝗮𝗺? A large-scale, curriculum-grounded dataset of 10,000+ Romanian math exam problems spanning 130 years (1895–2025), with a robust 70-year core (1957–2025). It covers all four official Baccalaureate tracks (M1–M4) and includes: ✅ Normalized problem statements & structured metadata ✅ Curriculum-aligned topic tags (validated against expert annotations — Cohen's Kappa: 0.94!) ✅ Similarity signals for near-duplicate detection ✅ A token-based solution complexity proxy for difficulty estimation — using LLMs as "synthetic students" --- 𝗪𝗵𝗮𝘁 𝗱𝗶𝗱 𝘄𝗲 𝗳𝗶𝗻𝗱? The data tells a fascinating story about Romanian math education: 📊 Pre-2000: a volatile, ever-shifting curriculum — exams looked radically different year to year 📐 Post-2000: a stable, standardized blueprint — topic diversity converged and stayed consistent 📉 Individual problem complexity has decreased in the modern era — but aggregate cognitive load is maintained through more problems 🔢 Algebra now dominates at ~60% of exam content, up from a much more balanced distribution in the early 20th century 🎯 National Simulations (March) are measurably harder than Summer finals — confirming what every Romanian student already suspected --- The paper also got its own podcast episode 🎙️ by Radu-Sebastian Amarie and Ștefan-Gabriel Muscalu — Spotify: https://lnkd.in/dk-b5Fh5 --- 𝗪𝗵𝘆 𝗱𝗼𝗲𝘀 𝘁𝗵𝗶𝘀 𝗺𝗮𝘁𝘁𝗲𝗿? Well-structured assessment datasets for low-resource languages are scarce. RoMathExam opens the door for reproducible AIED research — topic classification, difficulty modeling, similarity-based retrieval, curriculum analytics — all grounded in authentic Romanian educational context. 🤝 If you work in EdTech, AI, or education policy and see a use case here, my DMs are open. 📄 Read the paper: https://lnkd.in/dBeE9Qjb #AIED #DatasetRelease #MathEducation #EdTech #LargeLanguageModels #CurriculumAnalytics #Romania #UniversityOfBucharest #StudentResearch #OpenScience