Emre Kıcıman

Applied Scientist @Microsoft | Post-training for M365 Copilot

San Francisco, California, United States

About

I am an Applied Scientist and Researcher with 8 years of experience developing NLP-based systems across product and research teams at Microsoft, UKP Lab (TU Darmstadt), and Lunar Lab (Stony Brook University). Recently, I’ve led research on personalization and domain adaptation for M365 Copilot Enterprise, focusing on pre-training and post-training of LLMs, building LLM-as-a-judge evaluation techniques, model routing, and cascading frameworks to improve system efficiency. Previously, I deployed embedding-based models for SharePoint's file ranking and developed personalized dialogue systems for millions of Office 365 enterprise users. My work also includes advancing Retrieval-Augmented Generation (RAG) systems, with a focus on knowledge grounding, personalization and privacy.

Experience

Applied Scientist at Microsoft
Feb 2020 - Present · 6 yrs 1 mo
June 2023 - Present: Researcher on the M365 Copilot Tuning team. Building the next generation of reasoning models for multi-turn interactions in M365 Copilot, to solve complex planning tasks. Research lead on M365 Copilot personalization through innovative LLM mid-training, post-training, and synthetic data generation, for domain knowledge infusion and user style alignment. Developed robust evaluators for Office 365 and led the LLM-based personalization assessment for Bing Copilot's memory feature. https://www.microsoft.com/en-us/microsoft-365/enterprise/copilot-for-microsoft-365 Permissive Information-Flow Analysis for LLMs - https://arxiv.org/pdf/2410.03055? Model Routing for LLMs As the Lead Applied Scientist, I developed a pioneering cascading framework for Large Language Models, for optimizing latency and reducing costs in LLM based evaluations for capturing hallucinations in the M365 Copilot. Closely collaborated with the product teams to align with strategic goals and ensure successful implementation. October 2022 - May 2023: User Embeddings for Office 365 Content Recommendation Improved SharePoint's content recommendations with novel user-level embeddings, deploying it to Office 365's enterprise customers. Led the development of the evaluation framework for assessing improvements in user representation using an ANN-based retrieval system across enterprise documents. Feb 2020 – Sept 2022: Smart Replies for Microsoft Teams Core contributor to the deployment of LLM-powered smart replies for M365, focusing on privacy-preserving solutions for enterprise data. Led efforts in model compression, weight pruning and training, serving millions of users. Enhanced engagement with emoji integration and improved model diversity and relevance. Collaborated cross-functionally to scale the transformer based smart reply model efficiently. ICLR 2022 - https://arxiv.org/pdf/2204.03084 MSJAR 2021 - https://arxiv.org/pdf/2111.13999
Graduate Teaching Assistant - Natural Language Processing CSE538-01 at Stony Brook University
Aug 2019 - Dec 2019 · 5 mos
Applied Scientist Intern at Microsoft
Jun 2019 - Aug 2019 · 3 mos
I was an Applied Scientist intern in the Smart Reply team. Here I worked on quantifying the diversity in the suggested replies in the smart replies pipeline. This involved experimenting with different classification models for intents of responses and lexical diversity measures to analyze the diversity of the replies along different axes of semantic and lexical diversity.
Graduate Teaching Assistant - Data Mining at Stony Brook University
Jan 2019 - May 2019 · 5 mos
Graded assignments, exams and final project for graduate Computer Science students. This course covers a breadth of topics in Data Mining and Machine Learning which included - Data Preprocessing, Classification algorithms such as Decision Trees, Deep Learning, Clustering algorithms, and Genetic algorithm. The work also included holding TA hours for solving doubts of students in the above-mentioned topics, as part of the assignments, projects, and exams.
Machine Learning Scientist at Fractal Analytics
May 2016 - Dec 2017 · 1 yr 8 mos