Greater Grenoble Metropolitan Area
Driven by adaptability and resilience, I have built a career path focused on Data and AI. My degree in Data Science and my current studies in computer science at UTC have solidified my foundation, further enriched by professional experiences at CIAE and STMicroelectronics (Imaging division). I am passionate about Machine Learning, Deep Learning, Computer Vision, autonomous vehicles in a broad sense and Robotics. Feel free to reach out if you'd like to connect or discuss opportunities in data science, statistics, or Al!
Managed and optimized an internal GenAI chatbot (LLM + RAG) serving 160+ active users. Developed robust Python data ingestion pipelines, integrating Microsoft Graph API and internal APIs to continuously update the system's knowledge base.
Architected an end-to-end, modular NLP and topic modeling pipeline for large-scale document corpora (millions of patent records plus PDF, PPTX, DOCX), covering ingestion, preprocessing, vectorization, clustering, dimensionality reduction, and topic evaluation (Python, BERTopic, HDBSCAN, UMAP, PCA, TF‑IDF, Doc2Vec, sentence transformers). Built a flexible data ingestion and cleaning layer for multi-source JSONL and OCR outputs, leveraging vision-language models (VLMs) for document OCR and analysis, with source-specific parsing, text normalization, document chunking, and quality checks to ensure robust downstream models. Developed a reproducible, configuration-driven workflow and an interactive Plotly Dash web application with custom JavaScript for force-directed topic maps and similarity graphs, enabling exploratory analysis of patent and document landscapes at scale.
Represented the university at academic events, open days, and student fairs. Guided prospective students, promoted academic programs, and facilitated communication between the student body and university administration.
Developed Python scripts for efficient and secure data migration from external APIs to a PostgreSQL database. Performed advanced statistical analyses and built machine learning pipelines (including NLP, PCA, and multiple linear regression) using scikit-learn. Designed comprehensive data visualizations and reports to support strategic decision-making and operational planning