Vincenzo Vigna

Machine Learning & AI Researcher | Generative AI, RAG & LLMs | Explainable ML | PhD Computational Chemistry | CNR Postdoc

Greater Cosenza Metropolitan Area

About

Machine Learning Scientist / AI Engineer with a PhD-level background and hands-on experience in machine learning pipelines, Generative AI, RAG systems, NLP, and explainable AI. I design and deploy Python-based ML workflows for prediction, classification, model evaluation, and scientific decision support. My experience includes regression and classification models, XGBoost, neural networks, molecular descriptors, SHAP-based explainability, and large-scale dataset construction. I also build Generative AI and NLP solutions. I designed and developed RAGMODEX, a Retrieval-Augmented Generation platform powered by Large Language Models that turns complex scientific data into a queryable natural-language interface and supports model interpretation. The system works with both API-based and locally deployed LLMs, enabling flexibility, reproducibility, and data control. My work spans anticancer drug discovery and CO₂ capture materials, where I have integrated AI, quantum-mechanical methods, and high-performance computing to accelerate in-silico screening and generate actionable research insights. I am interested in applying machine learning, Generative AI, RAG, and data-driven modeling to industrial R&D, AI products, and decision-support systems.

Experience

  • Ricercatore Post-Doc / Data Scientist at CNR-ITM
    Jan 2025 - Present · 1 yr 6 mos

    • Developed machine learning models (XGBoost, neural networks) to predict membrane performance for CO₂ capture applications. • Built and curated structured datasets integrating molecular descriptors and experimental data. • Designed and implemented a modular Retrieval-Augmented Generation (RAG) system to support the interpretation of complex molecular features and enhance model explainability. • Integrated custom domain-specific documentation derived from official technical sources, enabling context-aware analysis. • Architected the system to support both paid LLM APIs and locally deployed models, ensuring flexibility, reproducibility, and data control. • Leveraged HPC environments to accelerate model training and large-scale in-silico screening workflows. • Collaborated within a multidisciplinary EU consortium, translating computational outputs into actionable research directions.

  • PhD Researcher – AI for Drug Discovery & Molecular Modeling at University of Calabria
    Jan 2022 - Jan 2025 · 3 yrs 1 mo

    • Developed machine learning models (regression, classification, ensemble methods, neural networks) for anticancer drug activity prediction. • Constructed and curated a dataset of ~9,700 metal-based complexes from Reaxys, engineering molecular descriptors for supervised learning workflows. • Benchmarked multiple algorithms (Random Forest, SVM, XGBoost, Neural Networks), achieving state-of-the-art predictive performance in therapeutic window classification. • Integrated AI methods with quantum mechanical (QM) computational approaches to investigate mechanisms of action of anticancer compounds. • Published results in peer-reviewed journals including Journal of Cheminformatics and Angewandte Chemie. • Presented research findings at international conferences and workshops.

  • Visiting PHD Student at Universidade de Coimbra
    Feb 2022 - Jul 2022 · 6 mos