United States
Developed a breakthrough automated PCSV derivation tool that revolutionized clinical data processing workflows. Using Python and advanced NLP techniques, I created a solution that extracts structured information from Statistical Analysis Plans (SAPs), reducing manual processing time from 8 hours to just 8 minutes while maintaining 100% accuracy. The project involved implementing cutting-edge Retrieval-Augmented Generation (RAG) and logic engine design to transform extracted criteria into executable Python scripts for ADaM-compliant dataset creation. I designed and built a comprehensive user-friendly interface prototype that enables seamless SAP document uploads and automated output retrieval, significantly streamlining clinical data management processes. By leveraging OpenAI GPT models and advanced AI techniques, I successfully automated the processing of unstructured clinical documents and generated compliant code automatically. The innovation culminated in a presentation to BDM team executives, where I demonstrated substantial efficiency gains and outlined the solution's potential for scalable implementation across clinical data management tasks. This experience showcased the transformative power of combining Python programming expertise with AI/ML technologies to solve complex pharmaceutical industry challenges.