Yagiz Kaymak, Ph.D.

Senior Principal Data Scientist

Raritan, New Jersey, United States

About

I am a data science and engineering leader with 10+ years of experience transforming complex healthcare data into actionable insights. My work sits at the intersection of machine learning, real-world evidence, and data strategy, with a current focus on improving outcomes in neuroscience through advanced analytics and predictive modeling. In my current role, I lead initiatives that leverage Rx and medical claims, clinical data, and AI-driven insights to better understand provider behavior, treatment pathways, and patient outcome in neuropsychiatric and neurodegenerative diseases. My team and I develop and deploy scalable predictive models to inform clinical decision-making, optimize patient engagement, and support evidence-based neuroscience strategies.

Experience

  • Johnson & Johnson (Full-time · 4 yrs 4 mos)
    • Senior Principal Data Scientist
      May 2025 - Present · 1 yr 3 mos

    • Manager Data Engineering
      Apr 2022 - May 2025 · 3 yrs 2 mos

    • Manager, Machine Learning and Data Engineering
      Apr 2022 - May 2025 · 3 yrs 2 mos

  • Data Engineering Lead at Rightway
    Oct 2021 - Apr 2022 · 7 mos

  • Manager, Data Engineer at GSK
    May 2019 - Oct 2021 · 2 yrs 6 mos

    - Worked as one of the data engineering leads who designed and built enterprise-level data pipeline to enable data analysis, machine learning, and artificial intelligence applications for a wide variety of business units at GSK (Talend, Azure Data Factory) - Implemented and developed Spark (Databricks) jobs for data ingestion and transformation (Java, PySpark, SparkSQL) - Designed and developed tailored data solutions for business units at GSK (Azure Functions, Azure EventHub, Azure SQL Data Warehouse, Python, Azure DevOps) - Provisioned, integrated, and automated Azure SQL Data Warehouse (Azure CLI, T-SQL, Python, Azure DevOps, Jenkins) - Enabled the continuous integration and development using DevOps tools (Azure DevOps and Jenkins)

  • Data Engineering Fellow at Insight at Insight Data Science
    Jan 2019 - May 2019 · 5 mos

    - Developed a fault-tolerant Apache Airflow architecture with multiple synchronized schedulers (Python) - Designed and implemented a high-volume data pipeline to visualize the trend of stock prices traded in Deutsche Stock Exchange using AWS S3, Apache Spark, PostgreSQL, and Dash by Plotly - Tested the robustness of the fault-tolerant Airflow architecture on the implemented data pipeline by using real-life scheduler failure-scenarios (Python, SQL)

  • New Jersey Institute of Technology (New Jersey, US)
    • Ph.D.
      Jan 2014 - Dec 2018 · 5 yrs

      - Contributed to the development of machine learning algorithms for Internet traffic classification. - Contributed to the development of evacuation planning in indoor-fire scenarios using machine learning. - Worked on free-space optical communications systems for high-speed trains. - Involved in the development of transport protocols and load balancing mechanisms for data center networks.

    • Research Assistant
      Jan 2014 - Dec 2018 · 5 yrs

      - Involved in the development of Internet traffic classification using machine learning and achieved a classification accuracy of 92%. - Implemented a per-packet load balancing approach for data center networks and improved the completion time of flows by 20%. - Designed a free-space optical communications system for high-speed trains to provide a 1-Gbps data link with the collaboration of China National Railway Locomotive Company (CRRC). - Developed an adaptive-divergence beam that provides an average received-power gain of 35 dB over a fixed-divergence beam in a free-space optical communications system for high-speed trains.