Sagar Purkayastha, Ph.D.

Sr. Manager & Lead, DS & Engg, Financial Crimes | Gen AI | LLM | Optimization

Cranbrook, British Columbia, Canada

About

As a Senior Manager/Principal Data Scientist at TD’s Financial Crime division, I leverage 17 years of profound experience in AI, Machine Learning, and Analytics to drive exceptional results, generating multi million dollars in annual revenue. I excel in leading and nurturing high-performing teams to deliver end-to-end machine-learning solutions that create substantial business value. My organizational skills and strategic vision have enabled me to excel in project prioritization, time and cost estimation, business case creation, resource allocation, and budgeting. My strong communication and presentation abilities have helped build relationships with executives and stakeholders, ensuring timely and impactful project delivery. As an example, under my leadership, we have been able to reduce project delivery times by 75% and doubled the completion rate to 100% in an agile environment, while ensuring staff development and growth. My contributions have been recognized with the TD Annual Helix Award, and I have also received the Manager’s Choice Award at IBM and the Best Teaching Assistant Award from the University of Calgary. Lastly, my expertise spans Modeling, Time Series Analysis, Forecasting, Deep Learning, Natural Language Processing (NLP), Generative AI (Gen AI), and Optimization, underpinned by 18 peer-reviewed publications and 7 patents in AI/ML and Mathematical Optimization. Core Competencies: Areas: Management: People Management, Project Management, Business Case Development, Budgeting, Cost Estimation Technical - Mathematical Optimization (Mathematical and Stochastic Programs), Dynamic Programming, Heuristic Optimization (Genetic Algorithm, Ant Colony, Monte Carlo Methods), Portfolio Optimization (Markowitz, Kelly Criterion), Machine Learning (Bayesian and Frequentist Approaches), Deep Learning (RNN, LSTM, CNN, Auto Encoders), Dimensionality Reduction and Manifold Learning, Hidden Markov Models, Natural Language Processing (NLP) and Generative AI (Gen AI), Recommendation Systems, Fraud Analytics, Anomaly Detection, State Estimation, Control Systems. Relevant Analytics Tools: Programming - Python, SQL, PySpark. Cloud Platforms - Microsoft Azure, AWS MLOps and Deployment - MLFlow, Feature Store, Dockers Tools & Framework - GitHub, Databricks, Excel, PowerPoint, JIRA, Confluence. Soft Skills: Leadership, Communication, Agile Work, Coaching, Mentoring

Experience

  • TD (Remote)
    • Sr Manager, DS & Engg Financial Crimes
      Oct 2024 - Present · 1 yr 9 mos

      - Leadership and Management: Building a team of 15+ data scientists and engineers to create an industry-leading Financial Crime Analytics program. - Strategic Planning: Developing the near and long-term data, infrastructure, and AI/ML use case roadmap and strategy to Financial Crime. - End-to-End AI/ML Development - Leading teams to develop enf-to-end data science solutions to Fraud, AML, Insider Threat detection. - Stakeholder Management - Working with 50+ stakeholders across various business lines and hierarchies to manage delivery expectations and scope to meet business expectations.

    • Senior Manager, Data Science & Engg/Advanced Analytics, AIML Practice, TD Wealth
      Dec 2021 - Oct 2024 · 2 yrs 11 mos

      Responsible for managing and leading a high-performance team of 14 data scientists and engineers and to ideate, formulate, and deliver high-value end-to-end data science solutions (in NLP, Gen AI, Marketing Analytics, Deep Learning, and Optimization), creating new opportunities, business cases, budgeting, cost estimates, and weekly executive updates. Leadership and Management: Spearheaded a dynamic team of 14 data scientists and engineers, orchestrating high-impact analytics solutions. Revenue and Cost Efficiency: Drove $25 MM+ in revenue and cost savings annually by delivering over 20 high-value use cases. Achieved $3 MM in monthly savings by leading the migration of use cases from on-premises servers to Azure cloud, ensuring timely project completions. Innovative Solutions: Directed the development of advanced ML solutions including predicting customer attrition, active traders, and transfer outs ($11 MM), transformer-based topic modeling and sentiment detection ($4 MM) (patent pending), Gen AI-based chatbot creation ($8 MM), clustering-based client segmentation ($1 MM), and optimal pricing and resource allocation based on Optimization ($2 MM). Strategic Planning: Crafted business cases and cost estimates for new AI/ML initiatives, driving projects from ideation through to production in Generative AI, NLP, Propensity Modelling, Causal Inference, Anomaly and Fraud Detection, and Mathematical Optimization. Process Optimization: Implemented streamlined delivery pipelines, enhancing efficiency and speed in project execution. Managed multiple stakeholders across TD, ensuring high-quality outputs under strict budgetary constraints. Quality Assurance: Tracked intake and project progress utilizing JIRA and Confluence and oversaw the team’s unit and integration testing on the end-to-end pipeline. Team Development: Established a versatile team structure to address various business needs, fostering talent development across diverse AI and ML domains.

    • Lead Data Scientist
      Jul 2021 - Dec 2021 · 6 mos

      Responsible for leading end-to-end (ETL, EDA, Modeling, and Deployment) Data Science projects (eg: classification, clustering, forecasting, regression etc.) utilizing structured and unstructured data, and conducting A/B testing, fairness, and bias assessments across all machine learning initiatives. Revenue Generation: Delivered $1 MM in revenue by leading the development of TD’s inaugural content recommender using NLP and optimization techniques (patent pending and TD Wealth Helix Award winner). Another $1 MM was generated through the creation of TD Wealth’s first internal and external search engine (patent pending, nominated for the Enterprise-wide TD Helix Award). Innovative Collaborations: Led a team to develop a groundbreaking model for detecting advisor misconduct, generating $2 MM in revenue (patent pending). Contributed to a $14 MM revenue boost by developing a product recommendation system for TD’s Advisor-led business. Additionally, engaged in profiling analysis to optimize campaign design. Quality Assurance: Led the team to perform unit and integration testing on the end-to-end pipeline. Additionally, created data science and coding best practices guides, as well as model validation and peer review guidelines.

  • IBM (Calgary, Alberta, Canada)
    • Lead Data Scientist, Watson Advanced Analytics
      Mar 2020 - Jul 2021 · 1 yr 5 mos

      Responsible for leading end-to-end Data Science projects (ETL, EDA, Modeling, and Deployment) for clients across diverse sectors such as oil and gas, mining, construction, and government. Regularly created and presented insights and solutions to client executives. Revenue Achievement: Drove $3 MM in revenue by developing a workforce optimization model for a multinational corporation. Prototyped various NLP algorithms and deep learning models, including sentiment analysis, topic modeling, and forecasting. Optimization Expertise: Engineered innovative solutions in portfolio and heuristic optimization, utilizing Markowitz theory, Ant Colony optimization, and Genetic Algorithms. Developed dockerized model APIs and deployed dashboards using Python and Heroku. Quality Assurance and Mentorship: Led the unit and integration testing of the pipelines prior to deployment. Coached and mentored junior data scientists, enhancing their expertise in machine learning and optimization.

    • Senior Data Scientist, Watson Advanced Analytics
      Jun 2018 - Feb 2020 · 1 yr 9 mos

      Responsible for the development of solutions and algorithms, while providing support to the Lead Data Scientist. Innovative Solutions Development: Spearheaded the creation of a pioneering deep learning solution for subsurface mineralization, utilizing Convolutional Neural Networks (CNN) with Python and PyTorch. This cutting-edge solution delivered $5 MM in revenue for a multimillion-dollar mining client. Reservoir Characterization: Led the development of IBM’s first machine learning-based reservoir characterization system. Employed Dynamic Time Warping and advanced clustering techniques on single and multi-component seismic data using Python, generating $1 MM in revenue for a leading oil and gas producer. Reservoir Reserves Estimation: Engineered an industry-first, end-to-end reservoir reserves estimation solution based on unsupervised similarity recognition and modified lumped diffusive transport ODEs with Python and PySpark. This innovative approach produced $1 MM in revenue for a multimillion-dollar oil and gas firm, resulting in an issued patent. Mentorship and Team Support: Provided expert guidance and mentorship to junior data scientists, enhancing their proficiency in machine learning, deep learning, and optimization. Supported the Lead Data Scientist in developing and refining complex algorithms and solutions.

  • PhD Student at University of Calgary
    Sep 2013 - Aug 2018 · 5 yrs

    Enhanced Oil Production: Engineered a groundbreaking multivariable Model Predictive Control (MPC) design for Steam Assisted Gravity Drainage (SAGD) that achieved a 171% increase in oil production and a 35.8% improvement in cumulative Steam-to-Oil Ratio in simulation. This was accomplished through the development of multiple novel MPC architectures in MATLAB, with findings published in peer-reviewed papers. Innovative Scheduling Schemes: Devised and implemented novel scheduling strategies based on the Kelly Criterion and D-RTO methodologies to optimize electricity distribution from SAGD waste heat and a gas turbine cogeneration unit. Utilized MATLAB and GAMS for the development, with research results published in academic papers. Advanced Forecasting Techniques: Applied sophisticated Time-Series analysis, Wavelet analysis, Ledoit-Wolf estimators, and Hidden Markov Models in MATLAB for system identification and parameter forecasting, enhancing predictive accuracy and operational efficiency. Optimization Excellence: Developed innovative Mixed-Integer Nonlinear Programming (MINLP) routines for integrating real-time design and scheduling of a bitumen upgrading facility for SAGD operations. Utilized MATLAB and GAMS to achieve optimized integration, with results documented in published research papers. Improved Thermal Conductivity Estimation: Achieved a 70.3% improvement in estimating effective thermal conductivity within three-phase porous media using an electrical circuit theory-based approach. Predictive Modeling: Created a Hidden Markov Model to accurately forecast steam chamber growth based on the relative mobilities of oil and steam, significantly advancing the understanding of SAGD dynamics. Mentorship: Provided guidance and support to junior researchers, fostering their development in advanced modeling, optimization, and simulation techniques.

  • Research Assistant at Rice University
    Jun 2009 - May 2013 · 4 yrs

    Logistic Regression Analysis: Applied logistic regression techniques to identify and validate effective strategies for human motor learning. This analysis provided valuable insights into optimizing motor learning processes and improving training outcomes. Advanced Motion Capture Experiments: Conducted comprehensive motion capture experiments utilizing a QuaRC six-camera system. Additionally, evaluated the feasibility of using low-cost gaming controllers (Nintendo Wiimote and Sony Playstation Sixaxis) as alternative motion capture devices, broadening the accessibility and affordability of motion tracking technology. Haptic and Tactile Display Development: Engineered and tested real-time haptic and tactile feedback systems for smart prostheses using MATLAB, Simulink, and the Quanser Data Acquisition system. This development enhanced sensory feedback and control, significantly improving user experience and functionality. Particle Filter Algorithm Implementation: Developed and implemented a particle filter-based algorithm in Python for open-loop control of multi-robot systems. This algorithm advanced the precision and coordination of robotic operations in complex environments. Comprehensive Controller Development: Designed, simulated, and implemented various control systems for a multivariable magnetic levitation system, including Proportional-Derivative (PD), Proportional-Integral-Derivative (PID), Linear Quadratic Regulator (LQR), and Full-State-Feedback controllers. Both gravity-compensated and non-compensated systems were developed, showcasing versatility in control strategies and system stability. Mentorship: Provided guidance and support to junior researchers, enhancing their skills in advanced modeling, control systems, and experimental techniques.

  • R&D Engineering Intern (Intelligent Production Systems) at Baker Hughes
    May 2012 - Aug 2012 · 4 mos

    Scale Removal: Conducted an in-depth feasibility study to explore real-time control mechanisms for effective scale removal from downhole Intelligent Well System valves. This research focused on enhancing the operational efficiency and longevity of well systems by addressing scale-related challenges. Innovative Testing Protocols: Researched and developed a comprehensive protocol for vibration testing of equipment used in Intelligent Production Systems. This protocol aimed to ensure the reliability and performance of critical equipment under dynamic conditions. Advanced Sensing Techniques: Investigated the application of optical fibers in Distributed Temperature Sensing (DTS) and Distributed Acoustic Sensing (DAS). Proposed innovative applications for these technologies in flow control, scale characterization, and the monitoring of temperature, acoustics, and vibrations. This research contributed to the development of more precise and versatile sensing solutions for industrial environments.