Chee Yee Lim, PhD

AI / Data Science Tech Lead

Singapore, Singapore

About

I am a hands-on technical leader dedicated to transforming complex data into actionable business intelligence for global organizations. Utilizing deep expertise in AI, software development, and cloud solutioning, I actively build alongside high-performing teams to bridge the gap between innovation and execution. I am passionate about cutting-edge tech, but strictly focused on alignment with business goals to ensure every technical solution drives measurable commercial value.

Experience

  • Data Science Manager at Johnson & Johnson
    Jul 2023 - Present · 3 yrs

    • Served as a hands-on technical lead for a team of 4 data scientists, balancing team leadership with active development to deliver agentic AI workflows and ML-based predictions across 5 production and 3 POC projects for the global finance team. • Architected production-grade agentic AI workflows to automate financial processes - including balance sheet reconciliation - utilising LangGraph and Deep Agents for agent harness, backed by OpenAI GPT for large language model (LLM) inference and MLflow for LLM monitoring. • Engineered scalable data ETL pipelines on Databricks, leveraging Delta Lake for data versioning and Unity Catalog for data lineage tracking to ensure rigorous data quality management and faster processing speed. • Scaled a global demand-forecasting solution utilising Apache Airflow, Polars and Nixtla, interfacing with SAP and Anaplan to create forecasts for 2 million country-SKU combinations, driving the adoption of automated baseline forecasts across 4 key markets due to 9% accuracy improvement over existing forecasts. • Championed data science initiatives to business audiences by translating complex technical jargon into strategic business concepts, securing stakeholder buy-in.

  • DHL Consulting (Singapore)
    • Data Science Manager
      Jun 2022 - Jun 2023 · 1 yr 1 mo

      • Led a team of 3 data scientists to deliver 8 end-to-end data science solutions (5 successfully deployed to production) spanning customer service analytics, customer retention and fraud detection. • Secured SGD 570,000 in revenue by driving business development workshops and cross-divisional alignment with regional and country heads, translating client business gaps into 5 funded data science engagements. • Architected and deployed a customer service analytics product that ingested multi-channel communication (chats, emails, calls) to surface trending topics and KPIs, significantly reducing response times and staffing costs while improving service quality for our client. • Developed high-accuracy NLP engines - including custom intention classification and sentiment analysis models - by leveraging a manually curated dataset of 20,000 labelled customer texts to drive automated, actionable customer insights. • Generated SGD 860,000 in annual savings across delivered portfolios while maintaining an exceptional client satisfaction rating, achieving an average Net Promoter Score (NPS) of 68.8.

    • Senior Data Scientist
      Sep 2020 - Jun 2022 · 1 yr 10 mos

      (See above)

  • Data Scientist at PatSnap
    Jul 2019 - Sep 2020 · 1 yr 3 mos

    • Developed a random forest-based model for patent value prediction by integrating novel NLP-based metrics extracted from patent texts with traditional patent indicators. • Deployed the random forest-based model into production in 2 forms: (1) as a dockerised Flask API for generating real-time predictions on new data, and (2) as a PySpark pipeline for generating batch predictions on historical data. • Worked closely with a team of 3 product managers and 2 engineers to ensure the product is developed on time while achieving business goals and fitting into existing IT infrastructure. • The patent value product replaced a third-party patent value data provider, which helps our company saves $100,000 per year in subscription fee.

  • Data Scientist at Schroders
    Sep 2017 - May 2019 · 1 yr 9 mos

    • Led the development of the human capital data product to provide summary insights into the board director relationships and career histories for 20,000+ public companies globally. • Engineered the backend ETL using distributed processing in Spark and the frontend using R Shiny dashboard. • The final product was perceived as 'a distinct value-add and massive time saver' by the heads of 3 investment research teams who requested their analysts to use the product as part of their process. • Liaised with 9 data vendors and verified the quality of alternative data by checking their data collection and processing methodology, as well as comparing the data with known information.