Sheng-Lien Lee

Data Scientist

Taiwan

About

Data Scientist with business background and a track record of solving diverse problems across public safety, healthcare, and finance. Adept at building scalable data pipelines, training machine learning models, and delivering actionable insights. Quick to adapt to new domains and collaborate with stakeholders to turn data into impact.

Experience

  • Data Scientist at Compal
    Jan 2026 - Present · 6 mos

  • Data Scientist Consultant at SkyEyez AI
    Oct 2024 - May 2025 · 8 mos

    Built and delivered a functional product prototype within two months, presenting it to funders and securing critical ongoing support and funding approval. • Enabled rapid data preparation for bounding box labeling, reducing data preprocessing time from 50+ hours to under 30 seconds, resulting in a 6,000 times increase in productivity by building a semi-automated ETL pipeline using Python. • Improved the accuracy of two YOLOv8 object detection models for the AI surveillance system to capture potential crime characteristics, achieving Precision: 92.4% and Recall: 90.1, by redefining detected classes to address model weaknesses while ensuring alignment with stakeholder requirements.

  • Data Scientist(Master's Capstone) at Virufy
    Sep 2023 - Dec 2023 · 4 mos

    Optimized the COVID-19 classification model. • Implemented a COVID-19 coughing classification model on the Google Cloud Platform by converting audio data into spectrograms and training through a Convolutional Neural Network (CNN). • Increased model accuracy from 63% to 90% by identifying key features through exploratory data analysis, feature selection, and medical research insights, using Python libraries (Matplotlib and Seaborn)

  • National Chi Nan University (Part-time · 1 yr 6 mos)
    • Quantitative Researcher
      Aug 2020 - Jan 2022 · 1 yr 6 mos

      Led a team of three researchers in analyzing the impact of public sentiment on the stock price. • Saved $35,000 annually on a Twitter API premium subscription and collected over 1,000,000 tweets by building a daily automated data-collection pipeline using Python and the standard Twitter API. • Boosted sentiment analysis accuracy from 53% to 91% by designing and implementing preprocessing steps for tweets using natural language processing (NLP) and regular expressions and validating the improvements through an A/B experiment. • Identified 2 key trading factors significantly influenced by public sentiment through a Python-based time series regression analysis and presented the research at the 2021 INFORMS Annual Meeting.

    • Teaching Assistant
      Sep 2020 - Jan 2021 · 5 mos

      Information Security Management Course • Instructed a hands-on lesson for 47 students on the implementation of E-mail Security Solution (Actalis SSL certificates) and Self-signed SSL Certificate for Apache in Windows & Ubuntu • Designed teaching material; tutor students based on personal experience of learning the subject

  • Geospatial Data Analyst at Central Police University
    Aug 2020 - Mar 2021 · 8 mos

    Developed an emergency response platform for coordinating earthquake relief and delivering routine medical services. • Integrated medical and geographical data from 143 hospitals and 205 rescue units into a MySQL database by building an automated ETL pipeline using Selenium, Beautiful Soup, and the Google Maps API. • Developed a web platform with interactive data dashboards for stakeholders to track historical emergency response events and geographical data to optimize rescue strategies and routes, using Tableau, SQL, PHP, and HTML. • Collaborated with stakeholders to develop strategies for emergency resource allocation and road safety enhancements, identifying high-accident locations on street maps through analysis of 770,000 spatial data points on the platform.