Huynh (Bruce) Nguyen

Senior Data Engineer

Singapore, Singapore

About

Building large-scale (TBs to PBs), flexible, and secure Data Lakehouse with modern data stacks such as Git, Hadoop/S3, Kubernetes, Hive Metastore, Snowflake/Redshift, Iceberg, Glue/EMR/Spark, Trino/Athena, Confluent Kafka, dbt, Airflow, and more. Applying CI/CD with Git and containerized technologies to build reproducible and monitorable things in data engineering, like in software engineering. Designing and building highly scalable and available multi-tenancy searching and reporting services for using big data and the right distributed systems to serve highly concurrent queries with large datasets.

Experience

  • Senior Data Engineer at Temus
    Oct 2024 - Dec 2025 · 1 yr 3 mos

    - Building data tools and pipelines that support AI-powered applications - Migrating data for a complex system

  • Senior Data Engineer at Global Fashion Group
    2022 - 2024 · 2 yrs

    - Building an agnostic Data Platform on Cloud using open-source projects and Kubernetes - Migrating old EMR pipelines to the new data platform - Building data modeling for complex data problems

  • Mid Data Engineer at GalaxyOne
    2021 - 2022 · 1 yr

    Deployed Data Platform on Cloud - AWS services such as Glue, Athena, Redshift, and S3. - GitLab CI/CD - SQL + DBT + SQL as code + Airflow + self-service CI/CD. - Airflow - Datahub - Gitlab - Terraform - Data Platform with Versioning

  • Data Engineer at Công ty Cổ phần Giao Hàng Tiết Kiệm
    Jun 2019 - Oct 2021 · 2 yrs 5 mos

    - Designed and deployed data models for data pipelines and data services. - Designed and deployed high availability, scalable, low latency search system with hundreds millions of data using many advanced search technology: Elasticsearch and Vespa.ai. - Built a custom Spark Thrift Server with Kerberos Authentication and Ranger Authorization that made Spark work as ETL Query Engine on Hive/Hadoop data. - Designed and deployed complex report systems using Lambda Architecture. - Designed and deployed Data as a Service (API) using Cassandra, Spring Boot, and more. - Maintained Cloudera Platform. - Ability to deploy Confluent Platform including Kafka, Kafka Connect Source/Sink (Mysql, Elasticsearch).

  • Data Engineer at Viettel Telecom
    Mar 2018 - Jun 2019 · 1 yr 4 mos

    - Developed and optimized data models and pipelines for large datasets using SparkSQL. - Developed a headless crawler tool for crawling single-page applications.