Seattle, Washington, United States
- Redesigning and implementing a modern test infrastructure for Microsoft's Schannel network security team, replacing a decades-old legacy framework spanning the full Windows TLS/SSL stack including registry, certificate management, and protocol negotiation
- Fixed a critical memory leak in TileDB-SOMA sparse array writes (v1.13+) caused by Apache Arrow table lifetime mismanagement, preventing OOM failures even on 128 GiB hosts during multi-GB dataset rewrites in production CZI CELLxGENE Census workflows (~50M cells, 700+ datasets) > Reported issue: github.com/single-cell-data/TileDB-SOMA/issues/2928 > Fixed code: github.com/single-cell-data/TileDB-SOMA/pull/2989 - Built and maintained Python packaging and release pipelines (PyPI and Anaconda) with automated nightly testing using GitHub Actions, Azure Pipelines, CMake, Pytest, and Catch2. Used AWS EC2 and S3 to support development and testing workflows, and built and ran Docker images for local development and CI - Designed and implemented a domain-specific query language for the Python API using Python’s AST library and a formal BNF grammar > github.com/TileDB-Inc/TileDB-Py/blob/main/tiledb/query_condition.py - Led the design and implementation of a C++ API for querying TileDB arrays and groups, enabling zero-copy result transfer to Python (NumPy, Pandas) via Apache Arrow for TileDB-Py and TileDB-SOMA Authored design documents and implemented core Python API features, including aggregations, enumerations, and fragment management - Selected GitHub contributions: > TileDB-SOMA: github.com/single-cell-data/TileDB-SOMA/commits/main/?author=nguyenv > TileDB-Py: github.com/TileDB-Inc/TileDB-Py/commits/main/?author=nguyenv
- Primary QA tester at ARL:UT to confirm functionality of databases and models and Windows for US federal government accreditation. Written over a dozen publications detailing how products were tested, whether the products met necessary standards, and suggestions for how to improve products - Diagnosed and resolved a critical floating-point precision bug in scientific calculation code used by US federal government agencies and labs where float32 computations produced different results on 32-bit vs 64-bit systems due to x87 FPU's 80-bit extended precision intermediates vs SSE's strict 32-bit precision, requiring analysis of GCC compiler behavior and IEEE 754 floating-point standards to ensure deterministic, reproducible results across architectures - Responsible for binding scientific methods in C/C++ to MATLAB (via MEX) or Java (via JNI) interface for usability by researchers and analysts - Selected projects: > Implemented an open addressing hash table similar to Python’s dictionary in C. This C library was leveraged as a FORTRAN module by using ISO_C_BINDING > Writing Python scripts to translate binary Git bundles into ASCII and vice versa (our lab needed to develop these scripts because our work with classified data prevents us from downloading binary files)
Led discussion sessions overseeing 60+ students to review fundamental computer science concepts and basic Python programming.
- Worked with two other interns to develop a statewide Incident Response plan based on NIST publications to address prevention and handling of cyberattacks - Drafted a software penetration and Metasploit manual for internal use within the TxDPS Cybersecurity Team