Thuy Nguyen

Principal Scientist at e-therapeutics

United Kingdom

About

Experience

  • Principal Scientist at Tangram Therapeutics
    Jul 2024 - Apr 2026 · 1 yr 10 mos

    Built and benchmarked LLM workflows leveraging transformer-based representation learning (SapBERT) to harmonise biomedical ontologies (12M+ entities), enabling consistent data integration across heterogeneous sources. Designed a quantitative framework for scoring disease prevalence across pharmaceutical markets, prioritising indications for early-stage drug discovery. Implemented features to support validation of target–indication pairs, mining structured genomic and disease databases to identify causal genes for target prioritisation. Collaborated with engineering and business analysis teams to develop retrieval-based, agentic LLM workflows for extracting structured insights from scientific literature. Developed few-shot prompting approaches for text-to-SQL queries of biomedical databases. Evaluated integration of knowledge graphs with retrieval-augmented generation (GraphRAG) to support conversational AI in drug development. Built and deployed an OpenSearch-based API for lexical retrieval over ontology terms, owning curation, indexing, and implementation end to end.

  • Bioinformatician at Genomics England
    Jan 2022 - Jul 2024 · 2 yrs 7 mos

    First author of “Equity in cancer genomics in the UK: Ancestry analysis for a national cancer cohort” published in Lancet Oncology 2025. Analysed effects of patient ancestry on clinical genomic diagnostics in the 100,000 Genomes Project Cancer Programme. Engineered pipelines to aggregate genetic data from Oxford Nanopore sequencing. Designed and executed benchmarking of techniques to detect, correct, and merge noisy genetic variant data.

  • Senior Bioinformatician at Wellcome Sanger Institute
    2016 - 2022 · 6 yrs

    Designed new bioinformatic methods for detecting copy number variation in selective whole genome amplified samples. Developed features for sample tracking system in Django using agile software development methodologies. Analyzed effects of various wetlab protocols, and bioinformatic approaches on malaria sample genotypes. Developed rapid lineage typing bioinformatic pipeline to identify SARS-CoV-2 variants of concern for informing reports sent to UK Health and Security Agency within hours of sequencing. Investigated association between vaccination and infection with different SARS-CoV-2 lineages.

  • Graduate Student at BC Centre for Excellence in HIV/AIDS
    Sep 2013 - Apr 2016 · 2 yrs 8 mos

    Analyzed viral evolution of resistance to immune response and treatment using next generation sequencing. Designed primers for Hepatitis infection diagnostics. Employed Random Forests for predicting sources of phylogenetic reconstruction errors. Developed MPI Python software to overcome missing data in phylogenetic reconstructions from shotgun sequencing.

  • Bioinformaticist at Rieseberg Lab, University of British Columbia
    Jan 2012 - Sep 2013 · 1 yr 9 mos

    Developed pipelines to assemble 3.6 Gbp sunflower genome, integrating physical and genetic maps. Coordinated bioinformatic resources for ~30 person evolutionary genetics lab. Authored successful grant proposal for $20K worth of computational resources from Compute Canada.