San Diego, California, United States
I obtained my PhD in genomics and computational biology from the University of Pennsylvania and wrote my thesis on novel methods to identify infectious agents in idiopathic diseases using statistical modeling and next-generation sequencing. I continued this work as a postdoctoral researcher studying the epidemiology of hospital-acquired infections before joining Janssen R&D as a data scientist. My current work focuses in leveraging real-world data and statistical modeling to help design and enhance clinical trials, especially in the rare disease and infectious disease spaces. In addition to the interests above, I've also developed multiple widely-used open-source bioinformatic tools and data visualization programs in Python, R, Rust, Go, and Java, and contribute regularly to the wider open-source bioinformatics community on Github (https://github.com/eclarke). In my free time, I enjoy hiking, backpacking, reading and photography (some of which is on https://instagram.com/eclarke.photo).
I design and build statistical approaches and computational models to identify risk factors, design synthetic control groups, and characterize the overall epidemiology of various diseases. I also work to better integrate these methods with broader regulatory and clinical strategies. This work helps refine and improve clinical trials, improving program efficiency and accelerating the development of novel vaccines and therapeutics
Investigated the genetic basis of antibiotic-resistant Pseudomonas aeruginosa infections in long-term health care facilities; exploring the use of Bayesian modeling of antibiotic resistance acquisition during hospitalization; methods development for antibiotic resistance detection and comparative genomic assembly across longitudinal cohorts using long-read sequencing technology.
Lead search for etiologic agent in sarcoidosis using primary clinical samples and unbiased metagenomic sequencing. Developed statistical methods for low-biomass metagenomic sequencing; developed a high-throughput metagenomic analysis pipeline for supercomputer clusters; developed software for host-pathogen genomics. Led search for microbial triggers in macaque idiopathic chronic enterocolitis. Conducted longitudinal analysis of the immune repertoire and oral/nasal/gut microbiome in SCID-X1 gene therapy patients. Analyzed lymphocyte development of induced pluripotent stem cells in RAG-1 mutations.
Development of novel Gene Ontology evaluation metrics for gene set enrichment analysis; maintained the Gene Wiki project and developed the GeneWiki+ site. Additional development of research software and pipelines in Python and Java.