Geneva, Geneva, Switzerland
Led development of PyTorch models for large-scale data workflows, from research prototype to maintainable production pipelines. Improved training throughput by addressing GPU utilization bottlenecks (input pipeline tuning, mixed precision, profiler-guided iteration). Built reproducible experiment templates (configs, logging, checkpoints, regression checks) to strengthen peer review and reliability. Performed deep code reviews of research repositories: flagged numerical stability issues, dtype/device mismatches, and evaluation leakage. Wrote concise technical notes explaining model behavior, failure modes, and trade-offs for engineering and research stakeholders.
Designed and trained Transformer-style models for sequence data and CNN baselines for structured signals with rigorous evaluation and ablations. Implemented distributed training runs using PyTorch DDP and containerized packaging for cluster execution. Reduced inference latency in batch pipelines by optimizing tensor shapes and minimizing CPU-GPU synchronization points. Established review checklists covering data integrity, leakage checks, calibration, reproducibility, and performance claims.
Built and maintained PyTorch research codebases emphasizing correctness, readability, and reproducibility. Ran controlled experiments on training dynamics (optimizer choice, warmup schedules, normalization, precision trade-offs). Introduced profiling-driven performance reviews to identify bottlenecks in dataloading, augmentation, and GPU memory pressure. Coached junior researchers on debugging patterns (gradient checks, loss decomposition, unit tests for tensor shapes).
Prototyped deep learning components in Python and evaluated model/feature trade-offs under compute constraints. Delivered an engineering report translating experiments into product-relevant recommendations.