Port Washington, New York, United States
I am a deeply passionate engineer and technical leader who gravitates towards challenging problems in scaling systems, datacenter design, and statistical inference. Over the years, I have built numerous distributed scheduling systems to improve utilization of large clusters; in 2015, I published a book on this topic with O'Reilly Media. More recently, I've also begun to work on quantitative research, statistical inferencing, and machine learning, but also I continue to be very deep in software infrastructure--from networking to build systems to customizing datacenters.
- Rewrote build system to unblock ML research usage of the Research Super Cluster’s storage system - Implemented flash storage backend for fast and reliable access to 2000PB (2 exabytes) of AI storage - Created API for large language models and synthesized research for efficiency optimizations
- Built out AWS environment, implementing NAT/VPN PrivateLink to firm networks, batch computation environment, dynamically allocated live & paper trading environments for testing complete stack in “prod” without risk - Implemented ML-based sentiment models with a $50mm book using Spark - Implemented caching, distributed, snapshotting filesystem for all production data for stable research - Created numerous research tools for signal construction and tradeability/factor analysis based on Pandas, Dask, and Spark - Improved existing technical models by engineering new features and improving training techniques - Migrated PM team to Bazel, integrating 3 disparate build systems into a unified polyglot build environment
- First senior engineer hire in Boulder site, mentored other engineers and established office culture - Integrated 7 systems created by teams across 3 other sites to define and launch new attribution product - Designed and implemented serverless realtime data pipeline on AWS handling 25k transactions per second