Joshua Brot

Staff Software Engineer at Voyage AI (MongoDB)

San Francisco, California, United States

About

Experience

  • Staff Software Engineer at MongoDB
    Dec 2025 - Present · 7 mos

    Working with the Voyage AI team to scale their production inference stack in order to handle third party traffic and the large, global traffic stream of MongoDB auto-embedding.

  • SambaNova (5 yrs 10 mos)
    • Software Architect
      Nov 2024 - Dec 2025 · 1 yr 2 mos

      Scaled SambaCloud (https://cloud.sambanova.ai/) over its first year. - Led the implementation of a queuing system in Rust to meet customer SLAs and achieve high hardware utilization. - Rewrote the billing system to support thousands of monthly customers, integrating AWS and Stripe. - Ensured the system reliability as SambaCloud scaled to billions of tokens each day served by thousands of accelerator chips across three data centers around the world. Led SambaCloud’s repackaging into several product lines meeting a diverse set of customer needs. - Streamlined the SambaCloud administration story to facilitate operation by external users. - Enabled a hosted deployment with customer administration but SambaNova-managed hardware. - Enabled an on-prem deployment with customer administration and customer-owned hardware. - Enabled traffic-sharing between customers and SambaNova to increase reliability and profitability.

    • Senior Principal Software Engineer
      Nov 2023 - Oct 2024 · 1 yr

      Architected the SambaCloud (https://cloud.sambanova.ai/). - Designed the cloud architecture and worked with a high performance team to launch it in just 2 months. - Implemented the services responsible for handling API requests and routing them to hardware (used go, python, and Redis Lua scripts). - Wrote the Helm chart that orchestrated the components across the production cluster. Designed runtime APIs to enable new product lines. - Built consensus with key stakeholders on the new API’s requirements. - Led a team to implement the new design quickly and efficiently. - Coordinated with several teams across time zones to successfully release the desired product on schedule. Created cross-cutting performance improvements in the inference stack. - Directly optimized the C++ runtime code to reduce host overhead from 100ms per invocation to 0.5ms. - Worked with many teams to bring about a new inference flow to amortize host overhead across multiple tokens. - Drove the technical implementation of continuous batching across 5 teams, vastly increasing serving efficiency.

    • Principal Software Engineer
      Nov 2022 - Oct 2023 · 1 yr

      Designed a new JIT integration for PyTorch to run on SambaNova’s hardware. - Proposed the project (inspired by torch_xla) and created a prototype. - Integrated with the PyTorch dispatcher to create a seamless user experience. - Led a team to create ATen to SambaNova MLIR lowerings. - Developed the first dynamic memory management capabilities for SambaNova hardware.

  • Software Engineer Intern at Productive Edge
    May 2019 - Aug 2019 · 4 mos

    Created an augmented reality remote assist application for the HoloLens, which interfaced with Internet of Things devices over Bluetooth and utilized Machine Learning for predictive analytics.