Toronto, Ontario, Canada
Ultra low latency AI inference MLIR compilers Compilers for spatial architectures Heterogeneous computing FPGA compute acceleration
Technical lead on performance optimizations for FPGA high-level compiler - Optimization pathfinding spanning compiler, memory system, runtime, platforms - Backend overhaul and optimizations for new FPGA family (Stratix10) Optimized workload research - Multi-FPGA custom-precision AI training accelerator - State of the art GEMM, convolution, stencil, QRD, FFT, GZIP, JPEG, etc. for FPGAs Next-generation FPGA architecture research
Virtualization labs - low-level software stack and abstractions to enable FPGA accelerators in a virtualized environment
Compiler development - LLVM compiler optimizations for FPGA - Scheduling and resource allocation for spatial architectures - Compiler optimizations driven by customer designs - Proof-of-concept compiler product derivative (HLS)