GPU acceleration engineer

Groupe EOLEN

Paris

Description

GPU Acceleration Engineer - Calculation Engine

🎯 Main MissionMassively accelerate the sparse calculation engine of a UK SaaS B2B - Enterprise Planning & Analytics company by porting critical algorithms from Rust/C++ to GPU (CUDA). Transform currently impossible calculations (requiring thousands of years of CPU time) into operations achievable in minutes.

📊 ContextUK SaaS B2B - Enterprise Planning & Analytics company manages planning models reaching 64 quadrillion cells with billions of time periods. Our Hyperblock/Polaris engine is currently limited by:

  • Legacy CPU architecture (Java/Rust/C++)
  • Memory constraints on massive sparse structures
  • Prohibitive calculation times on complex scenarios

Objective: Achieve performance gains of 100x to 1000x via GPU offloading.

🔧 Main Responsibilities

GPU Offloading

  • Port existing Rust/C++ algorithms to CUDA/GPU
  • Identify and extract critical calculation paths to accelerate
  • Optimize sparse matrix operations for GPU architecture
  • Develop performant Rust ↔ CUDA wrappers
  • Benchmark and validate performance gains

Memory Optimization

  • Design GPU memory management strategies for massive datasets
  • Implement efficient patterns for sparse structures
  • Optimize CPU ↔ GPU memory transfers
  • Manage GPU memory limitations on large-scale calculations

Technical Collaboration

  • Work with engineering team on integration
  • Document GPU porting patterns
  • Participate in code reviews and design reviews
  • Train the team on GPU best practices

💻 Technical Stack

Languages (in order of importance)

  • CUDA - Primary GPU development
  • Rust - Source language for algorithms to port
  • C++ - Legacy components and CUDA interoperability
  • (Java - platform context, no dev required)

Key Technologies

  • NVIDIA CUDA (toolkit, libraries: cuBLAS, cuSPARSE)
  • Rust (ownership model, unsafe blocks, FFI)
  • GPU Programming (kernels, memory hierarchy, optimization)
  • Sparse Matrix Operations (compression, storage formats)
  • Profiling Tools (nvprof, Nsight, perf)

✅ Required Profile

Essential SkillsGPU & CUDA (Essential)

  • ✅ Significant CUDA programming experience (3+ years)
  • ✅ Mastery of GPU kernel optimization
  • ✅ Deep knowledge of NVIDIA GPU architecture (memory hierarchy, warps, occupancy)
  • ✅ Experience with sparse calculations on GPU (cuSPARSE or equivalent)

Rust (Essential)

  • ✅ Production Rust development
  • ✅ Mastery of ownership and borrowing system
  • ✅ Experience with unsafe Rust and FFI (Foreign Function Interface)
  • ✅ Ability to analyze and refactor existing Rust code

C++ (Required)

  • ✅ Modern C++ (C++11/14/17)
  • ✅ C++ ↔ CUDA integration
  • ✅ Templates and metaprogramming (asset)

Algorithms (Required)

  • ✅ Data structures for scientific computing
  • ✅ Sparse matrix algorithms (CSR, COO, etc.)
  • ✅ Performance optimization and profiling
  • ✅ Parallelization and concurrency concepts

Highly Valued Experience

  • 🎯 Documented CPU → GPU porting projects
  • 🎯 HPC experience (supercomputers, GPU clusters)
  • 🎯 Memory optimization for large-scale datasets
  • 🎯 Scientific computing or numerical simulation
  • 🎯 Rust interop with other languages (C/C++/Python)

📍 Working Arrangements

Location & Travel

  • 100% remote (France/Europe base preferred)
  • Occasional travel to London
  • Frequency: ~1 week/month for team sprints
  • Project kickoff + key reviews
  • Intensive collaboration sessions

Start date: As soon as possible