Johannes Kuhlmann

HPC, AI and QC enthusiast

Munich, Bavaria, Germany

About

I am a dedicated, hardworking, proactive and detail oriented Researcher and Scientific Software Engineer with exceptionally strong problem solving skills and a strong background in modern software design methods and code optimizations for High Performance Computing systems. My background is in complex multi-physics, data analysis, heterogeneous computing, parallelization, software performance, and optimizing compilers. I worked on compile time performance improvements of AMD's LLVM/MLIR based AI compiler, the full stack of Quantum Computing software, mentored team members, and managed DevOps infrastructure. Currently, I work on improving software, system and infrastructure for our neuromorphic computing hardware, and on system design for a new inference accelerator. I combine a broad experience across the full system in AI acceleration with deep experience in performance optimization and classical HPC. I seek for roles where software / system performance is vital in HPC and AI. I am currently working on a neuromophic computing system targeting high efficiency AI. I improve the architecture of the full software stack, infrastructure and system design. I cover for example networking, resource scheduling, chip including NoC and board optimization, setup a local cluster and CI/CD system, definition of compute model, the runtime and compiler. But most interesting for me is my work on hardware-software co-design for our new LLM sparsification pipelines and custom inference accelerator designs. I worked at AMD on spatial and temporal tiling in the AIE compiler. I significantly improved the compile time performance through optimization and mathematical abstraction -- I parallelized the Design Space Exploration, generalized constraints, and implemented Interval Arithmetic. I worked at Quantum Brilliance on the full stack of Quantum Computing software from applications over middleware down to hardware integration, including parallelization and performance optimizations. I guided team members on software architecture & design and best practice. I did my PhD studies in turbulent combustion and flame dynamics. There I gained 5 years experience in solver, model and tools development for CFD and the application of LES for turbulent reacting flows on HPC systems. I worked on parallelization and performance improvements, and data analysis methods like data fusion and machine learning.

Experience

  • Senior Software Engineer & Team Lead at SpiNNcloud
    Mar 2025 - Present · 1 yr 5 mos

    I work on the full software stack and especially on the runtime engine. I solve the hard problems, align the parts and consult across all teams on software architecture and design, software engineering best practices and modern C++. I am the team lead and main engineer of the infrastructure team that set up from scratch our: - local compute cluster housing classical and neuromophic hardware - CI/CD system via gitlab pipelines, docker, cmake, gtest, pytest and ctest - unification of the software stack into a modular mono-repository - MLOps via gitlab-mlops - remote access via dockerized Jupytherhub

  • Senior Software Engineer - AI Compiler at AMD
    Sep 2024 - Feb 2025 · 6 mos

    I work on LLVM / MLIR based AI compiler in C++ to increase compile time performance - Parallelized Design Space Exploration (DSE) allowing linear scaling of tensor tiling calculation - Generalized constraints in DSE to reduce the size of design space and improve runtime - Designed and implemented DSE based on Interval Arithmetic to further increase performance - Started and actively engaged in developer forum to foster inter team best practices and improvements - Architectural improvements for better error handling and debuggability

  • Software Engineer (HPC Specialist) at Quantum Brilliance
    Aug 2022 - Aug 2024 · 2 yrs 1 mo

    I work on our quantum SDK (Qristal) in C++: - Maintenance, extension and integration of quantum middleware, i.e. XACC and CUDA Quantum - Design and implementation of applications like join order optimization and transaction scheduling for RDBMS, quantum chemistry or quantum benchmarking with methods like QML or VQE - Parallelization of libraries and applications - CI/CD with gitlab and docker - Modernization of cmake based build system - Porting from x86 to arm and powerpc - Setup of IBM power and AWS cloud services and integration into pipelines

  • Research Assistant at Thermo-Fluid Dynamics Group at Technical University of Munich
    Apr 2018 - Jul 2022 · 4 yrs 4 mos

    Focus: predictive LES, turbulent combustion, system identification, data fusion and machine learning; Thesis: "Modelling and Identification of Technically Premix Flame Dynamics" - Development of turbulent combustion models, LES solvers, adaptive mesh refinement for combustion, online data analysis tools; Inclusion, patching and extension of external libraries - Workflows for predictive LES, including CAD and mesh generation, and parallelization on HPC systems - Automation of pre/postprocessing, data generation and analysis, generating minimal user input workflow - Conception and operation of IT-Infrastructure for HPC and machine learning (hard- and software) - Group lead for general numerics & OpenFOAM and HPC

  • Research Assistant at Institute for Thermo-Fluid Dynamics at Ruhr-Universit��t Bochum
    Jun 2017 - Mar 2018 · 10 mos

    - Scale resolving simulation of internal flow structures and primary breakup in Diesel injectors - Development of numerical methods to predict cavitation and it’s induced erosion - Teaching in tubulence theory and fluid dynamics