Principal Backend Systems Engineer

Larkin Lane Films, LLC

Budapest

Description

Job Description

General Info

We are developing a breakthrough imaging and 3D technology platform currently under provisional patent. The Principal Backend Systems Engineer owns the production server software that delivers this platform to end users — the high-throughput API backend, the ML inference gateway that serves our models, and the internal control-plane services that hold the whole system together. Our technology is developed and trained on our on-premises GPU cluster; this role is responsible for the software that takes those trained models and our product experiences and serves them reliably to hundreds of thousands of concurrent users.

This is, first and foremost, a software engineering role. The person in this position writes the production C++ code on the hot paths where throughput, tail latency, and memory behaviour are first-class product attributes; writes the Python services where development velocity, clarity, and integration breadth matter most; and designs the concurrency, I/O, and state-management strategies that let live user connections survive instance failure, rolling deployment, and capacity rebalance without the end user noticing. The role exists because shipping a hyperspectral imaging and ML platform at scale requires server software written deliberately, not assembled from configuration. Cloud infrastructure, observability, and deployment are part of the job — a serious backend systems engineer is fluent in those — but they are the context in which the software runs, not the primary focus. This is a senior, high-ownership role that is deliberately hard to fill.

Tasks

Server Software & Performance Engineering

  • Design, write, and own the production server software that runs the Company's platform: the API backend that serves the website, mobile, and e-commerce traffic; the ML inference gateway that fronts our GPU-backed model serving; and the internal control-plane services that coordinate them.
  • Write the performance-critical code paths in modern C++ — the request hot paths where latency budgets are measured in milliseconds, the inference dispatch and batching logic where GPU efficiency depends on careful queueing, and the data paths where allocation behaviour and memory layout determine whether the system meets its throughput targets.
  • Write the Python services that surround and complement the C++ core — control-plane logic, integration glue, request orchestration, and the parts of the stack where development velocity and clarity matter more than raw performance.
  • Design and implement graceful connection and session migration so that live user connections survive instance failure, rolling deployment, and capacity rebalance — handed off to healthy instances without the end user perceiving an interruption.
  • Maximise throughput and minimise tail latency at the hot paths through deliberate, justified choices in concurrency model, non-blocking I/O strategy, memory and allocation behaviour, serialization format, and GPU batching. Treat response time as a product attribute, not an operations metric.
  • Profile, diagnose, and optimise performance end-to-end — from CPU and GPU utilisation through network and storage paths — and fix root causes in the server code rather than working around them in infrastructure.

ML Inference Serving

  • Own the inference gateway as a software engineering problem: request batching, GPU memory management, model versioning, request-level latency budgets, graceful degradation under overload, and the dispatch logic that turns a stream of incoming requests into efficient GPU utilisation.
  • Design the training-to-serving handoff in partnership with the ML team: model packaging and artifact format, staged rollouts and version pinning, A/B testing of model variants, and the operational contract between the on-premises training cluster and the cloud serving tier.
  • Build the inference-time observability that makes model performance debuggable in production — request-level traces, per-model latency and accuracy telemetry, and the tooling to investigate inference regressions quickly.
  • Optimise inference throughput and cost per request through careful batching strategy, model-aware scheduling, hardware-aware memory layout, and judicious use of lower-precision computation where the model and product allow.

Platform Architecture & Cross-Functional Ownership

  • Define the service boundaries, deployment patterns, data and caching layers, and performance constraints within which the rest of the engineering team builds features. Own the architectural framework for the website, e-commerce, and mobile API backend.
  • Partner closely with the Lead Mobile Software Engineer on the mobile API backend — authentication, data synchronisation, media upload and download, offline-capable request patterns, and the latency and reliability contract that the mobile client depends on.
  • Own integration architecture with third-party systems: e-commerce engine, product-supply and fulfilment systems, payment processors, and communication services.
  • Define and own the data architecture: operational databases, caches, object storage, analytics pipelines, and the movement of data between on-premises and cloud environments.

Production Operations Competence

  • Deploy, operate, and reason about the systems you write. You are responsible for your code in production, including its observability, its failure modes, and its cost. This is not a configuration-first DevOps role — but a senior backend engineer who can't run their own code in production is not actually senior.
  • Work fluently with cloud infrastructure (AWS or equivalent) — compute, networking, storage, identity, and the operational surface — to deploy your services and the platform around them. Use infrastructure-as-code (Terraform or equivalent) and CI/CD pipelines as engineering tools, not as a separate discipline.
  • Establish the observability practices for your code: meaningful metrics, structured logs, distributed traces, and dashboards that make production behaviour intelligible. Define service-level objectives for the services you own.
  • Participate in the on-call rotation for the systems you build, and lead incident response when something you own breaks in production. Blameless postmortems and follow-through on remediation are part of the role; running the on-call programme as a function is not.

Technical Leadership

  • Set technical direction for backend systems and ML serving across the engineering organisation. Review architecture proposals from across the team, push back constructively, and raise the bar for production quality.
  • Mentor other engineers technically through code review, design review, and pair-programming on hard problems. This is an individual-contributor leadership role on the principal track, not an engineering management role.
  • Represent the backend systems perspective in cross-team architectural reviews and strategic planning, and translate systems constraints and opportunities clearly to non-engineering stakeholders.
  • Maintain architectural documentation, design records, and operational runbooks as living, accurate artifacts — not write-once shelf-ware.

Requirements

Education & Experience

  • Bachelor's degree in Computer Science, Software Engineering, or a related technical field, or equivalent practical experience. A Master's or advanced degree in a relevant technical field is strongly preferred.
  • Ten or more years of backend systems, distributed-systems, or platform-software engineering experience, with a substantial portion of that time at the staff or principal level on the individual-contributor track.
  • Demonstrated track record of having personally written significant portions of at least one high-throughput production server system — not having operated one that someone else wrote, and not having only led teams that wrote them. Code samples, open-source contributions, or detailed system descriptions in interview will be expected.
  • Hands-on experience running GPU-based ML inference in production at scale, including the specific software engineering challenges of batching, scheduling, model versioning, and tail-latency control.

Skills & Competencies

  • Expert-level proficiency in modern C++ (C++17 or later) for production server software — concurrency primitives, non-blocking I/O, memory and allocation behaviour, lock-free or wait-free data structures where appropriate, zero-copy data paths, and the discipline to write C++ that is fast and correct under sustained production load.
  • Strong proficiency in Python for service implementation, control-plane logic, and ML integration tooling. The ability to choose deliberately between C++ and Python for a given component, and to make the two interoperate cleanly, is part of the job.
  • Deep understanding of high-throughput, low-tail-latency server design: concurrency models, event-loop and threaded architectures, connection and session state management, backpressure, graceful degradation, and transparent failover across instances. Demonstrated experience with at least one production system handling hundreds of thousands of concurrent users or equivalent request volume.
  • Practical experience with GPU-based ML inference serving — request batching strategies, GPU memory management, model versioning and rollout, and the latency-versus-throughput tradeoffs specific to inference workloads. Familiarity with serving frameworks (Triton, TorchServe, custom) is welcome, but the ability to reason about inference serving from first principles is more important than experience with any particular framework.
  • Working fluency with cloud infrastructure (AWS preferred; GCP or Azure acceptable) sufficient to deploy and operate the services you write. Infrastructure-as-code (Terraform or equivalent), container orchestration (Kubernetes), and CI/CD pipelines are tools you use confidently, not specialities you lead. Deep DevOps or SRE specialisation is not required and is not what this role is asking for.
  • Solid understanding of modern observability practice — metrics, structured logs, distributed traces, and the discipline of designing for debuggability rather than retrofitting it.
  • Additional fluency in Go or Rust is a plus, particularly for systems-software work where the C++/Python split is not the right answer.
  • Ability to handle confidential and pre-patent technical material with discretion, and to follow the Company's IP and data-handling policies rigorously.
  • High degree of curiosity, craftsmanship, and resilience; calm and methodical under production pressure.

Additional Preferred Experience

  • Experience designing and shipping inference serving infrastructure for computer vision, hyperspectral, or other high-data-volume model classes — where bandwidth, batching, and GPU memory layout dominate the engineering.
  • Experience with hybrid architectures spanning on-premises GPU clusters and cloud serving — model export, artifact management, and the training-to-serving handoff.
  • Open-source contributions to backend systems, distributed-systems, or ML serving infrastructure — published code that demonstrates the engineering judgement we are hiring for.
  • Experience with high-volume transactional backends, payment integration, or multi-region resilience engineering.
  • Prior staff- or principal-level technical leadership in a small, high-talent engineering team where individual contributors set technical direction.

Candidates must be eligible to work in Hungary, with fluent English knowledge, both written and oral.

Compensation

The base salary range for this position is HUF 2,000,000 – HUF 2,900,000 gross per month, paid in twelve monthly installments. The range reflects a full-time, top-tier principal-level engagement and is calibrated to the Hungarian market for senior backend systems engineering with hands-on production experience writing high-throughput server software and operating GPU-based ML inference at scale.

The offered starting salary within this range will be determined on the basis of objective, gender-neutral criteria, including the candidate's demonstrated depth of production C++ on high-throughput server systems, breadth of hands-on experience with GPU-backed ML inference, track record of architectural ownership across cross-functional platform boundaries, and prior technical leadership scope on the individual-contributor track. Advanced degrees, specialised domain knowledge, and meaningful contributions in adjacent areas (for example, computer vision or hyperspectral inference serving, high-volume e-commerce backends, or open-source systems software) are factors that may justify positioning toward the upper end of the range.

In addition to base salary, the total compensation package includes a discretionary performance bonus and statutory and customary benefits as applicable under Hungarian law. Details of the bonus component will be discussed with shortlisted candidates.

This salary range is disclosed in accordance with the pay transparency obligations arising from Directive (EU) 2023/970 and its transposition into Hungarian law. Candidates will not be asked about their prior or current compensation at any stage of the recruitment process.