Staff AI Engineer (LLM Systems)

Edge

Islamabad

Description

"Please note that this role is not for Edge"

Founding AI Engineer (LLM Systems)

We’re an early-stage funded startup building AI systems for a large, regulated, paperwork-heavy industry. Our platform replaces manual workflows involving forms, PDFs, portals, and human decision-making with AI-powered chat, voice, risk assessment, and automation systems.

We’re a small team shipping fast with real users in production, and engineers here have significant ownership and product impact.

The Role

You’ll own the LLM-powered parts of the platform end-to-end, including:

  • Chat and voice intake systems
  • Agent orchestration
  • Risk and similarity pipelines
  • RAG and structured outputs
  • Voice integrations
  • Evaluation infrastructure
  • AI observability, latency, and cost optimization

This is a production engineering role — not just prompt engineering. Reliability, safety, latency, and system design matter.

What You’ll Build

  • Multi-step intake workflows across chat and voice
  • Low-latency voice agents with function calling and barge-in support
  • Background risk scoring and similarity retrieval using pgvector
  • Automation handoffs to browser automation services
  • Evaluation systems with regression testing and CI gates
  • Prompt caching, embedding caching, and model routing for cost and performance optimization
  • Structured agent outputs with human takeover flows when required

Stack

  • Express 5 + TypeScript
  • Postgres + pgvector
  • Next.js 16 + Vercel
  • FastAPI + Playwright
  • Retell, Vapi, ElevenLabs
  • Claude models (primary) + OpenAI
  • GitHub Actions with SOC 2 controls

Requirements

  • 3–6 years of production software engineering experience
  • 1.5+ years building production LLM systems used by real users
  • Strong TypeScript or Python skills
  • Comfortable working in both languages within a short period
  • Experience with Claude or OpenAI APIs
  • Hands-on work with:
  • Tool calling
  • Streaming
  • Structured outputs
  • Prompt caching
  • Batch processing
  • RAG systems
  • Evaluation frameworks
  • Strong understanding of retrieval evaluation and chunking strategies
  • Strong distributed systems fundamentals:
  • Queues
  • Retries
  • Idempotency
  • Timeouts
  • Observability
  • Ability to independently design, build, and operate systems

Nice to Have

  • Voice AI experience (Retell, Vapi, LiveKit, Deepgram, ElevenLabs, etc.)
  • pgvector or vector database experience at scale
  • Playwright or browser automation experience
  • Experience in regulated industries such as fintech, insurance, healthcare, or legal
  • Prompt injection and AI safety hardening experience
  • Open-source AI contributions, agent frameworks, or evaluation tooling

Not a Fit If

  • You treat LLMs as black boxes
  • You’ve only built demos or hackathon projects
  • You need heavily defined tickets to start work
  • You prefer research-only environments without shipping production systems
  • You avoid speaking directly with users

What You’ll Get

  • Strong ownership as an early technical hire
  • Fast shipping environment with real production impact
  • Direct access to users and customer feedback
  • Opportunity to work across voice, agents, retrieval, evaluations, and AI infrastructure

Hiring Process

  • Intro call
  • Technical deep dive
  • Paid take-home assignment
  • Live pairing session
  • Founder conversation
  • References and offer