"Please note that this role is not for Edge"
Founding AI Engineer (LLM Systems)
We’re an early-stage funded startup building AI systems for a large, regulated, paperwork-heavy industry. Our platform replaces manual workflows involving forms, PDFs, portals, and human decision-making with AI-powered chat, voice, risk assessment, and automation systems.
We’re a small team shipping fast with real users in production, and engineers here have significant ownership and product impact.
The Role
You’ll own the LLM-powered parts of the platform end-to-end, including:
- Chat and voice intake systems
- Agent orchestration
- Risk and similarity pipelines
- RAG and structured outputs
- Voice integrations
- Evaluation infrastructure
- AI observability, latency, and cost optimization
This is a production engineering role — not just prompt engineering. Reliability, safety, latency, and system design matter.
What You’ll Build
- Multi-step intake workflows across chat and voice
- Low-latency voice agents with function calling and barge-in support
- Background risk scoring and similarity retrieval using pgvector
- Automation handoffs to browser automation services
- Evaluation systems with regression testing and CI gates
- Prompt caching, embedding caching, and model routing for cost and performance optimization
- Structured agent outputs with human takeover flows when required
Stack
- Express 5 + TypeScript
- Postgres + pgvector
- Next.js 16 + Vercel
- FastAPI + Playwright
- Retell, Vapi, ElevenLabs
- Claude models (primary) + OpenAI
- GitHub Actions with SOC 2 controls
Requirements
- 3–6 years of production software engineering experience
- 1.5+ years building production LLM systems used by real users
- Strong TypeScript or Python skills
- Comfortable working in both languages within a short period
- Experience with Claude or OpenAI APIs
- Hands-on work with:
- Tool calling
- Streaming
- Structured outputs
- Prompt caching
- Batch processing
- RAG systems
- Evaluation frameworks
- Strong understanding of retrieval evaluation and chunking strategies
- Strong distributed systems fundamentals:
- Queues
- Retries
- Idempotency
- Timeouts
- Observability
- Ability to independently design, build, and operate systems
Nice to Have
- Voice AI experience (Retell, Vapi, LiveKit, Deepgram, ElevenLabs, etc.)
- pgvector or vector database experience at scale
- Playwright or browser automation experience
- Experience in regulated industries such as fintech, insurance, healthcare, or legal
- Prompt injection and AI safety hardening experience
- Open-source AI contributions, agent frameworks, or evaluation tooling
Not a Fit If
- You treat LLMs as black boxes
- You’ve only built demos or hackathon projects
- You need heavily defined tickets to start work
- You prefer research-only environments without shipping production systems
- You avoid speaking directly with users
What You’ll Get
- Strong ownership as an early technical hire
- Fast shipping environment with real production impact
- Direct access to users and customer feedback
- Opportunity to work across voice, agents, retrieval, evaluations, and AI infrastructure
Hiring Process
- Intro call
- Technical deep dive
- Paid take-home assignment
- Live pairing session
- Founder conversation
- References and offer