Post by Matt Lowerre
AI Safety and Security
We recently helped a fintech client that’d already run automated red teaming on its new LLM “smart analyst,” but its CISO still wasn’t sleeping well. We built a human‑in‑the‑loop red-teaming program around their highest‑risk use cases. Automation generated large batches of adversarial prompts, and human experts reviewed the most “interesting” failures, labeled them, and decided which guardrails, policies, or access controls to update. Within a few months, jailbreaks in critical workflows declined, new issues were mostly niche edge cases, and the security team finally had something they could show to risk and compliance: an ongoing, human‑steered red-teaming function rather than a one‑off test.