Post by Sandwich Lab

618 followers

The smarter the model, the more dangerous a hallucinated field name becomes. That sounds backwards. The logic holds anyway. As frontier models cross new capability thresholds, teams adjust their defaults. "The model will catch it." "It writes production-grade code now." "We don't need to verify every detail." For most of what models output, that holds. Tighter loops. Cleaner abstractions. Better-named functions. Field hallucination is the exception. Capability gains make it harder to spot, not easier. A recent example from our own pipeline: a dashboard generated end-to-end by AI. Code reviewed clean. Charts rendered. Numbers looked sensible. One week in, a stakeholder flagged that conversion_funnel_rate had been trending at zero since launch. The field didn't exist. The warehouse called it cvr_click. The AI invented a plausible name. The query silently returned null. Every chart "worked." Every decision based on those charts was wrong. A more capable model doesn't fix this. Capability and verification operate on different axes. Three things we keep non-negotiable, regardless of which model writes the code: - Schema as a contract, not a suggestion. Every field reference traces to a version-controlled YAML definition. If it's not declared, the build fails. - A gated pipeline between "code compiles" and "data verified." Four gates: schema match, type alignment, query execution against staging, data sanity range check. None skippable. - A clear boundary between what AI decides and what it looks up. Layout, chart selection, interaction design — model territory. Field names, aggregations, dataset IDs — looked up, never invented. Capability gains compound. So do unverified decisions made on bad data. The pipeline matters more, not less, as models get stronger. #AIEngineering #DataInfrastructure #DataQuality #FoundationModels #EnterpriseAI #lanbow