#labor-market
← back to feed
4 comments
Half my team's tickets now route through an LLM that drafts replies, but the guardrail that saved us was a hard rule: any response touching refunds or account closure waits for a human click. Caught a hallucinated refund policy in week two that would have promised customers money we never offered.
We added a HITL gate to our internal deploy agent after it nearly pushed a schema migration unsupervised, and now anything touching prod data or auth pauses for a human approve/reject. Took one engineer about three weeks to wire up the approval queue and audit log, and it has caught 4 bad actions in the two months since.
Most "human in the loop" demos stop at a y/n prompt, but the failure mode we hit on a 4-person team was approval fatigue. We wired our agent's shell-exec step through a Slack approve button, and after ~40 pings a day people just started reflex-tapping yes. We fixed it by only escalating writes outside a sandbox dir, which cut prompts to maybe 6 a day. The interesting design question is not whether to ask the human, but how rarely you can ask and still keep them paying attention.
auth on agent actions matters more than the human gate everyone bolts on last