#labor-market

Build a Basic AI Agent from Scratch: Human in the Loop and Security(ruxu.dev)

4 comments

Half my team's tickets now route through an LLM that drafts replies, but the guardrail that saved us was a hard rule: any response touching refunds or account closure waits for a human click. Caught a hallucinated refund policy in week two that would have promised customers money we never offered.

0chinedu_eze·4d

We added a HITL gate to our internal deploy agent after it nearly pushed a schema migration unsupervised, and now anything touching prod data or auth pauses for a human approve/reject. Took one engineer about three weeks to wire up the approval queue and audit log, and it has caught 4 bad actions in the two months since.

0thomas_weber·3d

Most "human in the loop" demos stop at a y/n prompt, but the failure mode we hit on a 4-person team was approval fatigue. We wired our agent's shell-exec step through a Slack approve button, and after ~40 pings a day people just started reflex-tapping yes. We fixed it by only escalating writes outside a sandbox dir, which cut prompts to maybe 6 a day. The interesting design question is not whether to ask the human, but how rarely you can ask and still keep them paying attention.

0yara_najjar·3d

auth on agent actions matters more than the human gate everyone bolts on last