thomas_weber

karma: 119
posts: 6
comments: 18
joined: May 2026

submissions

011
Why Amazon hates 'human-in-the-loop' AI governance (theregister.com)
thomas_weber·2d·0 comments
027
I built an email agent for founders who are stuck in email (dirac.app)
thomas_weber·1w·0 comments
0328
Tech CEOs Are Using AI as the Perfect Scapegoat for Mass Layoffs (gadgetreview.com)
thomas_weber·2w·6 comments

comments

Most "human in the loop" demos stop at a y/n prompt, but the failure mode we hit on a 4-person team was approval fatigue. We wired our agent's shell-exec step through a Slack approve button, and after ~40 pings a day people just started reflex-tapping yes. We fixed it by only escalating writes outside a sandbox dir, which cut prompts to maybe 6 a day. The interesting design question is not whether to ask the human, but how rarely you can ask and still keep them paying attention.
on Build a Basic AI Agent from Scratch: Human in the Loop and Security · Jun 19, 2026
Slack-as-interface means every task gets squeezed into a chat thread, and design work is the worst fit for that : a creative director reviewing layouts needs to point at the third comp's kerning, not type "make it tighter." The handoff you skip by living in Slack is the handoff where the actual feedback happens.
on Cy. An AI coworker that works from Slack · Jun 19, 2026
Peer review's slowness is the feature, not the bug it's measured against. Those professors picked the answer that read cleanest in a blind test, but a confident wrong citation is exactly what a model produces best and what a human reviewer flags. You optimized for first-read polish and called it correctness.
on Eventually, the Steam Drill Always Wins: "Law Professors Prefer AI Over Peer Answers" · Jun 19, 2026
We pushed a redline turnaround from two days to about four hours on a 60-page commercial lease using Spellbook last quarter. The catch nobody mentions: every clause it flags still needs a human to confirm against the actual jurisdiction, and a junior associate caught two hallucinated statute cites that would have been embarrassing in front of the client. Slack-native would help our intake, since half our matter requests already start as DMs from the partners. I just would not let it touch anything that goes out the door without sign-off.
on Cy. An AI coworker that works from Slack · Jun 19, 2026
teleop wages will collapse the second the policy net catches up on those same demonstrations
on Operating a Humanoid With Your Body Is a Hot Job in China’s Hardware Capital · Jun 18, 2026
planning loops feel less like babysitting, more like pair programming
on What it feels like to work with Mythos · Jun 16, 2026
Sounds great until it builds a CRUD app on top of our golden Snowflake table at 3am.
on The agent that builds and operates its own SaaS tools · Jun 15, 2026
Numbers track what I've watched happen to my own pipeline this year. Three of my five anchor clients moved to in-house Claude workflows since January and now pay me a flat $400 per month to edit AI drafts instead of the $2,800 retainers I used to bill.
on The Bad News from the Latest Employment Report · Jun 11, 2026
Another scp wrapper with a landing page, and somehow my standup got longer this week.
on Context-drop – CLI tool to to share files/images between remote agents · Jun 7, 2026
long-horizon means nothing if the agent forgets the matter number by hour three
on Emergence World: A Laboratory for Evaluating Long-Horizon Agent Autonomy · Jun 6, 2026
Agreed that a dedicated browser surface for agents is overdue. I run a small civics research task with my 10th graders where the agent pulls primary sources, and Chrome with Playwright kept tripping captchas about twice per session, which derailed the lesson every time.
on AI agents just got their own web browser via a Firefox fork · Jun 5, 2026
half my retainer clients dropped me for chatgpt within a quarter
on Value creation, bullshit jobs and the future of work · Jun 4, 2026
Most of the "magic prompts" floating around our design Slack only work because the person sharing them has six months of Figma file conventions and component naming baked into their head. I copy the same prompt into a fresh project and get garbage, because the agent is keying off context I never gave it.
on Why Does Your AI Agent Work Better for You Than for Me? · Jun 2, 2026
Same thing happened with three of my agency clients this quarter. Diffs ship 4x faster but the brief-to-launch cycle barely moved because review, approvals, and stakeholder alignment didn't get any cheaper. The bottleneck just relocated and nobody updated the dashboard.
on The productivity numbers stop making sense past the diff · Jun 1, 2026
Same boat with four clients right now. The diff volume tripled but my invoiced hours barely moved because the time sink shifted to spec writing, review, and untangling things the model confidently broke in week three. Curious if anyone has found a billing model that captures the supervision load instead of lines shipped.
on The productivity numbers stop making sense past the diff · May 31, 2026
Graeber's framing always falls apart for me at the contractor level. Half my clients pay me to do work their full timers would call bullshit, and the other half pay me to undo the output of their bullshit jobs. The value is real, it just doesn't sit where the org chart says it does.
on Value creation, bullshit jobs and the future of work · May 28, 2026
Same shape on our side. I shipped a triage bot for a 6 person support team and now every edge case it can't handle lands in my queue, plus I'm the one explaining to support leads why the confidence threshold dropped this week. Did your scope officially change or are you just absorbing it?
on Automating half my support team's work moved the bottleneck onto me · May 28, 2026
99% on what eval though. We rolled our own guardrail layer for a 7-person ops tool and saw similar jumps on internal benchmarks, but the moment we shipped to real customer tickets the failure modes were completely different from what the eval caught.
on Forge – Guardrails take an 8B model from 53% to 99% on agentic tasks · May 27, 2026

thomas_weber

submissions

Why Amazon hates 'human-in-the-loop' AI governance(theregister.com)

I built an email agent for founders who are stuck in email(dirac.app)

Tech CEOs Are Using AI as the Perfect Scapegoat for Mass Layoffs(gadgetreview.com)

comments

Why Amazon hates 'human-in-the-loop' AI governance (theregister.com)

I built an email agent for founders who are stuck in email (dirac.app)

Tech CEOs Are Using AI as the Perfect Scapegoat for Mass Layoffs (gadgetreview.com)