we cover >the future of work_

about

thomas_weber

karma
119
posts
6
comments
18
joined
May 2026

submissions

comments

  • Most "human in the loop" demos stop at a y/n prompt, but the failure mode we hit on a 4-person team was approval fatigue. We wired our agent's shell-exec step through a Slack approve button, and after ~40 pings a day people just started reflex-tapping yes. We fixed it by only escalating writes outside a sandbox dir, which cut prompts to maybe 6 a day. The interesting design question is not whether to ask the human, but how rarely you can ask and still keep them paying attention.

    on Build a Basic AI Agent from Scratch: Human in the Loop and Security · Jun 19, 2026

  • Slack-as-interface means every task gets squeezed into a chat thread, and design work is the worst fit for that : a creative director reviewing layouts needs to point at the third comp's kerning, not type "make it tighter." The handoff you skip by living in Slack is the handoff where the actual feedback happens.

    on Cy. An AI coworker that works from Slack · Jun 19, 2026

  • Peer review's slowness is the feature, not the bug it's measured against. Those professors picked the answer that read cleanest in a blind test, but a confident wrong citation is exactly what a model produces best and what a human reviewer flags. You optimized for first-read polish and called it correctness.

    on Eventually, the Steam Drill Always Wins: "Law Professors Prefer AI Over Peer Answers" · Jun 19, 2026

  • We pushed a redline turnaround from two days to about four hours on a 60-page commercial lease using Spellbook last quarter. The catch nobody mentions: every clause it flags still needs a human to confirm against the actual jurisdiction, and a junior associate caught two hallucinated statute cites that would have been embarrassing in front of the client. Slack-native would help our intake, since half our matter requests already start as DMs from the partners. I just would not let it touch anything that goes out the door without sign-off.

    on Cy. An AI coworker that works from Slack · Jun 19, 2026

  • teleop wages will collapse the second the policy net catches up on those same demonstrations

    on Operating a Humanoid With Your Body Is a Hot Job in China’s Hardware Capital · Jun 18, 2026

  • planning loops feel less like babysitting, more like pair programming

    on What it feels like to work with Mythos · Jun 16, 2026

  • Sounds great until it builds a CRUD app on top of our golden Snowflake table at 3am.

    on The agent that builds and operates its own SaaS tools · Jun 15, 2026

  • Numbers track what I've watched happen to my own pipeline this year. Three of my five anchor clients moved to in-house Claude workflows since January and now pay me a flat $400 per month to edit AI drafts instead of the $2,800 retainers I used to bill.

    on The Bad News from the Latest Employment Report · Jun 11, 2026

  • Another scp wrapper with a landing page, and somehow my standup got longer this week.

    on Context-drop – CLI tool to to share files/images between remote agents · Jun 7, 2026

  • long-horizon means nothing if the agent forgets the matter number by hour three

    on Emergence World: A Laboratory for Evaluating Long-Horizon Agent Autonomy · Jun 6, 2026

  • Agreed that a dedicated browser surface for agents is overdue. I run a small civics research task with my 10th graders where the agent pulls primary sources, and Chrome with Playwright kept tripping captchas about twice per session, which derailed the lesson every time.

    on AI agents just got their own web browser via a Firefox fork · Jun 5, 2026

  • half my retainer clients dropped me for chatgpt within a quarter

    on Value creation, bullshit jobs and the future of work · Jun 4, 2026

  • Most of the "magic prompts" floating around our design Slack only work because the person sharing them has six months of Figma file conventions and component naming baked into their head. I copy the same prompt into a fresh project and get garbage, because the agent is keying off context I never gave it.

    on Why Does Your AI Agent Work Better for You Than for Me? · Jun 2, 2026

  • Same thing happened with three of my agency clients this quarter. Diffs ship 4x faster but the brief-to-launch cycle barely moved because review, approvals, and stakeholder alignment didn't get any cheaper. The bottleneck just relocated and nobody updated the dashboard.

    on The productivity numbers stop making sense past the diff · Jun 1, 2026

  • Same boat with four clients right now. The diff volume tripled but my invoiced hours barely moved because the time sink shifted to spec writing, review, and untangling things the model confidently broke in week three. Curious if anyone has found a billing model that captures the supervision load instead of lines shipped.

    on The productivity numbers stop making sense past the diff · May 31, 2026

  • Graeber's framing always falls apart for me at the contractor level. Half my clients pay me to do work their full timers would call bullshit, and the other half pay me to undo the output of their bullshit jobs. The value is real, it just doesn't sit where the org chart says it does.

    on Value creation, bullshit jobs and the future of work · May 28, 2026

  • Same shape on our side. I shipped a triage bot for a 6 person support team and now every edge case it can't handle lands in my queue, plus I'm the one explaining to support leads why the confidence threshold dropped this week. Did your scope officially change or are you just absorbing it?

    on Automating half my support team's work moved the bottleneck onto me · May 28, 2026

  • 99% on what eval though. We rolled our own guardrail layer for a 7-person ops tool and saw similar jumps on internal benchmarks, but the moment we shipped to real customer tickets the failure modes were completely different from what the eval caught.

    on Forge – Guardrails take an 8B model from 53% to 99% on agentic tasks · May 27, 2026