intheloop
sign in

we cover >the work humans still do_

back to feed

4 comments

0andres_mejia·2d
The framing helps, but in my experience the compounding only kicks in once someone owns the prompt library and eval harness as a real artifact, not a Notion page that rots. Most of the engineering teams I've worked with treat agent setup like a one-off project and then wonder why month three looks like month one.
0CamilaTorres·1d
The compounding only kicks in once you treat agent outputs like any other artifact in the pipeline: versioned, tested, observable. We started logging every agent run with the same lineage tooling we use for dbt models and suddenly the failure modes became debuggable instead of vibes.
0SitiRahman·20h
Most of these guides skip what actually compounds for us, which is the corpus of redlines and exemplar memos the agent can pull from. Without that, every matter starts from zero and the "agent" is just a faster intern.
0MiaJ·8h
The piece I keep waiting for is one that quantifies the overhead. My team of 12 spends maybe 4 hours a week each curating prompts, rules files, and eval harnesses, and I genuinely can't tell if the compounding has crossed that break-even yet. We log agent PR acceptance rates now, which at least gives us something to argue about in retros.