My AI Adoption Journey

mitchellh.com - 178 poäng - 52 kommentarer - 12664 sekunder sedan

Kommentarer (22)

mjr00 - 6772 sekunder sedan
> Break down sessions into separate clear, actionable tasks. Don't try to "draw the owl" in one mega session.
This is the key one I think. At one extreme you can tell an agent "write a for loop that iterates over the variable `numbers` and computes the sum" and they'll do this successfully, but the scope is so small there's not much point in using an LLM. On the other extreme you can tell an agent "make me an app that's Facebook for dogs" and it'll make so many assumptions about the architecture, code and product that there's no chance it produces anything useful beyond a cool prototype to show mom and dad.
A lot of successful LLM adoption for code is finding this sweet spot. Overly specific instructions don't make you feel productive, and overly broad instructions you end up redoing too much of the work.
libraryofbabel - 975 sekunder sedan
This is such a lovely balanced thoughtful refreshingly hype-free post to read. 2025 really was the year when things shifted and many first-rate developers (often previously AI skeptics, as Mitchell was) found the tools had actually got good enough that they could incorporate AI agents into their workflows.
It's a shame that AI coding tools have become such a polarizing issue among developers. I understand the reasons, but I wish there had been a smoother path to this future. The early LLMs like GPT-3 could sort of code enough for it to look like there was a lot of potential, and so there was a lot of hype to drum up investment and a lot of promises made that weren't really viable with the tech as it was then. This created a large number of AI skeptics (of whom I was one, for a while) and a whole bunch of cynicism and suspicion and resistance amongst a large swathe of developers. But could it have been different? It seems a lot of transformative new tech is fated to evolve this way. Early aircraft were extremely unreliable and dangerous and not yet worthy of the promises being made about them, but eventually with enough evolution and lessons learned we got the Douglas DC-3, and then in the end the 747.
If you're a developer who still doesn't believe that AI tools are useful, I would recommend you go read Mitchell's post, and give Claude Code a trial run like he did. Try and forget about the annoying hype and the vibe-coding influencers and the noise and just treat it like any new tool you might put through its paces. There are many important conversations about AI to be had, it has plenty of downsides, but a proper discussion begins with close engagement with the tools.
EastLondonCoder - 5275 sekunder sedan
This matches my experience, especially "don’t draw the owl" and the harness-engineering idea.
The failure mode I kept hitting wasn’t just "it makes mistakes", it was drift: it can stay locally plausible while slowly walking away from the real constraints of the repo. The output still sounds confident, so you don’t notice until you run into reality (tests, runtime behaviour, perf, ops, UX).
What ended up working for me was treating chat as where I shape the plan (tradeoffs, invariants, failure modes) and treating the agent as something that does narrow, reviewable diffs against that plan. The human job stays very boring: run it, verify it, and decide what’s actually acceptable. That separation is what made it click for me.
Once I got that loop stable, it stopped being a toy and started being a lever. I’ve shipped real features this way across a few projects (a git like tool for heavy media projects, a ticketing/payment flow with real users, a local-first genealogy tool, and a small CMS/publishing pipeline). The common thread is the same: small diffs, fast verification, and continuously tightening the harness so the agent can’t drift unnoticed.
keyle - 1174 sekunder sedan
It's amusing how everyone seems to be going through the same journey.
I do run multiple models at once now. On different parts of the code base.
I focus solely on the less boring tasks for myself and outsource all of the slam dunk and then review. Often use another model to validate the previous models work while doing so myself.
I do git reset still quite often but I find more ways to not get to that point by knowing the tools better and better.
Autocompleting our brains! What a crazy time.
cal_dent - 485 sekunder sedan
Just wanted to say that was a nice and very grounded write up; and as a result very informative. Thank you. More stuff like this is a breath of fresh air in a landscape that has veered into hyperbole territory both in the for and against ai sides
sho_hn - 6798 sekunder sedan
Much more pragmatic and less performative than other posts hitting frontpage. Good article.
underdeserver - 3206 sekunder sedan
> At a bare minimum, the agent must have the ability to: read files, execute programs, and make HTTP requests.
That's one very short step removed from Simon Willison's lethal trifecta.
senko - 2726 sekunder sedan
For those wondering how that looks in practice, here's one of OP's past blog posts describing a coding session to implement a non-trivial feature: https://mitchellh.com/writing/non-trivial-vibing (covered on HN here: https://news.ycombinator.com/item?id=45549434)
davidw - 2846 sekunder sedan
This seems like a pretty reasonable approach that charts a course between skepticism and "it's a miracle".
I wonder how much all this costs on a monthly basis?
pton_xd - 1571 sekunder sedan
Nice writeup!
For those using Emacs, is there a Magit-like interface for interacting with agents? I'd be keen on experimenting with something like that.
raphinou - 6527 sekunder sedan
I recently also reflected on the evolution of my use of ai in programming. Same evolution, other path. If anyone is interested: https://www.asfaload.com/blog/ai_use/
butler14 - 5550 sekunder sedan
I'd be interested to know what agents you're using. You mentioned Claude and GPT in passing, but don't actually talk about which you're using or for which tasks.
0xbadcafebee - 1992 sekunder sedan
> I'm not [yet?] running multiple agents, and currently don't really want to
This is the main reason to use AI agents, though: multitasking. If I'm working on some Terraform changes and I fire off an agent loop, I know it's going to take a while for it to produce something working. In the meantime I'm waiting for it to come back and pretend it's finished (really I'll have to fix it), so I start another agent on something else. I flip back and forth between the finished runs as they notify me. At the end of the day I have 5 things finished rather than two.
The "agent" doesn't have to be anything special either. Anything you can run in a VM or container (vscode w/copilot chat, any cli tool, etc) so you can enable YOLO mode.
mwigdahl - 7334 sekunder sedan
Good article! I especially liked the approach to replicate manual commits with the agent. I did not do that when learning but I suspect I'd have been much better off if I had.
fix4fun - 6204 sekunder sedan
Thanks for sharing your experiences :)
You mentioned "harness engineering". How do you approach building "actual programmed tools" (like screenshot scripts) specifically for an LLM's consumption rather than a human's? Are there specific output formats or constraints you’ve found most effective?
apercu - 2393 sekunder sedan
I find it interesting that this thread is full of pragmatic posts that seem to honestly reflect the real limits of current Gen-Ai.
Versus other threads (here on HN, and especially on places like LinkedIn) where it's "I set up a pipeline and some agents and now I type two sentences and amazing technology comes out in 5 minutes that would have taken 3 devs 6 months to do".
polyrand - 2193 sekunder sedan
> a period of inefficiency
I think this is something people ignore, and is significant. The only way to get good at coding with LLMs is actually trying to do it. Even if it's inefficient or slower at first. It's just another skill to develop [0].
And it's not really about using all the plugins and features available. In fact, many plugins and features are counter-productive. Just learn how to prompt and steer the LLM better.
[0]: https://ricardoanderegg.com/posts/getting-better-coding-llms...
jonathanstrange - 2816 sekunder sedan
There are so many stories about how people use agentic AI but they rarely post how much they spend. Before I can even consider it, I need to know how it will cost me per month. I'm currently using one pro subscription and it's already quite expensive for me. What are people doing, burning hundreds of dollars per month? Do they also evaluate how much value they get out of it?
jeffrallen - 2859 sekunder sedan
> babysitting my kind of stupid and yet mysteriously productive robot friend
LOL, been there, done that. It is much less frustrating and demoralizing than babysitting your kind of stupid colleague though. (Thankfully, I don't have any of those anymore. But at previous big companies? Oh man, if only their commits were ONLY as bad as a bad AI commit.)
vonneumannstan - 7434 sekunder sedan
For the AI skeptics reading this, there is an overwhelming probability that Mitchell is a better developer than you. If he gets value out of these tools you should think about why you can't.
xyst - 4629 sekunder sedan
[flagged]
therein - 7654 sekunder sedan
[flagged]