Launch HN: Terminal Use (YC W26) – Vercel for filesystem-based agents

46 poäng - 25 kommentarer - 11024 sekunder sedan

Hello Hacker News! We're Filip, Stavros, and Vivek from Terminal Use (https://www.terminaluse.com/). We built Terminal Use to make it easier to deploy agents that work in a sandboxed environment and need filesystems to do work. This includes coding agents, research agents, document processing agents, and internal tools that read and write files.

Here's a demo: https://www.youtube.com/watch?v=ttMl96l9xPA.

Our biggest pain point with hosting agents was that you'd need to stitch together multiple pieces: packaging your agent, running it in a sandbox, streaming messages back to users, persisting state across turns, and managing getting files to and from the agent workspace.

We wanted something like Cog from Replicate, but for agents: a simple way to package agent code from a repo and serve it behind a clean API/SDK. We wanted to provide a protocol to communicate with your agent, but not constraint the agent logic or harness itself.

On Terminal Use, you package your agent from a repo with a config.yaml and Dockerfile, then deploy it with our CLI. You define the logic of three endpoints (on_create, on_event, and on_cancel) which track the lifecycle of a task (conversation). The config.yaml contains details about resources, build context, etc.

Out of the box, we support Claude Agent SDK and Codex SDK agents. By support, we mean that we have an adapter that converts from the SDK message types to ours. If you'd like to use your own custom harness, you can convert and send messages with our types (Vercel AI SDK v6 compatible). For the frontend, we have a Vercel AI SDK provider that lets you use your agent with Vercel's AI SDK, and have a messages module so that you don't have to manage streaming and persistence yourself.

The part we think is most different is storage.

We treat filesystems as first-class primitives, separate from the lifecycle of a task. That means you can persist a workspace across turns, share it between different agents, or upload / download files independent of the sandbox being active. Further, our filesystem SDK provides presigned urls which makes it easy for your users to directly upload and download files which means that you don't need to proxy file transfer through your backend.

Since your agent logic and filesystem storage are decoupled, this makes it easy to iterate on your agents without worrying about the files in the sandbox: if you ship a bug, you can deploy and auto-migrate all your tasks to the new deployment. If you make a breaking change, you can specify that existing tasks stay on the existing version, and only new tasks use the new version.

We're also adding support for multi-filesystem mounts with configurable mount paths and read/write modes, so storage stays durable and reusable while mount layout stays task-specific.

On the deployment side, we've been influenced by modern developer platforms: simple CLI deployments, preview/production environments, git-based environment targeting, logs, and rollback. All the configuration you need to build, deploy & manage resources for your agent is stored in the config.yaml file which makes it easy to build & deploy your agent in CI/CD pipelines.

Finally, we've explicitly designed our platform for your CLI coding agents to help you build, test, & iterate with your agents. With our CLI, your coding agents can send messages to your deployed agents, and download filesystem contents to help you understand your agent's output. A common way we test our agents is that we make markdown files with user scenarios we'd like to test, and then ask Claude Code to impersonate our users and chat with our deployed agent.

What we do not have yet: full parity with general-purpose sandbox providers. For example, preview URLs and lower-level sandbox.exec(...) style APIs are still on the roadmap.

We're excited to hear any thoughts, insights, questions, and concerns in the comments below!

Kommentarer (9)

rodchalski - 3181 sekunder sedan
The K8s-vs-agent-infra debate here is interesting. K8s gives you process and network isolation. What it doesn't give you: per-task authorization scope.
An agent container has a credential surface defined at deploy time. That surface doesn't change between task 1 ("read this repo") and task 2 ("process this user upload"). If the agent is prompt-injected during task 1, it carries the same permissions into task 2.
The missing primitives aren't infra — they're policy: what is this agent authorized to do with the data it can reach, on a per-task basis? Can it write, or only read? Can it exfil to an external URL, or only to /output? And crucially: is there an append-only record of what it actually did, so you can audit post-incident?
K8s handles the container boundary. The authorization layer above that — task-scoped grants, observable action ledger, revocation mid-task — isn't solved by existing infra abstractions. That gap is real regardless of whether you use K8s, Modal, or something like this.
adi4213 - 5314 sekunder sedan
This is really interesting, congrats on the launch. The use case I’m trying to solve for is building a coding agent platform that reliably sets up our development stack well. Few questions! In my case, I’m trying to build a one-shot coding agent platform that nicely spins up a docker-in-docker Supabase environment, runs a NextJS app, and durably listens to CI and iterates.
1) Can I use this with my ChatGPT pro or Claude max subscription? 2)
CharlesW - 8759 sekunder sedan
> We built Terminal Use to make it easier to deploy agents that work in a sandboxed environment and need filesystems to do work.
When I read this, I think of Fly.io's sprites.dev. Is that reasonable, or do you consider this product to be in a different space? If the latter, can you ELI5?
messh - 1949 sekunder sedan
how does it compare to https://shellbox.dev? (and others like exe.dev, sprites.dev, and blaxel.ai)
thesiti92 - 9296 sekunder sedan
have you guys found any of the existing nfs tools helpful (archil, daytona volumes, ...) or did you have to roll your own? i guess i have the same question for checkpointing/retrying too. it feels like the market of tools is very up in the air right now.
oliver236 - 5105 sekunder sedan
is this a replacement to langgraph?
verdverm - 11588 sekunder sedan
Can you explain why everyone thinks we should use new tools to deploy agents instead of our existing infra?
eg. I already run Kubernetes
octoclaw - 6833 sekunder sedan
[dead]
aplomb1026 - 8748 sekunder sedan
[dead]