Don't Build an Agent
Learnings from building our own media generation agent at Prism. Agents are the new commodities — host an agent like Hermes and give it tools, skills, and a system prompt.

TL;DR: Agents are the new commodities. Don't build your own agent. Host an agent (like Hermes) and give it tools, skills, and a system prompt. We're launching an API that makes this process easy.
Table of Contents
- Noob mistake
- Making a real agent
- What an agent dev needs
- Managed agents
- Fin
Noob mistake
For prismvideos.com, we shipped a media generation agent built on Vercel AI Agents SDK. Our agent understood which model to recommend to users, could generate images and videos, and could analyze videos and tell users how to recreate them. It was beautiful.
Days later, Higgsfield launched Supercomputer, an agent with observational memory (memory across sessions), skills, automations, a computer, and a filesystem. It would have taken us weeks to add these features. Higgsfield's agent wasn't built with Vercel AI SDK, Claude Agents SDK, or OpenAI Agents SDK. Higgsfield's Supercomputer wraps Hermes, the open-source personal agent with 185k+ GitHub stars (at this time of writing).
I thought Hermes was a fad for nerds (like myself). But I realized if we used Hermes as a primitive, we could get session management (per-session memory and compaction), built-in tools (web search, browser, file system navigation), skills, self-learning, and automations for free. Customers could ask our agent, "every week look at our top-performing influencer video from last week and make five variations" - a true magic moment.
Making a real agent
We deleted our existing agent, and we launched an EC2 instance with a Hono server. The server created a Hermes agent in a Docker container for every customer. It also acted as a reverse proxy for passing messages between our app and the Hermes gateway. Now, we communicate with every user's Hermes agent over a WebSocket connection.
Rather than building observational memory, skills, self-learning, automations, and a persistent filesystem, we only needed to focus on the engineering relevant to prismvideos.com. We can give the agent our system prompt, our tools for creating media and determining which models to use via MCP, our skills files (how to create UGC videos, storyboards, visual effects), and our connectors (Meta Ads Manager, Google Drive, Resend).
What an agent dev needs
As consumer-facing agents get better - Claude, ChatGPT, Manus - customer expectations rise (for B2B software too). The Claude app has memory, so now my CEO wants it. What about self-learning? Steering? Can we add the Ralph Wiggum loop?
Companies are pouring billions into research and development on agent harnesses. I have no doubt that there will be a new agent harness after Hermes with a new feature everyone wants (it appears the new thing right now is "dreaming"). It is highly unlikely that an AI agent startup becomes wealthy by creating the best harness for a particular use case. If anything, they only expose themselves to the risk that a competitor ships a more feature-complete agent when the next harness arrives. AI agent startups are most likely to create differentiated value by integrating with their customers' proprietary data and learning their preferences.
The agent is the new primitive. Existing agent frameworks require developers to set up:
- session management (in some cases)
- tools (in some cases)
- memory
- self-learning ("dreaming")
- automations
- persistent filesystem
- container or sandboxed deployment
- skills
- MCP servers
But one through seven are part of any agent application.
By programmatically creating Hermes instances (like in the following way), developers get the agent and the infrastructure in a single API call:
POST /v1/deployments
Authorization: Bearer $PRISM_API_KEY
Content-Type: application/json
{
"customer_id": "cus_123",
"name": "Acme Creative Agent",
"runtime": "hermes",
"model": "anthropic/claude-sonnet-4.5",
"system_prompt": "You are Acme's media generation agent. Help the user plan, create, and iterate on high-performing short-form videos.",
"sandbox": {
"enabled": true,
"type": "docker",
"persistent_filesystem": true
},
"mcp_servers": [
{
"name": "prism-media",
"url": "https://api.prismvideos.com/mcp",
"tools": [
"search_models",
"get_model_schema",
"get_pricing",
"generate_image",
"generate_video",
"generate_audio"
]
}
],
"skills": [
{
"name": "ugc-video-creation",
"source": "file",
"path": ".prism/skills/ugc-video-creation/SKILL.md"
},
{
"name": "storyboarding",
"source": "inline",
"content": "---\nname: storyboarding\ndescription: Create shot-by-shot storyboards for short-form videos\n---\n# Storyboarding\n..."
},
{
"name": "social-media-visual-effects",
"source": "url",
"url": "https://example.com/skills/social-media-visual-effects/SKILL.md"
}
],
"secrets": {
"META_ADS_TOKEN": "sec_meta_ads_token",
"GOOGLE_DRIVE_TOKEN": "sec_google_drive_token"
},
"features": {
"memory": true,
"dreaming": true,
"automations": true,
"steering": true,
"filesystem_webhooks": true
}
}
Response:
{
"deployment_id": "dep_7xK9s2",
"customer_id": "cus_123",
"runtime": "hermes",
"status": "ready",
"model": "anthropic/claude-sonnet-4.5",
"thread_id": "thr_default_8a1",
"filesystem": {
"workspace_path": "/workspace",
"persistent": true
},
"events": {
"transport": "sse",
"url": "https://api.prismagents.com/v1/deployments/dep_123/events"
}
}
Bring a system prompt, skills, tools, and connectors and get an endpoint to chat with an agent over SSE.
Managed agents
There are a number of schleps deploying an agent requires. Harness-engineering should not be one of them. The same insight that prompted us to create this api for managing Hermes installations is likely the same one behind LangChain Managed Deep Agents and Claude Managed Agents. LangChain Managed Deep Agents is a hosted runtime for deploying AI agents. Developers bring their system prompt, MCP tools, skills, and subagent definitions and receive an agent ID to chat with their agent. Likewise, Claude Managed Agents gives developers the agent and the infrastructure in a single API call.
LangChain Managed Deep Agents is a powerful abstraction but doesn't expose automations, comes without built-in self-learning, and persistent goals (Ralph Wiggum loop).
Claude Managed Agents has self-learning in research preview, but likewise doesn't expose automations, persistent goals, or accept video inputs via API (a restriction of their models).
The following details cover the difference between our API and their offerings:
| Capability | Managed Hermes Agents | LangChain Managed Deep Agents | Claude Managed Agents |
|---|---|---|---|
| No provider lock-in | ✓ | ✓ | ✗ |
| Session management | ✓ | ✓ | ✓ |
| Agent + infrastructure in one API call | ✓ | ✓ | ✓ |
| Observational memory | ✓ | ✓ | ✓ |
| Built-in tools: web search, browser, file search | ✓ | ✓ | ✓ |
| Persistent filesystem | ✓ | ✓ | ✓ |
| Image & video input | ✓ | ✗ | ✗ |
| Per-container isolation | ✓ | ✓ | ✓ |
| Credential management | ✓ | ✓ | ✓ |
| Automations | ✓ | ✗ | ✗ |
| Subagents | ✓ | ✓ | ✓ |
| Dreaming | ✓ | ✗ | ✓ |
| Ralph Wiggum loop | ✓ | ✗ | ✗ |
| Steering | ✓ | ✗ | ✓ |
Fin
If you're a developer with a customer-facing chat product, ping me rajit [at] prismvideos [dot] com. We are happy to build your agent for you :).
What's Next?
- Thanks to Alex Liu, Land Tantichot, Mom, Dad, Vivek Hazari, Daniel DiPietro and Stepan Parunashvili for reading drafts of this post.
- Share this post on Twitter/X or LinkedIn.
