Hermes: The AI Agent That Learns From Itself

Here is a strange thing about most AI agents: they are brilliant and they have amnesia at the same time.

You spend Monday teaching one how your project is laid out, which commands to run, the quirks of your setup. It nails the task. Then Tuesday comes, you open a fresh session, and it is a blank slate again. Everything you taught it, gone. You are onboarding the same new hire every single morning, forever. It is exhausting, and it is the quiet ceiling on how useful these things have been.

Hermes, an open-source agent from Nous Research, is one of the first serious attempts to knock that ceiling out. Its whole pitch is in the tagline: the agent that grows with you. Instead of forgetting, it learns, it writes down what worked as a reusable skill, remembers it across sessions, and gets faster and more reliable at the things you actually do. Lately it has been the agent everyone’s arguing about, and it’s usually argued about next to a rival called OpenClaw. So let me walk you through what it is, why it matters, how the two differ, and, honestly, where Hermes still stumbles. No hype, just what the research shows.

The core idea: an agent with a memory that compounds

A normal agent

Mon: learns your setup ✓

Tue: forgot everything

Wed: forgot everything

Thu: forgot everything

Flat. Every day starts at zero.

Hermes

Mon: learns, saves a skill ✓

Tue: reuses it, refines it ↑

Wed: faster still ↑

Thu: near-instant ↑

Compounding. Each day builds on the last.

The whole difference in one picture. A standard agent's competence is flat, reset to zero each session. Hermes's competence is a staircase, because it keeps what it learns. That shift from flat to compounding is the entire reason people are paying attention.

If you’ve read my post on Skills, this will feel familiar, and that’s the point. A Skill is a folder with instructions an agent can pull in when a task matches. The twist Hermes adds is: the agent writes its own Skills. You don’t author them by hand. When Hermes finishes something tricky, it distills what worked into a reusable skill file and files it away. Next time a similar task shows up, it reaches for that skill instead of figuring it all out again.

How the learning loop actually works

Under the hood it’s a loop, and once you see the shape it’s not mysterious at all. It sits around the normal agent loop (think, act, observe) and adds a fourth beat: learn.

Hermes does a task, distills what worked into a skill, stores it in persistent memory, and reuses and sharpens it the next time. Round and round, the library of skills grows. Concretely: it writes reusable Markdown skill files and keeps outcomes in a searchable store (it uses SQLite full-text search plus LLM summarization to recall across sessions).

So when you come back tomorrow, the skills are already there. Run something similar and it leans on what it learned, and executes faster. That “come back the next day and it remembers” behaviour is the headline feature, and it’s baked into the architecture rather than bolted on.

Why this matters now (and why it hit #1)

Watch what compounding does over even a short week. This is the intuition, a rough sketch, not a benchmark, of why a self-improving agent pulls ahead:

Day 1

1 skill

Day 3

6 skills

Day 7

18 skills

Day 14

40+ skills

Illustrative, not measured: the point is the shape. A self-learning agent's usefulness curves upward as its skill library fills, while a forgetful one stays a flat line. This "depth over time" is exactly the bet that, per reports, pushed Hermes to the top of OpenRouter's daily-usage rankings in 2026, a meaningful slice of developers choosing depth of learning over sheer breadth of reach.

Hermes vs OpenClaw: two philosophies, not two products

You cannot read about Hermes without bumping into OpenClaw, its main rival, and the comparison is genuinely useful because they disagree at a deep level. They’re not two versions of the same thing; they’re two different bets about what a personal AI agent should be.

Hermes · Nous Research Depth of learning

"Grow with the user over time."

Self-improving loop: writes its own skills
Persistent, curated memory across sessions
Gets faster at your recurring work
Bet: a private assistant that compounds

OpenClaw Breadth of reach

"Be everywhere, do everything now."

Central gateway wiring 50+ messaging channels
Human-authored skills, batteries included
Fast to deploy, tooling out of the box
Bet: a hub that reaches every channel

The rivalry in one line: Hermes is built around a self-improving agent that learns; OpenClaw is built around a control-plane gateway that connects everywhere. Depth versus breadth. Neither is "right", they're optimized for different things.

Put plainly: if you want an agent that reaches you on twenty-five messaging channels and works out of the box, OpenClaw’s breadth wins. If you want a private assistant that quietly gets better at your specific recurring work, Hermes’s depth wins. The clean way to choose is to ask whether you value reach or learning more for your use.

Dimension	Hermes	OpenClaw
Core idea	Self-improving skill loop	Central gateway to many channels
Skills	Agent writes its own	Human-authored
Memory	Persistent, curated, cross-session	Session / gateway-centric
Strength	Depth: learns your work	Breadth: 50+ channels, fast setup
Setup time	Longer (2 to 4 hours)	Shorter (under 30 min)
License	Open source (MIT)	Open source

A side-by-side, kept honest. Both are open source. The real fork is philosophy: learning depth vs channel breadth, and that flows into everything else including how long it takes to get going.

What it looks like in real life: three examples

Abstract talk of “self-improving” only lands with concrete scenes. Here’s the same capability across three very different users:

General

Your personal assistant You ask it to plan trips a certain way, book with certain preferences. By week two it just *knows* your style, aisle seat, no red-eyes, and stops re-asking. It learned you.

Technical

A coding teammate First time, it fumbles your deploy process. It saves a skill for it. Next deploy, it runs the exact steps, your build flags, your test subset, without being re-taught. The tenth deploy is muscle memory.

Marketing

A content operator It drafts your newsletter, learns your voice and the sections you always want, remembers which subject-line style performed. Each issue costs you less editing than the last.

Same underlying mechanism, three worlds. A traveller, a developer, a marketer, each gets an agent that starts generic and becomes *theirs*. The value isn't any single task; it's the slope, the fact that next week is easier than this week.

It runs about anywhere too, a $5 VPS, a GPU box, or serverless, and you reach it through the channels you already use (Telegram, Discord, Slack and more). It’s model-agnostic (200+ models via Nous Portal, OpenRouter, OpenAI-compatible endpoints, or local Ollama), ships with 40-plus built-in tools, and speaks MCP, so it plugs into the same tool ecosystem I wrote about in the MCP post. In other words, it isn’t an island; it’s a learning loop wrapped around the agent ideas you already know.

The honest part: where Hermes falls short

A teaching post that only sells you the upside isn’t teaching, it’s advertising. So here’s the balanced ledger, straight from what practitioners report.

Genuine strengths

Memory compounds: recurring work gets faster and more reliable over time

Auto-generated skills, no hand-authoring each workflow

One-command install; runs on cheap hardware

Open source (MIT), model-agnostic, MCP-native, 40+ tools

Real limitations

Self-learning is OFF by default; new users don't enable it and see a plain agent

Slow to set up: 2 to 4 hours vs under 30 min for OpenClaw

No managed cloud; you host it yourself

In fuzzy domains it can get confidently faster at the *wrong* thing, no ground truth to check against

Learning is domain-specific; a skill for "summarize a PR" won't transfer to "plan a DB migration"

The two that matter most are on the right. "Self-learning off by default" means many people try Hermes and never actually see its whole point. And "faster at the wrong thing" is the deep one: an agent that reinforces its own habits needs a way to know it's improving toward *correct*, not just toward *confident*.

That last limitation is worth sitting with, because it’s the real intellectual catch of self-improving agents in general. Learning from your own experience is powerful when there’s a clear signal for “did that work?”, tests passed, the deploy succeeded, the user said yes. In domains without that clear signal, a self-reinforcing agent can happily get more efficient at a mistake. Speed is not the same as correctness. Any agent that trains on itself inherits this, and it’s exactly the kind of thing worth designing guardrails around, the same “keep a human in the loop for the ambiguous, irreversible calls” instinct that good agent design already demands.

The takeaway

Hermes is a bet that the next leap in agents isn’t a smarter model, it’s a better memory. Make the agent write down what it learns, keep it across sessions, and let competence compound instead of resetting to zero every morning. That’s a genuinely different shape from the forgetful assistants we’ve lived with, and it’s why it climbed to the top of the usage charts.

It’s not magic, and it’s not free. It asks for setup patience, a flipped-on config switch most people miss, and a clear-eyed awareness that an agent improving on its own can drift confidently wrong where there’s no ground truth. But the core idea, an agent that grows with you instead of forgetting you, is the right direction, and Hermes is one of the clearest, most open expressions of it yet. Whether you pick it or OpenClaw comes down to one honest question: do you want reach, or do you want an agent that learns you?