Hermes Agent

Build your own Agent with Nous Hermes open-weight models

👤For: AI engineers who want to self-host LLMs / Developers frustrated by GPT/Claude refusal policies / Teams building Agents who need reliable tool calling

⏱️3-5 weeks

📊Intermediate

Hermes is a family of open-weight models from Nous Research, fine-tuned on Llama. Hermes 4, released August 2025, comes in 14B, 70B, and 405B variants. The 405B hits 96.3% on MATH-500 and 81.9% on AIME'24 — frontier-tier numbers. Two things make it stand out: **native hybrid reasoning** (switch between think/fast modes) and **RefusalBench 57.1%** — the lowest refusal rate of any evaluated model (GPT-4o: 17.67%, Claude Sonnet 4: 17%).

Why learn it specifically? Because every team building Agents eventually hits two walls: **is tool calling actually reliable?** and **will the model refuse my business case?** Hermes 3 started training `<tool_call>` JSON emission into the weights directly — no external parsing hacks needed. Hermes 4 scaled the post-training corpus from 1M samples / 1.2B tokens to ~5M samples / ~60B tokens blended across reasoning and non-reasoning data.

💰 Salary reference (2026): Self-hosted LLM + Agent framework roles pay AU $160K-$250K, US $180K-$350K total comp. The demand isn't "can you call an API" — it's deployment, tool use tuning, and guardrails.

🏢 Hiring companies: Nous Research, Together AI, OpenRouter, Replicate; finance/healthcare/defense companies that can't send data to OpenAI; every startup building an "AI Agent platform."

This track assumes you already know how to call an LLM API. If not, start with chapters 01-05 of the AI Engineer track.

30-Second Quick Start

Try Hermes 4 14B in 30 seconds — local Ollama is enough.

# 装 Ollama 后
ollama pull hermes3:70b   # 或 hermes3:8b 本机能跑

# 命令行对话
ollama run hermes3:70b "给我一段 Python 代码，用 requests 调用一个 REST API 并重试 3 次"

# 或者用 OpenAI 兼容 API
curl http://localhost:11434/v1/chat/completions \
  -d '{"model": "hermes3:70b", "messages": [{"role": "user", "content": "Hi"}]}'

No GPU locally? Try Hermes 3 405B via OpenRouter free tier: `nousresearch/hermes-3-llama-3.1-405b:free`. Chapters 6 and 7 cover all three deployment options.

What You Will Learn

In this tutorial, you will learn:

✓Explain the real differences between Hermes 3 vs Hermes 4 and 14B/70B/405B — know which to pick when
✓Run Hermes both locally and via cloud (OpenRouter / Together), with real cost comparison
✓Master the native `<tool_call>` format — drop the hand-rolled parsers and swap Hermes in for GPT/Claude function calling
✓Build a multi-step autonomous Agent with Hermes + LangGraph, including tool calls, state recovery, and LangSmith tracing
✓Understand uncensored / neutral alignment, its operational risk, and ship basic guardrails before production

Chapter Overview

Quick preview by section - jump directly to what interests you.

Section

01 What is Hermes

Origins of the Hermes series, Nous Research roadmap, Hermes 3 → 4 evolution

2 lessonsReading / Visual

Enter 01 What is Hermes

Section

02 Architecture & Core Concepts

Parameters, training data, B200 cluster, flex attention, DPO strategy

3 lessonsReading / Visual

Enter 02 Architecture & Core Concepts

Section

03 Running Locally & in the Cloud

Install Ollama, pull weights, VRAM sizing, OpenAI-compatible API

3 lessonsReading / Visual

Enter 03 Running Locally & in the Cloud

Section

04 Building Agents with Hermes

State Graph, Tool Node, checkpointer, interrupt — build a working research Agent from scratch

2 lessonsReading / Visual

Enter 04 Building Agents with Hermes

Section

05 Production & Cost

Real numbers: Hermes 4 70B monthly cost / per-token price / GPU amortization

2 lessonsReading / Visual

Enter 05 Production & Cost

Section

06 Ecosystem & Next Steps

Distributed training, inference network, datasets from Nous — and Hermes roadmap

1 lessonsReading / Visual

Nous Ecosystem — DisTrO / Psyche / OpenHermes Datasets25 min

Enter 06 Ecosystem & Next Steps

🧠

Context Engineering

The next-generation LLM discipline named by Karpathy

View details →

🤖

AI Engineer

Move from using AI to building with it

View details →

🖼️

AI 视觉创作

gpt-image-2 deep dive · scale across 9 social platforms

View details →

30-Second Quick Start

What You Will Learn

Chapter Overview

You might also like

Context Engineering

AI Engineer

AI 视觉创作