Hermes: My Personal AI Agent, and What It Actually Automates
A markdown vault synced across devices, a cron-scheduled agent that runs weekly reflections on my own journal, and the line between script-only jobs and ones that need a real LLM. Here's what's actually running, and what it's caught.
What Hermes Actually Is#
Hermes is a scheduled agent, not a chatbot I talk to. It runs on a cron, reads a plain markdown vault, and does specific jobs — some of them a shell script could do, some of them need an actual model reading the content.
The vault itself is a PARA-style folder structure I call HQ: inbox/ for unsorted dumps, areas/ for ongoing life domains (finance, health, career, one folder per active project), reference/ for lookup docs I rarely edit, journal/ for daily notes named YYYY-MM-DD.md, automations/ for the scripts and schedules themselves. One concern per file, dates in filenames so everything sorts chronologically. No database, no app — just markdown, synced with git.
Two Kinds of Automation: no_agent vs Real LLM#
Every automation gets logged with what it does, when it runs, and how — and the "how" splits cleanly into two categories.
Script-only (no_agent mode). Deterministic jobs that don't need a model at all. My git sync between devices runs every 2 hours this way: pull, push, done. Reminders for known dates (a bill due, a visa renewal) are the same — the script reads a date, compares it to today, sends a notification. No LLM needed because the logic is "if date == today, notify." Running these as full agent invocations would be pure token overhead for something cron + bash already does correctly.
Full agent (scheduled cloud routine). Jobs that need judgment — reading unstructured text and deciding something. My weekly reflection runs every Sunday at 8pm: it reads the week's journal entries, looks for repetitive tasks that show up often enough to be worth automating, and cross-checks a goals file for milestone dates that have already passed without being marked hit. That last check isn't pattern-matching a regex — it's reading a date, reading today's date, and reasoning about whether a milestone should already have fired. It caught exactly that once: a milestone date had passed silently, un-flagged, until the reflection routine noticed and surfaced it for a manual re-date.
The split matters because it's tempting to route everything through an LLM once you have one running on a schedule. Most of what I actually automate doesn't need one.
The Failure Mode: an Agent That Reviews Itself#
The weekly reflection catching its own missed milestone is the part worth dwelling on. It's not a human noticing a bug and writing a test for it — it's a scheduled agent reading the same vault a human would read, applying the same "wait, this date already passed" logic a human would apply, and surfacing it before I would have noticed on my own. The automation isn't just running tasks, it's auditing whether other tasks — and other automations — are actually keeping up.
That's a different design goal than most personal-automation setups, which tend to stop at "remind me of X on schedule Y." Building in a layer that periodically asks "is anything here stale, silently failed, or worth turning into its own automation" is what makes this feel less like a pile of cron jobs and more like an actual ops system.
The Local Viewer#
Because the vault is just markdown files, I built a single-file, stdlib-only Python server (~700 lines, zero dependencies) to browse it without opening a folder tree in Finder. It walks the workspace lazily — it never descends into a ~38GB media folder unless I explicitly open it — renders markdown, previews images, aggregates open todos across every file, surfaces recently-edited notes, does full-text search, and lets me edit a note in place from the browser. One command, opens on localhost:8787, stays local to the machine.
The point of keeping it stdlib-only: this is a personal tool I want to still work in five years without an npm install fighting dependency rot. Markdown files and a script that reads them age better than almost any framework choice.
What I'd Change#
The git-sync-every-2-hours approach works but isn't real-time — if I edit a note on one device and need it on another immediately, I'm waiting up to 2 hours or triggering a manual sync. A file-watcher-triggered sync instead of a fixed interval is the obvious next step.
The other gap: automations only get logged when I remember to log them. There's no enforcement that a new script actually gets an entry in the schedule log — it's a convention, not a constraint. For a system whose whole value is "I can see everything that's running," an unlogged automation is a blind spot in the exact thing it's supposed to prevent.
Freelance
Need help with this?
I help startups and businesses ship web projects: migrations, new products, and performance fixes.
Get in touch →