SystemsAug 25, 20257 min read

Observability for a system only you will ever debug

Right-sizing observability means logging exactly enough to forgive yourself for what you'll have forgotten by next week.

Most observability advice is written for systems other people depend on. Dashboards, SLOs, on-call rotations, alert fatigue — all of it assumes a team, a pager, and a stranger who will inherit your mistakes at 3am. None of that is true for a homelab. The only person who will ever debug my media server, my reverse proxy, my backup cron, is me, and the version of me doing the debugging is six months removed from the version that built the thing and has forgotten every assumption it was based on.

That changes the math entirely. The job isn't uptime. The job is leaving a trail my future self can follow back to a decision he no longer remembers making.

The audience is one forgetful person

When you write logs for a team, you log for the lowest common denominator: someone who has never seen this code. That produces a lot of noise — context everyone on the team already shares, restated on every line. For a solo system the constraint inverts. I share all the context with the future debugger, because he is me. What I don't share is memory. He won't remember that the transcode queue silently caps at four jobs, or that the DNS rewrite exists to work around one device that hardcodes a public resolver.

So the logs that matter aren't the ones describing what the code does — I can read the code. The ones that matter describe what the code assumes, and what I already tried.

→Log the workaround and the reason for it, not the happy path.
→Log the value that surprised you while building, because it will surprise you again.
→Log boundaries you guessed at — timeouts, retry counts, batch sizes — so a future failure points straight at the guess.

Log what you assumed, not what you did. The code already records what you did.

Right-sizing is mostly subtraction

The failure mode of a homelab isn't too little logging. It's a journalctl wall so dense that the one line that matters is buried under ten thousand healthy heartbeats. Every log line you write is a line you'll have to skim past later, and the cost is paid every single time you debug, not once when you write it.

So I right-size by deleting. Anything that fires on the happy path and never changes — "request handled," "connection established," "still alive" — gets a log level I never read, or no log at all. What survives is the stuff that records a decision: a fallback that triggered, a value clamped to a limit, a retry that finally gave up. A log line should earn its place by changing what I'd do next.

The one exception is the seam between systems I don't control. When my backup script hands a tarball to a remote object store, I log both sides of that handoff loudly — the bytes I sent, the response I got — because that's the boundary where I have no source code to read and no way to reconstruct what happened from first principles.

Make the trail self-explaining

The best log line answers the question I'll actually have at 1am, which is never "what happened" but "is this the thing that's broken." That means stamping enough identity onto each line that I can correlate across services without holding state in my head.

log.sh


log "[backup:run=$RUN_ID] snapshot=$SNAP size=$BYTES retain=$KEEP"
log "[backup:run=$RUN_ID] prune kept=$KEPT dropped=$DROPPED reason=age>$KEEP"

Every line carries the run id, the decision, and the value that drove it.

A run id I can grep, the decision the script made, and the number that drove it. Six months later I don't have to remember how pruning works; the line tells me it dropped three snapshots because they were older than the retention window, and if that's wrong, I now know exactly which knob to turn. No dashboard, no trace viewer, just grep and a line that explains itself.

This is also why I avoid structured logging ceremony for a system this size. JSON logs are for machines that aggregate. My aggregator is a person with a terminal, and that person is better served by lines he can read at a glance than by a schema he'll have to query.

Forgive yourself in advance

The real test of homelab observability comes the day something breaks while you're trying to do something else entirely. You don't have an hour to relearn the system. You have ten minutes and a vague memory that you "fixed this before." Good logging is the note you left so that memory has somewhere to land.

So I treat every log line as a small apology to my future self, written in advance, for everything I'm about to forget. Not an audit trail. A trail of breadcrumbs from the symptom back to the assumption that betrayed me. Right-sizing observability for a system only I will debug means logging exactly that much — enough to forgive myself next week, and not one heartbeat more.

#Systems#Observability#HomelabShare ↗

→ / AUTHOR

Ionut Dumitru

Full-stack engineer and product designer. Writes about building products where the engineering and the design are the same job.

GitHub ↗X ↗

→ / NEXT

AIAug 18, 2025

What a model can't know about your user →

← All writingionutdumitru.com