In the spring of 2022, a 4 TB Seagate I'd been using as my "proper backup" started making a clicking sound during a routine sync. I had another copy on an older 2 TB drive — which I discovered had a corrupt filesystem and hadn't successfully written anything in at least eight months. The result: four years of RAW photography, project source trees, and music recordings. Gone.

I wasn't cavalier about backups. I thought I had a system. The problem was I had drives instead of a system. No integrity verification, no offsite copy, no versioned snapshots. Just two physical disks that could fail at the same moment — and did.

Why not just use restic?

I did use restic for about eighteen months after that. It's genuinely good software and I recommend it to non-technical friends without hesitation. But two things kept nagging me:

  • Repository format: restic's pack files contain one chunk per file — sensible for most use cases, but suboptimal when you're archiving thousands of near-duplicate RAW files from burst shooting. Frostholm uses variable-length CDC chunking across the entire repository, which lets it dedup across different files and versions simultaneously.
  • Cold-tier economics: restic's S3 backend is designed for warm storage. The access patterns assume reasonably frequent reads. When you're targeting B2's cold tier or Glacier, you want to batch chunk fetches and minimize API calls. Frostholm's backend abstraction was designed with this in mind from day one.

I started writing Frostholm on a long train trip in January 2024. It took about six months to reach something I'd trust with data I cared about.

Philosophy

The name comes from a word I found while translating old Norse texts for a hobby project — frost (obvious) plus holm (a small island). A cold island. Somewhere safe, isolated, unlikely to catch fire at the same moment as your main house.

The goal isn't to be the fastest backup tool or the one with the most features. The goal is to be the tool you actually trust at 2 AM when you realize something went wrong and you need to know: is my data there? Is it intact? Can I get it back right now?

The best backup is the one you actually verify. Frostholm makes verification a first-class operation, not an afterthought.

Technical stack

  • Written in Go 1.22 — single binary, no runtime dependencies
  • BLAKE3 for content hashing (via zeebo/blake3)
  • ChaCha20-Poly1305 (AEAD) for chunk encryption
  • Argon2id for key derivation from repository password
  • FastCDC-inspired variable-length chunker (target: 2 MB, range: 512 KB – 8 MB)
  • Local index cache with WAL for crash consistency
  • Backends: local, S3-compatible, Backblaze B2 native API

Who I am

I'm E.V., a systems programmer. I've worked on storage infrastructure professionally, which is probably why losing data to hardware failure stings in a particular way — I know exactly how preventable it was. I maintain Frostholm in my spare time with occasional contributions from a handful of people who found the project on GitHub.

You can reach me at hello [at] frostholm.fun. I read everything; I reply to most things.

Acknowledgements

Parts of the repository format were inspired by reading the restic and Borg documentation and source code. The CDC chunker is a reimplementation of the FastCDC algorithm from the 2016 USENIX ATC paper. Invaluable early feedback came from R. Lindström and M. Halonen, who graciously let me trash-test their NAS arrays.