Documentation
Frostholm (fh) is a command-line archival tool for files you cannot afford to
lose. It stores incremental snapshots in a content-addressed repository, deduplicates data
across snapshots and files, and encrypts everything at rest.
Repositories
A repository is the storage location for all your snapshots. It contains:
- Pack files — immutable blobs of encrypted, compressed chunks.
- Index — maps chunk hashes to pack files and byte offsets.
- Snapshot manifests — metadata trees describing each backup point.
- Config — repository parameters (chunk size, encryption params, format version).
Repositories are backend-agnostic. The same directory structure is used whether you're storing to a local path, an S3 bucket, or Backblaze B2.
# Initialize a local repository
fh init --repo /mnt/backup/my-repo
# Initialize an S3 repository
fh init --repo s3://mybucket/frostholm
# Initialize a B2 repository
fh init --repo b2://my-b2-bucket/frostholm
Snapshots
A snapshot is a point-in-time record of a file tree. Each snapshot contains a full logical view of your data — but only the changed chunks are stored. Previous unchanged chunks are referenced by hash, not duplicated.
fh backup --repo /mnt/backup/my-repo ~/Documents
# List snapshots
fh snapshots --repo /mnt/backup/my-repo
# Restore a specific snapshot
fh restore --repo /mnt/backup/my-repo --snapshot abc12345 --target /tmp/restore
Snapshot IDs are truncated BLAKE3 hashes of the manifest. You can reference them by the first 8 characters.
Deduplication
Frostholm uses content-defined chunking (CDC) to split files into variable-length chunks before storing them. The chunk boundaries depend on the content itself, not on fixed byte offsets, which means that inserting bytes at the beginning of a large file doesn't invalidate every downstream chunk.
Each chunk is identified by its BLAKE3 hash. Before writing a chunk, Frostholm checks whether that hash already exists in the repository index. If it does, the chunk is referenced rather than written. This works across different files, different snapshots, and different source directories in the same repository.
Typical deduplication ratios on real workloads:
| Workload | Raw size | Stored (after dedup) | Ratio |
|---|---|---|---|
| Daily dev work, 90 days | 42 GB | 6.1 GB | 6.9× |
| Photo library, weekly snapshots | 180 GB | 188 GB | 0.96× |
| Mixed docs & code, 1 year | 28 GB | 3.4 GB | 8.2× |
RAW photo libraries don't deduplicate well — each file is unique compressed data. Code and documents deduplicate heavily because lines and functions recur across versions.
Encryption
All chunks are encrypted with ChaCha20-Poly1305 before being written to the repository. The encryption key is derived from your repository password using Argon2id (time=3, memory=64 MB, parallelism=4 by default).
The repository password is never stored anywhere. If you lose it, you lose access to the
repository. Frostholm will warn you about this during fh init and prompt you
to confirm.
# Set password via environment variable (recommended for scripts)
export FH_REPO_PASSWORD="your-passphrase"
fh backup --repo /mnt/backup/my-repo ~/Documents
Local backend
The simplest backend — a directory on any filesystem. No additional configuration required. Works well for backing up to an external drive, a NAS mount, or a separate partition.
fh init --repo /Volumes/BackupDrive/frostholm-repo
S3-compatible backend
Works with AWS S3, Wasabi, MinIO, Cloudflare R2, and any other S3-compatible service.
Credentials are read from environment variables or ~/.aws/credentials.
export AWS_ACCESS_KEY_ID="..."
export AWS_SECRET_ACCESS_KEY="..."
fh init --repo s3://my-bucket/frostholm --s3-region us-east-1
# For non-AWS endpoints
fh init --repo s3://my-bucket/frostholm --s3-endpoint https://s3.wasabisys.com
Backblaze B2 backend
Uses B2's native API directly (not S3-compat). Provides better multipart upload handling and lower API call overhead on large repositories.
export B2_APPLICATION_KEY_ID="..."
export B2_APPLICATION_KEY="..."
fh init --repo b2://my-b2-bucket/frostholm
See the v0.4 release post for B2-specific tips on bucket configuration and lifecycle rules.