- Overview
- Quick Start
- Installation
- Usage
- Exit Codes
- Output Streams
- State Management
- Creation Arguments
- Creation Modes
- Creation Glob Patterns
- Marker Files
- Ignore Files
- Configuration
- Limitations
- License
par2cron is a tool that wraps par2cmdline (a parity-based file recovery tool)
to achieve automated periodic integrity creation, verification and repair within
any given directory tree. It is designed for use with non-changing WORM-type of
files, perfect for adding a degree of protection to media libraries or backups.
The driving idea is that you do not need to invest in a filesystem (like ZFS) that protects all your data, at the disadvantage of additional complexities, when you really only care that important subsets of your data remain protected.
A given directory tree on any filesystem is scanned for marker files, and a
PAR2 set created for every directory containing such a _par2cron file. For
verification, the program loads the PAR2 sets and verifies that the data which
they are protecting is healthy, otherwise flagging the PAR2 set for repair.
Once repair runs, corrupted or missing files are recovered. Many command-line
tunables, as well as configuration directives, are offered for more granular
adjustment of how to create, when to verify and in what situation to repair.
A set-and-forget setup is as easy as adding three commands to crontab:
par2cron createpar2cron verifypar2cron repair
That being set up, you can simply protect any valuable folder by just placing a
_par2cron file in it; the tool will create a PAR2 set and pick it up into the
periodic verification and repair cycle - now protected from corruption/bitrot.
A default setup involves adding three simple crontab entries:
0 1 * * * par2cron create /mnt/storage
0 3 * * * par2cron verify /mnt/storage
0 5 * * * par2cron repair /mnt/storageOnce configured, protecting a new folder is as simple as:
-
Navigating to any directory within
/mnt/storage -
Creating an empty "marker" file named
_par2cron -
Done - your files are protected after the next scheduled run!
PAR2 sets are then verified and repaired with by the set up periodic tasks.
A condensed quick guide and cheatsheet can be found in the QUICKGUIDE file.
One PAR2 per folder: To keep your mental model simple, marker-based PAR2 creation does not recurse into subfolders by default. The flat protection scope ensures that you know exactly which files a PAR2 covers.
To build from source, a Makefile is included with the project's source code.
Running make all will compile the application and pull in any necessary
dependencies. make check runs the test suite and static analysis tools.
For convenience, precompiled static binaries for common architectures are
released through GitHub. These can be installed into /usr/bin/ or respective
system locations; ensure they are executable by running chmod +x before use.
All builds from source are designed to generate reproducible builds, meaning that they should compile as byte-identical to the respective released binaries and also have the exact same checksums upon integrity verification.
par2(the binary of the par2cmdline tool):- Debian/Ubuntu:
apt install par2 - macOS:
brew install par2 - Fedora:
dnf install par2cmdline
- Debian/Ubuntu:
git clone https://github.com/desertwitch/par2cron.git
cd par2cron
make all./par2cron --helpThe program is divided into separate commands to achieve its tasks:
| Command | Purpose |
|---|---|
par2cron create |
Creates PAR2 sets for directories with marker files |
par2cron verify |
Verifies existing PAR2 sets in a directory tree |
par2cron repair |
Repairs corrupted files using PAR2 recovery data |
par2cron info |
Shows verification cycle and configuration statistics |
par2cron check-config |
Validates a par2cron YAML configuration file |
Scans a directory tree for "_par2cron" marker files
Creates PAR2 sets for directories containing a marker file
Usage:
par2cron create [flags] <dir>... [-- par2-args...]
Examples:
Use configuration file instead of CLI arguments:
par2cron create -c /tmp/par2cron.yaml /mnt/storage
Pass "-r15 -n1" (15% redundancy, 1 recovery file) to par2:
par2cron create /mnt/storage -- -r15 -n1
Run for around 1 hour (as soft limit), hide created files:
par2cron create -d 1h --hidden /mnt/storage
Flags:
-c, --config string path to a par2cron YAML configuration file
-d, --duration duration time budget per run (best effort/soft limit)
-g, --glob string PAR2 set default glob (files to include) (default "*")
-h, --help help for create
--hidden create PAR2 sets and related files as hidden (dotfiles)
--json output structured logs in JSON format
-l, --log-level level minimum level of emitted logs (debug|info|warn|error) (default info)
-m, --mode mode PAR2 set default mode; creates a set per (folder|nested|file|recursive) (default folder)
-v, --verify PAR2 sets must pass verification as part of creation
Verifies all protected data using the existing PAR2 sets
Corrupted/missing files are flagged for the repair operation
Usage:
par2cron verify [flags] <dir>... [-- par2-args...]
Examples:
Use configuration file instead of CLI arguments:
par2cron verify -c /tmp/par2cron.yaml /mnt/storage
Verify all sets, argument "-q" (quiet mode) for par2:
par2cron verify /mnt/storage -- -q
Verify sets not verified < 7 days, run around 2 hours:
par2cron verify -a 7d -d 2h /mnt/storage
Flags:
-a, --age duration minimum time between re-verifications (skip if verified within this period)
-i, --calc-run-interval duration how often you run par2cron verify (for backlog calculations) (default 24h)
-c, --config string path to a par2cron YAML configuration file
-d, --duration duration time budget per run (best effort/soft limit)
-h, --help help for verify
-e, --include-external include PAR2 sets without a par2cron manifest (and create one)
--json output structured logs in JSON format
-l, --log-level level minimum level of emitted logs (debug|info|warn|error) (default info)
--skip-not-created skip PAR2 sets without a par2cron manifest containing a creation record
External PAR2: par2cron can verify existing sets created by other tools. Use the
--include-externalflag to pull these into the verification cycle (creating par2cron manifests for them in the process).
Repair all data flagged as repairable during verification
Uses existing PAR2 sets to recover corrupted/missing files
Usage:
par2cron repair [flags] <dir>... [-- par2-args...]
Examples:
Use configuration file instead of CLI arguments:
par2cron repair -c /tmp/par2cron.yaml /mnt/storage
Repair all sets, argument "-q" (quiet mode) for par2:
par2cron repair -u /mnt/storage -- -q
Repair repairable, verify after, run for around 1 hour:
par2cron repair -d 1h -v /mnt/storage
Flags:
-u, --attempt-unrepairables attempt to repair PAR2 sets marked as unrepairable
-c, --config string path to a par2cron YAML configuration file
-d, --duration duration time budget per run (best effort/soft limit)
-h, --help help for repair
--json output structured logs in JSON format
-l, --log-level level minimum level of emitted logs (debug|info|warn|error) (default info)
-t, --min-tested int repair only when verified as corrupted at least X times
-p, --purge-backups remove obsolete backup files (.1, .2, ...) after successful repair
-r, --restore-backups roll back protected files to pre-repair state after unsuccessful repair
--skip-not-created skip PAR2 sets without a par2cron manifest containing a creation record
-v, --verify PAR2 sets must pass verification as part of repair
Analyzes the directory tree for statistics about PAR2 sets
Shows verification statistics and configuration information
Usage:
par2cron info [flags] <dir>...
Examples:
Analyze a 7-day cycle with 2-hour daily runs:
par2cron info -a 7d -d 2h /mnt/storage
Analyze a 14-day cycle with 4-hour weekly runs:
par2cron info -a 14d -d 4h -i 1w /mnt/storage
Output results as JSON (stdout/standard output):
par2cron info --json /mnt/storage
Flags:
-a, --age duration target cycle length (time between re-verifications)
-i, --calc-run-interval duration how often you run par2cron verify (default 24h)
-c, --config string path to a par2cron YAML configuration file
-d, --duration duration target time budget for each verify run (soft limit)
-h, --help help for info
-e, --include-external include external PAR2 sets without a par2cron manifest
--json output in JSON format (result to stdout, logs to stderr)
-l, --log-level level minimum level of emitted logs (debug|info|warn|error) (default info)
--skip-not-created skip PAR2 sets without a par2cron manifest containing a creation record
Validates the syntax of a par2cron YAML configuration
Use the command to check configurations before deploying
Usage:
par2cron check-config <file> [flags]
Examples:
Validate a par2cron YAML configuration file:
par2cron check-config /tmp/par2cron.yaml
Flags:
-h, --help help for check-config
Granular codes allow for integration with scripts and notification services:
| Code | Name | Description |
|---|---|---|
| 0 | Success | All operations completed successfully. |
| 1 | Partial Failure | One or more tasks failed, but the process continued. |
| 2 | Bad Invocation | Invalid command-line arguments or configuration error. |
| 3 | Repairable | Corruption detected, but parity data is sufficient to repair. |
| 4 | Unrepairable | Corruption detected that exceeds available redundancy. |
| 5 | Unclassified | An unexpected or unknown error occurred. |
In general the program is able to recover from most problematic situations without user interaction, either retrying failures at a later time or with rebuilding corrupted or missing manifests (read more about manifests below) wherever possible. Failure-related exit codes usually directly relate to encountered errors requiring some degree of manual inspection by the user.
As par2cron needs to coordinate between itself and the par2 program, their
output is clearly and cleanly separated. All par2cron logs, using structured
logging (either text-/JSON-based), are written to standard error (stderr).
Unstructured par2 program output is written to standard output (stdout).
The only anomaly to the above is the info command, which does not use the
par2 program. In non-JSON mode, again structured logging is written to
standard error (stderr), and unstructured information to standard output
(stdout). In JSON mode, all structured logging is written to standard
error (stderr), and the JSON-encoded result to standard output (stdout).
As a general rule of thumb this can be condensed into:
- Structured logging goes to standard error (
stderr) - Command-related output goes to standard output (
stdout)
The program aims to off-load all state directly next to the protected files. As a result, par2cron creates a manifest and lock file next to each PAR2 set. While this may seem as clutter at first, it is a conscious design choice eliminating the need for a central database and allowing these files to travel alongside backups. Verification and repair heavily utilize the manifest file to transition a verification result into an eventual repair or re-verification where needed. The lock file is used to ensure that multiple par2cron instances can run concurrently on the same directory tree without any cross-interference.
/mnt/storage/Pictures/
├── beach.jpg
├── flowers.jpg
├── Pictures.par2 <-- par2 index file
├── Pictures.vol00+01.par2 <-- par2 recovery data
├── Pictures.par2.json <-- par2cron manifest
└── Pictures.par2.lock <-- par2cron lockfile
Because all state is stored locally within the directory tree, you can move your protected folders between different drives or servers. As long as par2cron is running on the new host, it will pick up existing manifests and continue the verification cycle. While the lockfile ensures multiple par2cron instances on the same computer do not collide, you need to ensure that shared locations are only ever accessed by one par2cron instance at a time (network/cloud drives).
The --hidden argument of create can be useful to hide the PAR2 sets, if
the amount of files is something that can be a bother for file organization.
If opting for this, it should be noted that some backup programs will not
transfer hidden files (dotfiles) without being configured to do so, so you
should consider this when moving around par2cron-protected directory trees.
By default, no additional arguments are given to the par2 program for the
three calling par2cron operations. However, it is strongly recommended to
set default par2 arguments for the create command, to be reflecting your
personal needs and situation. You can decide the default set of arguments to
give to par2 for any of the par2cron commands using either the configuration
file or appending them as [-- par2-args...]:
par2cron create /mnt/storage -- -r15 -n1
par2cron verify /mnt/storage -- -q
par2cron repair /mnt/storage -- -m512As you can see, anything following -- are treated as default arguments to
pass to the par2 program for that par2cron operation. For the create
operation, this can then be influenced for individual creation jobs by use
of the marker filename or marker configuration (read more about this below).
A list for all the possible par2 arguments can be found here:
https://github.com/Parchive/par2cmdline#using-par2cmdline
The create command offers four distinct operation modes, controlling how many
PAR2 sets are created and where they are placed. The glob pattern (see below)
controls which files are considered for protection in any of these modes.
| Mode | PAR2 sets created (marker-containing folder as base) |
|---|---|
folder (default) |
One PAR2 set for all matching files |
nested |
One PAR2 set per folder containing matching files |
file |
One PAR2 set per matching file |
recursive |
One PAR2 set, but using par2 internal recursion |
Creates a single PAR2 set covering all glob-matching files, placed in the
marker-containing directory. With the default glob *, this means all files in
the marker folder itself, but not folders. With a deep glob like **/*.jpg,
this extends to matching files in subfolders - but still produces only one PAR2
set.
The PAR2 set is named after the directory it resides in. For example, a marker
in /mnt/storage/Pictures produces Pictures.par2. This can be overridden
on a per-job basis using the name directive within the marker configuration.
This is the recommended mode for most use cases.
One marker, one PAR2 set, simple and clean mental model.
Creates one PAR2 set per folder that contains glob-matching files, starting from
the marker-containing directory and descending into subfolders, provided a deep
glob like **/*.jpg is used. Each PAR2 set is placed in the folder it protects
and named after that folder.
This is useful for structured collections like media libraries. A single marker
in Movies/ with a deep glob will produce Movies/A/A.par2, Movies/B/B.par2
and so on - without needing individual marker files in each subfolder. Every
folder that contains at least one matching file gets its own independent PAR2
set, including deeply nested folders.
Combined with the persist marker directive, nested mode allows handling of
growing collections. New folders are picked up automatically on the next run,
and existing folders with a PAR2 set are skipped (such existing PAR2 sets will
not be updated). To update a folder's PAR2 set (after adding or changing
files), delete the old PAR2 set and the next run will recreate it (with your
changes) without requiring re-creation of a marker file.
Creates one PAR2 set per matching file, placed next to the file it protects.
Each PAR2 set is named after its file, so beach.jpg produces beach.jpg.par2.
This naming is not changeable through marker configuration.
This mode is useful for large collections where verifying or repairing a single
combined PAR2 set would take too long. The trade-off is more files: each
protected file produces its own set of PAR2 recovery files. The hidden
argument or marker directive can hide any PAR2-related files for less clutter.
With a deep glob like **/*.jpg, PAR2 sets are created next to each matching
file in its respective subfolder, so they always stay close to each other.
Creates a single PAR2 set and delegates recursion entirely to par2 itself
using its -R flag. The glob pattern controls which files and folders (only of
the marker-containing directory itself) par2 receives, par2 will then
greedily include everything within any glob-matching folder.
The PAR2 set is placed in the marker-containing directory and named after it,
like in folder mode. The key difference is that par2 handles the recursion
internally rather than par2cron building a recursive path list from a glob.
Beware this mode does not support deep glob patterns containing / or **, as
combining par2cron's glob recursion with par2 internal recursion would result
in unpredictable behavior and double recursion. For fine-grained control over
what gets protected across subfolders, use folder, nested or file mode with deep
globs instead. If you just want everything in the marker-containing
directory and below, glob * with recursive mode can be the simple choice.
Recursive mode is not recommended as a default. Use it on a per-job basis through marker configuration when the other modes do not fit your needs.
The --glob argument controls which files are considered for protection. It
defaults to *, matching all non-hidden files in the marker-containing
directory. par2cron uses the
doublestar library, supporting
the full range of glob patterns including ** for crossing directory
boundaries. The glob pattern can be changed in the default configuration or on a
per-job basis using the marker configurations.
Shallow patterns like *, *.jpg or *.{jpg,png} match files within a single
directory. In folder, nested and file modes, this means only files directly in
the marker-containing folder are considered. In recursive mode, the pattern is
applied to files and folders in the marker-containing directory, with par2
recursing into any matching folders.
Patterns containing / or ** such as **/*.jpg or */data/*.csv cross
directory boundaries. This allows all non-recursive modes (folder, nested, file)
to match specific files across subfolders:
- folder collects all matches into one PAR2 set in the marker directory
- nested groups matches by their containing folder, one PAR2 set per folder
- file creates one PAR2 set per match, placed next to each file
Deep patterns are not supported in recursive mode, as par2 performs its own
recursion internally. Combining both would result in unpredictable behavior.
| Pattern | Matches |
|---|---|
* |
All files in the marker directory |
*.mp4 |
All .mp4 files in the marker directory |
*.{mkv,srt} |
All .mkv and .srt files in the marker directory |
**/* |
All files in the marker directory and its subfolders |
**/*.mkv |
All .mkv files in the marker directory and below |
data/**/*.iso |
All .iso files in data/ and its subdirectories |
For a full list of supported patterns, refer to the doublestar documentation.
The core of the par2cron create operation are the marker files. A found marker
file denotes that elements in the containing directory need protecting. In most
basic form it is just an empty _par2cron file, with the defaults from the set
command-line arguments or configuration file being applied, but more control for
individual creation jobs is possible through the marker files (read more below).
Upon successful creation of the PAR2 set, the marker file is normally deleted. In case of failure, the creation is retried with the next run. If a same-named PAR2 set is already present in the directory, the marker file is skipped and a warning presented to the user (not resulting in a non-zero exit code).
Above does not apply when a marker file is set to persist (see below), which
allows re-use of marker files for growing folders. New folders then picked up
automatically on the next run, and existing PAR2 sets skipped without warning.
This is interesting for nested mode, but does not update existing PAR2 sets.
By default, subfolders are not considered for the created PAR2 set. par2cron promotes a clear mental model of "One PAR2 per folder". This helps to reduce cognitive load and wondering "Which files did this PAR2 protect again?".
/mnt/storage/Pictures/
├── Nature/
├── _par2cron <-- par2cron marker file
├── beach.jpg <-- will be protected using PAR2
└── sunset.jpg <-- will be protected using PAR2
For subfolder protection, consider using nested mode with a deep glob pattern. Recursive mode should only be set on a per-job basis via marker configuration.
Users wanting to re-create any of their PAR2 sets (having added or updated files) simply need to delete that PAR2 set, placing a new marker file into the directory. With persistent marker files (e.g. for nested creation mode), only the old PAR2 set needs deletion, but no new marker file will need to be created.
You can modify the default arguments that are given to par2, settable
through the command-line arguments or configuration file (see Usage and
Configuration sections), for individual creation jobs. An example would be
that your defaults are -r15 -n1, for 15% redundancy and 1 recovery file. Now
you have an especially important dataset that you would like to have 30% of
redundancy for, without wanting to change your configuration or affecting other
creations. This can simply be realized by creating a marker file with the name
_par2cron_r30, which will then create the PAR2 set using -r30 -n1, so
leaving in place the other default arguments (in this case -n1).
If an argument provided as part of a marker filename is not among the default
arguments given to par2, it is simply added for that creation job. An example
would be wanting to add -q to your -r15 -n1 default, in which case you would
simply create a marker file named _par2cron_q. This also applies if no default
arguments to be given to par2 were set, effectively adding any arguments that
are part of the marker filename, again - only for that individual creation job.
| Marker filename | Default arguments | Resulting arguments |
|---|---|---|
_par2cron |
-r15 -n1 |
-r15 -n1 |
_par2cron_q |
-r15 -n1 |
-r15 -n1 -q |
_par2cron_r30 |
-r15 -n1 |
-r30 -n1 |
_par2cron_r30_q |
-r15 -n1 |
-r30 -n1 -q |
The use case for this is being able to fine-tune individual creation jobs by
just memorizing the often-used, important arguments for e.g. redundancy without
having to remember the entire (set as default) collection of par2 arguments.
Above examples assumed -r15 -n1 were set as the par2 default arguments for
the creation task (using par2cron create /mnt/storage -- -r15 -n1). However,
it applies for any combination of default creation arguments passed to par2.
For more control, or replacing the entire default arguments that are given
to par2 (again only for the individual creation job), read below about marker
configuration (the optional content that can be placed within a marker file).
For convenience, the par2 argument -R (when found in a marker filename)
will also automatically set creation mode to recursive for that job. As a
result, any marker file named _par2cron_R will override to recursive mode,
even if this was not specifically set through the YAML marker configuration.
In most cases a marker file will have no content, but for maximum control over individual creation jobs it is possible to place YAML directives inside. These directives allow overriding the defaults set via command-line arguments or configuration file, but only for the individual creation job. All settings are optional, so you can just pick the ones needed for the individual creation job.
Below is an example of a configuration using all directives:
# Override name of the PAR2 set to be created
# Beware that this setting does not apply in file mode
name: "Ubuntu"
# Override the arguments passed to par2
# Replaces the default arguments set in CLI/configuration
args: ["-r30", "-n1"]
# Override the glob pattern
# Refer to section "Creation Glob Patterns" of documentation
glob: "*.iso"
# Override the creation mode [folder|nested|file|recursive]
mode: "folder"
# Override whether to verify the PAR2 set after creation
verify: true
# Override whether to create the PAR2 set and related files as hidden
hidden: true
# Do not delete this marker file after PAR2 set creation
# If set, no warnings will be raised about existing PAR2 sets
# Allows re-use of marker file for growing folders (e.g. in nested mode)
persist: trueThe directives are designed to be easy to remember, although for the rare case that you should need such a marker configuration a little cheat-sheet is to be recommended, because YAML errors will result in a non-zero exit code.
A situation may arise where you want to exclude a folder (or directory tree) from all par2cron operations either temporarily or permanently. You can do so by placing an ignore file in that directory, so that it is excluded from the job enumeration of par2cron. This allows to e.g. exclude directories with PAR2 sets that you do not want verified or otherwise interacted with by the program.
.par2cron-ignore(ignore this folder).par2cron-ignore-all(ignore this folder and subfolders)
A configuration file can be given to par2cron, which is reusable and replaces the need to achieve complex setups through the command-line arguments entirely.
You should verify the configuration using par2cron check-config, as malformed
configuration will prevent the program from starting (bad invocation exit code).
For a full configuration example, refer to the par2cron.yaml file.
par2cron, and PAR2 in general, is mostly designed to operate on non-changing data. It simply has no concept of data being updated, instead flagging such updates as possible corruption. If you need to update any protected files, you will need to manually delete the PAR2 set and then have it recreated using the marker file approach (equals the process for new sets of protectable data).
A par2cron-generated PAR2 set will consist of at least 4 files and possibly more
depending on your par2 arguments. This can cause significant file clutter in
directories, which can be mitigated by using the --hidden argument with
create (read more about this in the above section State Management).
While the lockfile ensures multiple par2cron instances on the same computer do not collide, you need to ensure that shared (network) locations are only ever accessed by one par2cron computer at a time (e.g. different weekdays).
par2 (the dependency) itself does not support protecting files through
symbolic links ("symlinks"). It is also strongly discouraged to organize
important files this way, because it makes keeping files physically close more
difficult and backup planning more brittle and error-prone. par2cron skips
over symbolic links with a warning, and rejects with an error any glob patterns
where it is already obvious from the non-glob parts that a symlink will need to
be followed to glob the files that need protecting. (e.g. symlink/**/*.txt).
All code is licensed under the MIT License.
