ADR-0017: Pre-commit hook implementation
Status: Accepted Date: 2026-05-19
Context
ADR-0009 specified that a pre-commit hook should enforce the “AI hard rules” from CONTEXT.md so that drift becomes structurally impossible — failed hook = failed commit = AI must fix and retry. After 2 successful pilot batches, the patterns to enforce have stabilized enough to encode them.
Decision
Implement scripts/pre-commit-hook.py — a standard-library-only Python script (no external dependencies) installed as a git pre-commit hook via symlink: .git/hooks/pre-commit → ../../scripts/pre-commit-hook.py.
The hook reads staged .md files (git diff --cached --name-only --diff-filter=ACM) and applies type-specific checks based on the type: field in frontmatter.
Checks implemented (numbered to match CONTEXT.md “AI hard rules”)
| # | Rule | Implementation |
|---|---|---|
| 1 | Tag must be in _meta/tags.md registry | Parse tag list from registry; verify every tags: entry in staged file is present |
| 2 | Atomic body ≤ 300 words | Strip markdown formatting + wikilinks + footnote refs, count space-separated tokens |
| 3 | Threads have ## Counter-argument AND ## Response sections | Substring presence check |
| 4 | Counter has cited critic OR “source not located” disclaimer | Regex for wikilinks [[…]], inline (Author Year citation, or the literal disclaimer string |
| 5 | Resource wikilinks (containing /) resolve to existing files | Path lookup under resources/ |
| 7 | Person/Glossary ## Referenced by matches actual vault wikilink graph | Scan INDEX_FOLDERS for files containing [[<slug>]], compare with parsed list; report missing or extra entries |
Checks NOT implemented
- Rule 6 (no commit outside active staging batch) — workflow discipline, not commit-time check. Hook has no concept of “active batch.”
- Frontmatter schema validation (required fields per type) — could add later; current cost/benefit suggests waiting until a missing-field bug actually causes harm.
- Citation grammar shape per class (rule 5 strong form) — current implementation only checks that resource paths resolve; it does not verify that the anchor format matches the per-class grammar in CONTEXT.md (e.g.,
#3:16for Bible vs#section-headingfor BR). Defer until a real grammar drift bug appears.
Failure mode
On any error: prints failures to stderr with ✗ prefix, prints total error count, exits 1 — git aborts the commit. AI must fix and re-stage.
On success: prints ✓ pre-commit checks passed (N markdown file(s)), exits 0.
Install
chmod +x scripts/pre-commit-hook.py
ln -sf ../../scripts/pre-commit-hook.py .git/hooks/pre-commit(Symlink makes the hook portable: clone the repo, run the install command once, and the hook is live.)
Manual invocation
python3 scripts/pre-commit-hook.py # validates staged set (what's about to commit)
python3 scripts/pre-commit-hook.py --all # validates every tracked .md file in the vault--all is useful for retroactive sweeps — after schema changes, after manual edits, or periodically as a sanity check that nothing has drifted between commits.
Alternatives considered
pre-commitframework (the popular Python hook manager): rejected — adds a dependency and a.pre-commit-config.yaml. Single Python file with stdlib only is simpler and matches ADR-0009’s “small Python script (~50 lines)” intent. (Final implementation is ~180 lines; the larger size buys precise error messages and the full rule 7 graph check.)- PyYAML for frontmatter parsing: rejected — we control the schema; a small regex parser handles our subset deterministically and adds zero dependencies.
- Subprocess + grep for the wikilink graph (rule 7): rejected in favor of pure-Python
pathlib.glob+ compiled regex; portable across shells, no piping fragility. - Strict citation-grammar enforcement (rule 5 full form): deferred — current resolves-to-file check catches the most common failure mode (broken or typo’d citations). Stronger grammar matching is a future enhancement.
Consequences
- (+) AI cannot land a commit with drift in any of the enforced rules. The previously-manual “verify all
## Referenced bysections match” step is now automatic on every commit. - (+) Future batches stop needing the manual grep-verification step at finalization (saves a few minutes per batch, eliminates a class of human-error mistakes).
- (+) New contributors / future Claude sessions inherit the rules mechanically — the hook is the single source of truth for what “drift” means operationally.
- (−) Test/dev workflows that intentionally stage incomplete work get blocked. Workaround:
git commit --no-verifyexists but per global instructions should only be used with explicit user permission — fine, since the hook exists precisely to prevent the kind of commits that bypass should never happen. - (−) ~180 lines of script to maintain. Mitigated by stdlib-only design and comprehensive comments.