$ttl
Prereq
Book PDF in `10-sources/`. Know the pack slug (e.g. `visual-identity`) and a short source-slug (e.g. `bokhua-principles-of-logo-design-2022`).
Steps
1. Verify text vs scan — on ≥20 pages
pdftotext -f 1 -l 20 book.pdf - | tr -d '[:space:]' | wc -c
Gotcha: check 20pp, NOT 5 — front matter (cover/title) is image-only and returns ~0 chars, falsely flagging a text book as scan. If genuinely scan (<1500 ch on 20pp) → OCR:
~/Sync/digitalGarden/bin/ocr.sh book.pdf -p 1-18 -l en+de -o /tmp/book.txt
Gotcha: for IA-sourced books prefer `ocr.sh` over the IA `_djvu.txt` (the latter is heavily degraded).
2. Ground concepts
pdftotext book.pdf /tmp/book.txt
grep -niE "concept|anchor|terms" /tmp/book.txt | head -20
Read anchors; abstract concepts into your own words (summaries, never verbatim).
3. Bulk-create EU files + register (mkeu pattern)
cd ~/Sync/digitalGarden/10-extracted
APP=~/.claude/skills/runa-classify/append-event-v13.sh
SRC=$(md5 -q ~/Sync/digitalGarden/10-sources/book.pdf | cut -c1-32)
RUN=vi-<book>-2026-05-25
mkeu(){ id="$1";tier="$2";ttl="$3";b1="$4"
cat > "$id.podlite" <<EOF
EU statement
$b1
EOF snip=$(md5 -q "$id.podlite"|cut -c1-16) "$APP" --event=eu_created --eu-id="$id" --details="$ttl" --tier-impact="$tier+1" \ --run-id="$RUN" --source-file-hash="$SRC" --snippet-hash="$snip" \ --ingestion-version=L1-manual --plugin=manual.read --pack=visual-identity; } mkeu my-eu-slug T2 "Title" "Statement." # ... repeat per EU
4. Close session + verify pyramid
"$APP" --event=runa_session_completed --eu-id="$RUN" --details="N EUs (T1:a T2:b ...)" \
--tier-impact="T1+a,T2+b,..." --run-id="$RUN" --pack=visual-identity \
--source-file-hash="$SRC" --ingestion-version=L1-manual --plugin=manual.read
~/.claude/skills/runa-pyramid/invariant-check.sh --session "$RUN" # must be 0 violations
Gotcha: `runa_session_completed` REQUIRES `--ingestion-version` AND `--plugin` (errors without). The declared `--tier-impact` aggregate MUST equal the sum of the run's `eu_created` tier-impacts, or the pyramid fails.
Gotchas (cost real time 2026-05-25)
Pyramid correction: a mis-declared `session_completed` is NOT fixed by appending a corrected one — `invariant-check` reads the first. Fix the original line via targeted `sed` (unique string) + remove the dup, with a journal backup first.
Batch-edit by `:pack`, not filename glob: `sed 10-extracted/muller-*.podlite` collides Jens Müller (visual-identity) with Müller-Brockmann (design-functionalism). Use `grep -l ':pack<slug' *.podlite`. See [[pack-batch-edit-by-pack-not-glob]].
index.podlite is high-level, not a per-EU registry — discovery = file-grep by `:pack`.
Acquire URLs: IA download `https://archive.org/download/<id>/<urlencoded-filename>`; Are.na full-book PDFs live at `attachments.are.na/*.pdf` (curl-able).
Verification
grep -lr ":pack<visual-identity" 10-extracted/ | grep -vc "hub-\|index" # file count
# == journal eu_created with pack= (no retro gap when tagged at creation)
Related
pack-creation-guide — Phase 3 methodology