Implement phase 5 critical branch extraction

This commit is contained in:
2026-03-14 10:21:26 +01:00
parent b2f61c3d73
commit 60c5d886a4
20 changed files with 589 additions and 399 deletions

View File

@@ -63,7 +63,6 @@ The current implementation supports:
The current implementation does not yet support:
- OCR/image-based PDFs such as `Void.pdf`
- normalized `critical_branch` population
- normalized `critical_effect` population
- automatic confidence scoring beyond validation errors
@@ -210,10 +209,6 @@ The importer now explicitly rejects cells that still look structurally wrong aft
This keeps the phase-2.1 safety goal in place while allowing broader standard-table layouts that render a single affix block either before or after the prose block.
## Planned Future Phases
The current architecture is intended to support additional phases:
### Phase 3: Broader Table Coverage
Phase 3 expands the manifest and validates the shared `standard` parser across a broader set of `A-E` tables.
@@ -494,11 +489,12 @@ Affix-like classification is intentionally conservative. Numeric prose lines suc
The current implementation stores:
- `RawCellText`
- `DescriptionText`
- `RawAffixText`
- base `RawCellText`
- base `DescriptionText`
- base `RawAffixText`
- parsed conditional branches with condition text, branch prose, and branch affix text
It does not yet normalize branches or effects into separate tables.
It does not yet normalize effects into separate tables.
## Validation Rules