Document completed phase 5 branch support

This commit is contained in:
2026-03-14 10:59:13 +01:00
parent a2b3a049b8
commit 35c250666f
2 changed files with 59 additions and 58 deletions

View File

@@ -57,8 +57,10 @@ The current implementation supports:
- `unbalance`
- row-boundary repair for trailing affix leakage
- split row-label reconstruction for tables that render labels such as `99-` / `100` as two fragments
- conditional branch extraction into `critical_branch`
- footer/page-number filtering during body parsing
- transactional loading into SQLite
- conditional branch display through the web critical lookup
The current implementation does not yet support:
@@ -270,9 +272,16 @@ Phase-4 notes:
### Phase 5: Conditional Branch Extraction
- split branch-heavy cells into `critical_branch`
- preserve the base cell text and branch text separately
- support branch conditions such as `with helmet` and `w/o leg greaves`
Phase 5 is complete.
Phase-5 notes:
- branch-heavy cells are split into base result content plus ordered `critical_branch` rows
- branch parsing is shared across `standard`, `variant_column`, and `grouped_variant` table families
- branch conditions are preserved as display text and normalized into condition keys such as `with_leg_greaves`
- branch payloads can contain prose, affix notation, or both
- the importer now upgrades older SQLite files to add the `CriticalBranches` table before load
- the web critical lookup now returns and renders conditional branches alongside the base result
### Phase 6: Effect Normalization
@@ -493,6 +502,7 @@ The current implementation stores:
- base `DescriptionText`
- base `RawAffixText`
- parsed conditional branches with condition text, branch prose, and branch affix text
- parsed conditional branches in debug artifacts and persisted SQLite rows
It does not yet normalize effects into separate tables.
@@ -530,13 +540,15 @@ The loader is transactional.
The current load path:
1. ensures the SQLite database exists
2. deletes the existing subtree for the targeted critical table
3. inserts:
2. upgrades older SQLite files to the current importer-owned critical schema where needed
3. deletes the existing subtree for the targeted critical table
4. inserts:
- `critical_table`
- `critical_column`
- `critical_roll_band`
- `critical_result`
4. commits only after the full table is saved
- `critical_branch`
5. commits only after the full table is saved
This means importer iterations can target one table without resetting unrelated database content.
@@ -562,6 +574,8 @@ Important files in the current implementation:
- command orchestration
- `src/RolemasterDb.ImportTool/CriticalImportLoader.cs`
- transactional SQLite load/reset behavior
- `src/RolemasterDb.ImportTool/Parsing/CriticalCellTextParser.cs`
- shared base-vs-branch parsing for cell content
- `src/RolemasterDb.ImportTool/CriticalImportManifestLoader.cs`
- manifest loading
- `src/RolemasterDb.ImportTool/PdfXmlExtractor.cs`
@@ -574,8 +588,14 @@ Important files in the current implementation:
- positioned text fragment model
- `src/RolemasterDb.ImportTool/Parsing/ParsedCriticalCellArtifact.cs`
- debug cell artifact model
- `src/RolemasterDb.ImportTool/Parsing/ParsedCriticalBranch.cs`
- parsed branch artifact model
- `src/RolemasterDb.ImportTool/Parsing/ImportValidationReport.cs`
- validation output model
- `src/RolemasterDb.App/Data/RolemasterDbSchemaUpgrader.cs`
- SQLite upgrade hook for branch-table rollout
- `src/RolemasterDb.App/Components/Shared/CriticalLookupResultCard.razor`
- web rendering of base results and conditional branches
## Adding a New Table

View File

@@ -127,6 +127,11 @@ Each branch can carry:
- branch raw affix text
- parsed JSON
Current implementation note:
- `critical_branch` is now populated by the importer and returned by the web critical lookup
- condition keys are normalized for lookup/API use, while the original condition text remains available for display
### 7. `critical_effect`
Normalized machine-readable effects parsed from the symbol line and, over time, from prose.
@@ -204,82 +209,58 @@ or:
## Example return object
This is the shape I would return from a lookup:
This is close to the current lookup shape, while still leaving room for future `critical_effect` normalization:
```json
{
"critical_type": "slash",
"table_name": "Slash Critical Strike Table",
"group": null,
"column": {
"key": "B",
"label": "B",
"role": "severity"
},
"roll": {
"input": 38,
"band": "36-45",
"min": 36,
"max": 45
},
"column": "B",
"column_label": "B",
"column_role": "severity",
"roll": 38,
"roll_band": "36-45",
"roll_band_min": 36,
"roll_band_max": 45,
"description": "Strike foe in shin.",
"raw_affix_text": null,
"branches": [
{
"branch_kind": "conditional",
"condition_key": "with_leg_greaves",
"condition_text": "with leg greaves",
"description": "",
"raw_affix_text": "+2H - must_parry",
"affixes": [
{
"effect_code": "direct_hits",
"value": 2
}
],
"conditions": [
{
"when": "with leg greaves",
"description": null,
"raw_affix_text": "+2H - must_parry",
"affixes": [
{
"effect_code": "direct_hits",
"value": 2
"sort_order": 1
},
{
"effect_code": "must_parry_rounds",
"value": 1
}
]
},
{
"when": "without leg greaves",
"branch_kind": "conditional",
"condition_key": "without_leg_greaves",
"condition_text": "w/o leg greaves",
"description": "You slash open foe's shin.",
"raw_affix_text": "+2H - bleed",
"affixes": [
{
"effect_code": "direct_hits",
"value": 2
},
{
"effect_code": "bleed_per_round",
"value": 1
}
]
"sort_order": 2
}
],
"raw_text": "Original full cell text as extracted from the PDF",
"raw_cell_text": "Original full cell text as extracted from the PDF",
"source": {
"pdf": "Slash.pdf",
"page": 1,
"extraction_method": "text"
"extraction_method": "xml"
}
}
```
## Ingestion notes
Recommended import flow:
Current import flow:
1. Create `critical_table`, `critical_group`, `critical_column`, and `critical_roll_band` from each PDF's visible axes.
2. Store each cell in `critical_result.raw_cell_text` exactly as extracted.
3. Parse the symbol line into `critical_effect`.
4. Split explicit conditional branches into `critical_branch`.
5. Gradually enrich prose-derived effects such as death, blindness, paralysis, limb loss, initiative changes, and item breakage.
6. Route image PDFs like `Void.pdf` through OCR before the same parser.
2. Store each base cell in `critical_result` with base raw/description/affix text.
3. Split explicit conditional branches into `critical_branch`.
4. Return the base result plus ordered branches through the web critical lookup.
5. Parse symbolic affixes into `critical_effect` in the next phase.
6. Gradually enrich prose-derived effects such as death, blindness, paralysis, limb loss, initiative changes, and item breakage.
7. Route image PDFs like `Void.pdf` through OCR before the same parser.
The important design decision is: never throw away the original text. The prose is too irregular to rely on normalized fields alone.