Document completed phase 5 branch support

This commit is contained in:
2026-03-14 10:59:13 +01:00
parent a2b3a049b8
commit 35c250666f
2 changed files with 59 additions and 58 deletions

View File

@@ -57,8 +57,10 @@ The current implementation supports:
- `unbalance` - `unbalance`
- row-boundary repair for trailing affix leakage - row-boundary repair for trailing affix leakage
- split row-label reconstruction for tables that render labels such as `99-` / `100` as two fragments - split row-label reconstruction for tables that render labels such as `99-` / `100` as two fragments
- conditional branch extraction into `critical_branch`
- footer/page-number filtering during body parsing - footer/page-number filtering during body parsing
- transactional loading into SQLite - transactional loading into SQLite
- conditional branch display through the web critical lookup
The current implementation does not yet support: The current implementation does not yet support:
@@ -270,9 +272,16 @@ Phase-4 notes:
### Phase 5: Conditional Branch Extraction ### Phase 5: Conditional Branch Extraction
- split branch-heavy cells into `critical_branch` Phase 5 is complete.
- preserve the base cell text and branch text separately
- support branch conditions such as `with helmet` and `w/o leg greaves` Phase-5 notes:
- branch-heavy cells are split into base result content plus ordered `critical_branch` rows
- branch parsing is shared across `standard`, `variant_column`, and `grouped_variant` table families
- branch conditions are preserved as display text and normalized into condition keys such as `with_leg_greaves`
- branch payloads can contain prose, affix notation, or both
- the importer now upgrades older SQLite files to add the `CriticalBranches` table before load
- the web critical lookup now returns and renders conditional branches alongside the base result
### Phase 6: Effect Normalization ### Phase 6: Effect Normalization
@@ -493,6 +502,7 @@ The current implementation stores:
- base `DescriptionText` - base `DescriptionText`
- base `RawAffixText` - base `RawAffixText`
- parsed conditional branches with condition text, branch prose, and branch affix text - parsed conditional branches with condition text, branch prose, and branch affix text
- parsed conditional branches in debug artifacts and persisted SQLite rows
It does not yet normalize effects into separate tables. It does not yet normalize effects into separate tables.
@@ -530,13 +540,15 @@ The loader is transactional.
The current load path: The current load path:
1. ensures the SQLite database exists 1. ensures the SQLite database exists
2. deletes the existing subtree for the targeted critical table 2. upgrades older SQLite files to the current importer-owned critical schema where needed
3. inserts: 3. deletes the existing subtree for the targeted critical table
4. inserts:
- `critical_table` - `critical_table`
- `critical_column` - `critical_column`
- `critical_roll_band` - `critical_roll_band`
- `critical_result` - `critical_result`
4. commits only after the full table is saved - `critical_branch`
5. commits only after the full table is saved
This means importer iterations can target one table without resetting unrelated database content. This means importer iterations can target one table without resetting unrelated database content.
@@ -562,6 +574,8 @@ Important files in the current implementation:
- command orchestration - command orchestration
- `src/RolemasterDb.ImportTool/CriticalImportLoader.cs` - `src/RolemasterDb.ImportTool/CriticalImportLoader.cs`
- transactional SQLite load/reset behavior - transactional SQLite load/reset behavior
- `src/RolemasterDb.ImportTool/Parsing/CriticalCellTextParser.cs`
- shared base-vs-branch parsing for cell content
- `src/RolemasterDb.ImportTool/CriticalImportManifestLoader.cs` - `src/RolemasterDb.ImportTool/CriticalImportManifestLoader.cs`
- manifest loading - manifest loading
- `src/RolemasterDb.ImportTool/PdfXmlExtractor.cs` - `src/RolemasterDb.ImportTool/PdfXmlExtractor.cs`
@@ -574,8 +588,14 @@ Important files in the current implementation:
- positioned text fragment model - positioned text fragment model
- `src/RolemasterDb.ImportTool/Parsing/ParsedCriticalCellArtifact.cs` - `src/RolemasterDb.ImportTool/Parsing/ParsedCriticalCellArtifact.cs`
- debug cell artifact model - debug cell artifact model
- `src/RolemasterDb.ImportTool/Parsing/ParsedCriticalBranch.cs`
- parsed branch artifact model
- `src/RolemasterDb.ImportTool/Parsing/ImportValidationReport.cs` - `src/RolemasterDb.ImportTool/Parsing/ImportValidationReport.cs`
- validation output model - validation output model
- `src/RolemasterDb.App/Data/RolemasterDbSchemaUpgrader.cs`
- SQLite upgrade hook for branch-table rollout
- `src/RolemasterDb.App/Components/Shared/CriticalLookupResultCard.razor`
- web rendering of base results and conditional branches
## Adding a New Table ## Adding a New Table

View File

@@ -127,6 +127,11 @@ Each branch can carry:
- branch raw affix text - branch raw affix text
- parsed JSON - parsed JSON
Current implementation note:
- `critical_branch` is now populated by the importer and returned by the web critical lookup
- condition keys are normalized for lookup/API use, while the original condition text remains available for display
### 7. `critical_effect` ### 7. `critical_effect`
Normalized machine-readable effects parsed from the symbol line and, over time, from prose. Normalized machine-readable effects parsed from the symbol line and, over time, from prose.
@@ -204,82 +209,58 @@ or:
## Example return object ## Example return object
This is the shape I would return from a lookup: This is close to the current lookup shape, while still leaving room for future `critical_effect` normalization:
```json ```json
{ {
"critical_type": "slash", "critical_type": "slash",
"table_name": "Slash Critical Strike Table", "table_name": "Slash Critical Strike Table",
"group": null, "group": null,
"column": { "column": "B",
"key": "B", "column_label": "B",
"label": "B", "column_role": "severity",
"role": "severity" "roll": 38,
}, "roll_band": "36-45",
"roll": { "roll_band_min": 36,
"input": 38, "roll_band_max": 45,
"band": "36-45",
"min": 36,
"max": 45
},
"description": "Strike foe in shin.", "description": "Strike foe in shin.",
"raw_affix_text": null,
"branches": [
{
"branch_kind": "conditional",
"condition_key": "with_leg_greaves",
"condition_text": "with leg greaves",
"description": "",
"raw_affix_text": "+2H - must_parry", "raw_affix_text": "+2H - must_parry",
"affixes": [ "sort_order": 1
{
"effect_code": "direct_hits",
"value": 2
}
],
"conditions": [
{
"when": "with leg greaves",
"description": null,
"raw_affix_text": "+2H - must_parry",
"affixes": [
{
"effect_code": "direct_hits",
"value": 2
}, },
{ {
"effect_code": "must_parry_rounds", "branch_kind": "conditional",
"value": 1 "condition_key": "without_leg_greaves",
} "condition_text": "w/o leg greaves",
]
},
{
"when": "without leg greaves",
"description": "You slash open foe's shin.", "description": "You slash open foe's shin.",
"raw_affix_text": "+2H - bleed", "raw_affix_text": "+2H - bleed",
"affixes": [ "sort_order": 2
{
"effect_code": "direct_hits",
"value": 2
},
{
"effect_code": "bleed_per_round",
"value": 1
}
]
} }
], ],
"raw_text": "Original full cell text as extracted from the PDF", "raw_cell_text": "Original full cell text as extracted from the PDF",
"source": { "source": {
"pdf": "Slash.pdf", "pdf": "Slash.pdf",
"page": 1, "extraction_method": "xml"
"extraction_method": "text"
} }
} }
``` ```
## Ingestion notes ## Ingestion notes
Recommended import flow: Current import flow:
1. Create `critical_table`, `critical_group`, `critical_column`, and `critical_roll_band` from each PDF's visible axes. 1. Create `critical_table`, `critical_group`, `critical_column`, and `critical_roll_band` from each PDF's visible axes.
2. Store each cell in `critical_result.raw_cell_text` exactly as extracted. 2. Store each base cell in `critical_result` with base raw/description/affix text.
3. Parse the symbol line into `critical_effect`. 3. Split explicit conditional branches into `critical_branch`.
4. Split explicit conditional branches into `critical_branch`. 4. Return the base result plus ordered branches through the web critical lookup.
5. Gradually enrich prose-derived effects such as death, blindness, paralysis, limb loss, initiative changes, and item breakage. 5. Parse symbolic affixes into `critical_effect` in the next phase.
6. Route image PDFs like `Void.pdf` through OCR before the same parser. 6. Gradually enrich prose-derived effects such as death, blindness, paralysis, limb loss, initiative changes, and item breakage.
7. Route image PDFs like `Void.pdf` through OCR before the same parser.
The important design decision is: never throw away the original text. The prose is too irregular to rely on normalized fields alone. The important design decision is: never throw away the original text. The prose is too irregular to rely on normalized fields alone.