Implement phase 6 critical effect normalization

This commit is contained in:
2026-03-14 11:31:13 +01:00
parent 35c250666f
commit 521f0ff8d5
29 changed files with 932 additions and 55 deletions

View File

@@ -65,7 +65,6 @@ The current implementation supports:
The current implementation does not yet support:
- OCR/image-based PDFs such as `Void.pdf`
- normalized `critical_effect` population
- automatic confidence scoring beyond validation errors
## High-Level Architecture
@@ -285,9 +284,16 @@ Phase-5 notes:
### Phase 6: Effect Normalization
- parse symbolic affix lines into normalized effects
- populate `critical_effect`
- gradually enrich prose-derived effects over time
Phase 6 is complete for symbol-driven affixes.
Phase-6 notes:
- footer legends are parsed into table-specific affix metadata before effect normalization
- symbolic affix lines are normalized into `critical_effect` rows for both base results and conditional branches
- the normalized pass currently covers direct hits, must-parry, no-parry, stun, bleed, foe penalties, attacker bonuses, and `Mana` power-point modifiers
- result and branch `parsed_json` payloads now store the normalized symbol effects
- the web critical lookup now returns and renders parsed affix effects alongside the raw affix text
- prose-derived effects remain future work
### Phase 7: OCR and Manual Fallback
@@ -501,11 +507,10 @@ The current implementation stores:
- base `RawCellText`
- base `DescriptionText`
- base `RawAffixText`
- parsed conditional branches with condition text, branch prose, and branch affix text
- normalized base affix effects in `critical_effect`
- parsed conditional branches with condition text, branch prose, branch affix text, and normalized branch affix effects
- parsed conditional branches in debug artifacts and persisted SQLite rows
It does not yet normalize effects into separate tables.
## Validation Rules
The current validation pass is intentionally strict.
@@ -548,6 +553,7 @@ The current load path:
- `critical_roll_band`
- `critical_result`
- `critical_branch`
- `critical_effect`
5. commits only after the full table is saved
This means importer iterations can target one table without resetting unrelated database content.
@@ -575,7 +581,11 @@ Important files in the current implementation:
- `src/RolemasterDb.ImportTool/CriticalImportLoader.cs`
- transactional SQLite load/reset behavior
- `src/RolemasterDb.ImportTool/Parsing/CriticalCellTextParser.cs`
- shared base-vs-branch parsing for cell content
- shared base-vs-branch parsing for cell content and affix extraction
- `src/RolemasterDb.ImportTool/Parsing/AffixEffectParser.cs`
- footer-legend-aware symbol effect normalization
- `src/RolemasterDb.ImportTool/Parsing/AffixLegend.cs`
- parsed footer legend model used for affix classification and effect mapping
- `src/RolemasterDb.ImportTool/CriticalImportManifestLoader.cs`
- manifest loading
- `src/RolemasterDb.ImportTool/PdfXmlExtractor.cs`
@@ -589,13 +599,15 @@ Important files in the current implementation:
- `src/RolemasterDb.ImportTool/Parsing/ParsedCriticalCellArtifact.cs`
- debug cell artifact model
- `src/RolemasterDb.ImportTool/Parsing/ParsedCriticalBranch.cs`
- parsed branch artifact model
- parsed branch artifact model with normalized effects
- `src/RolemasterDb.ImportTool/Parsing/ParsedCriticalEffect.cs`
- parsed effect artifact model
- `src/RolemasterDb.ImportTool/Parsing/ImportValidationReport.cs`
- validation output model
- `src/RolemasterDb.App/Data/RolemasterDbSchemaUpgrader.cs`
- SQLite upgrade hook for branch-table rollout
- SQLite upgrade hook for branch/effect-table rollout
- `src/RolemasterDb.App/Components/Shared/CriticalLookupResultCard.razor`
- web rendering of base results and conditional branches
- web rendering of base results, conditional branches, and parsed affix effects
## Adding a New Table

View File

@@ -145,6 +145,7 @@ Recommended canonical `effect_code` values:
- `bleed_per_round`
- `foe_penalty`
- `attacker_bonus_next_round`
- `power_point_modifier`
- `initiative_gain`
- `initiative_loss`
- `drop_item`
@@ -169,6 +170,11 @@ Each effect should point to either:
This lets you keep the raw text but still filter/query on effects.
Current implementation note:
- symbol-driven affixes are now normalized for both base results and conditional branch affixes
- `value_expression` is used when the affix contains a formula instead of a flat integer, which is currently needed for `Mana` power-point adjustments such as `+(2d10-18)P`
## Why this works for your lookup
Your lookup target is mostly:
@@ -258,8 +264,8 @@ Current import flow:
1. Create `critical_table`, `critical_group`, `critical_column`, and `critical_roll_band` from each PDF's visible axes.
2. Store each base cell in `critical_result` with base raw/description/affix text.
3. Split explicit conditional branches into `critical_branch`.
4. Return the base result plus ordered branches through the web critical lookup.
5. Parse symbolic affixes into `critical_effect` in the next phase.
4. Parse symbolic affixes for both the base result and any branch affix payloads into `critical_effect`.
5. Return the base result plus ordered branches and parsed affix effects through the web critical lookup.
6. Gradually enrich prose-derived effects such as death, blindness, paralysis, limb loss, initiative changes, and item breakage.
7. Route image PDFs like `Void.pdf` through OCR before the same parser.

View File

@@ -91,6 +91,7 @@ create table critical_effect (
target text,
value_integer integer,
value_decimal numeric(10, 2),
value_expression text,
duration_rounds integer,
per_round integer,
modifier integer,
@@ -139,4 +140,3 @@ create index critical_branch_parsed_json_gin
-- and c.column_key = 'C'
-- and 38 >= rb.min_roll
-- and (rb.max_roll is null or 38 <= rb.max_roll);