Document completed phase 5 branch support
This commit is contained in:
@@ -57,8 +57,10 @@ The current implementation supports:
|
||||
- `unbalance`
|
||||
- row-boundary repair for trailing affix leakage
|
||||
- split row-label reconstruction for tables that render labels such as `99-` / `100` as two fragments
|
||||
- conditional branch extraction into `critical_branch`
|
||||
- footer/page-number filtering during body parsing
|
||||
- transactional loading into SQLite
|
||||
- conditional branch display through the web critical lookup
|
||||
|
||||
The current implementation does not yet support:
|
||||
|
||||
@@ -270,9 +272,16 @@ Phase-4 notes:
|
||||
|
||||
### Phase 5: Conditional Branch Extraction
|
||||
|
||||
- split branch-heavy cells into `critical_branch`
|
||||
- preserve the base cell text and branch text separately
|
||||
- support branch conditions such as `with helmet` and `w/o leg greaves`
|
||||
Phase 5 is complete.
|
||||
|
||||
Phase-5 notes:
|
||||
|
||||
- branch-heavy cells are split into base result content plus ordered `critical_branch` rows
|
||||
- branch parsing is shared across `standard`, `variant_column`, and `grouped_variant` table families
|
||||
- branch conditions are preserved as display text and normalized into condition keys such as `with_leg_greaves`
|
||||
- branch payloads can contain prose, affix notation, or both
|
||||
- the importer now upgrades older SQLite files to add the `CriticalBranches` table before load
|
||||
- the web critical lookup now returns and renders conditional branches alongside the base result
|
||||
|
||||
### Phase 6: Effect Normalization
|
||||
|
||||
@@ -493,6 +502,7 @@ The current implementation stores:
|
||||
- base `DescriptionText`
|
||||
- base `RawAffixText`
|
||||
- parsed conditional branches with condition text, branch prose, and branch affix text
|
||||
- parsed conditional branches in debug artifacts and persisted SQLite rows
|
||||
|
||||
It does not yet normalize effects into separate tables.
|
||||
|
||||
@@ -530,13 +540,15 @@ The loader is transactional.
|
||||
The current load path:
|
||||
|
||||
1. ensures the SQLite database exists
|
||||
2. deletes the existing subtree for the targeted critical table
|
||||
3. inserts:
|
||||
2. upgrades older SQLite files to the current importer-owned critical schema where needed
|
||||
3. deletes the existing subtree for the targeted critical table
|
||||
4. inserts:
|
||||
- `critical_table`
|
||||
- `critical_column`
|
||||
- `critical_roll_band`
|
||||
- `critical_result`
|
||||
4. commits only after the full table is saved
|
||||
- `critical_branch`
|
||||
5. commits only after the full table is saved
|
||||
|
||||
This means importer iterations can target one table without resetting unrelated database content.
|
||||
|
||||
@@ -562,6 +574,8 @@ Important files in the current implementation:
|
||||
- command orchestration
|
||||
- `src/RolemasterDb.ImportTool/CriticalImportLoader.cs`
|
||||
- transactional SQLite load/reset behavior
|
||||
- `src/RolemasterDb.ImportTool/Parsing/CriticalCellTextParser.cs`
|
||||
- shared base-vs-branch parsing for cell content
|
||||
- `src/RolemasterDb.ImportTool/CriticalImportManifestLoader.cs`
|
||||
- manifest loading
|
||||
- `src/RolemasterDb.ImportTool/PdfXmlExtractor.cs`
|
||||
@@ -574,8 +588,14 @@ Important files in the current implementation:
|
||||
- positioned text fragment model
|
||||
- `src/RolemasterDb.ImportTool/Parsing/ParsedCriticalCellArtifact.cs`
|
||||
- debug cell artifact model
|
||||
- `src/RolemasterDb.ImportTool/Parsing/ParsedCriticalBranch.cs`
|
||||
- parsed branch artifact model
|
||||
- `src/RolemasterDb.ImportTool/Parsing/ImportValidationReport.cs`
|
||||
- validation output model
|
||||
- `src/RolemasterDb.App/Data/RolemasterDbSchemaUpgrader.cs`
|
||||
- SQLite upgrade hook for branch-table rollout
|
||||
- `src/RolemasterDb.App/Components/Shared/CriticalLookupResultCard.razor`
|
||||
- web rendering of base results and conditional branches
|
||||
|
||||
## Adding a New Table
|
||||
|
||||
|
||||
@@ -127,6 +127,11 @@ Each branch can carry:
|
||||
- branch raw affix text
|
||||
- parsed JSON
|
||||
|
||||
Current implementation note:
|
||||
|
||||
- `critical_branch` is now populated by the importer and returned by the web critical lookup
|
||||
- condition keys are normalized for lookup/API use, while the original condition text remains available for display
|
||||
|
||||
### 7. `critical_effect`
|
||||
|
||||
Normalized machine-readable effects parsed from the symbol line and, over time, from prose.
|
||||
@@ -204,82 +209,58 @@ or:
|
||||
|
||||
## Example return object
|
||||
|
||||
This is the shape I would return from a lookup:
|
||||
This is close to the current lookup shape, while still leaving room for future `critical_effect` normalization:
|
||||
|
||||
```json
|
||||
{
|
||||
"critical_type": "slash",
|
||||
"table_name": "Slash Critical Strike Table",
|
||||
"group": null,
|
||||
"column": {
|
||||
"key": "B",
|
||||
"label": "B",
|
||||
"role": "severity"
|
||||
},
|
||||
"roll": {
|
||||
"input": 38,
|
||||
"band": "36-45",
|
||||
"min": 36,
|
||||
"max": 45
|
||||
},
|
||||
"column": "B",
|
||||
"column_label": "B",
|
||||
"column_role": "severity",
|
||||
"roll": 38,
|
||||
"roll_band": "36-45",
|
||||
"roll_band_min": 36,
|
||||
"roll_band_max": 45,
|
||||
"description": "Strike foe in shin.",
|
||||
"raw_affix_text": "+2H - must_parry",
|
||||
"affixes": [
|
||||
"raw_affix_text": null,
|
||||
"branches": [
|
||||
{
|
||||
"effect_code": "direct_hits",
|
||||
"value": 2
|
||||
}
|
||||
],
|
||||
"conditions": [
|
||||
{
|
||||
"when": "with leg greaves",
|
||||
"description": null,
|
||||
"branch_kind": "conditional",
|
||||
"condition_key": "with_leg_greaves",
|
||||
"condition_text": "with leg greaves",
|
||||
"description": "",
|
||||
"raw_affix_text": "+2H - must_parry",
|
||||
"affixes": [
|
||||
{
|
||||
"effect_code": "direct_hits",
|
||||
"value": 2
|
||||
},
|
||||
{
|
||||
"effect_code": "must_parry_rounds",
|
||||
"value": 1
|
||||
}
|
||||
]
|
||||
"sort_order": 1
|
||||
},
|
||||
{
|
||||
"when": "without leg greaves",
|
||||
"branch_kind": "conditional",
|
||||
"condition_key": "without_leg_greaves",
|
||||
"condition_text": "w/o leg greaves",
|
||||
"description": "You slash open foe's shin.",
|
||||
"raw_affix_text": "+2H - bleed",
|
||||
"affixes": [
|
||||
{
|
||||
"effect_code": "direct_hits",
|
||||
"value": 2
|
||||
},
|
||||
{
|
||||
"effect_code": "bleed_per_round",
|
||||
"value": 1
|
||||
}
|
||||
]
|
||||
"sort_order": 2
|
||||
}
|
||||
],
|
||||
"raw_text": "Original full cell text as extracted from the PDF",
|
||||
"raw_cell_text": "Original full cell text as extracted from the PDF",
|
||||
"source": {
|
||||
"pdf": "Slash.pdf",
|
||||
"page": 1,
|
||||
"extraction_method": "text"
|
||||
"extraction_method": "xml"
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## Ingestion notes
|
||||
|
||||
Recommended import flow:
|
||||
Current import flow:
|
||||
|
||||
1. Create `critical_table`, `critical_group`, `critical_column`, and `critical_roll_band` from each PDF's visible axes.
|
||||
2. Store each cell in `critical_result.raw_cell_text` exactly as extracted.
|
||||
3. Parse the symbol line into `critical_effect`.
|
||||
4. Split explicit conditional branches into `critical_branch`.
|
||||
5. Gradually enrich prose-derived effects such as death, blindness, paralysis, limb loss, initiative changes, and item breakage.
|
||||
6. Route image PDFs like `Void.pdf` through OCR before the same parser.
|
||||
2. Store each base cell in `critical_result` with base raw/description/affix text.
|
||||
3. Split explicit conditional branches into `critical_branch`.
|
||||
4. Return the base result plus ordered branches through the web critical lookup.
|
||||
5. Parse symbolic affixes into `critical_effect` in the next phase.
|
||||
6. Gradually enrich prose-derived effects such as death, blindness, paralysis, limb loss, initiative changes, and item breakage.
|
||||
7. Route image PDFs like `Void.pdf` through OCR before the same parser.
|
||||
|
||||
The important design decision is: never throw away the original text. The prose is too irregular to rely on normalized fields alone.
|
||||
|
||||
Reference in New Issue
Block a user