299 lines
7.7 KiB
Markdown
299 lines
7.7 KiB
Markdown
# Critical Tables DB Model
|
|
|
|
## What the PDFs look like
|
|
|
|
The PDFs are not one uniform table shape. I found three families:
|
|
|
|
1. Standard tables
|
|
- Columns are severity-like keys such as `A` through `E`.
|
|
- Rows are roll bands such as `01-05`, `66`, `96-99`, or `100`.
|
|
- Examples: `Slash.pdf`, `Puncture.pdf`, `Arcane Aether.pdf`.
|
|
|
|
2. Variant-column tables
|
|
- Columns are not severity letters; they are variant keys such as `normal`, `magic`, `mithril`, `holy arms`, `slaying`.
|
|
- Rows are still roll bands.
|
|
- Example: `Large Creature - Weapon.pdf`.
|
|
|
|
3. Grouped variant tables
|
|
- There is an extra grouping axis above the column axis.
|
|
- Example: `Large Creature - Magic.pdf` has:
|
|
- group: `large`, `super_large`
|
|
- column: `normal`, `slaying`
|
|
- In the current importer manifest, the grouped magic PDF is loaded once as `large_creature_magic` because the `Large Creature - Magic.pdf` and `Super Large Creature - Magic.pdf` source files are duplicates.
|
|
- row: roll band
|
|
|
|
There are also extraction constraints:
|
|
|
|
- Most PDFs are text extractable with `pdftohtml -xml`.
|
|
- `Void.pdf` appears image-based and will need OCR or manual transcription.
|
|
- A single cell can contain:
|
|
- base description text
|
|
- symbolic affixes such as `+5H - 2S - 3B`
|
|
- conditional branches such as `with helmet`, `w/o leg greaves`, `if foe has shield`
|
|
|
|
Because of that, the safest model is hybrid:
|
|
|
|
- relational tables for lookup axes and indexed effects
|
|
- raw text storage for fidelity
|
|
- structured JSON for irregular branches that are hard to normalize perfectly on first pass
|
|
|
|
## Recommended logical model
|
|
|
|
### 1. `critical_table`
|
|
|
|
One record per PDF/table, which is the primary "critical type" for lookup.
|
|
|
|
Examples:
|
|
|
|
- `slash`
|
|
- `puncture`
|
|
- `arcane_aether`
|
|
- `large_creature_weapon`
|
|
- `large_creature_magic`
|
|
|
|
### 2. `critical_group`
|
|
|
|
Optional extra axis for tables that need more than type + column + roll.
|
|
|
|
Examples:
|
|
|
|
- `large`
|
|
- `super_large`
|
|
|
|
Most tables will have no group rows.
|
|
|
|
### 3. `critical_column`
|
|
|
|
Generalized "severity/column" axis.
|
|
|
|
Examples:
|
|
|
|
- `A`, `B`, `C`, `D`, `E`
|
|
- `normal`, `magic`, `mithril`, `holy_arms`, `slaying`
|
|
|
|
Do not hardcode this as a single severity enum. Treat it as a table-defined dimension.
|
|
|
|
### 4. `critical_roll_band`
|
|
|
|
Stores row bands and supports exact row lookup by roll.
|
|
|
|
Examples:
|
|
|
|
- `01-05`
|
|
- `66`
|
|
- `96-99`
|
|
- `251+`
|
|
|
|
Recommended fields:
|
|
|
|
- `min_roll`
|
|
- `max_roll` nullable for open-ended rows like `251+`
|
|
- display label
|
|
- sort order
|
|
|
|
### 5. `critical_result`
|
|
|
|
One record per lookup cell:
|
|
|
|
- table
|
|
- optional group
|
|
- column
|
|
- roll band
|
|
|
|
This stores:
|
|
|
|
- `raw_cell_text`
|
|
- `description_text`
|
|
- `raw_affix_text`
|
|
- `parsed_json`
|
|
- parse status / source metadata
|
|
|
|
### 6. `critical_branch`
|
|
|
|
Optional conditional branches inside a result cell.
|
|
|
|
Examples:
|
|
|
|
- `with helmet`
|
|
- `without helmet`
|
|
- `with leg greaves`
|
|
- `if foe has shield`
|
|
|
|
Each branch can carry:
|
|
|
|
- `condition_text`
|
|
- optional structured `condition_json`
|
|
- branch description text
|
|
- branch raw affix text
|
|
- parsed JSON
|
|
|
|
Current implementation note:
|
|
|
|
- `critical_branch` is now populated by the importer and returned by the web critical lookup
|
|
- condition keys are normalized for lookup/API use, while the original condition text remains available for display
|
|
|
|
### 7. `critical_effect`
|
|
|
|
Normalized machine-readable effects parsed from the symbol line and, over time, from prose.
|
|
|
|
Recommended canonical `effect_code` values:
|
|
|
|
- `direct_hits`
|
|
- `must_parry_rounds`
|
|
- `no_parry_rounds`
|
|
- `stunned_rounds`
|
|
- `bleed_per_round`
|
|
- `foe_penalty`
|
|
- `attacker_bonus_next_round`
|
|
- `power_point_modifier`
|
|
- `initiative_gain`
|
|
- `initiative_loss`
|
|
- `drop_item`
|
|
- `item_breakage_check`
|
|
- `limb_useless`
|
|
- `knockdown`
|
|
- `prone`
|
|
- `coma`
|
|
- `paralyzed`
|
|
- `blind`
|
|
- `deaf`
|
|
- `mute`
|
|
- `dies_in_rounds`
|
|
- `instant_death`
|
|
- `armor_destroyed`
|
|
- `weapon_stuck`
|
|
|
|
Each effect should point to either:
|
|
|
|
- the base `critical_result`, or
|
|
- a `critical_branch`
|
|
|
|
This lets you keep the raw text but still filter/query on effects.
|
|
|
|
Current implementation note:
|
|
|
|
- symbol-driven affixes are now normalized for both base results and conditional branch affixes
|
|
- `value_expression` is used when the affix contains a formula instead of a flat integer, which is currently needed for `Mana` power-point adjustments such as `+(2d10-18)P`
|
|
|
|
## Why this works for your lookup
|
|
|
|
Your lookup target is mostly:
|
|
|
|
- `critical type`
|
|
- `severity(column)`
|
|
- `roll`
|
|
|
|
That maps cleanly to:
|
|
|
|
- `critical_table.slug`
|
|
- `critical_column.column_key`
|
|
- numeric roll matched against `critical_roll_band`
|
|
|
|
For the outlier tables, add an optional `group_key`.
|
|
|
|
That means the API can still stay simple:
|
|
|
|
```json
|
|
{
|
|
"critical_type": "slash",
|
|
"column": "C",
|
|
"roll": 38,
|
|
"group": null
|
|
}
|
|
```
|
|
|
|
or:
|
|
|
|
```json
|
|
{
|
|
"critical_type": "large_creature_magic",
|
|
"group": "super_large",
|
|
"column": "slaying",
|
|
"roll": 88
|
|
}
|
|
```
|
|
|
|
## Example return object
|
|
|
|
This is close to the current lookup shape, while still leaving room for future `critical_effect` normalization:
|
|
|
|
```json
|
|
{
|
|
"critical_type": "slash",
|
|
"table_name": "Slash Critical Strike Table",
|
|
"group": null,
|
|
"column": "B",
|
|
"column_label": "B",
|
|
"column_role": "severity",
|
|
"roll": 38,
|
|
"roll_band": "36-45",
|
|
"roll_band_min": 36,
|
|
"roll_band_max": 45,
|
|
"description": "Strike foe in shin.",
|
|
"raw_affix_text": null,
|
|
"branches": [
|
|
{
|
|
"branch_kind": "conditional",
|
|
"condition_key": "with_leg_greaves",
|
|
"condition_text": "with leg greaves",
|
|
"description": "",
|
|
"raw_affix_text": "+2H - must_parry",
|
|
"sort_order": 1
|
|
},
|
|
{
|
|
"branch_kind": "conditional",
|
|
"condition_key": "without_leg_greaves",
|
|
"condition_text": "w/o leg greaves",
|
|
"description": "You slash open foe's shin.",
|
|
"raw_affix_text": "+2H - bleed",
|
|
"sort_order": 2
|
|
}
|
|
],
|
|
"raw_cell_text": "Original full cell text as extracted from the PDF",
|
|
"source": {
|
|
"pdf": "Slash.pdf",
|
|
"extraction_method": "xml"
|
|
}
|
|
}
|
|
```
|
|
|
|
## Ingestion notes
|
|
|
|
Current import flow:
|
|
|
|
1. Create `critical_table`, `critical_group`, `critical_column`, and `critical_roll_band` from each PDF's visible axes.
|
|
2. Store each base cell in `critical_result` with base raw/description/affix text.
|
|
3. Split explicit conditional branches into `critical_branch`.
|
|
4. Parse symbolic affixes for both the base result and any branch affix payloads into `critical_effect`.
|
|
5. Return the base result plus ordered branches and parsed affix effects through the web critical lookup.
|
|
6. Gradually enrich prose-derived effects such as death, blindness, paralysis, limb loss, initiative changes, and item breakage.
|
|
7. Route image PDFs like `Void.pdf` through OCR before the same parser.
|
|
|
|
The important design decision is: never throw away the original text. The prose is too irregular to rely on normalized fields alone.
|
|
|
|
## Manual curation workflow
|
|
|
|
Because the import path depends on OCR, PDF XML extraction, and heuristics, the web app now treats manual repair as a first-class capability instead of an out-of-band database operation.
|
|
|
|
Current curation flow:
|
|
|
|
1. Browse a table on the `/tables` page.
|
|
2. Hover a populated cell to identify editable entries.
|
|
3. Open the popup editor for that cell.
|
|
4. Edit the entire `critical_result` graph:
|
|
- base raw cell text
|
|
- curated prose / description
|
|
- raw affix text
|
|
- parse status
|
|
- parsed JSON
|
|
- nested `critical_branch` rows
|
|
- nested `critical_effect` rows for both the base result and branches
|
|
5. Save the result back through the API.
|
|
|
|
The corresponding API endpoints are:
|
|
|
|
- `GET /api/tables/critical/{slug}/cells/{resultId}`
|
|
- `PUT /api/tables/critical/{slug}/cells/{resultId}`
|
|
|
|
The save operation replaces the stored branches and effects for that cell with the submitted payload. That keeps manual edits deterministic and avoids trying to reconcile partial child-row diffs against importer-generated data.
|