Initial commit
This commit is contained in:
285
docs/critical_tables_db_model.md
Normal file
285
docs/critical_tables_db_model.md
Normal file
@@ -0,0 +1,285 @@
|
||||
# Critical Tables DB Model
|
||||
|
||||
## What the PDFs look like
|
||||
|
||||
The PDFs are not one uniform table shape. I found three families:
|
||||
|
||||
1. Standard tables
|
||||
- Columns are severity-like keys such as `A` through `E`.
|
||||
- Rows are roll bands such as `01-05`, `66`, `96-99`, or `100`.
|
||||
- Examples: `Slash.pdf`, `Puncture.pdf`, `Arcane Aether.pdf`.
|
||||
|
||||
2. Variant-column tables
|
||||
- Columns are not severity letters; they are variant keys such as `normal`, `magic`, `mithril`, `holy arms`, `slaying`.
|
||||
- Rows are still roll bands.
|
||||
- Example: `Large Creature - Weapon.pdf`.
|
||||
|
||||
3. Grouped variant tables
|
||||
- There is an extra grouping axis above the column axis.
|
||||
- Example: `Large Creature - Magic.pdf` has:
|
||||
- group: `large`, `super_large`
|
||||
- column: `normal`, `slaying`
|
||||
- row: roll band
|
||||
|
||||
There are also extraction constraints:
|
||||
|
||||
- Most PDFs are text extractable with `pdftotext -layout`.
|
||||
- `Void.pdf` appears image-based and will need OCR or manual transcription.
|
||||
- A single cell can contain:
|
||||
- base description text
|
||||
- symbolic affixes such as `+5H - 2S - 3B`
|
||||
- conditional branches such as `with helmet`, `w/o leg greaves`, `if foe has shield`
|
||||
|
||||
Because of that, the safest model is hybrid:
|
||||
|
||||
- relational tables for lookup axes and indexed effects
|
||||
- raw text storage for fidelity
|
||||
- structured JSON for irregular branches that are hard to normalize perfectly on first pass
|
||||
|
||||
## Recommended logical model
|
||||
|
||||
### 1. `critical_table`
|
||||
|
||||
One record per PDF/table, which is the primary "critical type" for lookup.
|
||||
|
||||
Examples:
|
||||
|
||||
- `slash`
|
||||
- `puncture`
|
||||
- `arcane_aether`
|
||||
- `large_creature_weapon`
|
||||
- `large_creature_magic`
|
||||
|
||||
### 2. `critical_group`
|
||||
|
||||
Optional extra axis for tables that need more than type + column + roll.
|
||||
|
||||
Examples:
|
||||
|
||||
- `large`
|
||||
- `super_large`
|
||||
|
||||
Most tables will have no group rows.
|
||||
|
||||
### 3. `critical_column`
|
||||
|
||||
Generalized "severity/column" axis.
|
||||
|
||||
Examples:
|
||||
|
||||
- `A`, `B`, `C`, `D`, `E`
|
||||
- `normal`, `magic`, `mithril`, `holy_arms`, `slaying`
|
||||
|
||||
Do not hardcode this as a single severity enum. Treat it as a table-defined dimension.
|
||||
|
||||
### 4. `critical_roll_band`
|
||||
|
||||
Stores row bands and supports exact row lookup by roll.
|
||||
|
||||
Examples:
|
||||
|
||||
- `01-05`
|
||||
- `66`
|
||||
- `96-99`
|
||||
- `251+`
|
||||
|
||||
Recommended fields:
|
||||
|
||||
- `min_roll`
|
||||
- `max_roll` nullable for open-ended rows like `251+`
|
||||
- display label
|
||||
- sort order
|
||||
|
||||
### 5. `critical_result`
|
||||
|
||||
One record per lookup cell:
|
||||
|
||||
- table
|
||||
- optional group
|
||||
- column
|
||||
- roll band
|
||||
|
||||
This stores:
|
||||
|
||||
- `raw_cell_text`
|
||||
- `description_text`
|
||||
- `raw_affix_text`
|
||||
- `parsed_json`
|
||||
- parse status / source metadata
|
||||
|
||||
### 6. `critical_branch`
|
||||
|
||||
Optional conditional branches inside a result cell.
|
||||
|
||||
Examples:
|
||||
|
||||
- `with helmet`
|
||||
- `without helmet`
|
||||
- `with leg greaves`
|
||||
- `if foe has shield`
|
||||
|
||||
Each branch can carry:
|
||||
|
||||
- `condition_text`
|
||||
- optional structured `condition_json`
|
||||
- branch description text
|
||||
- branch raw affix text
|
||||
- parsed JSON
|
||||
|
||||
### 7. `critical_effect`
|
||||
|
||||
Normalized machine-readable effects parsed from the symbol line and, over time, from prose.
|
||||
|
||||
Recommended canonical `effect_code` values:
|
||||
|
||||
- `direct_hits`
|
||||
- `must_parry_rounds`
|
||||
- `no_parry_rounds`
|
||||
- `stunned_rounds`
|
||||
- `bleed_per_round`
|
||||
- `foe_penalty`
|
||||
- `attacker_bonus_next_round`
|
||||
- `initiative_gain`
|
||||
- `initiative_loss`
|
||||
- `drop_item`
|
||||
- `item_breakage_check`
|
||||
- `limb_useless`
|
||||
- `knockdown`
|
||||
- `prone`
|
||||
- `coma`
|
||||
- `paralyzed`
|
||||
- `blind`
|
||||
- `deaf`
|
||||
- `mute`
|
||||
- `dies_in_rounds`
|
||||
- `instant_death`
|
||||
- `armor_destroyed`
|
||||
- `weapon_stuck`
|
||||
|
||||
Each effect should point to either:
|
||||
|
||||
- the base `critical_result`, or
|
||||
- a `critical_branch`
|
||||
|
||||
This lets you keep the raw text but still filter/query on effects.
|
||||
|
||||
## Why this works for your lookup
|
||||
|
||||
Your lookup target is mostly:
|
||||
|
||||
- `critical type`
|
||||
- `severity(column)`
|
||||
- `roll`
|
||||
|
||||
That maps cleanly to:
|
||||
|
||||
- `critical_table.slug`
|
||||
- `critical_column.column_key`
|
||||
- numeric roll matched against `critical_roll_band`
|
||||
|
||||
For the outlier tables, add an optional `group_key`.
|
||||
|
||||
That means the API can still stay simple:
|
||||
|
||||
```json
|
||||
{
|
||||
"critical_type": "slash",
|
||||
"column": "C",
|
||||
"roll": 38,
|
||||
"group": null
|
||||
}
|
||||
```
|
||||
|
||||
or:
|
||||
|
||||
```json
|
||||
{
|
||||
"critical_type": "large_creature_magic",
|
||||
"group": "super_large",
|
||||
"column": "slaying",
|
||||
"roll": 88
|
||||
}
|
||||
```
|
||||
|
||||
## Example return object
|
||||
|
||||
This is the shape I would return from a lookup:
|
||||
|
||||
```json
|
||||
{
|
||||
"critical_type": "slash",
|
||||
"table_name": "Slash Critical Strike Table",
|
||||
"group": null,
|
||||
"column": {
|
||||
"key": "B",
|
||||
"label": "B",
|
||||
"role": "severity"
|
||||
},
|
||||
"roll": {
|
||||
"input": 38,
|
||||
"band": "36-45",
|
||||
"min": 36,
|
||||
"max": 45
|
||||
},
|
||||
"description": "Strike foe in shin.",
|
||||
"raw_affix_text": "+2H - must_parry",
|
||||
"affixes": [
|
||||
{
|
||||
"effect_code": "direct_hits",
|
||||
"value": 2
|
||||
}
|
||||
],
|
||||
"conditions": [
|
||||
{
|
||||
"when": "with leg greaves",
|
||||
"description": null,
|
||||
"raw_affix_text": "+2H - must_parry",
|
||||
"affixes": [
|
||||
{
|
||||
"effect_code": "direct_hits",
|
||||
"value": 2
|
||||
},
|
||||
{
|
||||
"effect_code": "must_parry_rounds",
|
||||
"value": 1
|
||||
}
|
||||
]
|
||||
},
|
||||
{
|
||||
"when": "without leg greaves",
|
||||
"description": "You slash open foe's shin.",
|
||||
"raw_affix_text": "+2H - bleed",
|
||||
"affixes": [
|
||||
{
|
||||
"effect_code": "direct_hits",
|
||||
"value": 2
|
||||
},
|
||||
{
|
||||
"effect_code": "bleed_per_round",
|
||||
"value": 1
|
||||
}
|
||||
]
|
||||
}
|
||||
],
|
||||
"raw_text": "Original full cell text as extracted from the PDF",
|
||||
"source": {
|
||||
"pdf": "Slash.pdf",
|
||||
"page": 1,
|
||||
"extraction_method": "text"
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## Ingestion notes
|
||||
|
||||
Recommended import flow:
|
||||
|
||||
1. Create `critical_table`, `critical_group`, `critical_column`, and `critical_roll_band` from each PDF's visible axes.
|
||||
2. Store each cell in `critical_result.raw_cell_text` exactly as extracted.
|
||||
3. Parse the symbol line into `critical_effect`.
|
||||
4. Split explicit conditional branches into `critical_branch`.
|
||||
5. Gradually enrich prose-derived effects such as death, blindness, paralysis, limb loss, initiative changes, and item breakage.
|
||||
6. Route image PDFs like `Void.pdf` through OCR before the same parser.
|
||||
|
||||
The important design decision is: never throw away the original text. The prose is too irregular to rely on normalized fields alone.
|
||||
|
||||
Reference in New Issue
Block a user