Files
RolemasterDB/docs/critical_tables_db_model.md

8.5 KiB

Critical Tables DB Model

What the PDFs look like

The PDFs are not one uniform table shape. I found three families:

  1. Standard tables

    • Columns are severity-like keys such as A through E.
    • Rows are roll bands such as 01-05, 66, 96-99, or 100.
    • Examples: Slash.pdf, Puncture.pdf, Arcane Aether.pdf.
  2. Variant-column tables

    • Columns are not severity letters; they are variant keys such as normal, magic, mithril, holy arms, slaying.
    • Rows are still roll bands.
    • Example: Large Creature - Weapon.pdf.
  3. Grouped variant tables

    • There is an extra grouping axis above the column axis.
    • Example: Large Creature - Magic.pdf has:
      • group: large, super_large
      • column: normal, slaying
    • In the current importer manifest, the grouped magic PDF is loaded once as large_creature_magic because the Large Creature - Magic.pdf and Super Large Creature - Magic.pdf source files are duplicates.
      • row: roll band

There are also extraction constraints:

  • Most PDFs are text extractable with pdftohtml -xml.
  • Void.pdf appears image-based and will need OCR bootstrap, with the existing curation flow handling cleanup.
  • A single cell can contain:
    • base description text
    • symbolic affixes such as +5H - 2S - 3B
    • conditional branches such as with helmet, w/o leg greaves, if foe has shield

Because of that, the safest model is hybrid:

  • relational tables for lookup axes and indexed effects
  • raw text storage for fidelity
  • structured JSON for irregular branches that are hard to normalize perfectly on first pass

1. critical_table

One record per PDF/table, which is the primary "critical type" for lookup.

Examples:

  • slash
  • puncture
  • arcane_aether
  • large_creature_weapon
  • large_creature_magic

2. critical_group

Optional extra axis for tables that need more than type + column + roll.

Examples:

  • large
  • super_large

Most tables will have no group rows.

3. critical_column

Generalized "severity/column" axis.

Examples:

  • A, B, C, D, E
  • normal, magic, mithril, holy_arms, slaying

Do not hardcode this as a single severity enum. Treat it as a table-defined dimension.

4. critical_roll_band

Stores row bands and supports exact row lookup by roll.

Examples:

  • 01-05
  • 66
  • 96-99
  • 251+

Recommended fields:

  • min_roll
  • max_roll nullable for open-ended rows like 251+
  • display label
  • sort order

5. critical_result

One record per lookup cell:

  • table
  • optional group
  • column
  • roll band

This stores:

  • is_curated
  • raw_cell_text
  • description_text
  • raw_affix_text
  • parsed_json
  • parse_status
  • source_page_number
  • source_image_path
  • source_image_crop

is_curated is an explicit workflow flag. Once a result is curated in the web editor, later importer runs must preserve curator-owned content instead of replacing the row wholesale.

The source-image fields keep importer provenance separate from the editor snapshot stored in parsed_json:

  • source_page_number points to the rendered PDF page used for review
  • source_image_path stores the importer-managed relative PNG path for the cell crop
  • source_image_crop stores the crop geometry that produced the PNG and can be used for debugging alignment problems

6. critical_branch

Optional conditional branches inside a result cell.

Examples:

  • with helmet
  • without helmet
  • with leg greaves
  • if foe has shield

Each branch can carry:

  • condition_text
  • optional structured condition_json
  • branch description text
  • branch raw affix text
  • parsed JSON

Current implementation note:

  • critical_branch is now populated by the importer and returned by the web critical lookup
  • condition keys are normalized for lookup/API use, while the original condition text remains available for display

7. critical_effect

Normalized machine-readable effects parsed from the symbol line and, over time, from prose.

Recommended canonical effect_code values:

  • direct_hits
  • must_parry_rounds
  • no_parry_rounds
  • stunned_rounds
  • bleed_per_round
  • foe_penalty
  • attacker_bonus_next_round
  • power_point_modifier
  • initiative_gain
  • initiative_loss
  • drop_item
  • item_breakage_check
  • limb_useless
  • knockdown
  • prone
  • coma
  • paralyzed
  • blind
  • deaf
  • mute
  • dies_in_rounds
  • instant_death
  • armor_destroyed
  • weapon_stuck

Each effect should point to either:

  • the base critical_result, or
  • a critical_branch

This lets you keep the raw text but still filter/query on effects.

Current implementation note:

  • symbol-driven affixes are now normalized for both base results and conditional branch affixes
  • value_expression is used when the affix contains a formula instead of a flat integer, which is currently needed for Mana power-point adjustments such as +(2d10-18)P

Why this works for your lookup

Your lookup target is mostly:

  • critical type
  • severity(column)
  • roll

That maps cleanly to:

  • critical_table.slug
  • critical_column.column_key
  • numeric roll matched against critical_roll_band

For the outlier tables, add an optional group_key.

That means the API can still stay simple:

{
  "critical_type": "slash",
  "column": "C",
  "roll": 38,
  "group": null
}

or:

{
  "critical_type": "large_creature_magic",
  "group": "super_large",
  "column": "slaying",
  "roll": 88
}

Example return object

This is close to the current lookup shape, while still leaving room for future critical_effect normalization:

{
  "critical_type": "slash",
  "table_name": "Slash Critical Strike Table",
  "group": null,
  "column": "B",
  "column_label": "B",
  "column_role": "severity",
  "roll": 38,
  "roll_band": "36-45",
  "roll_band_min": 36,
  "roll_band_max": 45,
  "description": "Strike foe in shin.",
  "raw_affix_text": null,
  "branches": [
    {
      "branch_kind": "conditional",
      "condition_key": "with_leg_greaves",
      "condition_text": "with leg greaves",
      "description": "",
      "raw_affix_text": "+2H - must_parry",
      "sort_order": 1
    },
    {
      "branch_kind": "conditional",
      "condition_key": "without_leg_greaves",
      "condition_text": "w/o leg greaves",
      "description": "You slash open foe's shin.",
      "raw_affix_text": "+2H - bleed",
      "sort_order": 2
    }
  ],
  "raw_cell_text": "Original full cell text as extracted from the PDF",
  "source": {
    "pdf": "Slash.pdf",
    "extraction_method": "xml"
  }
}

Ingestion notes

Current import flow:

  1. Create critical_table, critical_group, critical_column, and critical_roll_band from each PDF's visible axes.
  2. Store each base cell in critical_result with base raw/description/affix text.
  3. Split explicit conditional branches into critical_branch.
  4. Parse symbolic affixes for both the base result and any branch affix payloads into critical_effect.
  5. Return the base result plus ordered branches and parsed affix effects through the web critical lookup.
  6. Gradually enrich prose-derived effects such as death, blindness, paralysis, limb loss, initiative changes, and item breakage.
  7. Route image PDFs like Void.pdf through OCR bootstrap before the same downstream parser and curation flow.

The important design decision is: never throw away the original text. The prose is too irregular to rely on normalized fields alone.

Manual curation workflow

Because the import path depends on OCR, PDF XML extraction, and heuristics, the web app now treats manual repair as a first-class capability instead of an out-of-band database operation.

Current curation flow:

  1. Browse a table on the /tables page.
  2. Hover a populated cell to identify editable entries.
  3. Open the popup editor for that cell.
  4. Edit the entire critical_result graph:
    • base raw cell text
    • curated prose / description
    • raw affix text
    • curated state
    • parse status
    • parsed JSON
    • nested critical_branch rows
    • nested critical_effect rows for both the base result and branches
  5. Save the result back through the API.

The corresponding API endpoints are:

  • GET /api/tables/critical/{slug}/cells/{resultId}
  • GET /api/tables/critical/{slug}/cells/{resultId}/source-image
  • PUT /api/tables/critical/{slug}/cells/{resultId}

The save operation replaces the stored branches and effects for that cell with the submitted payload and updates the explicit curated flag. Importer-managed source provenance can still be refreshed on later imports without overwriting curated content.