15 KiB
Critical Curation UX Plan
Goal
Extend the importer and web app so critical results can be curated with direct visual reference to the source PDF cell, tracked with an explicit curated state, and protected from later importer runs.
Current State Summary
- The importer extracts XML text from PDFs, parses tables, writes debug artifacts, then deletes and reloads an entire critical table into SQLite.
CriticalResultcurrently stores text/effect data andParseStatus, but it does not track curation state or source-image metadata.- The
/tablespage renders compact cells and opensCriticalCellEditorDialog, but the editor has no source-image panel and the table has no curated-status affordance. - The importer currently destroys all existing results for a table during reload, which would erase any manual curation.
Recommended Design Decisions
1. Persist both curation state and source-image linkage on CriticalResult
Add explicit fields instead of hiding this inside ParsedJson.
Recommended additions:
IsCuratedbool not null default falseSourcePageNumberint?SourceImagePathstring?SourceImageCropJsonstring?
Rationale:
IsCuratedis queried and displayed directly in the UI and import logic.- The editor needs a stable way to request the PNG for a result without reconstructing crop geometry on every request.
- Keeping source-image metadata separate from
ParsedJsonavoids mixing importer provenance with manual-editor snapshot state.
2. Treat imported source images as importer-managed artifacts
Store generated page PNGs and per-cell crop PNGs under the existing artifact root, for example:
artifacts/import/critical/<slug>/pages/page-001.pngartifacts/import/critical/<slug>/cells/<group-or-none>__<column>__<roll-band>.png
Rationale:
- This fits the existing importer artifact workflow.
- It keeps raw extraction/debug output outside the Blazor app project.
- File naming can be deterministic from the logical result key.
3. Serve source images through an API endpoint, not raw filesystem paths
Add an API endpoint that resolves the result’s stored relative image path and streams the PNG.
Recommended endpoint:
GET /api/tables/critical/{slug}/cells/{resultId:int}/source-image
Rationale:
- Avoids exposing arbitrary filesystem paths to the browser.
- Lets the app return
404cleanly when an image is missing. - Keeps the editor response simple by returning either an image URL or enough information to build one.
4. Preserve curated results by switching the importer from table replacement to keyed merge/upsert
Importer matching key:
table slug- optional
group key column keyroll band label
Behavior:
- If no existing result exists, insert a new result.
- If an existing result exists and
IsCurated == false, replace importer-managed fields and child rows. - If an existing result exists and
IsCurated == true, preserve curated text/effect/branch fields and do not overwrite them. - Still allow importer-managed source-image metadata to be refreshed if it can be done without changing curated content.
Rationale:
- The current delete-and-reload flow is incompatible with requirement 5.
- The logical table coordinates already form a stable identity for each result.
Implementation Phases
Phase 1: Schema and Domain Model
1.1 Extend CriticalResult
Planned files:
src/RolemasterDb.App/Domain/CriticalResult.cssrc/RolemasterDb.App/Data/RolemasterDbContext.cssrc/RolemasterDb.App/Data/RolemasterDbSchemaUpgrader.csdocs/critical_tables_schema.sqldocs/critical_tables_db_model.md
Tasks:
- Add the new persisted properties to
CriticalResult. - Configure lengths/indexes in
RolemasterDbContextwhere needed. - Extend
RolemasterDbSchemaUpgraderwith additive SQLite migrations for the new columns. - Document the new fields and their purpose in the schema/model docs.
Acceptance criteria:
- Existing databases upgrade in place.
- New databases include the new columns.
- The schema documentation matches the implementation.
Phase 2: Importer Provenance and Image Extraction
2.1 Capture per-result source geometry during parsing
Planned files:
src/RolemasterDb.ImportTool/Parsing/ParsedCriticalResult.cssrc/RolemasterDb.ImportTool/Parsing/ParsedCriticalCellArtifact.cssrc/RolemasterDb.ImportTool/Parsing/StandardCriticalTableParser.cssrc/RolemasterDb.ImportTool/Parsing/VariantColumnCriticalTableParser.cssrc/RolemasterDb.ImportTool/Parsing/GroupedVariantCriticalTableParser.cs- shared parser helpers under
src/RolemasterDb.ImportTool/Parsing/
Tasks:
- Extend the parsed-result models to carry source page number and crop/bounding-box metadata.
- Derive a reliable bounding rectangle from the fragments assigned to each parsed cell.
- Ensure grouped/variant parsers emit the same provenance data shape as standard tables.
- Include the geometry in debug artifacts to make extraction problems inspectable.
Acceptance criteria:
- Every parsed result has enough provenance to locate and crop its source region.
parsed-cells.jsonshows page/crop metadata for manual debugging.
2.2 Add page rendering and per-cell PNG crop generation
Planned files:
src/RolemasterDb.ImportTool/PdfXmlExtractor.cssrc/RolemasterDb.ImportTool/CriticalImportCommandRunner.cssrc/RolemasterDb.ImportTool/ImportArtifactPaths.cs- new helper classes in
src/RolemasterDb.ImportTool/ docs/critical_import_tool.md
Tasks:
- Extend extraction so the importer renders source PDF pages to PNG in addition to XML.
- Add a crop step that uses the parsed cell bounding boxes to generate one PNG per critical result.
- Store deterministic relative image paths on parsed results before load.
- Extend artifact path helpers to include
pages/andcells/folders. - Document any required external tool dependency, expected command line, and artifact layout.
Implementation note:
- Validate which Poppler/image tool is already available in this environment before coding. The plan should prefer reusing the existing PDF toolchain if possible.
Acceptance criteria:
- Running the importer for a table produces page PNGs and cell PNGs alongside the existing JSON artifacts.
- Each parsed result can be linked to exactly one crop image.
Phase 3: Importer Load Strategy That Preserves Curated Results
3.1 Replace destructive table reload with merge/upsert
Planned files:
src/RolemasterDb.ImportTool/CriticalImportLoader.cssrc/RolemasterDb.ImportTool/CriticalImportCommandRunner.cs- importer tests under
src/RolemasterDb.ImportTool.Tests/ docs/critical_import_tool.md
Tasks:
- Stop deleting an entire table before re-import.
- Load existing table metadata, columns, groups, roll bands, and results keyed by logical identity.
- Rebuild table-level axes as needed, but preserve existing result rows when identity matches.
- For non-curated results, replace importer-managed text/effect/branch/source-image fields.
- For curated results, preserve manual content and child rows.
- Decide whether missing results from the source should remain, be deleted only when uncurated, or be flagged for review. Recommended default: only delete unmatched results when they are uncurated.
Acceptance criteria:
- Re-import keeps curated results intact.
- Re-import still updates uncurated results and newly discovered cells.
- Re-import remains transactional.
3.2 Define importer ownership boundaries clearly
Tasks:
- Explicitly identify which fields are importer-owned versus curator-owned.
- Make the merge code preserve curator-owned fields on curated rows.
- Keep source provenance/image fields importer-owned unless that causes user-visible regressions.
Recommended ownership split:
- Importer-owned: raw extracted text, generated quick-parse baseline, parse status, source page/image/crop metadata.
- Curator-owned when curated: description overrides, branch edits, effect edits, curated flag.
Open point to resolve during implementation:
- Whether
RawCellTextshould remain importer-owned even for curated rows, or frozen once curated. Recommended approach: freeze it for curated rows to preserve the exact human-reviewed context shown in the editor.
Phase 4: API and Service Contract Changes
4.1 Extend editor and table-detail contracts
Planned files:
src/RolemasterDb.App/Features/LookupContracts.cssrc/RolemasterDb.App/Features/CriticalCellEditorResponse.cssrc/RolemasterDb.App/Features/CriticalCellUpdateRequest.cssrc/RolemasterDb.App/Features/LookupService.cssrc/RolemasterDb.App/Program.cssrc/RolemasterDb.App/Components/Pages/Api.razor
Tasks:
- Add curated-state fields to table-cell detail and editor responses.
- Add source-image URL or source-image presence/metadata to the editor response.
- Extend update requests so the editor can mark a result curated or not curated.
- Add the source-image streaming endpoint.
- Update API documentation page to reflect the new payload shape and endpoint.
Acceptance criteria:
/tablescan render curation state without opening the editor.- The editor can load the image URL and curated flag from a single response.
- Saving a cell persists the curated flag.
Phase 5: Editor UX for Curation
5.1 Show the source PNG inside CriticalCellEditorDialog
Planned files:
src/RolemasterDb.App/Components/Shared/CriticalCellEditorDialog.razorsrc/RolemasterDb.App/Components/Shared/CriticalCellEditorModel.cssrc/RolemasterDb.App/wwwroot/app.css
Tasks:
- Add a dedicated source-reference panel near the editor header or beside the quick-input section.
- Render the PNG with useful alt text containing table, roll band, group, and column context.
- Handle missing-image state gracefully with a compact fallback message.
- Ensure the dialog still works on smaller screens without the image overwhelming the form.
Acceptance criteria:
- A curator can see the original source cell while editing.
- The editor remains usable on desktop and mobile widths.
5.2 Add explicit curation controls in the editor
Tasks:
- Add a clear curated-state badge and a toggle or checkbox for marking a result curated.
- Make the current state visible in the header so the curator knows whether the cell is protected from importer overwrite.
- Consider a short explanatory note when toggling to curated, since that changes importer behavior.
Acceptance criteria:
- A curator can mark and unmark a result without leaving the editor.
- The protected state is obvious before saving.
Phase 6: Table UX for Curated vs Needs Curation
6.1 Surface curation state on the /tables grid
Planned files:
src/RolemasterDb.App/Components/Pages/Tables.razorsrc/RolemasterDb.App/Components/Shared/CompactCriticalCell.razorsrc/RolemasterDb.App/wwwroot/app.css
Tasks:
- Add a high-contrast, low-noise visual distinction between curated and needs-curation cells.
- Keep the state visible even when the cell text is dense.
- Add accessible labels/tooltips so keyboard and screen-reader users get the same state information.
- Decide whether empty cells should remain neutral rather than “needs curation”.
Recommended UI pattern:
- Curated: subtle success-toned chip or corner marker plus label in tooltip.
- Needs curation: warm warning-toned chip or border treatment.
- Empty: unchanged neutral empty state.
Acceptance criteria:
- A table can be scanned quickly for work that still needs curation.
- The curation marker does not reduce readability of the compact cell content.
6.2 Consider lightweight summary affordances
Optional follow-up tasks if the basic state marker is not enough:
- Add curated vs needs-curation counts in the table header.
- Add a filter toggle to show only needs-curation cells.
These should be deferred unless the initial implementation still feels too noisy or too hard to scan.
Phase 7: Tests and Verification
7.1 Importer tests
Planned files:
src/RolemasterDb.ImportTool.Tests/
Tests to add:
- Parsed result includes source page/crop metadata.
- Import run generates expected image artifacts for a known sample table.
- Re-import updates uncurated results.
- Re-import preserves curated results and their edited child rows.
- Re-import behavior for missing/unmatched results is deterministic.
7.2 App/service tests
Planned scope:
- Add or extend tests around
LookupServiceif a test project already exists or can be added cheaply.
Tests to add:
- Editor response includes curated state and image URL.
- Update request persists curated state.
- Table detail response carries curated state for each cell.
- Source-image endpoint returns
404for missing images and200for valid ones.
7.3 Manual verification
Manual checks:
- Import a representative table and confirm cell PNGs look correctly cropped.
- Open several cells in
CriticalCellEditorDialogand confirm the displayed image matches the edited result. - Mark a result curated, re-run importer, and confirm no curated fields are overwritten.
- Mark a result back to needs curation, re-run importer, and confirm importer updates resume.
- Check
/tablesscanning on both desktop and mobile layouts.
Documentation Updates Required During Implementation
Planned files:
docs/critical_import_tool.mddocs/critical_tables_db_model.mddocs/critical_tables_schema.sqlsrc/RolemasterDb.App/Components/Pages/Api.razor
Required updates:
- Describe the new importer artifact layout for page and cell PNGs.
- Document the curated flag and source-image metadata in the logical data model.
- Document the non-destructive importer behavior for curated rows.
- Document the new API response fields and source-image endpoint.
Suggested Execution Order
- Implement schema/domain changes first so both importer and app can compile against the new fields.
- Add parser provenance and image artifact generation next.
- Refactor importer load behavior from destructive replace to merge/upsert with curated preservation.
- Extend service/API contracts.
- Add editor image panel and curated toggle.
- Add
/tablescurated-state indicators. - Finish with docs, tests, and manual verification.
Risks and Mitigations
-
Risk: Crop coordinates are slightly wrong because PDF text geometry and rendered page PNG scale differ. Mitigation: Persist page number plus raw bounding box metadata, and verify crop alignment against a few known tables before broad rollout.
-
Risk: Preserving curated rows while rebuilding axes could break foreign-key relationships if columns/groups/roll bands are recreated blindly. Mitigation: Reuse or reconcile axis rows by logical key before touching result rows.
-
Risk: Importer ownership versus curator ownership is ambiguous for
RawCellText,ParseStatus, andParsedJson. Mitigation: Set this policy explicitly before coding and enforce it in one merge path. -
Risk: Serving images from artifact storage could expose unsafe file access if implemented with arbitrary path input. Mitigation: Resolve images by result identity and stored relative path only.
Approval Gate
Implementation should start only after this plan is approved, because the importer-load refactor and schema changes affect both data safety and future curation workflow.