Add critical result curation metadata

This commit is contained in:
2026-03-17 22:03:09 +01:00
parent 9e16605168
commit 99e7da0d21
5 changed files with 69 additions and 3 deletions

View File

@@ -102,11 +102,23 @@ One record per lookup cell:
This stores:
- `is_curated`
- `raw_cell_text`
- `description_text`
- `raw_affix_text`
- `parsed_json`
- parse status / source metadata
- `parse_status`
- `source_page_number`
- `source_image_path`
- `source_image_crop`
`is_curated` is an explicit workflow flag. Once a result is curated in the web editor, later importer runs must preserve curator-owned content instead of replacing the row wholesale.
The source-image fields keep importer provenance separate from the editor snapshot stored in `parsed_json`:
- `source_page_number` points to the rendered PDF page used for review
- `source_image_path` stores the importer-managed relative PNG path for the cell crop
- `source_image_crop` stores the crop geometry that produced the PNG and can be used for debugging alignment problems
### 6. `critical_branch`
@@ -284,6 +296,7 @@ Current curation flow:
- base raw cell text
- curated prose / description
- raw affix text
- curated state
- parse status
- parsed JSON
- nested `critical_branch` rows
@@ -293,6 +306,7 @@ Current curation flow:
The corresponding API endpoints are:
- `GET /api/tables/critical/{slug}/cells/{resultId}`
- `GET /api/tables/critical/{slug}/cells/{resultId}/source-image`
- `PUT /api/tables/critical/{slug}/cells/{resultId}`
The save operation replaces the stored branches and effects for that cell with the submitted payload. That keeps manual edits deterministic and avoids trying to reconcile partial child-row diffs against importer-generated data.
The save operation replaces the stored branches and effects for that cell with the submitted payload and updates the explicit curated flag. Importer-managed source provenance can still be refreshed on later imports without overwriting curated content.

View File

@@ -52,12 +52,15 @@ create table critical_result (
critical_group_id bigint references critical_group(id) on delete cascade,
critical_column_id bigint not null references critical_column(id) on delete cascade,
critical_roll_band_id bigint not null references critical_roll_band(id) on delete cascade,
is_curated boolean not null default false,
raw_cell_text text not null,
description_text text,
raw_affix_text text,
parsed_json jsonb not null default '{}'::jsonb,
parse_status text not null default 'raw' check (parse_status in ('raw', 'partial', 'parsed', 'verified')),
source_bbox jsonb,
source_page_number integer,
source_image_path text,
source_image_crop jsonb,
created_at timestamptz not null default now()
);