Add critical result curation metadata

This commit is contained in:
2026-03-17 22:03:09 +01:00
parent 9e16605168
commit 99e7da0d21
5 changed files with 69 additions and 3 deletions

View File

@@ -102,11 +102,23 @@ One record per lookup cell:
This stores:
- `is_curated`
- `raw_cell_text`
- `description_text`
- `raw_affix_text`
- `parsed_json`
- parse status / source metadata
- `parse_status`
- `source_page_number`
- `source_image_path`
- `source_image_crop`
`is_curated` is an explicit workflow flag. Once a result is curated in the web editor, later importer runs must preserve curator-owned content instead of replacing the row wholesale.
The source-image fields keep importer provenance separate from the editor snapshot stored in `parsed_json`:
- `source_page_number` points to the rendered PDF page used for review
- `source_image_path` stores the importer-managed relative PNG path for the cell crop
- `source_image_crop` stores the crop geometry that produced the PNG and can be used for debugging alignment problems
### 6. `critical_branch`
@@ -284,6 +296,7 @@ Current curation flow:
- base raw cell text
- curated prose / description
- raw affix text
- curated state
- parse status
- parsed JSON
- nested `critical_branch` rows
@@ -293,6 +306,7 @@ Current curation flow:
The corresponding API endpoints are:
- `GET /api/tables/critical/{slug}/cells/{resultId}`
- `GET /api/tables/critical/{slug}/cells/{resultId}/source-image`
- `PUT /api/tables/critical/{slug}/cells/{resultId}`
The save operation replaces the stored branches and effects for that cell with the submitted payload. That keeps manual edits deterministic and avoids trying to reconcile partial child-row diffs against importer-generated data.
The save operation replaces the stored branches and effects for that cell with the submitted payload and updates the explicit curated flag. Importer-managed source provenance can still be refreshed on later imports without overwriting curated content.