Add high-res critical image refresh import
This commit is contained in:
@@ -33,7 +33,7 @@ The current implementation supports:
|
||||
- `variant_column` critical tables with non-severity columns
|
||||
- `grouped_variant` critical tables with a group axis plus variant columns
|
||||
- XML-based extraction using `pdftohtml -xml`
|
||||
- XML-aligned page rendering and per-cell PNG crops using `pdftoppm -png -r 108`
|
||||
- XML-aligned page rendering and per-cell PNG crops using `pdftoppm -png -r 432`
|
||||
- geometry-based parsing across the currently enabled table set:
|
||||
- `arcane-aether`
|
||||
- `arcane-nether`
|
||||
@@ -359,6 +359,22 @@ Example:
|
||||
dotnet run --project .\src\RolemasterDb.ImportTool\RolemasterDb.ImportTool.csproj -- import slash
|
||||
```
|
||||
|
||||
### `reimport-images <table>`
|
||||
|
||||
Reuses `source.xml`, regenerates page PNGs and cell PNGs, rewrites the JSON artifacts, and refreshes only source-image metadata in SQLite.
|
||||
|
||||
Use this when:
|
||||
|
||||
- crop resolution or render settings changed
|
||||
- you want better source images without reloading result text
|
||||
- you want to keep curated and uncurated content untouched while refreshing artifacts
|
||||
|
||||
Example:
|
||||
|
||||
```powershell
|
||||
dotnet run --project .\src\RolemasterDb.ImportTool\RolemasterDb.ImportTool.csproj -- reimport-images slash
|
||||
```
|
||||
|
||||
## Manifest
|
||||
|
||||
The importer manifest is stored at:
|
||||
@@ -433,7 +449,7 @@ Each parsed cell now includes:
|
||||
|
||||
### `pages/page-001.png`
|
||||
|
||||
Rendered PDF page images at `108 DPI`, which matches the coordinate space emitted by `pdftohtml -xml`.
|
||||
Rendered PDF page images at `432 DPI`, using a central render scale factor of `4` over the XML coordinate space emitted by `pdftohtml -xml`.
|
||||
|
||||
Use this when:
|
||||
|
||||
@@ -607,10 +623,14 @@ The importer now uses two Poppler tools:
|
||||
|
||||
- `pdftohtml -xml -i -noframes`
|
||||
- extracts geometry-aware XML text
|
||||
- `pdftoppm -png -r 108`
|
||||
- `pdftoppm -png -r 432`
|
||||
- renders page PNGs and per-cell crop PNGs
|
||||
|
||||
The `108 DPI` render setting is deliberate: for the current PDFs and Poppler output, it produces page images whose pixel dimensions match the XML `page width` and `page height`, so crop coordinates can be applied directly without an extra scale-conversion step.
|
||||
The importer keeps a central render scale factor of `4`. The XML still defines bounds in its original coordinate space, but rendered PNGs and stored crop metadata now use the scaled coordinate space and a `432 DPI` render setting. In practice:
|
||||
|
||||
- XML coordinates are multiplied by `4` before crop extraction
|
||||
- page and crop metadata stored with each result reflect the scaled PNG coordinate space
|
||||
- crop alignment remains deterministic without changing the parsing pipeline
|
||||
|
||||
## Interaction With Web App Startup
|
||||
|
||||
|
||||
Reference in New Issue
Block a user