xTr1m/GameList

Fork 0

Files

Frank Tovar 608c5368b3 Update architecture risk review

2026-02-18 19:24:36 +01:00

7.5 KiB

Raw Blame History

Review - Pick'n'Play (State-of-the-Art Assessment)

Date: 2026-02-18

Scope

Evaluated backend (Program.cs, Endpoints/*, Infrastructure/*, Data/*), frontend (wwwroot/*), and CI/deployment scripts.
Focused on risks in maintainability, extensibility, scalability, and security.

Executive summary

The app is solid for a small private group, but it still has several architectural bottlenecks compared to current best practice for long-term product growth. The largest risks are read amplification from client polling, hard-coded workflow/permission modeling, and security hardening gaps (CSRF posture and crypto modernization).

Findings

1) High - Scalability - Single-node SQLite bottleneck

Evidence:

SQLite is the primary DB (Program.cs:42).

Risk:

SQLite is excellent for small single-node deployments, but write concurrency and horizontal scale are limited for larger or bursty usage.

Alternative:

Keep SQLite for local/dev and migrate production to PostgreSQL/SQL Server with provider-specific migrations and connection pooling.

2) High - Scalability - Polling causes read amplification

Evidence:

Frontend polling runs continuously with 3s-20s cadence (wwwroot/app.js:30, wwwroot/app.js:58).
Each refresh can hit multiple endpoints (wwwroot/js/data.js:21, wwwroot/js/data.js:109).
/api/state also executes multiple aggregate counts each time (Endpoints/StateWorkflowService.cs:14).

Risk:

As concurrent users increase, backend read load grows quickly and mostly serves unchanged data.

Alternative:

Move to event-driven updates (SSE/WebSocket) plus conditional GET (ETag/If-None-Match) and/or a consolidated bootstrap endpoint.

3) High - Security - CSRF protection is implicit, not explicit

Evidence:

Cookie authentication is used for API auth (Program.cs:100, Program.cs:104).
Many state-changing endpoints rely on cookie auth (Endpoints/SuggestEndpoints.cs:24, Endpoints/VoteEndpoints.cs:24, Endpoints/AdminEndpoints.cs:14).
No antiforgery middleware/tokens are configured in startup.

Risk:

SameSite=Strict helps (Program.cs:104) but is not a full long-term CSRF strategy across browser/proxy edge cases.

Alternative:

Add explicit anti-forgery tokens for mutating requests (or move to bearer tokens for API calls) and verify origin headers server-side.

4) High - Extensibility - Workflow is hard-coded across backend and frontend

Evidence:

Phase behavior is encoded via enum and many explicit checks/switches (Domain/Phase.cs:3, Endpoints/StateWorkflowService.cs:70, Endpoints/EndpointHelpers.cs:97).
Frontend also hard-codes phase logic in many files (wwwroot/app.js:99, wwwroot/js/data.js:47, wwwroot/js/votes-ui.js:167).

Risk:

Adding a new phase or changing transitions requires touching many scattered branches, increasing regression risk.

Alternative:

Introduce a shared workflow/state-machine model (transition table) and consume it in both backend and frontend.

5) High - Extensibility - Role model is fixed to booleans

Evidence:

Role/state flags are booleans on player (Domain/Player.cs:22, Domain/Player.cs:23).
Admin checks are tightly coupled to that model (Infrastructure/AdminOnlyFilter.cs:13, Endpoints/AdminWorkflowService.cs:93).

Risk:

Future needs (moderator, read-only admin, per-feature permissions) require schema and logic rewrites instead of additive changes.

Alternative:

Move to role/permission tables (or claims-based capability model) and policy-based authorization.

6) Medium - Maintainability - Frontend is string-template heavy with global mutable state

Evidence:

Single global state object (wwwroot/js/state.js:1).
Heavy innerHTML rendering across modules (wwwroot/js/suggestions-ui.js:115, wwwroot/js/votes-ui.js:35, wwwroot/js/results-ui.js:72).

Risk:

Harder refactoring, weaker static guarantees, and easy XSS regressions when new contributors add templates.

Alternative:

Incrementally move to TypeScript + componentized rendering (or at minimum typed JSDoc + stricter lint rules + centralized safe render helpers).

7) Medium - Scalability/Security - In-memory dictionaries are unbounded

Evidence:

Auth attempt monitor stores failures in unbounded ConcurrentDictionary (Infrastructure/AuthAttemptMonitor.cs:14).
Image reachability cache is a static dictionary without size limits (Endpoints/SuggestionValidator.cs:7, Endpoints/SuggestionValidator.cs:42).

Risk:

High-cardinality traffic can grow memory and become a denial-of-service vector.

Alternative:

Replace with bounded MemoryCache (size limits + eviction) or distributed cache (Redis) with TTL and cardinality controls.

8) Medium - Scalability - Linking/results workflows load full sets into memory

Evidence:

Link/unlink loads all suggestions then computes roots in memory (Endpoints/AdminWorkflowService.cs:136, Endpoints/AdminWorkflowService.cs:188).
Results projection includes full vote lists per suggestion (Endpoints/ResultsWorkflowService.cs:23, Endpoints/ResultsWorkflowService.cs:34, Endpoints/ResultsWorkflowService.cs:50).

Risk:

Memory and query cost rise non-linearly with larger datasets.

Alternative:

Introduce persisted link-group IDs and push aggregation to SQL; add pagination/windowing for large result sets.

9) Medium - Security - Crypto is good but not state-of-the-art

Evidence:

Password hashing uses PBKDF2-SHA256 (Infrastructure/PasswordHasher.cs:33) with fixed iteration count (Infrastructure/PasswordHasher.cs:10).

Risk:

PBKDF2 remains acceptable, but modern guidance favors memory-hard KDFs (Argon2id/scrypt) against GPU/ASIC attacks.

Alternative:

Add versioned password hashes and migrate to Argon2id on login/re-hash.

10) Medium - Security - CSP remains permissive for inline style and mixed image origins

Evidence:

CSP allows style-src 'unsafe-inline' and img-src ... https: http: (Program.cs:138).

Risk:

Wider policy surface than needed, especially for long-term hardening.

Alternative:

Remove inline style dependence (e.g., CSS classes or nonce/hash), and restrict image sources to https and/or a media proxy allowlist.

11) Medium - Maintainability/Extensibility - API contract sync is manual

Evidence:

Frontend endpoints are hard-coded in JS (wwwroot/js/api.js:37, wwwroot/js/api.js:70).
Contract is also maintained manually in markdown (API.md).

Risk:

Contract drift between backend DTOs and frontend consumers over time.

Alternative:

Generate OpenAPI from endpoints and produce typed client code for frontend consumption.

12) Medium - Scalability/Resilience - External image validation is synchronous on write path

Evidence:

Suggestion validation calls network reachability checks (Endpoints/SuggestionValidator.cs:40).
Validation does outbound DNS/connect/HEAD/GET in request path (Endpoints/EndpointHelpers.cs:196, Endpoints/EndpointHelpers.cs:240, Endpoints/EndpointHelpers.cs:270).

Risk:

User write latency is coupled to third-party host responsiveness.

Alternative:

Accept URL quickly, then validate asynchronously (background job + status flag), optionally with trusted media proxying.

Recommended remediation order

Reduce read amplification: event-driven updates + state endpoint optimization.
Harden security baseline: explicit CSRF, CSP tightening, and password-hash migration plan.
Decouple growth hotspots: workflow state machine + role/permission model.
Improve operational scale: replace unbounded in-memory structures and large in-memory link/result operations.
Introduce contract tooling: OpenAPI + generated frontend client to reduce drift.

7.5 KiB Raw Blame History

Review - Pick'n'Play (State-of-the-Art Assessment)

Scope

Executive summary

Findings

1) High - Scalability - Single-node SQLite bottleneck

2) High - Scalability - Polling causes read amplification

3) High - Security - CSRF protection is implicit, not explicit

4) High - Extensibility - Workflow is hard-coded across backend and frontend

5) High - Extensibility - Role model is fixed to booleans

6) Medium - Maintainability - Frontend is string-template heavy with global mutable state

7) Medium - Scalability/Security - In-memory dictionaries are unbounded

8) Medium - Scalability - Linking/results workflows load full sets into memory

9) Medium - Security - Crypto is good but not state-of-the-art

10) Medium - Security - CSP remains permissive for inline style and mixed image origins

11) Medium - Maintainability/Extensibility - API contract sync is manual

12) Medium - Scalability/Resilience - External image validation is synchronous on write path

Recommended remediation order

7.5 KiB

Raw Blame History