Skip to content

Stage 2 — CF Sync Engine & Validation Matrix #5

Description

@KineticTactic

What this is: The pipeline between a player clicking Sync and the server knowing whether they received an Accepted verdict. Everything downstream depends on this being correct and race-safe.

Owner: Someone comfortable with async queues, HTTP API integration, and writing tests for edge cases.

Tasks

  1. POST /api/contests/sync endpoint

    • Check ratelimit:sync:<userId> in Redis. If present: return HTTP 429.
    • Set ratelimit:sync:<userId> with 60s TTL.
    • Enqueue a cf_sync job on cf_sync_queue: { roomId, userId, teamId, cfHandle, problemId }.
    • Set sync:<roomId>:<userId> Hash: { status: 'queued', position: N, createdAt: now }.
    • Return HTTP 202: { queued: true }.
    • Publish sync.queued to events:user:<userId>: { position: N }.
  2. cf_sync_queue worker (full implementation)

    • Dequeue job.
    • Call cf.user.status for the handle (last 20 submissions).
    • Run the Validation Matrix (all 5 checks — verdict, lower timestamp, upper timestamp, problem ID, handle match).
    • On AC: emit the internal sync.detected event (consumed by the Room engine in Stage 3): { roomId, userId, teamId, problemId, cfSubmissionId, cfTimestamp, verdict: 'OK', pointsAwarded: null }. Also publish sync.detected to events:user:<userId>.
    • On non-AC or validation failure: publish sync.failed to events:user:<userId>: { verdict: 'WA' | 'invalid' }. No Redis state is touched.
    • Circuit breaker: if CF returns HTTP 429, pause the queue for 30s.
  3. BullMQ retry behaviour

    • Verify exponential backoff on failure: 3 attempts, 5s → 10s → 20s.
    • On permanent failure (3 retries exhausted): publish sync.failed to events:user:<userId>: { reason: 'cf_unavailable' }.
  4. CF API adapter layer

    • Thin wrapper in lib/cf-api.ts. All CF API calls go through this file — no ad-hoc CF fetches elsewhere.
    • Parse errors trigger an alert log.
    • Reuses the existing fetch pattern from src/lib/potd/submit.ts.
  5. Cooldown UI contract

    • The 60s cooldown is server-enforced via ratelimit:sync:<userId>. The client may mirror it in the UI, but the server is always authoritative. Document this contract clearly in STAGE1_DONE.md.

Testing Gate

  • Real CF handle with an AC submission from the last hour: POST /api/contests/sync returns 202, sync.queued SSE event is received on the user stream, within ~2s sync.detected is received with correct data.
  • WA submission: sync.failed received with { verdict: 'WA' }. No Redis state updated.
  • Pre-reveal test: set revealedAt to 1 hour in the future. Older AC submission → Validation Matrix rejects it.
  • Wrong problem: AC exists for a different CF problem → problem ID check rejects it.
  • Double-sync within 60s: second call returns HTTP 429. No new job enqueued.
  • CF downtime (mock HTTP 500): worker retries 3 times with backoff, then publishes sync.failed to user stream.

Handoff Contract

  • All testing gate items pass with real CF credentials.
  • cf_sync_queue worker is running as a BullMQ worker process.
  • The sync.detected internal event shape is documented: { roomId, userId, teamId, problemId, cfSubmissionId, cfTimestamp, verdict: 'OK', pointsAwarded: null }. pointsAwarded is null here; Stage 3 fills it.

Metadata

Metadata

Assignees

Labels

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions