lila/documentation/ai-context/04-websocket-protocol.md
2026-05-16 01:59:43 +02:00

8.6 KiB

04 — WebSocket Protocol

Purpose: Deep dive into WebSocket lifecycle, state management, and edge cases for LLMs working on multiplayer features. Concatenate with 00-project-overview.md and 99-current-task.md. Last updated: 2026-05-15 Depends on: 00-project-overview.md, 03-api-contract.md


Connection Lifecycle

1. Upgrade

Client: GET wss://api.lilastudy.com/ws
        Headers: Cookie: better-auth.session=...

Server: Validates session via Better Auth (reads cookie, looks up in DB)
        → Valid: 101 Switching Protocols, connection established
        → Invalid: 401 Unauthorized, connection rejected

Auth is mandatory. No anonymous WebSocket connections. Guest play (if implemented) would need a different auth strategy here.

2. Message Routing

After connection, all messages flow through:

Raw JSON message
     ↓
Zod safeParse against WebSocketMessageSchema (discriminated union on `type`)
     ↓
Router switches on `type` → dispatches to handler
     ↓
Handler executes business logic → broadcasts to room

Invalid messages: Parse failures are logged and silently dropped. The client receives no error response — this is intentional to prevent error spam from malformed clients.

3. Disconnect

When a client disconnects (browser close, network loss, page navigate):

Connection close event
     ↓
Handler removes player from lobby (if in one)
     ↓
Broadcasts updated lobby:state to remaining players
     ↓
If game in progress and player disconnects:
     → Player is marked as "disconnected" (not removed from game state)
     → Their answer slot is treated as "no answer" (timeout)
     → Game continues

No automatic reconnect. The client must manually reconnect and re-join the lobby. Graceful reconnect with state restoration is planned (BACKLOG.md next).


State Management

Two-Tier Storage

State Type Storage Durability Use Case
Lobby membership PostgreSQL (lobbies, lobby_players) Durable Who is in which room, who is host
Game state In-memory (InMemoryLobbyGameStore, InMemoryGameSessionStore) Ephemeral Current question, scores, timer, answers received

Why the split? Lobby membership must survive server restarts (players shouldn't be kicked on deploy). Game state is ephemeral by design — a game lasts minutes, and losing state on restart is acceptable for MVP.

In-Memory Store Structure

// Conceptual — actual implementation in apps/api/src/gameSessionStore/
interface InMemoryGameState {
  [lobbyCode: string]: {
    status: "waiting" | "question" | "result" | "finished";
    currentRound: number;
    totalRounds: number;
    currentQuestion: GameQuestion | null;
    answers: Map<playerId, { optionId: number; timestamp: number }>;
    scores: Map<playerId, number>;
    timer: NodeJS.Timeout | null; // 15s server timer
    questionStartTime: number; // For speed-based tiebreaking
  };
}

The 15-Second Timer

Implementation

Host sends lobby:start
     ↓
Server generates questions, stores in game state
     ↓
Broadcast game:question to all players
     ↓
START 15-second timer (NodeJS setTimeout)
     ↓
Player answers collected in Map<playerId, answer>
     ↓
Timer expires OR all players answered
     ↓
STOP timer, evaluate answers, broadcast game:answer_result
     ↓
If more rounds: wait 3s → broadcast next game:question → restart timer
     ↓
If last round: broadcast game:finished

Timer Edge Cases

Scenario Behavior
Player answers at 14.9s Valid, collected before timer expiry
Player answers at 15.1s Rejected, treated as timeout. Timer already fired.
All players answer early Timer is cleared early, round proceeds immediately
No one answers All players get 0 points for that round, next round starts
Host disconnects mid-game Game continues, any player can see results. No "host transfer" logic yet.
Non-host sends lobby:start Silently ignored (or rejected — check implementation)

Message Broadcasting

Room-Based Broadcasting

The server maintains a mapping of lobbyCode → Set<WebSocket connections>. When a message needs to broadcast:

// Pseudo-code from ws/connections.ts
function broadcastToRoom(code: string, message: WebSocketMessage) {
  const connections = roomConnections.get(code);
  for (const ws of connections) {
    if (ws.readyState === WebSocket.OPEN) {
      ws.send(JSON.stringify(message));
    }
  }
}

Self-broadcast: The sender receives their own broadcast. The frontend must handle this (e.g., ignore their own lobby:state if they already updated optimistically).

Message Ordering

WebSocket guarantees in-order delivery per connection. However, race conditions can occur:

  • Player A sends game:answer at 14.5s
  • Player B's connection lags, receives game:answer_result before their own game:answer ack
  • Frontend must handle out-of-order messages gracefully

Edge Cases & Failure Modes

Mid-Game Disconnect

Player disconnects during question phase
     ↓
Connection close handler triggered
     ↓
Player NOT removed from game state (they might reconnect)
     ↓
Timer continues
     ↓
On timer expiry: player has no answer → treated as wrong
     ↓
Result broadcast includes "disconnected" status for that player

Current gap: No reconnect-with-state-restoration. Player must re-join lobby and game state is not recovered. Planned in BACKLOG.md next.

Double Join

Player joins lobby ABC
     ↓
Player joins lobby ABC again (accidental double-click, retry)
     ↓
Server: idempotent — player already in lobby, return 200
     ↓
No duplicate entries in lobby_players table

Rapid Start/Stop

Host clicks "Start" twice rapidly
     ↓
First click: game starts, state changes to "in_progress"
     ↓
Second click: server checks state, sees "in_progress", ignores

Client-Side Message Loss

If a client's game:answer never reaches the server (network blip):

  • Server never receives the answer
  • Timer expires
  • Player gets 0 points for that round
  • No retry mechanism — client sends once, no ack expected

Planned Improvements (Not Yet Implemented)

From BACKLOG.md next:

  1. Graceful WS reconnect — Exponential back-off, restore game state on reconnection if game still in progress
  2. Heartbeat/ping — Detect stale connections faster than TCP timeout
  3. Valkey for game state — Replace in-memory store with Redis-compatible storage for horizontal scaling and persistence across restarts
  4. Configurable game settings — Host sets round count, timer duration, target score via lobby settings jsonb column
  5. Additional game modes — TV Quiz Show, Race to the Top, Chain Link, Elimination Round, Cooperative Challenge (see design/GAME_MODES.md)

Key Files

File Purpose
apps/api/src/ws/index.ts WebSocket server setup, attach to HTTP server
apps/api/src/ws/auth.ts Session validation on upgrade
apps/api/src/ws/router.ts Message routing by type
apps/api/src/ws/connections.ts Connection management, room mapping
apps/api/src/ws/handlers/lobbyHandlers.ts lobby:join, lobby:leave, lobby:start
apps/api/src/ws/handlers/gameHandlers.ts game:answer
apps/api/src/services/multiplayerGameService.ts Game logic, timer, scoring
apps/api/src/lobbyGameStore/ In-memory lobby state storage
packages/shared/src/schemas/lobby.ts WS message Zod schemas
packages/shared/src/schemas/game.ts Game state Zod schemas