diff --git a/documentation/decisions.md b/documentation/decisions.md index 4b705c0..cdf41b3 100644 --- a/documentation/decisions.md +++ b/documentation/decisions.md @@ -359,3 +359,71 @@ All deferred post-MVP, purely additive (new tables referencing existing `terms`) - `noun_forms` — gender, singular, plural, articles per language (source: Wiktionary) - `verb_forms` — conjugation tables per language (source: Wiktionary) - `term_pronunciations` — IPA and audio URLs per language (source: Wiktionary / Forvo) + +--- + +## Deployment + +### Reverse proxy: Caddy (not Nginx, not Traefik) + +Caddy provides automatic HTTPS via Let's Encrypt with zero configuration beyond specifying domain names. The entire Caddyfile is ~10 lines. Nginx would require manual certbot setup and more verbose config. Traefik's auto-discovery of Docker containers (via labels) is powerful but overkill for a stable three-service stack where routing rules never change. Caddy runs as a Docker container alongside the app — no native install. + +### Subdomain routing (not path-based) + +`lilastudy.com` serves the frontend, `api.lilastudy.com` serves the API, `git.lilastudy.com` serves Forgejo. Cleaner separation than path-based routing — any service can be moved to a different server just by changing DNS. Requires CORS configuration since the browser sees different origins, and cross-subdomain cookies via `COOKIE_DOMAIN=.lilastudy.com`. Wildcard DNS (`*.lilastudy.com`) means new subdomains require no DNS changes. + +### Frontend served by nginx:alpine (not Node, not Caddy) + +Vite builds to static files. Serving them with nginx inside the container is lighter than running a Node process and keeps the container at ~7MB. Caddy could serve them directly, but using a separate container maintains the one-service-per-container principle and keeps Caddy's config purely about routing. + +### SPA fallback via nginx `try_files` + +Without `try_files $uri $uri/ /index.html`, refreshing on `/play` returns 404 because there's no actual `play` file. Nginx serves `index.html` for all routes and lets TanStack Router handle client-side routing. + +### Forgejo as git server + container registry (not GitHub, not Docker Hub) + +Keeps everything self-hosted on one VPS. Forgejo's built-in package registry doubles as a container registry, eliminating a separate service. Git push and image push go to the same server. + +### Forgejo SSH on port 2222 (not 22) + +Port 22 is the VPS's own SSH. Mapping Forgejo's SSH to 2222 avoids conflicts. Dev laptop `~/.ssh/config` maps `git.lilastudy.com` to port 2222 so git commands work without specifying the port every time. + +### `packages/db` and `packages/shared` exports: compiled JS paths + +Exports in both package.json files point to `./dist/src/index.js`, not TypeScript source. In dev, `tsx` can run TypeScript, but in production Node cannot. This means packages must be built before the API starts in dev — acceptable since these packages change infrequently. Alternative approaches (conditional exports, tsconfig paths) were considered but added complexity for no practical benefit. + +### Environment-driven config for production vs dev + +CORS origin, Better Auth base URL, cookie domain, API URL, and OAuth credentials are all read from environment variables with localhost fallbacks. The same code runs in both environments without changes. `VITE_API_URL` is the exception — it's baked in at build time via Docker build arg because Vite replaces `import.meta.env` at compile time, not runtime. + +### Cross-subdomain cookies + +Better Auth's `defaultCookieAttributes` sets `domain: .lilastudy.com` in production (from env var `COOKIE_DOMAIN`). Without this, the auth cookie scoped to `api.lilastudy.com` wouldn't be sent on requests from `lilastudy.com`. The leading dot makes the cookie valid across all subdomains. + +--- + +## CI/CD + +### Forgejo Actions with SSH deploy (not webhooks, not manual) + +CI builds images natively on the ARM64 VPS (no QEMU cross-compilation). The runner uses the host's Docker socket to build. After pushing to the registry, the workflow SSHs into the VPS to pull and restart containers. Webhooks were considered but add an extra listener service to maintain and secure. Manual deploy was the initial approach but doesn't scale with frequent pushes. + +### Dedicated CI SSH key + +A separate `ci-runner` SSH key pair (not the developer's personal key) is used for CI deploys. The private key is stored in Forgejo's secrets. If compromised, only this key needs to be revoked — the developer's access is unaffected. + +### Runner config: `docker_host: "automount"` + `valid_volumes` + explicit config path + +The Forgejo runner's `automount` setting mounts the host Docker socket into job containers. `valid_volumes` must include `/var/run/docker.sock` or the mount is blocked. The runner command must explicitly reference the config file (`-c /data/config.yml`) — without this flag, config changes are silently ignored. `--group-add 989` in container options adds the host's docker group so job containers can access the socket. + +### Docker CLI installed per job (not baked into runner image) + +The job container (`node:24-bookworm`) doesn't include Docker CLI. It's installed via `apt-get install docker.io` as the first workflow step. This adds ~20 seconds per run but avoids maintaining a custom runner image. The CLI sends commands through the mounted socket to the host's Docker engine. + +--- + +## Backups + +### pg_dump cron + dev laptop sync (not WAL archiving, not managed service) + +Daily compressed SQL dumps with 7-day retention. Dev laptop auto-syncs new backups on login via rsync. Simple, portable, sufficient for current scale. WAL archiving gives point-in-time recovery but is complex to set up. Offsite storage (Hetzner Object Storage) is the planned next step — backups on the same VPS don't protect against VPS failure. diff --git a/documentation/notes.md b/documentation/notes.md index 48673c8..c750683 100644 --- a/documentation/notes.md +++ b/documentation/notes.md @@ -11,6 +11,10 @@ verify if hetzner domain needs to be pushed, theres a change on hetzner and some domains need to be migrated +### redirect or page not found + +subdomains or pages that dont exist should have page not found or should redirect + ### docker credential helper WARNING! Your credentials are stored unencrypted in '/home/languagedev/.docker/config.json'. @@ -24,15 +28,6 @@ laptop: verify if docker containers run on startup (they shouldnt) ### vps setup - monitoring and logging (eg via chrootkit or rkhunter, logwatch/monit => mails daily with summary) -- keep the vps clean (e.g. old docker images/containers) - -### cd/ci pipeline - -forgejo actions? smth else? where docker registry, also forgejo? - -### postgres backups - -how? ### try now option diff --git a/documentation/roadmap.md b/documentation/roadmap.md index 825cbc9..7c4b4ed 100644 --- a/documentation/roadmap.md +++ b/documentation/roadmap.md @@ -91,7 +91,7 @@ Each phase produces a working increment. Nothing is built speculatively. --- -## Phase 3 — Auth +## Phase 3 — Auth ✅ **Goal:** Users can log in via Google or GitHub and stay logged in. **Done when:** Better Auth session is validated on protected routes; unauthenticated users are redirected to login; user row is created on first social login. @@ -109,6 +109,68 @@ Each phase produces a working increment. Nothing is built speculatively. --- +## Phase 6 — Production Deployment ✅ + +**Goal:** App is live on Hetzner, accessible via HTTPS on all subdomains. +**Done when:** `https://lilastudy.com` loads; `https://api.lilastudy.com` responds; auth flow works end-to-end; CI/CD deploys on push to main. + +_Note: Deployment was moved ahead of multiplayer — the app is useful without multiplayer but not without deployment._ + +### Infrastructure + +- [x] Hetzner VPS provisioned (Debian 13, ARM64, 4GB RAM) +- [x] SSH hardening, ufw firewall, fail2ban +- [x] Docker + Docker Compose installed +- [x] Domain DNS: A record + wildcard `*.lilastudy.com` pointing to VPS + +### Reverse proxy + +- [x] Caddy container with automatic HTTPS (Let's Encrypt) +- [x] Subdomain routing: `lilastudy.com` → web, `api.lilastudy.com` → API, `git.lilastudy.com` → Forgejo + +### Docker stack + +- [x] Production `docker-compose.yml` with all services on shared network +- [x] No ports exposed on internal services — only Caddy (80/443) and Forgejo SSH (2222) +- [x] Production Dockerfile stages for API (runner) and frontend (nginx:alpine) +- [x] Monorepo package exports fixed for production (dist/src paths) +- [x] Production `.env` with env-driven CORS, auth URLs, cookie domain + +### Git server + container registry + +- [x] Forgejo running with built-in container registry +- [x] SSH on port 2222, dev laptop `~/.ssh/config` configured +- [x] Repository created, code pushed + +### CI/CD + +- [x] Forgejo Actions enabled +- [x] Forgejo Runner container on VPS with Docker socket access +- [x] `.forgejo/workflows/deploy.yml` — build, push, deploy via SSH on push to main +- [x] Registry and SSH secrets configured in Forgejo + +### Database + +- [x] Initial seed via pg_dump from dev laptop +- [x] Seeding script is idempotent (onConflictDoNothing) for future data additions +- [x] Schema migrations via Drizzle (migrate first, deploy second) + +### OAuth + +- [x] Google and GitHub OAuth redirect URIs configured for production +- [x] Cross-subdomain cookies via COOKIE_DOMAIN=.lilastudy.com + +### Backups + +- [x] Daily cron job (3 AM) with pg_dump, 7-day retention +- [x] Dev laptop auto-syncs backups on login via rsync + +### Documentation + +- [x] `deployment.md` covering full infrastructure setup + +--- + ## Phase 4 — Multiplayer Lobby **Goal:** Players can create and join rooms; the host sees all joined players in real time. @@ -148,32 +210,21 @@ Each phase produces a working increment. Nothing is built speculatively. --- -## Phase 6 — Production Deployment - -**Goal:** App is live on Hetzner, accessible via HTTPS on all subdomains. -**Done when:** `https://app.yourdomain.com` loads; `wss://api.yourdomain.com` connects; auth flow works end-to-end. - -- [ ] `docker-compose.prod.yml`: all services + `nginx-proxy` + `acme-companion` -- [ ] Nginx config per container: `VIRTUAL_HOST` + `LETSENCRYPT_HOST` -- [ ] Production `.env` files on VPS -- [ ] Drizzle migration runs on `api` container start -- [ ] Seed production DB -- [ ] Smoke test: login → solo game → multiplayer game end-to-end - ---- - ## Phase 7 — Polish & Hardening **Goal:** Production-ready for real users. +- [x] CI/CD pipeline (Forgejo Actions → SSH deploy) +- [x] Database backups (cron → dev laptop sync) - [ ] Rate limiting on API endpoints - [ ] Graceful WS reconnect with exponential back-off - [ ] React error boundaries - [ ] `GET /users/me/stats` endpoint + profile page - [ ] Accessibility pass (keyboard nav, ARIA on quiz buttons) - [ ] Favicon, page titles, Open Graph meta -- [ ] CI/CD pipeline (GitHub Actions → SSH deploy) -- [ ] Database backups (cron → Hetzner Object Storage) +- [ ] Offsite backup storage (Hetzner Object Storage) +- [ ] Monitoring/logging (uptime, centralized logs) +- [ ] Valkey for game session store (replace in-memory) --- @@ -183,9 +234,9 @@ Each phase produces a working increment. Nothing is built speculatively. Phase 0 (Foundation) ✅ └── Phase 1 (Vocabulary Data + API) ✅ └── Phase 2 (Singleplayer UI) ✅ - └── Phase 3 (Auth) - ├── Phase 4 (Multiplayer Lobby) - │ └── Phase 5 (Multiplayer Game) - │ └── Phase 6 (Deployment) - └── Phase 7 (Hardening) + ├── Phase 3 (Auth) ✅ + │ └── Phase 6 (Deployment + CI/CD) ✅ + └── Phase 4 (Multiplayer Lobby) + └── Phase 5 (Multiplayer Game) + └── Phase 7 (Hardening) ``` diff --git a/documentation/spec.md b/documentation/spec.md index 8539dac..4bf2835 100644 --- a/documentation/spec.md +++ b/documentation/spec.md @@ -63,9 +63,9 @@ These are not deleted from the plan — they are deferred. The architecture is a ## 4. Technology Stack -The monorepo structure and tooling are already set up. This is the full stack — the MVP uses a subset of it. +The monorepo structure and tooling are already set up. This is the full stack. -| Layer | Technology | MVP? | +| Layer | Technology | Status | | ------------ | ------------------------------ | ----------- | | Monorepo | pnpm workspaces | ✅ | | Frontend | React 18, Vite, TypeScript | ✅ | @@ -77,10 +77,11 @@ The monorepo structure and tooling are already set up. This is the full stack | Database | PostgreSQL + Drizzle ORM | ✅ | | Validation | Zod (shared schemas) | ✅ | | Testing | Vitest, supertest | ✅ | -| Auth | Better Auth (Google + GitHub) | ❌ post-MVP | +| Auth | Better Auth (Google + GitHub) | ✅ | +| Deployment | Docker Compose, Caddy, Hetzner | ✅ | +| CI/CD | Forgejo Actions | ✅ | | Realtime | WebSockets (`ws` library) | ❌ post-MVP | | Cache | Valkey | ❌ post-MVP | -| Deployment | Docker Compose, Hetzner, Nginx | ❌ post-MVP | --- @@ -88,14 +89,20 @@ The monorepo structure and tooling are already set up. This is the full stack ```text vocab-trainer/ +├── .forgejo/ +│ └── workflows/ +│ └── deploy.yml — CI/CD pipeline (build, push, deploy) ├── apps/ │ ├── api/ │ │ └── src/ -│ │ ├── app.ts — createApp() factory, express.json(), error middleware +│ │ ├── app.ts — createApp() factory, CORS, auth handler, error middleware │ │ ├── server.ts — starts server on PORT │ │ ├── errors/ │ │ │ └── AppError.ts — AppError, ValidationError, NotFoundError +│ │ ├── lib/ +│ │ │ └── auth.ts — Better Auth config (Google + GitHub providers) │ │ ├── middleware/ +│ │ │ ├── authMiddleware.ts — session validation for protected routes │ │ │ └── errorHandler.ts — central error middleware │ │ ├── routes/ │ │ │ ├── apiRouter.ts — mounts /health and /game routers @@ -111,10 +118,17 @@ vocab-trainer/ │ │ ├── InMemoryGameSessionStore.ts │ │ └── index.ts │ └── web/ +│ ├── Dockerfile — multi-stage: dev + production (nginx:alpine) +│ ├── nginx.conf — SPA fallback routing │ └── src/ │ ├── routes/ │ │ ├── index.tsx — landing page -│ │ └── play.tsx — the quiz +│ │ ├── play.tsx — the quiz +│ │ ├── login.tsx — Google + GitHub login buttons +│ │ ├── about.tsx +│ │ └── __root.tsx +│ ├── lib/ +│ │ └── auth-client.ts — Better Auth React client │ ├── components/ │ │ └── game/ │ │ ├── GameSetup.tsx — settings UI @@ -131,7 +145,7 @@ vocab-trainer/ │ └── db/ │ ├── drizzle/ — migration SQL files │ └── src/ -│ ├── db/schema.ts — Drizzle schema +│ ├── db/schema.ts — Drizzle schema (terms, translations, auth tables) │ ├── models/termModel.ts — getGameTerms(), getDistractors() │ ├── seeding-datafiles.ts — seeds terms + translations from JSON │ ├── seeding-cefr-levels.ts — enriches translations with CEFR data @@ -139,7 +153,9 @@ vocab-trainer/ │ └── index.ts ├── scripts/ — Python extraction/comparison/merge scripts ├── documentation/ — project docs -├── docker-compose.yml +├── docker-compose.yml — local dev stack +├── docker-compose.prod.yml — production config reference +├── Caddyfile — reverse proxy routing └── pnpm-workspace.yaml ``` @@ -178,13 +194,28 @@ HTTP Request **Key principle:** all database code lives in `packages/db`. `apps/api` never imports `drizzle-orm` for queries — it only calls functions exported from `packages/db`. +### Production Infrastructure + +```text +Internet → Caddy (HTTPS termination) + ├── lilastudy.com → web container (nginx, static files) + ├── api.lilastudy.com → api container (Express, port 3000) + └── git.lilastudy.com → forgejo container (git + registry, port 3000) + +SSH (port 2222) → forgejo container (git push/pull) +``` + +All containers communicate over an internal Docker network. Only Caddy (80/443) and Forgejo SSH (2222) are exposed to the internet. + --- ## 7. Data Model (Current State) Words are modelled as language-neutral concepts (terms) separate from learning curricula (decks). Adding a new language pair requires no schema changes — only new rows in `translations`, `decks`. -**Core tables:** `terms`, `translations`, `term_glosses`, `decks`, `deck_terms`, `categories`, `term_categories` +**Core tables:** `terms`, `translations`, `term_glosses`, `decks`, `deck_terms`, `topics`, `term_topics` + +**Auth tables (managed by Better Auth):** `user`, `session`, `account`, `verification` Key columns on `terms`: `id` (uuid), `pos` (CHECK-constrained), `source`, `source_id` (unique pair for idempotent imports) @@ -201,9 +232,10 @@ Full schema is in `packages/db/src/db/schema.ts`. ### Endpoints ```text -POST /api/v1/game/start GameRequest → GameSession -POST /api/v1/game/answer AnswerSubmission → AnswerResult -GET /api/v1/health Health check +POST /api/v1/game/start GameRequest → GameSession (requires auth) +POST /api/v1/game/answer AnswerSubmission → AnswerResult (requires auth) +GET /api/v1/health Health check (public) +ALL /api/auth/* Better Auth handlers (public) ``` ### Schemas (packages/shared) @@ -235,7 +267,7 @@ Typed error classes (`AppError` base, `ValidationError` 400, `NotFoundError` 404 - **Session length**: 3 or 10 questions (configurable) - **Scoring**: +1 per correct answer (no speed bonus for MVP) - **Timer**: none in singleplayer MVP -- **No auth required**: anonymous users +- **Auth required**: users must log in via Google or GitHub - **Submit-before-send**: user selects, then confirms (prevents misclicks) --- @@ -258,14 +290,15 @@ After completing a task: share the code, ask what to refactor and why. The LLM s ## 11. Post-MVP Ladder -| Phase | What it adds | -| ----------------- | -------------------------------------------------------------- | ----------------------------------------------------------------------- | -| Auth | Auth | Better Auth (Google + GitHub), embedded in Express API, user rows in DB | -| User Stats | Games played, score history, profile page | -| Multiplayer Lobby | Room creation, join by code, WebSocket connection | -| Multiplayer Game | Simultaneous answers, server timer, live scores, winner screen | -| Deployment | Docker Compose prod config, Nginx, Let's Encrypt, Hetzner VPS | -| Hardening | Rate limiting, error boundaries, CI/CD, DB backups | +| Phase | What it adds | Status | +| ----------------- | ------------------------------------------------------------------------------- | ------ | +| Auth | Better Auth (Google + GitHub), embedded in Express API, user rows in DB | ✅ | +| Deployment | Docker Compose, Caddy, Forgejo, CI/CD, Hetzner VPS | ✅ | +| Hardening (partial) | CI/CD pipeline, DB backups | ✅ | +| User Stats | Games played, score history, profile page | ❌ | +| Multiplayer Lobby | Room creation, join by code, WebSocket connection | ❌ | +| Multiplayer Game | Simultaneous answers, server timer, live scores, winner screen | ❌ | +| Hardening (rest) | Rate limiting, error boundaries, monitoring, accessibility | ❌ | ### Future Data Model Extensions (deferred, additive) @@ -285,11 +318,16 @@ All are new tables referencing existing `terms` rows via FK. No existing schema - Game mechanic: simultaneous answers, 15-second server timer, all players see same question - Valkey for ephemeral room state, PostgreSQL for durable records -### Infrastructure (deferred) +### Infrastructure (current) -- `app.yourdomain.com` → React frontend -- `api.yourdomain.com` → Express API + WebSocket + Better Auth -- Docker Compose with `nginx-proxy` + `acme-companion` for automatic SSL +- `lilastudy.com` → React frontend (nginx serving static files) +- `api.lilastudy.com` → Express API + Better Auth +- `git.lilastudy.com` → Forgejo (git server + container registry) +- Docker Compose with Caddy for automatic HTTPS via Let's Encrypt +- CI/CD via Forgejo Actions (build on push to main, deploy via SSH) +- Daily DB backups with cron, synced to dev laptop + +See `deployment.md` for full infrastructure documentation. --- @@ -312,14 +350,14 @@ See `roadmap.md` for the full roadmap with task-level checkboxes. ### Dependency Graph ```text -Phase 0 (Foundation) -└── Phase 1 (Vocabulary Data + API) - └── Phase 2 (Singleplayer UI) - └── Phase 3 (Auth) - ├── Phase 4 (Room Lobby) - │ └── Phase 5 (Multiplayer Game) - │ └── Phase 6 (Deployment) - └── Phase 7 (Hardening) +Phase 0 (Foundation) ✅ +└── Phase 1 (Vocabulary Data + API) ✅ + └── Phase 2 (Singleplayer UI) ✅ + ├── Phase 3 (Auth) ✅ + │ └── Phase 6 (Deployment + CI/CD) ✅ + └── Phase 4 (Multiplayer Lobby) + └── Phase 5 (Multiplayer Game) + └── Phase 7 (Hardening) ``` ---