updating documentation
All checks were successful
Build and Deploy / build-and-deploy (push) Successful in 1m3s

This commit is contained in:
lila 2026-04-14 19:35:49 +02:00
parent 201f462447
commit e5595b5039
5 changed files with 268 additions and 61 deletions

View file

@ -359,3 +359,71 @@ All deferred post-MVP, purely additive (new tables referencing existing `terms`)
- `noun_forms` — gender, singular, plural, articles per language (source: Wiktionary)
- `verb_forms` — conjugation tables per language (source: Wiktionary)
- `term_pronunciations` — IPA and audio URLs per language (source: Wiktionary / Forvo)
---
## Deployment
### Reverse proxy: Caddy (not Nginx, not Traefik)
Caddy provides automatic HTTPS via Let's Encrypt with zero configuration beyond specifying domain names. The entire Caddyfile is ~10 lines. Nginx would require manual certbot setup and more verbose config. Traefik's auto-discovery of Docker containers (via labels) is powerful but overkill for a stable three-service stack where routing rules never change. Caddy runs as a Docker container alongside the app — no native install.
### Subdomain routing (not path-based)
`lilastudy.com` serves the frontend, `api.lilastudy.com` serves the API, `git.lilastudy.com` serves Forgejo. Cleaner separation than path-based routing — any service can be moved to a different server just by changing DNS. Requires CORS configuration since the browser sees different origins, and cross-subdomain cookies via `COOKIE_DOMAIN=.lilastudy.com`. Wildcard DNS (`*.lilastudy.com`) means new subdomains require no DNS changes.
### Frontend served by nginx:alpine (not Node, not Caddy)
Vite builds to static files. Serving them with nginx inside the container is lighter than running a Node process and keeps the container at ~7MB. Caddy could serve them directly, but using a separate container maintains the one-service-per-container principle and keeps Caddy's config purely about routing.
### SPA fallback via nginx `try_files`
Without `try_files $uri $uri/ /index.html`, refreshing on `/play` returns 404 because there's no actual `play` file. Nginx serves `index.html` for all routes and lets TanStack Router handle client-side routing.
### Forgejo as git server + container registry (not GitHub, not Docker Hub)
Keeps everything self-hosted on one VPS. Forgejo's built-in package registry doubles as a container registry, eliminating a separate service. Git push and image push go to the same server.
### Forgejo SSH on port 2222 (not 22)
Port 22 is the VPS's own SSH. Mapping Forgejo's SSH to 2222 avoids conflicts. Dev laptop `~/.ssh/config` maps `git.lilastudy.com` to port 2222 so git commands work without specifying the port every time.
### `packages/db` and `packages/shared` exports: compiled JS paths
Exports in both package.json files point to `./dist/src/index.js`, not TypeScript source. In dev, `tsx` can run TypeScript, but in production Node cannot. This means packages must be built before the API starts in dev — acceptable since these packages change infrequently. Alternative approaches (conditional exports, tsconfig paths) were considered but added complexity for no practical benefit.
### Environment-driven config for production vs dev
CORS origin, Better Auth base URL, cookie domain, API URL, and OAuth credentials are all read from environment variables with localhost fallbacks. The same code runs in both environments without changes. `VITE_API_URL` is the exception — it's baked in at build time via Docker build arg because Vite replaces `import.meta.env` at compile time, not runtime.
### Cross-subdomain cookies
Better Auth's `defaultCookieAttributes` sets `domain: .lilastudy.com` in production (from env var `COOKIE_DOMAIN`). Without this, the auth cookie scoped to `api.lilastudy.com` wouldn't be sent on requests from `lilastudy.com`. The leading dot makes the cookie valid across all subdomains.
---
## CI/CD
### Forgejo Actions with SSH deploy (not webhooks, not manual)
CI builds images natively on the ARM64 VPS (no QEMU cross-compilation). The runner uses the host's Docker socket to build. After pushing to the registry, the workflow SSHs into the VPS to pull and restart containers. Webhooks were considered but add an extra listener service to maintain and secure. Manual deploy was the initial approach but doesn't scale with frequent pushes.
### Dedicated CI SSH key
A separate `ci-runner` SSH key pair (not the developer's personal key) is used for CI deploys. The private key is stored in Forgejo's secrets. If compromised, only this key needs to be revoked — the developer's access is unaffected.
### Runner config: `docker_host: "automount"` + `valid_volumes` + explicit config path
The Forgejo runner's `automount` setting mounts the host Docker socket into job containers. `valid_volumes` must include `/var/run/docker.sock` or the mount is blocked. The runner command must explicitly reference the config file (`-c /data/config.yml`) — without this flag, config changes are silently ignored. `--group-add 989` in container options adds the host's docker group so job containers can access the socket.
### Docker CLI installed per job (not baked into runner image)
The job container (`node:24-bookworm`) doesn't include Docker CLI. It's installed via `apt-get install docker.io` as the first workflow step. This adds ~20 seconds per run but avoids maintaining a custom runner image. The CLI sends commands through the mounted socket to the host's Docker engine.
---
## Backups
### pg_dump cron + dev laptop sync (not WAL archiving, not managed service)
Daily compressed SQL dumps with 7-day retention. Dev laptop auto-syncs new backups on login via rsync. Simple, portable, sufficient for current scale. WAL archiving gives point-in-time recovery but is complex to set up. Offsite storage (Hetzner Object Storage) is the planned next step — backups on the same VPS don't protect against VPS failure.

View file

@ -225,9 +225,59 @@ Host git.lilastudy.com
This allows standard git commands without specifying the port.
## CI/CD Pipeline
Automated build and deploy via Forgejo Actions. On every push to `main`, the pipeline builds ARM64 images natively on the VPS, pushes them to the Forgejo registry, and restarts the app containers.
### Components
- **Forgejo Actions** — enabled by default, workflow files in `.forgejo/workflows/`
- **Forgejo Runner** — runs as a container (`lila-ci-runner`) on the VPS, uses the host's Docker socket to build images natively on ARM64
- **Workflow file**`.forgejo/workflows/deploy.yml`
### Pipeline Steps
1. Install Docker CLI and SSH client in the job container
2. Checkout the repository
3. Login to the Forgejo container registry
4. Build API image (target: `runner`)
5. Build Web image (target: `production`, with `VITE_API_URL` baked in)
6. Push both images to `git.lilastudy.com`
7. SSH into the VPS, pull new images, restart `api` and `web` containers, prune old images
### Secrets (stored in Forgejo repo settings → Actions → Secrets)
| Secret | Value |
|---|---|
| REGISTRY_USER | Forgejo username |
| REGISTRY_PASSWORD | Forgejo password |
| SSH_PRIVATE_KEY | Contents of `~/.ssh/ci-runner` on the VPS |
| SSH_HOST | VPS IP address |
| SSH_USER | `lila` |
### Runner Configuration
The runner config is at `/data/config.yml` inside the `lila-ci-runner` container. Key settings:
- `docker_host: "automount"` — mounts the host Docker socket into job containers
- `valid_volumes: ["/var/run/docker.sock"]` — allows the socket mount
- `privileged: true` — required for Docker access from job containers
- `options: "--group-add 989"` — adds the host's docker group (GID 989) to job containers
The runner command must explicitly reference the config file:
```yaml
command: '/bin/sh -c "sleep 5; forgejo-runner -c /data/config.yml daemon"'
```
### Deploy Cycle
Push to main → pipeline runs automatically (~2-5 min) → app is updated. No manual steps required.
To manually trigger a re-run: go to the repo's Actions tab, click on the latest run, and use the re-run button.
## Known Issues and Future Work
- **CI/CD**: Currently manual build-push-pull cycle. Plan: Forgejo Actions with a runner on the VPS building ARM images natively (eliminates QEMU cross-compilation)
- **Backups**: Offsite backup storage (Hetzner Object Storage or similar) should be added
- **Valkey**: Not in the production stack yet. Will be added when multiplayer requires session/room state
- **Monitoring/logging**: No centralized logging or uptime monitoring configured

View file

@ -24,15 +24,15 @@ https://docs.docker.com/go/credential-store/
### vps setup
- monitoring and logging (eg via chrootkit or rkhunter, logwatch/monit => mails daily with summary)
- keep the vps clean (e.g. old docker images/containers)
- ~~keep the vps clean (e.g. old docker images/containers)~~ ✅ CI/CD pipeline runs `docker image prune -f` after deploy
### cd/ci pipeline
### ~~cd/ci pipeline~~ ✅ RESOLVED
forgejo actions? smth else? where docker registry, also forgejo?
Forgejo Actions with runner on VPS, Forgejo built-in container registry. See `deployment.md`.
### postgres backups
### ~~postgres backups~~ ✅ RESOLVED
how?
Daily pg_dump cron job, 7-day retention, dev laptop auto-sync via rsync. See `deployment.md`.
### try now option

View file

@ -91,7 +91,7 @@ Each phase produces a working increment. Nothing is built speculatively.
---
## Phase 3 — Auth
## Phase 3 — Auth
**Goal:** Users can log in via Google or GitHub and stay logged in.
**Done when:** Better Auth session is validated on protected routes; unauthenticated users are redirected to login; user row is created on first social login.
@ -109,6 +109,68 @@ Each phase produces a working increment. Nothing is built speculatively.
---
## Phase 6 — Production Deployment ✅
**Goal:** App is live on Hetzner, accessible via HTTPS on all subdomains.
**Done when:** `https://lilastudy.com` loads; `https://api.lilastudy.com` responds; auth flow works end-to-end; CI/CD deploys on push to main.
_Note: Deployment was moved ahead of multiplayer — the app is useful without multiplayer but not without deployment._
### Infrastructure
- [x] Hetzner VPS provisioned (Debian 13, ARM64, 4GB RAM)
- [x] SSH hardening, ufw firewall, fail2ban
- [x] Docker + Docker Compose installed
- [x] Domain DNS: A record + wildcard `*.lilastudy.com` pointing to VPS
### Reverse proxy
- [x] Caddy container with automatic HTTPS (Let's Encrypt)
- [x] Subdomain routing: `lilastudy.com` → web, `api.lilastudy.com` → API, `git.lilastudy.com` → Forgejo
### Docker stack
- [x] Production `docker-compose.yml` with all services on shared network
- [x] No ports exposed on internal services — only Caddy (80/443) and Forgejo SSH (2222)
- [x] Production Dockerfile stages for API (runner) and frontend (nginx:alpine)
- [x] Monorepo package exports fixed for production (dist/src paths)
- [x] Production `.env` with env-driven CORS, auth URLs, cookie domain
### Git server + container registry
- [x] Forgejo running with built-in container registry
- [x] SSH on port 2222, dev laptop `~/.ssh/config` configured
- [x] Repository created, code pushed
### CI/CD
- [x] Forgejo Actions enabled
- [x] Forgejo Runner container on VPS with Docker socket access
- [x] `.forgejo/workflows/deploy.yml` — build, push, deploy via SSH on push to main
- [x] Registry and SSH secrets configured in Forgejo
### Database
- [x] Initial seed via pg_dump from dev laptop
- [x] Seeding script is idempotent (onConflictDoNothing) for future data additions
- [x] Schema migrations via Drizzle (migrate first, deploy second)
### OAuth
- [x] Google and GitHub OAuth redirect URIs configured for production
- [x] Cross-subdomain cookies via COOKIE_DOMAIN=.lilastudy.com
### Backups
- [x] Daily cron job (3 AM) with pg_dump, 7-day retention
- [x] Dev laptop auto-syncs backups on login via rsync
### Documentation
- [x] `deployment.md` covering full infrastructure setup
---
## Phase 4 — Multiplayer Lobby
**Goal:** Players can create and join rooms; the host sees all joined players in real time.
@ -148,32 +210,21 @@ Each phase produces a working increment. Nothing is built speculatively.
---
## Phase 6 — Production Deployment
**Goal:** App is live on Hetzner, accessible via HTTPS on all subdomains.
**Done when:** `https://app.yourdomain.com` loads; `wss://api.yourdomain.com` connects; auth flow works end-to-end.
- [ ] `docker-compose.prod.yml`: all services + `nginx-proxy` + `acme-companion`
- [ ] Nginx config per container: `VIRTUAL_HOST` + `LETSENCRYPT_HOST`
- [ ] Production `.env` files on VPS
- [ ] Drizzle migration runs on `api` container start
- [ ] Seed production DB
- [ ] Smoke test: login → solo game → multiplayer game end-to-end
---
## Phase 7 — Polish & Hardening
**Goal:** Production-ready for real users.
- [x] CI/CD pipeline (Forgejo Actions → SSH deploy)
- [x] Database backups (cron → dev laptop sync)
- [ ] Rate limiting on API endpoints
- [ ] Graceful WS reconnect with exponential back-off
- [ ] React error boundaries
- [ ] `GET /users/me/stats` endpoint + profile page
- [ ] Accessibility pass (keyboard nav, ARIA on quiz buttons)
- [ ] Favicon, page titles, Open Graph meta
- [ ] CI/CD pipeline (GitHub Actions → SSH deploy)
- [ ] Database backups (cron → Hetzner Object Storage)
- [ ] Offsite backup storage (Hetzner Object Storage)
- [ ] Monitoring/logging (uptime, centralized logs)
- [ ] Valkey for game session store (replace in-memory)
---
@ -183,9 +234,9 @@ Each phase produces a working increment. Nothing is built speculatively.
Phase 0 (Foundation) ✅
└── Phase 1 (Vocabulary Data + API) ✅
└── Phase 2 (Singleplayer UI) ✅
└── Phase 3 (Auth)
├── Phase 4 (Multiplayer Lobby)
│ └── Phase 5 (Multiplayer Game)
│ └── Phase 6 (Deployment)
├── Phase 3 (Auth) ✅
│ └── Phase 6 (Deployment + CI/CD) ✅
└── Phase 4 (Multiplayer Lobby)
└── Phase 5 (Multiplayer Game)
└── Phase 7 (Hardening)
```

View file

@ -63,9 +63,9 @@ These are not deleted from the plan — they are deferred. The architecture is a
## 4. Technology Stack
The monorepo structure and tooling are already set up. This is the full stack — the MVP uses a subset of it.
The monorepo structure and tooling are already set up. This is the full stack.
| Layer | Technology | MVP? |
| Layer | Technology | Status |
| ------------ | ------------------------------ | ----------- |
| Monorepo | pnpm workspaces | ✅ |
| Frontend | React 18, Vite, TypeScript | ✅ |
@ -77,10 +77,11 @@ The monorepo structure and tooling are already set up. This is the full stack
| Database | PostgreSQL + Drizzle ORM | ✅ |
| Validation | Zod (shared schemas) | ✅ |
| Testing | Vitest, supertest | ✅ |
| Auth | Better Auth (Google + GitHub) | ❌ post-MVP |
| Auth | Better Auth (Google + GitHub) | ✅ |
| Deployment | Docker Compose, Caddy, Hetzner | ✅ |
| CI/CD | Forgejo Actions | ✅ |
| Realtime | WebSockets (`ws` library) | ❌ post-MVP |
| Cache | Valkey | ❌ post-MVP |
| Deployment | Docker Compose, Hetzner, Nginx | ❌ post-MVP |
---
@ -88,14 +89,20 @@ The monorepo structure and tooling are already set up. This is the full stack
```text
vocab-trainer/
├── .forgejo/
│ └── workflows/
│ └── deploy.yml — CI/CD pipeline (build, push, deploy)
├── apps/
│ ├── api/
│ │ └── src/
│ │ ├── app.ts — createApp() factory, express.json(), error middleware
│ │ ├── app.ts — createApp() factory, CORS, auth handler, error middleware
│ │ ├── server.ts — starts server on PORT
│ │ ├── errors/
│ │ │ └── AppError.ts — AppError, ValidationError, NotFoundError
│ │ ├── lib/
│ │ │ └── auth.ts — Better Auth config (Google + GitHub providers)
│ │ ├── middleware/
│ │ │ ├── authMiddleware.ts — session validation for protected routes
│ │ │ └── errorHandler.ts — central error middleware
│ │ ├── routes/
│ │ │ ├── apiRouter.ts — mounts /health and /game routers
@ -111,10 +118,17 @@ vocab-trainer/
│ │ ├── InMemoryGameSessionStore.ts
│ │ └── index.ts
│ └── web/
│ ├── Dockerfile — multi-stage: dev + production (nginx:alpine)
│ ├── nginx.conf — SPA fallback routing
│ └── src/
│ ├── routes/
│ │ ├── index.tsx — landing page
│ │ └── play.tsx — the quiz
│ │ ├── play.tsx — the quiz
│ │ ├── login.tsx — Google + GitHub login buttons
│ │ ├── about.tsx
│ │ └── __root.tsx
│ ├── lib/
│ │ └── auth-client.ts — Better Auth React client
│ ├── components/
│ │ └── game/
│ │ ├── GameSetup.tsx — settings UI
@ -131,7 +145,7 @@ vocab-trainer/
│ └── db/
│ ├── drizzle/ — migration SQL files
│ └── src/
│ ├── db/schema.ts — Drizzle schema
│ ├── db/schema.ts — Drizzle schema (terms, translations, auth tables)
│ ├── models/termModel.ts — getGameTerms(), getDistractors()
│ ├── seeding-datafiles.ts — seeds terms + translations from JSON
│ ├── seeding-cefr-levels.ts — enriches translations with CEFR data
@ -139,7 +153,9 @@ vocab-trainer/
│ └── index.ts
├── scripts/ — Python extraction/comparison/merge scripts
├── documentation/ — project docs
├── docker-compose.yml
├── docker-compose.yml — local dev stack
├── docker-compose.prod.yml — production config reference
├── Caddyfile — reverse proxy routing
└── pnpm-workspace.yaml
```
@ -178,13 +194,28 @@ HTTP Request
**Key principle:** all database code lives in `packages/db`. `apps/api` never imports `drizzle-orm` for queries — it only calls functions exported from `packages/db`.
### Production Infrastructure
```text
Internet → Caddy (HTTPS termination)
├── lilastudy.com → web container (nginx, static files)
├── api.lilastudy.com → api container (Express, port 3000)
└── git.lilastudy.com → forgejo container (git + registry, port 3000)
SSH (port 2222) → forgejo container (git push/pull)
```
All containers communicate over an internal Docker network. Only Caddy (80/443) and Forgejo SSH (2222) are exposed to the internet.
---
## 7. Data Model (Current State)
Words are modelled as language-neutral concepts (terms) separate from learning curricula (decks). Adding a new language pair requires no schema changes — only new rows in `translations`, `decks`.
**Core tables:** `terms`, `translations`, `term_glosses`, `decks`, `deck_terms`, `categories`, `term_categories`
**Core tables:** `terms`, `translations`, `term_glosses`, `decks`, `deck_terms`, `topics`, `term_topics`
**Auth tables (managed by Better Auth):** `user`, `session`, `account`, `verification`
Key columns on `terms`: `id` (uuid), `pos` (CHECK-constrained), `source`, `source_id` (unique pair for idempotent imports)
@ -201,9 +232,10 @@ Full schema is in `packages/db/src/db/schema.ts`.
### Endpoints
```text
POST /api/v1/game/start GameRequest → GameSession
POST /api/v1/game/answer AnswerSubmission → AnswerResult
GET /api/v1/health Health check
POST /api/v1/game/start GameRequest → GameSession (requires auth)
POST /api/v1/game/answer AnswerSubmission → AnswerResult (requires auth)
GET /api/v1/health Health check (public)
ALL /api/auth/* Better Auth handlers (public)
```
### Schemas (packages/shared)
@ -235,7 +267,7 @@ Typed error classes (`AppError` base, `ValidationError` 400, `NotFoundError` 404
- **Session length**: 3 or 10 questions (configurable)
- **Scoring**: +1 per correct answer (no speed bonus for MVP)
- **Timer**: none in singleplayer MVP
- **No auth required**: anonymous users
- **Auth required**: users must log in via Google or GitHub
- **Submit-before-send**: user selects, then confirms (prevents misclicks)
---
@ -258,14 +290,15 @@ After completing a task: share the code, ask what to refactor and why. The LLM s
## 11. Post-MVP Ladder
| Phase | What it adds |
| ----------------- | -------------------------------------------------------------- | ----------------------------------------------------------------------- |
| Auth | Auth | Better Auth (Google + GitHub), embedded in Express API, user rows in DB |
| User Stats | Games played, score history, profile page |
| Multiplayer Lobby | Room creation, join by code, WebSocket connection |
| Multiplayer Game | Simultaneous answers, server timer, live scores, winner screen |
| Deployment | Docker Compose prod config, Nginx, Let's Encrypt, Hetzner VPS |
| Hardening | Rate limiting, error boundaries, CI/CD, DB backups |
| Phase | What it adds | Status |
| ----------------- | ------------------------------------------------------------------------------- | ------ |
| Auth | Better Auth (Google + GitHub), embedded in Express API, user rows in DB | ✅ |
| Deployment | Docker Compose, Caddy, Forgejo, CI/CD, Hetzner VPS | ✅ |
| Hardening (partial) | CI/CD pipeline, DB backups | ✅ |
| User Stats | Games played, score history, profile page | ❌ |
| Multiplayer Lobby | Room creation, join by code, WebSocket connection | ❌ |
| Multiplayer Game | Simultaneous answers, server timer, live scores, winner screen | ❌ |
| Hardening (rest) | Rate limiting, error boundaries, monitoring, accessibility | ❌ |
### Future Data Model Extensions (deferred, additive)
@ -285,11 +318,16 @@ All are new tables referencing existing `terms` rows via FK. No existing schema
- Game mechanic: simultaneous answers, 15-second server timer, all players see same question
- Valkey for ephemeral room state, PostgreSQL for durable records
### Infrastructure (deferred)
### Infrastructure (current)
- `app.yourdomain.com` → React frontend
- `api.yourdomain.com` → Express API + WebSocket + Better Auth
- Docker Compose with `nginx-proxy` + `acme-companion` for automatic SSL
- `lilastudy.com` → React frontend (nginx serving static files)
- `api.lilastudy.com` → Express API + Better Auth
- `git.lilastudy.com` → Forgejo (git server + container registry)
- Docker Compose with Caddy for automatic HTTPS via Let's Encrypt
- CI/CD via Forgejo Actions (build on push to main, deploy via SSH)
- Daily DB backups with cron, synced to dev laptop
See `deployment.md` for full infrastructure documentation.
---
@ -312,13 +350,13 @@ See `roadmap.md` for the full roadmap with task-level checkboxes.
### Dependency Graph
```text
Phase 0 (Foundation)
└── Phase 1 (Vocabulary Data + API)
└── Phase 2 (Singleplayer UI)
└── Phase 3 (Auth)
├── Phase 4 (Room Lobby)
│ └── Phase 5 (Multiplayer Game)
│ └── Phase 6 (Deployment)
Phase 0 (Foundation)
└── Phase 1 (Vocabulary Data + API)
└── Phase 2 (Singleplayer UI)
├── Phase 3 (Auth) ✅
│ └── Phase 6 (Deployment + CI/CD) ✅
└── Phase 4 (Multiplayer Lobby)
└── Phase 5 (Multiplayer Game)
└── Phase 7 (Hardening)
```