# Lila — Feature & Startup Strategy Roadmap > **Context for any LLM reading this:** Lila is a language learning/vocabulary app with two core differentiators: (1) **media-based practice** — users learn vocabulary extracted from real media they love (e.g., a Shakira song, the first chapter of _Harry Potter_, or an episode of _Breaking Bad_), and (2) **multiplayer modes** — users practice vocabulary together or competitively in real-time sessions. The app is currently at an early MVP stage. The existing MVP was built around OpenWordNet, which is being replaced because it produces unreliable translations (sense-disambiguation issues). The team is migrating the data pipeline to **Kaikki**, which structures entries per word sense and links translations to specific senses rather than vague general concepts. This migration is the current technical priority. The project is a TypeScript monorepo (pnpm workspaces) with an Express/WebSocket API (`apps/api`), a React frontend using TanStack Router (`apps/web`), a data ingestion pipeline (`data-pipeline`) backed by SQLite/Drizzle, shared packages (`packages/db`, `packages/shared`), and Docker-based deployment orchestrated with Caddy. Documentation restructuring (human-readable vs. AI-optimized docs) is being handled in a separate parallel workstream. --- ## Current State (Ground Truth — 2026-05-15) ### What Works Today ✅ - **Singleplayer quiz** — Duolingo-style, 5 language pairs (en↔it/de/es/fr), 3 or 10 rounds, POS + difficulty filters - **Multiplayer** — Create/join lobby by room code, 2–4 players, simultaneous answers, 15s server timer, live scoring, winner screen - **Auth** — Google + GitHub via Better Auth - **Deployment** — Live at lilastudy.com, Hetzner VPS, Caddy HTTPS, Docker Compose, CI/CD via Forgejo Actions - **Database** — PostgreSQL with Drizzle ORM, daily backups ### What's In Progress / Blocked 🚧 - **Kaikki data pipeline migration** — Stage 1 (extract) and Stage 2 (reverse link) complete on sample data. Stage 3 (enrich) being rewritten for sub-stage architecture. Stages 4–6 not started. - **Guest play** — No try-before-signup flow yet. Auth required for all game routes. - **Game session store** — Still in-memory. Valkey container exists locally but not wired up. - **Media ingestion** — Not started. No pipeline for subtitles/lyrics → vocab extraction yet. ### The Strategic Gap The app is currently a **generic vocabulary quiz**. The media-based practice feature (the differentiator) does not exist yet. It depends on: 1. Kaikki pipeline reaching production (fixes translation quality) 2. A media ingestion prototype (subtitles/lyrics → text → vocab extraction → quiz) --- ## Stream 1: Documentation Restructure (Parallel Track) **Status:** ✅ Complete. Human-readable branch (README, STATUS, ARCHITECTURE, BACKLOG, DECISIONS, DEPLOYMENT, DATA_PIPELINE, MODEL_STRATEGY, LLM_SETUP, design/GAME_MODES) and AI-context branch (00–06, prompts/meta.md, 99-current-task.md) are live in `documentation/`. --- ## Stream 2: Feature Roadmap (Three Lanes) ### Lane A — Attract & Keep Users **Goal:** A user lands on Lila, understands the value in 10 seconds, and completes a satisfying vocabulary practice session in under 2 minutes. **Current Reality Check:** - Singleplayer and multiplayer quizzes are **already working and deployed**. - The app is functional but **not differentiated** — it's a generic vocabulary quiz right now. - The "wow" moment requires the **media-based practice feature**, which does not exist yet. **Must-Haves for First Users:** 1. **Guest Play (Zero-Friction Onboarding)** `[in backlog next]` - No signup required for first session. - Capture email or OAuth only after the user experiences value. - Critical for viral loops and investor demos. - **Status:** Planned in BACKLOG.md. Not yet implemented. 2. **One Polished Media Demo** `[not started]` - Pick **ONE** piece of media and make it flawless end-to-end: subtitles/lyrics → Kaikki-based vocab extraction with sense-disambiguated translations → playable quiz with timestamps/context. - Candidates: _Breaking Bad S01E01_, a Shakira song, or _Harry Potter and the Sorcerer's Stone Ch. 1_. - This is the primary "wow" moment. Differentiates Lila from all other vocabulary apps. - **Blocker:** Requires (a) Kaikki pipeline in production, and (b) a media ingestion prototype. 3. **One Additional Multiplayer Mode** `[design exists, not implemented]` - Proves the mode-agnostic lobby architecture works and adds variety beyond the current simultaneous-answer flow. - **Recommended first mode:** Race to the Top (target score, no round limit) — simplest to implement, changes only scoring logic. - Alternative: TV Quiz Show (buzzer — first to press answers) — most visually distinct, but requires new answer flow. - **Status:** Lobby infrastructure is mode-agnostic. Each mode adds game logic only. See `design/GAME_MODES.md` for full designs. - **Why it matters:** Duolingo has no multiplayer. Anki has no multiplayer. Real-time modes are a genuine differentiator even without media. 4. **Social Proof / Shareable Output** `[not started]` - Post-game card: "I learned 12 words from _La Tortura_ — can you beat my score?" - Image export or copy-paste text for Reddit, Discord, Twitter. - This is the organic growth engine. - **Blocker:** Requires media demo to exist first. **Already Shipped (Don't Rebuild):** - ✅ Singleplayer quiz (5 languages, POS/difficulty filters) - ✅ Multiplayer lobby + real-time game (2–4 players, simultaneous answers, 15s timer, scoring) - ✅ Auth (Google + GitHub) - ✅ Live deployment with CI/CD **Nice-to-Haves (Post-Launch):** - Additional multiplayer modes (Chain Link, Elimination Round, Cooperative Challenge) - Leaderboards - Spaced repetition review queue - Additional game modes (see design/GAME_MODES.md) --- ### Lane B — Investor-Ready **Goal:** Walk into a pitch with engagement metrics and a defensibility story tied to Lila's unique data pipeline. **Checklist:** 1. **Metrics Instrumentation** `[not started]` - Track: DAU/MAU, session length, quiz completion rate, multiplayer match completion rate, Day 1 / Day 7 retention. - Tool: PostHog, Mixpanel, or Plausible (self-hosted). - Need 4–6 weeks of real-user data. - **Note:** The app is live but has no analytics. This is a prerequisite for any investor conversation. 2. **Growth Mechanic** `[not started]` - The shareable card (Lane A.3) must be live and instrumented. - Measure k-factor (viral coefficient). Even 0.3 is a story. - **Blocker:** Requires media demo. 3. **Defensibility Story** `[partially true, not yet proven]` - **Data moat:** Lila's Kaikki → media mapping pipeline produces sense-disambiguated vocabulary tied to specific media timestamps. Competitors using generic word lists or OpenWordNet-style dumps cannot match the precision. - **Current reality:** The Kaikki pipeline exists but is not in production. The media mapping pipeline does not exist yet. - **What investors would ask:** "You have a quiz app. Where's the media feature you pitched?" - **Requirement:** Media demo + Kaikki production data must be live before investor conversations. 4. **Monetization Hypothesis** `[not tested]` - Pick ONE model to test: - **Freemium:** Free media, premium for user uploads / unlimited multiplayer / advanced analytics. - **B2B:** Schools and language institutes buy group licenses. - **Affiliate:** Deep-link to streaming services, books, or music platforms. - Don't implement yet, but explain LTV/CAC math and pricing assumptions. **Investor Timeline:** - **Now → Month 2:** Finish Kaikki pipeline + ship media demo + add metrics. - **Month 2–3:** Soft launch to 100 strangers, gather retention data. - **Month 3+:** Investor-ready if retention curves look good. --- ### Lane C — Co-Founder-Ready **Goal:** A potential co-founder looks at Lila and thinks, "This person can build, and there's a real product here." **Checklist:** 1. **Clean Codebase + Documentation** `[in progress]` - Documentation restructure is complete. - README must get a new dev from `git clone` to `docker compose up` in < 5 minutes. - **Status:** Docs are done. Code cleanliness is ongoing (BACKLOG.md `next`/`later` items). 2. **Live Demo with Real Users** `[partially done]` - App is live at lilastudy.com with real auth and multiplayer. - **Gap:** No real users yet. The current app is a generic quiz — not compelling enough for strangers to stick around. - **Requirement:** Media demo must be live before pitching to potential co-founders. 3. **Clear Vision Doc** `[not written]` - 1-page: What Lila is, what it isn't, and the 18-month arc. - Include: target languages, target media types, target user persona, and what "success" looks like at 6 / 12 / 18 months. --- ## Stream 3: Building the Startup (Technical Founder Journey) ### Phase 1 — Differentiate the MVP (Now → Month 2–3) **Duration:** 2–3 months **Rule:** Do NOT look for a co-founder yet. **Why:** The MVP is already built and deployed. What's missing is the **differentiating feature** (media-based practice). A co-founder won't help you build this faster — it's a technical/data problem. Also, you need leverage: "I built the MVP AND the media pipeline" is stronger than "I have an idea for a media pipeline." **Tasks:** 1. **Finish Kaikki pipeline** (Stage 3–6) - Complete enrich sub-stage rewrite - Run full sample, validate quality - Production sync to PostgreSQL - **Timeline:** 2–4 weeks 2. **Build media ingestion prototype** - Pick ONE media piece (Breaking Bad S01E01, Shakira song, or Harry Potter Ch. 1) - Pipeline: subtitles/lyrics → text extraction → vocabulary identification → Kaikki sense-matching → quiz generation - UI: media selection → quiz with context ("This word appears at 00:04:23") - **Timeline:** 2–4 weeks (parallel with Kaikki pipeline) 3. **Ship guest play** - Make auth optional on game routes - "Try without account" button on landing page - Capture email/OAuth after first session - **Timeline:** 1 week 4. **Add metrics instrumentation** - PostHog or Plausible - Track: signups, quiz starts, completions, multiplayer matches, retention - **Timeline:** 1 week 5. **Soft launch to 100 strangers** - Reddit (r/languagelearning, r/Anki), language-learning Discords, Hacker News Show HN - Collect qualitative feedback - **Timeline:** 1 week (after media demo is live) --- ### Phase 2 — Validate & Measure (Month 2–4) **Goal:** Prove that the media feature resonates and that retention curves exist. **Tasks:** - Analyze metrics: Do users who try media-based practice return more than singleplayer-only users? - Iterate on media selection and quiz UX based on feedback - Polish shareable output (social cards) - Fix hardening items from BACKLOG.md Phase 7 **Decision gate:** If 100 users show positive retention signals (Day 1 > 30%, Day 7 > 10%), proceed to Phase 3. If not, iterate on media feature or pivot. --- ### Phase 3 — Define the Gap (Month 3–5) **Goal:** Identify exactly what you suck at or hate doing. **The wrong approach:** "I need an MBA." **The right approach:** "I need someone who has done [specific thing] before." **Common gaps for technical founders:** | Gap | Profile | What They Do | |-----|---------|--------------| | Fundraising | Former founder who raised Seed/Series A | Writes deck, runs investor meetings, handles term sheets | | Monetization | Product/Growth PM from ed-tech | Designs pricing, runs experiments, builds B2B pipeline | | Partnerships | BD person from media/streaming | Negotiates content deals, affiliate partnerships | | Operations | COO-type | Runs hiring, finance, legal, day-to-day ops | | Marketing | Growth marketer | Runs paid/organic acquisition, community building | **Your job:** After 100 users, the gap becomes obvious. If no one converts to signup, you need growth/marketing. If schools email asking for licenses, you need BD/monetization. If investors ask questions you can't answer, you need a fundraising co-founder. --- ### Phase 4 — Co-Founder Search (Targeted, Month 4–6) **Goal:** Find 2–3 candidates, work with them on a trial basis. **Where to look:** - **Founder dating events:** YC Co-Founder Matching, Indie Hackers meetups, local accelerators. - **Angel investor intros:** Ask any angel you meet for intros to founders they backed who might want a new project. - **Industry communities:** Ed-tech Slack/Discord groups, language-learning subreddits (look for people complaining about Duolingo — they care). - **LinkedIn outbound:** Search "former PM at Duolingo," "former growth at Babbel." Cold DM with the Lila demo, not a resume. **Trial period (4–8 weeks):** - Work together on a concrete project (e.g., "Design and test a monetization experiment"). - No equity commitment. Pay as a contractor if needed. - Evaluate: Do they deliver? Do you communicate well? Do they care about the mission? --- ### Phase 5 — Formalize (Only After Trial, Month 6+) **Goal:** Legal structure, equity split, roles. **Equity mindset:** - **50/50 is dangerous** unless you truly could not build without them from day one. You already built the Lila MVP alone — you have leverage. - **Suggested:** 60/40 or 65/35 with 4-year vesting and a 1-year cliff for both. - **Vesting is non-negotiable.** If they leave in 6 months, they keep nothing. **Roles:** - You: CTO / Product (own technical vision, architecture, data pipeline). - Them: CEO / COO / CMO depending on profile (own business side, fundraising, partnerships). - Decision-making: You retain veto on technical/product decisions; they lead business/fundraising. **Legal:** - Incorporate properly (C-Corp if US, Ltd if UK/EU). - IP assignment agreement: everything built so far belongs to the company. - Founder agreement: roles, vesting, termination, dispute resolution. --- ## Suggested Execution Order ### Month 1 (Now) - **Week 1–2:** Finish Kaikki Stage 3 enrich sub-stage rewrite. Run full sample, validate quality. Start first additional multiplayer mode (Race to the Top recommended). - **Week 3:** Ship guest play. Make auth optional on game routes. - **Week 4:** Start media ingestion prototype (parallel). Pick one media piece, get text extraction working. ### Month 2 - **Week 1–2:** Complete media ingestion prototype. End-to-end: media → quiz. Complete first additional multiplayer mode. - **Week 3:** Add metrics (PostHog/Plausible). Polish shareable output (social cards). - **Week 4:** Integrate media demo with multiplayer mode. Test combined flow. ### Month 3 - **Week 1:** Soft launch to 100 strangers. Gather feedback. - **Week 2–3:** Iterate based on feedback. Fix hardening items from BACKLOG.md. - **Week 4:** Analyze metrics. Decision gate: proceed or iterate? ### Month 4–6 - If metrics are positive: start co-founder search (Phase 4). - If metrics are weak: iterate on media feature or pivot the value proposition. --- ## Open Questions to Refine This Roadmap Answer these to make the roadmap more specific: ### Product Reality Check - [ ] What is the exact blocker on Kaikki Stage 3? (Is it the sub-stage rewrite, or something else?) - [ ] Which media piece should be the first demo? (Breaking Bad, Shakira, or Harry Potter?) - [ ] Do you have subtitle/lyrics data for the chosen media piece? ### Target Audience - [ ] Who is the ideal first user? (e.g., "German intermediate learner who watches Netflix") - [ ] What language pair should the media demo target? ### Business Model - [ ] Do you have a monetization hypothesis? (freemium / B2B / affiliate) - [ ] Any thoughts on unit economics? ### Competitive Landscape - [ ] Who do you see as direct competitors? (LingQ? FluentU? Duolingo? Anki?) - [ ] What do they do poorly that Lila fixes? ### Runway & Constraints - [ ] Is this full-time or nights-and-weekends? - [ ] Do you have any funding, savings runway, or revenue? - [ ] What's your hard deadline for "showable to users"? ### Co-Founder Search - [ ] Local or remote-first? - [ ] What do you want them to _do_? (fundraise, partnerships, monetization, operations?) - [ ] Do you already know candidates, or starting from zero? - [ ] Equity mindset: 50/50, or majority control for yourself?