336 lines
16 KiB
Markdown
336 lines
16 KiB
Markdown
# Lila — Feature & Startup Strategy Roadmap
|
||
|
||
> **Context for any LLM reading this:** Lila is a language learning/vocabulary app with two core differentiators: (1) **media-based practice** — users learn vocabulary extracted from real media they love (e.g., a Shakira song, the first chapter of _Harry Potter_, or an episode of _Breaking Bad_), and (2) **multiplayer modes** — users practice vocabulary together or competitively in real-time sessions. The app is currently at an early MVP stage. The existing MVP was built around OpenWordNet, which is being replaced because it produces unreliable translations (sense-disambiguation issues). The team is migrating the data pipeline to **Kaikki**, which structures entries per word sense and links translations to specific senses rather than vague general concepts. This migration is the current technical priority. The project is a TypeScript monorepo (pnpm workspaces) with an Express/WebSocket API (`apps/api`), a React frontend using TanStack Router (`apps/web`), a data ingestion pipeline (`data-pipeline`) backed by SQLite/Drizzle, shared packages (`packages/db`, `packages/shared`), and Docker-based deployment orchestrated with Caddy. Documentation restructuring (human-readable vs. AI-optimized docs) is being handled in a separate parallel workstream.
|
||
|
||
---
|
||
|
||
## Current State (Ground Truth — 2026-05-15)
|
||
|
||
### What Works Today ✅
|
||
|
||
- **Singleplayer quiz** — Duolingo-style, 5 language pairs (en↔it/de/es/fr), 3 or 10 rounds, POS + difficulty filters
|
||
- **Multiplayer** — Create/join lobby by room code, 2–4 players, simultaneous answers, 15s server timer, live scoring, winner screen
|
||
- **Auth** — Google + GitHub via Better Auth
|
||
- **Deployment** — Live at lilastudy.com, Hetzner VPS, Caddy HTTPS, Docker Compose, CI/CD via Forgejo Actions
|
||
- **Database** — PostgreSQL with Drizzle ORM, daily backups
|
||
|
||
### What's In Progress / Blocked 🚧
|
||
|
||
- **Kaikki data pipeline migration** — Stage 1 (extract) and Stage 2 (reverse link) complete on sample data. Stage 3 (enrich) being rewritten for sub-stage architecture. Stages 4–6 not started.
|
||
- **Guest play** — No try-before-signup flow yet. Auth required for all game routes.
|
||
- **Game session store** — Still in-memory. Valkey container exists locally but not wired up.
|
||
- **Media ingestion** — Not started. No pipeline for subtitles/lyrics → vocab extraction yet.
|
||
|
||
### The Strategic Gap
|
||
|
||
The app is currently a **generic vocabulary quiz**. The media-based practice feature (the differentiator) does not exist yet. It depends on:
|
||
|
||
1. Kaikki pipeline reaching production (fixes translation quality)
|
||
2. A media ingestion prototype (subtitles/lyrics → text → vocab extraction → quiz)
|
||
|
||
---
|
||
|
||
## Stream 1: Documentation Restructure (Parallel Track)
|
||
|
||
**Status:** ✅ Complete. Human-readable branch (README, STATUS, ARCHITECTURE, BACKLOG, DECISIONS, DEPLOYMENT, DATA_PIPELINE, MODEL_STRATEGY, LLM_SETUP, design/GAME_MODES) and AI-context branch (00–06, prompts/meta.md, 99-current-task.md) are live in `documentation/`.
|
||
|
||
---
|
||
|
||
## Stream 2: Feature Roadmap (Three Lanes)
|
||
|
||
### Lane A — Attract & Keep Users
|
||
|
||
**Goal:** A user lands on Lila, understands the value in 10 seconds, and completes a satisfying vocabulary practice session in under 2 minutes.
|
||
|
||
**Current Reality Check:**
|
||
|
||
- Singleplayer and multiplayer quizzes are **already working and deployed**.
|
||
- The app is functional but **not differentiated** — it's a generic vocabulary quiz right now.
|
||
- The "wow" moment requires the **media-based practice feature**, which does not exist yet.
|
||
|
||
**Must-Haves for First Users:**
|
||
|
||
1. **Guest Play (Zero-Friction Onboarding)** `[in backlog next]`
|
||
- No signup required for first session.
|
||
- Capture email or OAuth only after the user experiences value.
|
||
- Critical for viral loops and investor demos.
|
||
- **Status:** Planned in BACKLOG.md. Not yet implemented.
|
||
|
||
2. **One Polished Media Demo** `[not started]`
|
||
- Pick **ONE** piece of media and make it flawless end-to-end: subtitles/lyrics → Kaikki-based vocab extraction with sense-disambiguated translations → playable quiz with timestamps/context.
|
||
- Candidates: _Breaking Bad S01E01_, a Shakira song, or _Harry Potter and the Sorcerer's Stone Ch. 1_.
|
||
- This is the primary "wow" moment. Differentiates Lila from all other vocabulary apps.
|
||
- **Blocker:** Requires (a) Kaikki pipeline in production, and (b) a media ingestion prototype.
|
||
|
||
3. **One Additional Multiplayer Mode** `[design exists, not implemented]`
|
||
- Proves the mode-agnostic lobby architecture works and adds variety beyond the current simultaneous-answer flow.
|
||
- **Recommended first mode:** Race to the Top (target score, no round limit) — simplest to implement, changes only scoring logic.
|
||
- Alternative: TV Quiz Show (buzzer — first to press answers) — most visually distinct, but requires new answer flow.
|
||
- **Status:** Lobby infrastructure is mode-agnostic. Each mode adds game logic only. See `design/GAME_MODES.md` for full designs.
|
||
- **Why it matters:** Duolingo has no multiplayer. Anki has no multiplayer. Real-time modes are a genuine differentiator even without media.
|
||
|
||
4. **Social Proof / Shareable Output** `[not started]`
|
||
- Post-game card: "I learned 12 words from _La Tortura_ — can you beat my score?"
|
||
- Image export or copy-paste text for Reddit, Discord, Twitter.
|
||
- This is the organic growth engine.
|
||
- **Blocker:** Requires media demo to exist first.
|
||
|
||
**Already Shipped (Don't Rebuild):**
|
||
|
||
- ✅ Singleplayer quiz (5 languages, POS/difficulty filters)
|
||
- ✅ Multiplayer lobby + real-time game (2–4 players, simultaneous answers, 15s timer, scoring)
|
||
- ✅ Auth (Google + GitHub)
|
||
- ✅ Live deployment with CI/CD
|
||
|
||
**Nice-to-Haves (Post-Launch):**
|
||
|
||
- Additional multiplayer modes (Chain Link, Elimination Round, Cooperative Challenge)
|
||
- Leaderboards
|
||
- Spaced repetition review queue
|
||
- Additional game modes (see design/GAME_MODES.md)
|
||
|
||
---
|
||
|
||
### Lane B — Investor-Ready
|
||
|
||
**Goal:** Walk into a pitch with engagement metrics and a defensibility story tied to Lila's unique data pipeline.
|
||
|
||
**Checklist:**
|
||
|
||
1. **Metrics Instrumentation** `[not started]`
|
||
- Track: DAU/MAU, session length, quiz completion rate, multiplayer match completion rate, Day 1 / Day 7 retention.
|
||
- Tool: PostHog, Mixpanel, or Plausible (self-hosted).
|
||
- Need 4–6 weeks of real-user data.
|
||
- **Note:** The app is live but has no analytics. This is a prerequisite for any investor conversation.
|
||
|
||
2. **Growth Mechanic** `[not started]`
|
||
- The shareable card (Lane A.3) must be live and instrumented.
|
||
- Measure k-factor (viral coefficient). Even 0.3 is a story.
|
||
- **Blocker:** Requires media demo.
|
||
|
||
3. **Defensibility Story** `[partially true, not yet proven]`
|
||
- **Data moat:** Lila's Kaikki → media mapping pipeline produces sense-disambiguated vocabulary tied to specific media timestamps. Competitors using generic word lists or OpenWordNet-style dumps cannot match the precision.
|
||
- **Current reality:** The Kaikki pipeline exists but is not in production. The media mapping pipeline does not exist yet.
|
||
- **What investors would ask:** "You have a quiz app. Where's the media feature you pitched?"
|
||
- **Requirement:** Media demo + Kaikki production data must be live before investor conversations.
|
||
|
||
4. **Monetization Hypothesis** `[not tested]`
|
||
- Pick ONE model to test:
|
||
- **Freemium:** Free media, premium for user uploads / unlimited multiplayer / advanced analytics.
|
||
- **B2B:** Schools and language institutes buy group licenses.
|
||
- **Affiliate:** Deep-link to streaming services, books, or music platforms.
|
||
- Don't implement yet, but explain LTV/CAC math and pricing assumptions.
|
||
|
||
**Investor Timeline:**
|
||
|
||
- **Now → Month 2:** Finish Kaikki pipeline + ship media demo + add metrics.
|
||
- **Month 2–3:** Soft launch to 100 strangers, gather retention data.
|
||
- **Month 3+:** Investor-ready if retention curves look good.
|
||
|
||
---
|
||
|
||
### Lane C — Co-Founder-Ready
|
||
|
||
**Goal:** A potential co-founder looks at Lila and thinks, "This person can build, and there's a real product here."
|
||
|
||
**Checklist:**
|
||
|
||
1. **Clean Codebase + Documentation** `[in progress]`
|
||
- Documentation restructure is complete.
|
||
- README must get a new dev from `git clone` to `docker compose up` in < 5 minutes.
|
||
- **Status:** Docs are done. Code cleanliness is ongoing (BACKLOG.md `next`/`later` items).
|
||
|
||
2. **Live Demo with Real Users** `[partially done]`
|
||
- App is live at lilastudy.com with real auth and multiplayer.
|
||
- **Gap:** No real users yet. The current app is a generic quiz — not compelling enough for strangers to stick around.
|
||
- **Requirement:** Media demo must be live before pitching to potential co-founders.
|
||
|
||
3. **Clear Vision Doc** `[not written]`
|
||
- 1-page: What Lila is, what it isn't, and the 18-month arc.
|
||
- Include: target languages, target media types, target user persona, and what "success" looks like at 6 / 12 / 18 months.
|
||
|
||
---
|
||
|
||
## Stream 3: Building the Startup (Technical Founder Journey)
|
||
|
||
### Phase 1 — Differentiate the MVP (Now → Month 2–3)
|
||
|
||
**Duration:** 2–3 months
|
||
**Rule:** Do NOT look for a co-founder yet.
|
||
|
||
**Why:** The MVP is already built and deployed. What's missing is the **differentiating feature** (media-based practice). A co-founder won't help you build this faster — it's a technical/data problem. Also, you need leverage: "I built the MVP AND the media pipeline" is stronger than "I have an idea for a media pipeline."
|
||
|
||
**Tasks:**
|
||
|
||
1. **Finish Kaikki pipeline** (Stage 3–6)
|
||
- Complete enrich sub-stage rewrite
|
||
- Run full sample, validate quality
|
||
- Production sync to PostgreSQL
|
||
- **Timeline:** 2–4 weeks
|
||
|
||
2. **Build media ingestion prototype**
|
||
- Pick ONE media piece (Breaking Bad S01E01, Shakira song, or Harry Potter Ch. 1)
|
||
- Pipeline: subtitles/lyrics → text extraction → vocabulary identification → Kaikki sense-matching → quiz generation
|
||
- UI: media selection → quiz with context ("This word appears at 00:04:23")
|
||
- **Timeline:** 2–4 weeks (parallel with Kaikki pipeline)
|
||
|
||
3. **Ship guest play**
|
||
- Make auth optional on game routes
|
||
- "Try without account" button on landing page
|
||
- Capture email/OAuth after first session
|
||
- **Timeline:** 1 week
|
||
|
||
4. **Add metrics instrumentation**
|
||
- PostHog or Plausible
|
||
- Track: signups, quiz starts, completions, multiplayer matches, retention
|
||
- **Timeline:** 1 week
|
||
|
||
5. **Soft launch to 100 strangers**
|
||
- Reddit (r/languagelearning, r/Anki), language-learning Discords, Hacker News Show HN
|
||
- Collect qualitative feedback
|
||
- **Timeline:** 1 week (after media demo is live)
|
||
|
||
---
|
||
|
||
### Phase 2 — Validate & Measure (Month 2–4)
|
||
|
||
**Goal:** Prove that the media feature resonates and that retention curves exist.
|
||
|
||
**Tasks:**
|
||
|
||
- Analyze metrics: Do users who try media-based practice return more than singleplayer-only users?
|
||
- Iterate on media selection and quiz UX based on feedback
|
||
- Polish shareable output (social cards)
|
||
- Fix hardening items from BACKLOG.md Phase 7
|
||
|
||
**Decision gate:** If 100 users show positive retention signals (Day 1 > 30%, Day 7 > 10%), proceed to Phase 3. If not, iterate on media feature or pivot.
|
||
|
||
---
|
||
|
||
### Phase 3 — Define the Gap (Month 3–5)
|
||
|
||
**Goal:** Identify exactly what you suck at or hate doing.
|
||
|
||
**The wrong approach:** "I need an MBA."
|
||
**The right approach:** "I need someone who has done [specific thing] before."
|
||
|
||
**Common gaps for technical founders:**
|
||
| Gap | Profile | What They Do |
|
||
|-----|---------|--------------|
|
||
| Fundraising | Former founder who raised Seed/Series A | Writes deck, runs investor meetings, handles term sheets |
|
||
| Monetization | Product/Growth PM from ed-tech | Designs pricing, runs experiments, builds B2B pipeline |
|
||
| Partnerships | BD person from media/streaming | Negotiates content deals, affiliate partnerships |
|
||
| Operations | COO-type | Runs hiring, finance, legal, day-to-day ops |
|
||
| Marketing | Growth marketer | Runs paid/organic acquisition, community building |
|
||
|
||
**Your job:** After 100 users, the gap becomes obvious. If no one converts to signup, you need growth/marketing. If schools email asking for licenses, you need BD/monetization. If investors ask questions you can't answer, you need a fundraising co-founder.
|
||
|
||
---
|
||
|
||
### Phase 4 — Co-Founder Search (Targeted, Month 4–6)
|
||
|
||
**Goal:** Find 2–3 candidates, work with them on a trial basis.
|
||
|
||
**Where to look:**
|
||
|
||
- **Founder dating events:** YC Co-Founder Matching, Indie Hackers meetups, local accelerators.
|
||
- **Angel investor intros:** Ask any angel you meet for intros to founders they backed who might want a new project.
|
||
- **Industry communities:** Ed-tech Slack/Discord groups, language-learning subreddits (look for people complaining about Duolingo — they care).
|
||
- **LinkedIn outbound:** Search "former PM at Duolingo," "former growth at Babbel." Cold DM with the Lila demo, not a resume.
|
||
|
||
**Trial period (4–8 weeks):**
|
||
|
||
- Work together on a concrete project (e.g., "Design and test a monetization experiment").
|
||
- No equity commitment. Pay as a contractor if needed.
|
||
- Evaluate: Do they deliver? Do you communicate well? Do they care about the mission?
|
||
|
||
---
|
||
|
||
### Phase 5 — Formalize (Only After Trial, Month 6+)
|
||
|
||
**Goal:** Legal structure, equity split, roles.
|
||
|
||
**Equity mindset:**
|
||
|
||
- **50/50 is dangerous** unless you truly could not build without them from day one. You already built the Lila MVP alone — you have leverage.
|
||
- **Suggested:** 60/40 or 65/35 with 4-year vesting and a 1-year cliff for both.
|
||
- **Vesting is non-negotiable.** If they leave in 6 months, they keep nothing.
|
||
|
||
**Roles:**
|
||
|
||
- You: CTO / Product (own technical vision, architecture, data pipeline).
|
||
- Them: CEO / COO / CMO depending on profile (own business side, fundraising, partnerships).
|
||
- Decision-making: You retain veto on technical/product decisions; they lead business/fundraising.
|
||
|
||
**Legal:**
|
||
|
||
- Incorporate properly (C-Corp if US, Ltd if UK/EU).
|
||
- IP assignment agreement: everything built so far belongs to the company.
|
||
- Founder agreement: roles, vesting, termination, dispute resolution.
|
||
|
||
---
|
||
|
||
## Suggested Execution Order
|
||
|
||
### Month 1 (Now)
|
||
|
||
- **Week 1–2:** Finish Kaikki Stage 3 enrich sub-stage rewrite. Run full sample, validate quality. Start first additional multiplayer mode (Race to the Top recommended).
|
||
- **Week 3:** Ship guest play. Make auth optional on game routes.
|
||
- **Week 4:** Start media ingestion prototype (parallel). Pick one media piece, get text extraction working.
|
||
|
||
### Month 2
|
||
|
||
- **Week 1–2:** Complete media ingestion prototype. End-to-end: media → quiz. Complete first additional multiplayer mode.
|
||
- **Week 3:** Add metrics (PostHog/Plausible). Polish shareable output (social cards).
|
||
- **Week 4:** Integrate media demo with multiplayer mode. Test combined flow.
|
||
|
||
### Month 3
|
||
|
||
- **Week 1:** Soft launch to 100 strangers. Gather feedback.
|
||
- **Week 2–3:** Iterate based on feedback. Fix hardening items from BACKLOG.md.
|
||
- **Week 4:** Analyze metrics. Decision gate: proceed or iterate?
|
||
|
||
### Month 4–6
|
||
|
||
- If metrics are positive: start co-founder search (Phase 4).
|
||
- If metrics are weak: iterate on media feature or pivot the value proposition.
|
||
|
||
---
|
||
|
||
## Open Questions to Refine This Roadmap
|
||
|
||
Answer these to make the roadmap more specific:
|
||
|
||
### Product Reality Check
|
||
|
||
- [ ] What is the exact blocker on Kaikki Stage 3? (Is it the sub-stage rewrite, or something else?)
|
||
- [ ] Which media piece should be the first demo? (Breaking Bad, Shakira, or Harry Potter?)
|
||
- [ ] Do you have subtitle/lyrics data for the chosen media piece?
|
||
|
||
### Target Audience
|
||
|
||
- [ ] Who is the ideal first user? (e.g., "German intermediate learner who watches Netflix")
|
||
- [ ] What language pair should the media demo target?
|
||
|
||
### Business Model
|
||
|
||
- [ ] Do you have a monetization hypothesis? (freemium / B2B / affiliate)
|
||
- [ ] Any thoughts on unit economics?
|
||
|
||
### Competitive Landscape
|
||
|
||
- [ ] Who do you see as direct competitors? (LingQ? FluentU? Duolingo? Anki?)
|
||
- [ ] What do they do poorly that Lila fixes?
|
||
|
||
### Runway & Constraints
|
||
|
||
- [ ] Is this full-time or nights-and-weekends?
|
||
- [ ] Do you have any funding, savings runway, or revenue?
|
||
- [ ] What's your hard deadline for "showable to users"?
|
||
|
||
### Co-Founder Search
|
||
|
||
- [ ] Local or remote-first?
|
||
- [ ] What do you want them to _do_? (fundraise, partnerships, monetization, operations?)
|
||
- [ ] Do you already know candidates, or starting from zero?
|
||
- [ ] Equity mindset: 50/50, or majority control for yourself?
|