Capture Live (c) active · schema v2 foundation landed
Capture now, make sense later.
m2c kickoff 15-05-2026 ✅ — tasks 1–4 closed. Schema-refactor foundation landed end-to-end as a single coordinated change. Migration 010 (010_schema_v2_multimodal.sql) applied: new events.media (photos + audio siblings, FK to raw_inputs ON DELETE CASCADE) + new events.location_pings (parallel stream, not FK) + capture_mode discriminator added to raw_inputs (text NOT NULL DEFAULT 'live' CHECK (...IN ('live', 'test'))) + existing 78 photo paths backfilled into media + content_photo dropped from raw_inputs + review RPC reshaped (photo_path via media LEFT JOIN; capture_mode projected). Bundled in one migration because Postgres refuses to drop a column referenced by an active function. Capture App (capture-app/public/app.js) writes the new shape — raw_inputs insert with pre-generated row UUIDs (so we can FK the sibling media rows without a return round-trip) + follow-up bulk insert into events.media. Review frontend types + 6 components renamed (content_photo → photo_path); local preview against the live Supabase corpus verified all six surfaces render clean (84 rows, 78 photos, 0 console errors). Production deploys (Capture App + review-frontend) pending Freek's explicit authorization — auto-mode classifier blocks deploys without explicit OK. Three new decisions.md 15-05-2026 entries (capture_mode CHECK; FK cascade on media; single coordinated change). Full m2c kickoff history in project-management/projectlog.md 15-05-2026 entry.
Active milestone
Capture Live (c) — Multi-Modal Capture (active since 15-05-2026; schema v2 foundation landed)
What's next
- AI Enrichment (prototype) — scoped 12-05-2026, replaces former "Automated Loop"
Previous milestone
Capture Live (b) — Review Frontend Phase 1 (complete 14-05-2026)
Tracks · Owner
Capture Freek
Milestone timeline
Ten named moments where something demonstrably works. Numbered for ordering, not for ranking — they can be reordered, inserted, or deferred without renaming anything else.
Recent updates
What landed in the past few sessions. Pulled live from STATUS.md.
events.media + events.location_pings sibling tables; capture_mode column on raw_inputs (CHECK-constrained TEXT, 'live' default); 78 photo paths backfilled into media; content_photo dropped; review RPC reshaped. Capture App + review-frontend updated to new shape; both production deploys executed by Freek mid-session. Three decision entries logged on the schema (capture_mode = CHECK; FK ON DELETE CASCADE on media; single coordinated change vs additive-with-transition). Permissions hygiene: docs/claude-permissions.md register seeded with 3-monthly review cadence; user-level Family-era entries pruned (7 stale rules removed). Cowork brief (New instructions/) retracted by Freek before processing.@cf/openai/whisper-large-v3-turbo). Native stack fit (already on Cloudflare for capture-app + review-frontend deploys); ~$0.031/hour beyond the free tier; 10k Neurons/day = ~214 free audio-minutes/day covers prototype volume; self-hosted Whisper documented as the migration path. Sensitive captures will bypass this provider per ADR-003 §2.1. ADR-003 §2.4 closed.MediaRecorder with MIME-priority fallback; pulsing recording indicator + timer; audio tiles in the queue with native <audio controls> + duration + remove × + "upload pending — m2c task 8" badge. Mid-recording mode switches finalise rather than orphan the recording. Live at https://app.remoir.app/ (deploy 43808b8b). Real-device iOS Safari mic-grant validation pending Freek.transcribe.remoir.app. New transcribe-worker/ directory: TypeScript Cloudflare Worker, Workers AI binding to @cf/openai/whisper-large-v3-turbo per the task 5 decision. POST /transcribe with multipart/form-data (audio field), origin allow-list (app.remoir.app + localhost:3100) + optional X-Capture-Token shared secret, returns {text, language, duration_seconds} JSON. Migration 011 applied: new remoir-raw-audio Supabase Storage bucket (20 MB cap; MIME allow-list audio/webm + audio/mp4 + audio/aac + audio/mpeg + audio/ogg) parallel to remoir-raw-photos. End-to-end smoke test passed: macOS say → 16kHz mono WAV → curl POST → 200 OK in 2.27s with a clean transcript. Lesson logged in lessonsClaude.md 15-05-2026 — Workers AI Whisper requires audio as a base64-encoded string, not a number array (first deploy hit AiError 5006; redeploy with base64 worked first time).'audio' added to events.raw_inputs.type CHECK; new transcript text column on events.media; review RPC reshaped to surface audio_path + transcript via a second LEFT JOIN on kind='audio'. Capture App handleSend refactored to a two-phase flow — photos first (existing), then audios in parallel (upload to remoir-raw-audio Storage → POST to transcribe.remoir.app/transcribe → write raw_inputs + events.media with transcript inline). content_text mirrors transcript on audio rows so existing gallery/table/audit-log surfaces handle them without schema-aware code paths. Review-frontend types + 7 files updated for 'audio' rendering: types.ts (RawInputType, ALL_TYPES, RawInput); new audios.ts parallel of photos.ts for audio signed-URL batching; Card.tsx (new AudioPlayer export + 🎙️ icon + purple TYPE_BADGE); DetailView.tsx (audio block, dedicated audio sign-batch); ListView.tsx (parallel audio sign-batch alongside photos); MapView.tsx (purple TYPE_COLOR); Filters.tsx + Dashboard.tsx (audio icon + tile border); AuditLog.tsx ('(audio)' content-hint fallback); anomalies.ts (sparse-row exempts audio_path). End-to-end verification: TypeScript build clean; capture-app preview confirms badge gone + UI wired; wire test (curl upload → transcribe → INSERT raw_inputs + media) all 201/200/clean; review-frontend preview renders the test row with purple AUDIO badge + native audio player + transcript. Test row cleaned up (storage object orphan is m2c task 14 cleanup-script territory). Both deploys live (capture 3088cdbb, review 3d571fca).audio/mp4; codecs=mp4a.40.2; Supabase Storage validates whole-string MIME equality so the codec params caused 415 rejection. Fixed with audioBaseMime(mime) helper stripping at ; (also used for the persisted events.media.mime); new durable lesson in lessonsClaude.md. (2) Block-scoping regression — setStatus(rows.length ...) line referenced rows after my task-8 photo/audio split wrapped the photo flow in if (photos.length > 0); replaced with photos.length + audios.length. This same bug also prevented resetForm from firing (it was inside the post-status setTimeout), explaining why the UI never cleared after Send. (3) Cloudflare zone remoir.app has Browser Cache TTL overriding origin Cache-Control, so app.remoir.app served a 4-hour-cached app.js even after deploys. Workaround: pull-to-refresh or deployment-specific URL. Structural fix logged in maybelater.md 15-05-2026 — dashboard click to set Browser Cache TTL = Respect Existing Headers. Real-device validation completed on 8d73e0d8.remoir-capture.pages.dev (which bypasses the zone): capture → upload → transcribe → row + playback in review.remoir.app → UI resets cleanly. m2c tasks 6 + 8 both fully closed including iOS real-device validation.Six development tracks
Tracks are permanent parallel workstreams. They apply across the whole platform and never finish — they mature over time. A single milestone typically advances two or three tracks at once.
Capture
Making input as frictionless as possible. WhatsApp, email, Apple Shortcuts, free-form text, voice. The moment of capture is the highest-friction point and the most important to solve. If capture feels like work, nothing else matters.
Knowledge Graph
The structured collection layer. Objects, links, templates, Capacities (for Remoir Events), Supabase schemas (for Remoir Family). Where user value is created through organisation and retrieval.
Data Pipeline
The backend plumbing. Webhooks, automation (Make, n8n, custom), OCR, geocoding, storage, error handling. What moves data from capture to knowledge graph reliably.
Raw Data
Storing all inputs in their original unprocessed form with full metadata from day one. The foundation for AI parsing and the user's personal data asset. User-owned and controlled from the first line of code. Non-negotiable.
AI Enrichment
Parsing, tagging, linking, summarising using AI. Reduces manual effort. Eventually replaces structured input entirely with free-form natural language. Trained on the Raw Data track. m3 (AI Enrichment prototype, scoped 12-05-2026) lands first on this track — a lightweight Claude-API prototype over raw inputs that generates the empirical signal needed to scope m6 (AI Parsing, production-grade).
Sovereignty
User ownership and control of all data and metadata. Group governance. Selective sharing with commercial parties on user terms. Long-term vision — informs architecture decisions now without being built now. Includes the federated ownership layer (see vision/future-layers.md).