This is the revised Project Task Book optimized for the MacBook + iPhone ecosystem.

Using a Mac simplifies the hardware engineering significantly, allowing you to focus purely on the AI and App logic. This plan assumes you are using macOS Sequoia (or later) and Xcode 17+ (standard for 2026).

🍎 Project Task Book: The AI Autobiography Agent (iOS Native)¶

Project Goal: Create a voice-first, AI-driven iOS app that interviews the user, structures their life story, and autonomously writes a biography. Tech Stack: Flutter (iOS), Firebase (Functions, Firestore, Storage), Vertex AI (Gemini Hybrid Strategy).

🟢 Phase 0: The Apple Ecosystem Setup (Days 1-2)¶

Objective: Configure the Mac for professional iOS development and connect the Cloud backend.

0.1 Development Environment¶

Install Xcode: Download from App Store. Open it to install command-line tools.
Install Homebrew: Run the standard curl script in Terminal.

Install Flutter & Tools:

brew install --cask flutter
brew install cocoapods
gem install cocoapods # Ruby gem is often needed for iOS linking

Verify Setup: Run flutter doctor. Fix any signing issues in Xcode.
Physical Device Setup: Enable "Developer Mode" on your iPhone (Settings > Privacy & Security). Connect to Mac via cable once to trust the computer.

0.2 Cloud Architecture (Firebase Console)¶

Create Project: Initialize a new Firebase project (Blaze Plan required).
Enable Google Cloud APIs:
- Vertex AI API
- Cloud Storage API
Database Setup: Initialize Firestore.
- Composite Index: Collection memories -> Fields: userId (Asc) + estimatedDate (Asc).
- Vector Index: Collection memories -> Field: embedding (Vector, 768 dims).

🟡 Phase 1: The iOS Capture Engine (MVP) (Days 3-6)¶

Objective: Record high-fidelity audio on iPhone, upload to Cloud, and process with Gemini.

1.1 iOS Permissions & Configuration¶

Info.plist Configuration: Add keys for NSMicrophoneUsageDescription and UIBackgroundModes (audio).
Runner Setup: In Xcode, go to Signing & Capabilities -> Add Capability -> "Background Modes" -> Check "Audio, AirPlay, and PiP".

1.2 Flutter Recorder Logic¶

Dependency: flutter_sound and permission_handler.
Audio Session: Configure the iOS Audio Session to mode: measurement or spokenAudio to disable auto-gain if you want raw input, or speech to let iOS clean up noise.
Codec: Hardcode Codec.aacMP4 (Native iOS format).
Upload Logic: Use firebase_storage with putFile and retry logic for resiliency.

1.3 The "Fast" AI Layer (Gemini 2.0 Flash)¶

Cloud Function: onObjectFinalized trigger.
Model: gemini-2.0-flash-exp (Low latency, low cost).
Prompt: "Transcribe verbatim. Extract: Sentiment, People, Date, Location. Return JSON."
Test: Record on iPhone -> Check Firestore for JSON.

🟠 Phase 2: The Visual Timeline & Search (Days 7-12)¶

Objective: A scrollable "Life Map" that identifies gaps in your history.

2.1 The Timeline UI¶

Data Fetching: Query memories ordered by Date.
Visuals: Use a custom Painter or timelines_plus. Draw a line connecting nodes.
The "Fog of War":
- Logic: if (dateB - dateA) > 2 years -> Render a blurry "Missing Era" block.
- Action: Tapping the block triggers the "Gap Interviewer" (Phase 3).

2.2 Vector Memory (The Brain)¶

Embedding Trigger: onCreate function for Firestore.
Model: Vertex AI text-embedding-004.
Action: Convert the summary (not the whole transcript) into a vector. Store in embedding field.

🔴 Phase 3: The "Active Listener" Agent (Days 13-19)¶

Objective: The App wakes up and asks YOU questions.

3.1 The "Shake to Ask" (Quick Question)¶

Sensor Integration: Use sensors_plus. Detect Shake event.
Logic:
1. Pick Random Theme (e.g., "First Love").
2. Vector Search: "Did user talk about First Love?"
3. Gemini 2.0 Flash generates a question based on the missing info.
TTS: Use an AI Voice API (Google or ElevenLabs) to speak the question.

3.2 The "Deep Review" (Critical Analysis)¶

UI: "Review my Childhood" button.
Logic: Aggregate all transcripts from that era.
Model: Gemini 3 Pro (The "Reasoning" Model).
Prompt: "Analyze these 20 pages of transcript. Find the psychological contradictions. Where is the user lying to themselves? Generate 3 hard questions."
Output: Save to a "Pending Questions" inbox.

🟣 Phase 4: The Book Writer (Days 20-25)¶

Objective: Transform oral history into written prose.

4.1 Narrative Synthesis¶

UI: "Generate Chapter 1".
Context Caching: (Crucial Step) Upload the user's "Character Sheet" and "Glossary" to Vertex AI Context Cache to save money.
Model: Gemini 3 Pro.
Prompt: "Write a 3,000-word chapter based on these specific memories. Use a [User Selected Style] tone. Use the first person."

4.2 PDF Export¶

Layout: Use pdf package. Add Title Page, Table of Contents.
Review: Allow user to edit the text in-app before finalizing (Simple Text Editor).

🏁 Phase 5: Polish & Deployment (Days 26-30)¶

On-Device Testing: Run the app unplugged. Test walking, background recording, and interruption handling (phone calls).
Biometrics: Add FaceID (local_auth) to lock the diary.
TestFlight: Archive the build in Xcode and upload to App Store Connect (TestFlight) to send to close friends/family for beta testing.

💡 Critical "Missing Info" & Suggestions¶

Since you are using the Apple ecosystem, you have access to specific tools that can enhance this project. Here is what you should consider adding:

1. The "Photo Memory" Feature (Multimodal)¶

The Idea: Sometimes you can't describe a memory, but you have a photo of it.
iOS Implementation: Allow the user to upload a photo from their Camera Roll.
AI Action: Send the Image + Audio to Gemini 2.0 Flash.
- Prompt: "The user is holding this photo and talking about it. Combine the visual details in the photo with their story."
- Result: The AI writes: "I held the faded photo of the 1998 Ford Escort. You could see the rust on the bumper..." (It sees what you didn't say).

2. Apple Watch Companion¶

The Idea: The best dictation tool is the one on your wrist.
Implementation: A simple WatchOS app with a big "Record" button.
Sync: It records locally on the watch, then transfers the file to the iPhone when connected, which then uploads to Cloud.
Why: "Walking and talking" is the most natural way to do oral history.

3. Local "On-Device" AI (Privacy)¶

The Idea: In 2026, iPhones have powerful NPUs (Neural Processing Units).
Implementation: You could use Google MediaPipe for LLMs (running a small Gemma 2 model on the phone).
Use Case: Use the on-device model to generate the "Quick Titles" or "Tags" for memories instantly, without waiting for the Cloud. This makes the app feel snappier.

4. "Glossary" Management¶

The Problem: Gemini will misspell names. "Kaitlyn" vs "Caitlin".
Solution: Add a settings page: "My People & Places."
- Data: User adds: Mom = Sarah, Hometown = Poughkeepsie.
- Injection: Pass this JSON map into the System Instructions of every Gemini call.

5. Background Audio "Keep-Alive"¶

The Risk: iOS hates background processes. If you stop talking for 15 minutes but leave the recorder on, iOS might kill the app to save battery.
Solution: Implement a "Silent Audio Loop." Play a 0-volume sound file in the loop while recording. This tricks iOS into thinking the app is a music player, keeping it alive indefinitely in the background.****