AI Model Comparison

The Study Results (January 2026)¶

Based on the latest model landscape as of January 2026, the recommendation in the original PDF (Gemini 1.5 Pro) is now partially outdated.

You have a choice between Capacity (Context Size) and Intelligence (Reasoning). For a high-quality autobiography, you should use a Hybrid Strategy.

Here is the breakdown of the models available to you right now:

1. The "Daily Driver": Gemini 2.0 Flash¶

Best For: The "Interviewer" mode (talking to the app daily).
Why:
- Speed: It is near-instant, making the conversation feel natural.
- Cost: It is ~10-12x cheaper than the Pro models.
- Context: It has a 1 Million Token window. This is roughly 700,000 words (or about 10 standard books). For daily interactions, this is more than enough "memory."
- Audio: Native audio input is standard.

2. The "Deep Reviewer": Gemini 3 Pro¶

Best For: The "Sunday Review" and "Chapter Generation."
Why:
- Reasoning: It significantly outperforms 1.5 and 2.0 in "connecting the dots." If you tell a story about fear in childhood and anxiety in your career, Gemini 3 Pro is the model most likely to say: "There is a pattern here."
- Context: It also features a 1 Million Token standard window.
- The Catch: It is expensive ($2.00/1M input tokens vs $0.10 for Flash). Use it sparingly—only when you need deep analysis.

3. The "Legacy King": Gemini 1.5 Pro (002)¶

Best For: Extreme Long-Term Memory (Only if you exceed 1 Million Tokens).
Why: This model remains unique because it supports a 2 Million Token context window.
Verdict: You probably don't need this yet.
- Math: An average spoken hour is ~8,000 words (~10k tokens). You would need to record 100 hours of audio before you hit the limit of the newer Gemini 2.0/3.0 models.
- Strategy: Start with Gemini 3 Pro. If you record more than 100 hours of content (a massive audiobook), switch the "Memory" engine back to 1.5 Pro.

The Recommended Architecture for Your App (2026 Edition)¶

Don't pick just one. Configure your Cloud Functions to use the right tool for the job.

A. The "Chat" Function (Fast & Cheap) * Model: gemini-2.0-flash-exp * Task: Handling the "Shake to Ask" feature, daily prompts, and recording transcription. * Context: Feeds in the last ~20 recent memories + User Profile.

B. The "Biographer" Function (Smart & Deep) * Model: gemini-3-pro-preview * Task: The "Review Mode" where it reads your stories and generates critical questions. * Context: Uses Context Caching (see below) to load your entire life story so far.

Critical Feature to Enable: "Context Caching"¶

Since you are sending the same "Life Story" to the AI over and over, you should enable Vertex AI Context Caching. * Without Cache: You pay to upload your whole life story every time you ask a question. * With Cache: You upload the life story once (cached for 60 mins or more). Subsequent questions cost ~90% less. * Implementation: In your Cloud Function, create a cachedContent object for the user's autobiography timeline and query against that ID.

Final Code Update (Node.js)¶

Update your functions/src/index.ts to use this hybrid approach:

// For fast, daily interactions (Transcription, Quick Questions)
const fastModel = vertexAI.getGenerativeModel({ 
    model: "gemini-2.0-flash-exp" 
});

// For deep analysis (Generating Chapters, finding psychological patterns)
const smartModel = vertexAI.getGenerativeModel({ 
    model: "gemini-3-pro-preview" 
});