跳转至

Product Statement and Technical Stack for Personal AI Companion

Here is the fully updated Product Statement and Technical Stack for your Personal AI Companion, re-architected around FalkorDB for real-time memory evolution.


1. Product Statement

For privacy-conscious individuals seeking a digital companion with true biographical continuity, "The Constant" (formerly "The Witness") is a Local-First, Real-Time Stateful AI That builds a structured, evolving "World Model" of the user’s life as they speak. Unlike standard local chatbots that rely on fuzzy vector search (leading to hallucinations) or nightly processing batches (laggy memory), "The Constant" utilizes FalkorDB to instantly map entities and relationships into a low-latency Graph Database. This allows the AI to learn facts, correct misconceptions, and update relationship statuses in real-time, without ever sending a single byte of data to the cloud. Guaranteed to provide sub-second recall of facts and absolute data sovereignty on your M1 Max.


2. Core Pillars (The "Why")

  1. Instantaneous "Write" Memory: The move to FalkorDB eliminates the "Day/Night" cycle. When you tell the AI "I just adopted a cat named Luna," the graph is updated (MERGE (:User)-[:OWNS]->(:Pet {name: 'Luna'})) before the AI even finishes its reply.
  2. Hard-Fact Consistency: By separating "Vibes" (Vector) from "Facts" (Graph), the AI stops guessing. It knows exactly who your sister is, not because it "feels" like it based on text probability, but because a database node exists defining that relationship.
  3. Dynamic Evolution (The "Breakup" Protocol): Unlike vector systems that struggle to forget, FalkorDB allows for explicit relationship modification. If a life circumstance changes, the AI modifies the graph structure immediately, preventing painful "Zombie Memories" (recalling things that are no longer true).

3. The Tech Stack (M1 Max Optimized)

This architecture utilizes your 64GB RAM to run two distinct database engines and two LLMs simultaneously for a lag-free experience.

A. Infrastructure

  • Host: MacBook Pro M1 Max (64GB).
  • Containerization: Docker Desktop (for FalkorDB).
  • Orchestration: Python 3.11 + LangChain (GraphCypherQAChain).

B. The Intelligence Layer (Dual-Model)

  • The Persona (The Mouth): Llama-3.1-70B-Instruct (4-bit Quantized).
    • Role: Generates the empathetic, conversational response using context provided by the databases.
  • The Scribe (The Ears): Llama-3.2-3B-Instruct.
    • Role: Runs in the background. It analyzes the user's input and translates it into Cypher Queries to update FalkorDB. It never speaks to the user; it only listens and categorizes.

C. The Memory Layer (Hybrid)

  • The Fact Store: FalkorDB (Redis-based Graph).
    • Content: Entities (People, Pets, Places), Relationships, Dates, Preferences.
    • Speed: <5ms query time.
  • The Vibe Store: ChromaDB (Vector).
    • Content: Raw transcript logs.
    • Role: Used to recall specific quotes, style, and tone ("What was that joke I made last week?").

D. The Sensory Layer

  • Input: Whisper (Large-v3) on CoreML.
  • Output: Coqui XTTS_v2 (Clonable, emotive voice).

4. The Mechanism (The Real-Time Workflow)

This flows happens in the milliseconds after the user stops speaking.

Step 1: Ingestion & Parallel Dispatch

  • User: "My boss Steve was awful today, but at least my wife Sarah bought me pizza."
  • Whisper: Transcribes text.
  • Action: The text is sent to Thread A (Response) and Thread B (Memory) simultaneously.

Thread A: The Response Generation (The "Persona")

  1. Retrieval (Hybrid):
    • Graph Query: "Who is Steve? Who is Sarah?" -> Returns: Steve (Boss, Attributes: Negative), Sarah (Wife, Attributes: Supportive).
    • Vector Query: "Last 5 conversations about work stress."
  2. Synthesis: Llama-70B receives the text + the specific Graph context.
  3. Output: "I'm sorry to hear Steve is at it again. That man never learns! But Sarah sounds like a lifesaver—what kind of pizza?"

Thread B: The Memory Update (The "Scribe")

Happens silently in the background while Thread A is processing. 1. Analysis: Llama-3B receives the text. 2. Prompt: "Extract entities and relationships into Cypher." 3. Execution:

MERGE (u:User)-[:HAS_BOSS]->(b:Person {name: 'Steve'})
SET b.sentiment = 'Negative'
MERGE (u)-[:MARRIED_TO]->(w:Person {name: 'Sarah'})
MERGE (u)-[:ATE]->(f:Food {name: 'Pizza'})
4. Commit: FalkorDB updates the world model instantly.


5. Development Roadmap (FalkorDB Focused)

  1. Week 1: The Graph Skeleton

    • Install Docker and run falkordb/falkordb.
    • Write the Python script using LangChain to connect to FalkorDB.
    • Test basic Cypher queries (Create/Match).
  2. Week 2: The Scribe (Llama 3B Integration)

    • Create a prompt for Llama-3B that reliably outputs Cypher code based on English input.
    • Test the "Write" loop: Type a sentence -> Check if the Graph updated in the Docker dashboard.
  3. Week 3: The Hybrid Retrieval

    • Build the function that queries both ChromaDB (Vector) and FalkorDB (Graph) and formats the results into a prompt for the 70B model.
  4. Week 4: Voice & Polish

    • Connect Whisper and Coqui TTS.
    • Implement "Latency Masking" (The Reflex trick) if the 70B model takes >3 seconds to generate.

This stack transforms your M1 Max into a living, thinking entity that "grows up" with you in real-time.