FalkorDB Updated PS and TS for The Professor
Here is the updated Product Statement and Technical Stack for your Tutorial AI, "The Professor," re-architected to leverage FalkorDB for dynamic curriculum management.
This version moves away from the static nature of Microsoft GraphRAG, replacing it with FalkorDB to create a "Living Syllabus" that adapts physically to the student's performance in real-time.
1. Product Statement¶
For rigorous autodidacts and researchers who demand structural mastery over simple information retrieval, "The Professor" is a Local-First, Adaptive Pedagogical Engine That converts static libraries (PDFs/Textbooks) into a gamified, dependency-based knowledge graph. Unlike "Chat with PDF" tools that randomly fetch paragraphs, or static courses that force a linear path, "The Professor" utilizes FalkorDB to maintain a "Living Syllabus." It maps concepts as a dependency tree (Prerequisites -> Advanced Topics) and tracks the student's mastery of each node in real-time. If the student fails a "Feynman Test," the graph automatically expands to insert remedial sub-nodes, ensuring no foundational gaps remain. Guaranteed to provide a university-level Socratic tutoring experience, fully offline and private on your M1 Max.
2. Core Pillars¶
- The Dependency Graph: We treat knowledge like a technology tree in a video game. You cannot unlock "Quantum Entanglement" until you have mastered "Wave Function." FalkorDB enforces this hierarchy, preventing the AI from hallucinating advanced answers before the basics are established.
- The "Feynman" Feedback Loop: The system does not just explain; it demands you explain it back. The AI grades your explanation against the source text stored in the graph.
- Self-Healing Curriculum: Because FalkorDB is a transactional database, we can write to it instantly. If you struggle with a concept, the AI physically refactors the graph structure, inserting new "Bridge Nodes" (remedial lessons) into your syllabus on the fly.
3. The Technical Stack (FalkorDB Edition)¶
Hardware Target: MacBook Pro M1 Max (64GB RAM) Environment: Docker (FalkorDB), Python 3.11, Ollama, Streamlit.
A. The Database (The Living Syllabus)¶
- Engine: FalkorDB (Docker Container).
- Data Model (The Schema):
- Static Nodes:
SourceMaterial(The Book),Concept(The Topic),Chunk(Raw Text + Vector). - Dynamic Nodes:
Student(User Profile),Session(Log). - Edges:
(:Concept)-[:PREREQUISITE]->(:Concept),(:Student)-[:MASTERED {score: 95}]->(:Concept).
- Static Nodes:
- Vector Indexing: Enabled on
Chunknodes for semantic retrieval.
B. The Intelligence Layer¶
- 1. The Teacher (The Brain): Qwen2.5-72B-Instruct (4-bit).
- Why: It currently holds the crown for Open Source STEM/Logic/Coding tasks. It generates the lecture scripts and grades the user's answers.
- 2. The Architect (The Graph Builder): Qwen2.5-14B-Instruct.
- Role: Used during ingestion. It reads the raw PDF text and structures it into Cypher queries to build the initial Dependency Graph.
- 3. The Illustrator (The Blackboard): Qwen2.5-Coder-7B.
- Role: Specialized in generating Mermaid.js diagrams and LaTeX formulas to visually explain the concepts.
C. The Interface Layer¶
- UI: Streamlit (Split View).
- Left: Chat/Voice Interface.
- Right: The Blackboard (Renders Markdown/Mermaid) + The Map (PyVis visualization of the FalkorDB Syllabus graph).
- I/O: Whisper (Input) + Coqui TTS (Output).
4. The Mechanism (The Pedagogical Loop)¶
This workflow replaces the standard "Chat" loop with a "Teaching" loop.
Phase 1: Ingestion (The "Curriculum Builder")¶
- One-time process when adding a book.
- Input: PDF Textbook.
- Qwen-14B: Scans Table of Contents and Chapter Summaries.
- Action: Executes Cypher to build the tree:
- Embedder: Chunks the text, vectorizes it, and links chunks to the Concepts.
Phase 2: The Lesson (The Interaction)¶
- State Check: Python script queries FalkorDB: "Find the first Concept where (:Student)-[:MASTERED]->(:Concept) does NOT exist, but all PREREQUISITES are mastered."
- Retrieval: Fetches the text chunks + vectors associated with that specific Concept.
- Generation (Teacher & Illustrator):
- Teacher: Explains the concept via TTS.
- Illustrator: Generates a Mermaid Flowchart shown on the Blackboard.
Phase 3: The Feynman Test (The Assessment)¶
- AI: "Now, explain the relationship between [Concept A] and [Concept B] in your own words."
- User: "Well, A causes B because..."
- Grading: Qwen-72B compares User Audio (Text) vs. Source Chunks.
- Pass: Run
CREATE (:Student)-[:MASTERED]->(:Concept). The UI Map turns that node Gold. - Fail: The AI queries the graph for the sub-components of the concept and starts a Remedial Session.
- Pass: Run
5. Development Roadmap¶
-
Week 1: The Graph Structure
- Design the Cypher Schema for a "Syllabus."
- Manually insert a few nodes (e.g., "Math 101" -> "Math 102") into FalkorDB to test the "Unlock" logic.
-
Week 2: The Ingestion Pipeline
- Use
PyMuPDFto read a PDF. - Use Qwen-14B to output JSON representing the hierarchy.
- Write the Python script to push this hierarchy into FalkorDB.
- Use
-
Week 3: The Blackboard (Streamlit)
- Build the Streamlit UI.
- Get
streamlit-mermaidworking so you can render diagrams generated by the LLM.
-
Week 4: The Loop
- Connect Qwen-72B.
- Write the prompt for the "Grader" (comparing User Input vs Database Truth).
- Connect the "Pass" signal to the database update query.