跳转至

Evaluate FalkorDB Based Summary to Original Summary

This is a classic "Reconstruction Problem." You have turned a hamburger (the Book) into a cow (the Graph), and now you want to turn the cow back into a hamburger.

The accuracy depends entirely on how you query the graph.

If you ask FalkorDB for a summary correctly, the result will be High on Logic/Structure (better than the original book in some ways) but Low on Narrative/Style (it will feel like a textbook summary, not a novel).

Here is the technical workflow to extract a "Whole Book Summary" from FalkorDB and an honest assessment of its accuracy.


Part 1: The "Spine Walk" Method (How to Ask)

You cannot just say "Summarize the database." You need to feed the LLM a structured Skeleton of the graph.

Since we built a "Dependency Tree" (Prerequisites), we can leverage that structure to tell a coherent story.

Step 1: The Cypher Query (The Retrieval)

We need to find the "Main Characters" of the book (the Major Concepts) and the flow between them. We don't want every tiny detail.

The Strategy: Retrieve the top 20 most connected concepts (PageRank/Degree Centrality) OR traverse the root prerequisites.

# python script
def get_book_skeleton():
    query = """
    // Find the most 'central' concepts (the main topics)
    MATCH (c:Concept)
    // Count how many connections each concept has
    WITH c, count((c)--()) as connections
    ORDER BY connections DESC
    LIMIT 20

    // Retrieve their relationships to other main topics
    MATCH (c)-[r:PREREQUISITE_FOR]->(neighbor)
    WHERE neighbor IN c // Only keep links between main topics

    RETURN c.name, c.summary, neighbor.name AS leads_to
    """
    return falkordb.query(query)

Step 2: The Reconstruction Prompt

You feed that structured list into your 72B Model.

SUMMARY_PROMPT = f"""
You are an editor summarizing a textbook based on its Knowledge Graph.

Here is the Logic Skeleton of the book (Major Concepts and their flow):
{graph_data_from_cypher}

TASK:
Write a Comprehensive Executive Summary of this book.
1. Start with the foundational concepts.
2. Explain how they lead to the advanced topics (follow the 'leads_to' links).
3. Synthesize the core argument of the book.

Tone: Academic and cohesive.
"""

Part 2: The Accuracy Analysis (The "Lossiness")

How close is this to the original PDF?

1. Structural Accuracy: 95% (Superior to the Book)

  • Why: Books are linear and often rambling. Authors go on tangents. The Graph distills the logical dependency.
  • Result: The summary will explain the logic of the subject better than the book did. It will clearly say "A causes B causes C," even if the author took 50 pages to say that vaguely.

2. Factual Detail: 70% (Lossy)

  • Why: During ingestion, your 14B model compressed paragraphs into 1-sentence summary properties on the nodes.
  • The Loss: Specific dates, minor anecdotes, and side-examples are likely gone unless they warranted their own Node.
  • Result: You get the "What" and the "Why," but you lose the "Color."

3. Narrative Voice: 20% (Robotic)

  • Why: The Graph stores data, not prose.
  • The Loss: If the author had a funny writing style or used beautiful metaphors, those are stripped away. The reconstruction will sound like Qwen-72B, not the original author.

Part 3: How to boost accuracy to 99% (The "Meta-Node" Trick)

If you want a "Perfect Summary" that retains the author's voice, you must plan for it during Ingestion. You cannot reconstruct it perfectly from the granular nodes later.

The Fix: Create a specific "BookSummary" Node during Phase 1.

  1. Modify Ingestion Script: As you iterate through chapters, have the LLM update a running "Global Summary" (using a Map-Reduce approach).
  2. Save it to FalkorDB:
    CREATE (:SourceMaterial {
        title: "Quantum Physics 101",
        full_summary: "In this book, the author argues that..."
    })
    
  3. The Retrieval: When you ask for a summary, don't try to reconstruct it from the graph. Just query the SourceMaterial node.
    MATCH (s:SourceMaterial) RETURN s.full_summary
    

Recommendation: For "The Professor," rely on Part 1 (The Spine Walk). The goal of the app is Mastery, not Book reporting. A structurally accurate summary (showing how concepts connect) is actually more useful for learning than a narrative summary that repeats the author's anecdotes.