Prompt for Gemini for a Complete APP framework

This is an excellent final step for your prototype. Since you're using a local, specialized stack (llama-cpp-python with your GGUF model, Faiss, and a local TTS like MeloTTS), you can't ask a general-purpose AI like Gemini to write the full script, as it can't guarantee compatibility with your specific local installations and libraries.

However, you can ask Gemini to write the Structure and Logic for the RAG Orchestration, which is the hardest part.

Here is the most effective prompt to ask Gemini (or any powerful LLM) to get a template that will serve as the backbone of your final Python script.

Prompt for Gemini (or your preferred LLM)¶

Please write a well-structured Python script that orchestrates a complete **Local RAG (Retrieval-Augmented Generation) and Audio Output** pipeline.

**The script must be designed to run in a controlled WSL2 environment and use external services via function calls.** The final goal is to take a text question, retrieve context, generate an answer, and play it back as audio.

**The script must include the following five distinct sections, clearly marked with function definitions:**

### **Section 1: Configuration & Setup**
*   Define placeholder variables for all external dependencies (e.g., Vector DB path, LLM API endpoint, TTS output file).

### **Section 2: The RAG Retrieval Function**
*   Define a function named `retrieve_context(question: str) -> str`.
*   Inside the function, **simulate** a Faiss/ChromaDB query by returning a realistic, hardcoded, multi-line string of **"Retrieved Context"**. (We will replace this with the real vector DB query later).

### **Section 3: The LLM Inference Function**
*   Define a function named `generate_answer(prompt: str) -> str`.
*   This function must implement an **HTTP POST request** to a placeholder LLM API (e.g., `http://192.168.1.105:8000/v1/completions`) using the `requests` library.
*   The request body should correctly mimic the standard **OpenAI/Llama-cpp-python JSON format** for a text completion request.
*   It should return the generated text response.

### **Section 4: The Text-to-Speech (TTS) & Playback Function**
*   Define a function named `speak_answer(text: str) -> None`.
*   This function should have **placeholders** for a local TTS library (like MeloTTS/Coqui).
*   It should **print a message** indicating that it is "Simulating TTS conversion to `output.wav`" and then "Simulating audio playback."

### **Section 5: The Main Execution Flow (The Orchestrator)**
*   This section will contain the complete, sequential logic:
    1.  Define a sample `user_question`.
    2.  Call `retrieve_context()`.
    3.  Create the final, **enriched prompt** (Context + Question).
    4.  Call `generate_answer()` with the enriched prompt.
    5.  Call `speak_answer()` with the LLM's response.

**Output the final code in a single, well-commented Python block.**