Ollama LLM Local Setup
Yes, absolutely. This is one of the most common and effective ways to use the power of an Apple Silicon machine for local AI development and use.
The Apple Silicon architecture (M1, M2, M3, etc.) with unified memory is exceptionally well-suited for running LLMs, making it a great choice for the "server" component.
The easiest and most robust tool for this setup is Ollama.
Here is the step-by-step process for how you would set this up:
The Setup Using Ollama¶
Ollama is a powerful tool that simplifies running open-source LLMs on macOS and automatically exposes a REST API endpoint (a standard network connection) that your Windows 11 machine can connect to.
Step 1: Set Up the LLM Server (Apple Silicon Mac)¶
- Install Ollama: Download and install Ollama for macOS.
- Pull an LLM: Open the Terminal and download a model. Since you have large memory, you could try a larger, high-quality model (e.g., Llama 3 8B, Mistral 7B, or even a smaller quantized version of Llama 3 70B if your RAM allows).
- Configure for Network Access (Crucial Step): By default, Ollama only listens on your Mac's internal IP (
127.0.0.1orlocalhost). You need to tell it to listen on the local network so the Windows PC can find it.- Set an environment variable before running Ollama:
- 0.0.0.0 tells Ollama to listen on all available network interfaces, including your Wi-Fi or Ethernet connection. The default port is 11434.
Step 2: Configure the Network¶
- Find the Mac's Local IP: On your Mac, go to System Settings > Wi-Fi or Ethernet to find the computer's local network IP address (e.g.,
192.168.1.105). - Firewall Check: Ensure your Mac's Firewall is configured to allow incoming connections to port
11434(or whatever port Ollama is using).
Step 3: Chat from the Windows 11 PC¶
The Windows 11 machine can now connect to the LLM running on your Mac using the Mac's IP address and port 11434.
You have a few ways to chat from the Windows 11 PC:
Option A: Using a Python Script (Simple)¶
You can write a simple Python script using the requests library to send an API call to the Mac's Ollama server.
import requests
import json
# Replace this with your Mac's actual IP address
OLLAMA_SERVER = "http://192.168.1.105:11434/api/generate"
payload = {
"model": "llama3:8b",
"prompt": "What are the three most popular tourist attractions in Rome?",
"stream": False
}
response = requests.post(OLLAMA_SERVER, data=json.dumps(payload))
print(response.json()['response'])
Option B: Using a Third-Party Desktop Client (Easiest)¶
There are several open-source and commercial chat applications for Windows that allow you to specify a custom API endpoint (like Ollama's). You would simply enter your Mac's IP and port (192.168.1.105:11434) into the client's settings.
Option C: Using curl (Quick Test)¶
You can test the connection directly from the Windows 11 Command Prompt or PowerShell: