What you'll have at the end
- A local AI chat server that speaks any language — running 100% on your Mac
- Separate accounts for every family member or team member
- AI image generation with FLUX.1 (DALL-E 3 quality)
- Access from iPhone, iPad, or any device on your network
- Automatic model routing — fast or smart, depending on the question
- €60–90/month in saved subscriptions
Before You Start — Do These First
These accounts take a few minutes but will block you mid-setup if you skip them:
- Hugging Face account (free) — go to huggingface.co, sign up, then go to FLUX.1-schnell → accept terms. Then Settings → Access Tokens → New Token → copy it. You'll need this later.
- Docker Hub account (optional) — hub.docker.com. Only needed for future use.
Hardware Requirements
| Spec | Minimum | Recommended |
|---|---|---|
| Chip | Apple Silicon M1/M2 | M3 Max / M4 Max |
| RAM (Unified Memory) | 16 GB | 64 GB+ |
| Storage (free) | 100 GB | 500 GB+ |
| macOS | Ventura 13+ | Sequoia 15+ |
The Three Tools — What Each One Does
Before diving into setup, here's how the three components relate to each other:
LM Studio — The engine. It downloads AI models (Qwen, Llama, Mistral) into your Mac's RAM and runs them locally. Instead of sending your prompts to ChatGPT, you send them to your own model on your own hardware. It also runs as an API server on port 1234. Think of it as the car's engine — nothing moves without it.
Open WebUI — The steering wheel. A ChatGPT-like browser interface that connects to LM Studio in the background. Gives you chat history, system prompts, separate accounts, knowledge bases — everything that makes it feel like ChatGPT but entirely local. Without LM Studio it does nothing. Without Open WebUI, LM Studio has no face.
ComfyUI — The image studio. Completely separate from the chat stack. Loads Stable Diffusion / FLUX models and generates images and video through a node-based interface. Run it independently — it does a completely different job.
Mac (Apple Silicon)
│
├── LM Studio ← runs AI models (LLM)
│ └── API port 1234
│
├── Docker Desktop
│ └── Open WebUI ← ChatGPT-style interface (port 3000)
│ ├── account: admin (you)
│ ├── account: user1
│ └── account: user2, user3
│
└── ComfyUI ← images / video (port 8188)
| Subscription replaced | Cost/month | Replaced by |
|---|---|---|
| ChatGPT Plus | €20 | Open WebUI + local model |
| Claude Pro | €19 | Open WebUI + Qwen 3.5 35B |
| Perplexity Pro | €20 | Open WebUI + web search |
| DALL-E / Midjourney | €10–30 | ComfyUI + FLUX.1 |
| Total | €59–89/month | €0 |
Step 1 — Install Docker Desktop
Docker runs Open WebUI in a contained environment. Think of it as a box that keeps everything organized. You don't need to understand it — just install it like any Mac app.
- Download from docker.com/products/docker-desktop — Apple Silicon version.
- Install normally. Accept all permission requests.
After installing, two critical settings:
- Settings → General: Enable "Start Docker Desktop when you sign in." Without this, Open WebUI won't load after a reboot.
- Settings → Resources: Disable "Resource Saver." If left on, Docker pauses after inactivity and appears broken.
Step 2 — Install LM Studio and Download Models
Download from lmstudio.ai and install. Open it — you'll see a Discover tab (like an app store for AI models).
If you have 64 GB+ RAM
| Model | Size | Use |
|---|---|---|
| Qwen3.5-35B-A3B-Uncensored (Q6_K) | ~29 GB | Main model. Multilingual, excellent reasoning. Does everything. |
| Devstral-Small-2-24B (Q4_K_XL) | ~16 GB | Code specialist. Skip if you don't write code. |
| Meltemi-7B-v1 (Q8_0) | ~7 GB | Fast, lightweight. For quick simple questions. |
If you have 16–32 GB RAM
| Model | Size | Use |
|---|---|---|
| Meltemi-7B-v1 (Q8_0) | ~7 GB | Fast and multilingual — best option for less RAM |
| Qwen2.5-14B (Q4_K_M) | ~9 GB | General purpose, strong multilingual |
Search each model name in the Discover tab, hit Download, and wait (they're large — download overnight on slow connections).
Once downloaded: go to the Developer tab → Start Server. You'll see Server running on port 1234. Then enable "Serve on Local Network" so other devices on your WiFi can access it.
Step 3 — Install Open WebUI
Open Terminal (find it with Spotlight — Cmd+Space → "Terminal") and run this single command:
docker run -d -p 3000:8080 \
--add-host=host.docker.internal:host-gateway \
-e OLLAMA_BASE_URL="" \
-e OPENAI_API_BASE_URL="http://host.docker.internal:1234/v1" \
-e OPENAI_API_KEY="lm-studio" \
-v open-webui:/app/backend/data \
--name open-webui \
--restart always \
ghcr.io/open-webui/open-webui:main
What this does: tells Docker to download Open WebUI, run it on port 3000, connect it to LM Studio on port 1234, and auto-restart if anything crashes.
Wait 2–3 minutes. Then open your browser and go to http://localhost:3000. You'll see a ChatGPT-style interface. Create your first account — the first user automatically becomes Administrator.
-p 3000:8080 to -p 3001:8080 and use http://localhost:3001 instead.Step 4 — Set Up Accounts
Each person gets their own account with separate chat history. Go to Settings (gear icon) → Admin Panel → Users → + Add User:
| Account | Role | Who |
|---|---|---|
| admin@home.local | Admin | You — can see and change everything |
| user1@home.local | User | Partner, colleague, etc. |
| user2@home.local | User | Second user |
Step 5 — Create Model Profiles
Instead of exposing model names like qwen3.5-35b-a3b-uncensored to users, create friendly named profiles. Go to Workspace → Models → + New Model:
- 🤖 Assistant — Base: Qwen 3.5 35B. System prompt: "You are a helpful assistant. Answer clearly, concisely, and accurately."
- 💻 Code — Base: Devstral Small 24B. System prompt: "You are an expert developer. Write clean code and explain your reasoning."
- ⚡ Quick — Base: Meltemi 7B. System prompt: "Answer briefly and directly. No unnecessary explanation."
Set 🤖 Assistant as the default in Admin Panel → Settings → Default Model. Users open the app, type, and get an answer — no model-picking required.
Step 6 — iPhone and iPad Access
Every device on your network can use the AI. First, find your Mac's local IP:
ipconfig getifaddr en0
You'll get something like 192.168.1.100. On any iPhone or iPad on the same WiFi, open Safari and go to http://192.168.1.100:3000.
Make it a proper app: tap the Share button → "Add to Home Screen." It opens fullscreen, exactly like a native app. Nobody will know it's running locally on your Mac.
Step 7 — ComfyUI for Image Generation
Skip this step if you only need chat. ComfyUI is a separate tool that handles AI image and video generation — completely independent from the chat stack.
Open Terminal and run these commands in order:
brew install python@3.11
(If brew isn't installed, go to brew.sh first.)
cd ~ && git clone https://github.com/comfyanonymous/ComfyUI.git
cd ~/ComfyUI && pip3.11 install -r requirements.txt
cd ~/ComfyUI && python3.11 main.py --force-fp16
Look for Device: mps in the output — this confirms it's using your Apple Silicon GPU. Then open http://localhost:8188.
Step 8 — Download FLUX.1 (Image Model)
Login to Hugging Face (use the token from Step 0):
python3.11 -c "from huggingface_hub import login; login('YOUR_TOKEN_HERE')"
Download the FLUX.1 quantized model (~7 GB instead of 24 GB — same quality output):
cd ~/ComfyUI/models/checkpoints
python3.11 -c "
from huggingface_hub import hf_hub_download
hf_hub_download(
repo_id='city96/FLUX.1-schnell-gguf',
filename='flux1-schnell-Q4_K_S.gguf',
local_dir='.'
)
print('Done!')
"
Then in the ComfyUI interface, open the "1.1 Starter – Text to Image" template → click "See Errors" → "Download all." This auto-downloads ~8 GB of helper files (text encoders + VAE).
What You're Saving
| Service | Monthly | Yearly |
|---|---|---|
| ChatGPT Plus (×2) | €40 | €480 |
| Claude Pro | €19 | €228 |
| Perplexity Pro | €20 | €240 |
| Midjourney / DALL-E | €10 | €120 |
| Total | €89/month | €1,068/year |
The hardware (Mac) pays for itself in under two years in saved subscriptions alone — before accounting for privacy, speed, and no rate limits.
What's Next (Phase 2)
- Video generation: Wan2.1 or CogVideoX-5B via ComfyUI
- Music generation: MusicGen Large (Meta) via ComfyUI
- Browser automation: OpenClaw + local LLM for automated tasks
- Workflow automation: n8n for morning digests, alerts, and automations
- RAG on your own notes: Ask the AI about your own documents via Open WebUI RAG