5 Self-Hosted ChatGPT Alternatives You Can Run Today (2026)
ChatGPT uninstalls are up 295%. Here are 5 open-source alternatives you can run on your own hardware — no API keys, no Pentagon deals, no data leaving your machine.
Why Everyone's Leaving ChatGPT
Let's get the elephant out of the room. OpenAI signed a deal with the Pentagon — the Department of Defense, rebranded as the "Department of War" under the current administration. ChatGPT uninstalls spiked 295% overnight. One-star reviews jumped 775%. Claude hit #1 on the App Store.
But here's the thing — switching from ChatGPT to Claude is just switching landlords. You're still sending every conversation, every document, every half-baked idea to someone else's servers. Today it's OpenAI's Pentagon deal. Tomorrow it might be Anthropic's turn to make a choice you don't like.
The actual solution? Run it yourself.
A year ago, self-hosted AI was a hobby for people with $10K GPU rigs and a tolerance for command-line pain. In 2026, you can run models that rival ChatGPT on a gaming laptop. The Qwen 3.5 9B model — released literally yesterday — beats GPT-5 Nano on vision benchmarks and runs on 8GB of VRAM.
No accounts. No API keys. No terms of service. No one reading your prompts. Your machine, your model, your data.
Here are five ways to do it, from "I've never touched a terminal" to "I run Kubernetes for fun."
What You Actually Need to Run Local AI
Before we dive in, let's kill the myth that you need a $3,000 GPU. Here's what actually matters:
💚 Minimum (Usable)
- RAM: 8GB
- GPU: None (CPU-only works)
- Storage: 10GB free
- Models: Qwen 3.5 0.8B–2B
- Speed: Slow but functional
Any laptop from the last 5 years
💙 Sweet Spot
- RAM: 16GB
- GPU: 8GB VRAM (RTX 3060/4060)
- Storage: 30GB free
- Models: Qwen 3.5 4B–9B, Llama 3.3 8B
- Speed: Feels like ChatGPT
A decent gaming PC or M1/M2 Mac
💜 Overkill (In a Good Way)
- RAM: 32GB+
- GPU: 16–24GB VRAM (RTX 4080/4090)
- Storage: 100GB free
- Models: Anything up to 70B quantized
- Speed: Faster than cloud APIs
M2 Pro/Max Mac or serious GPU rig
Apple Silicon users: You're in luck. M1/M2/M3/M4 Macs use unified memory, meaning your 16GB or 32GB of RAM doubles as VRAM. A base M2 MacBook Air can comfortably run 7-9B models. It's the best local AI hardware per dollar right now.
1. Ollama — The One Everyone Should Start With
If you install one thing from this list, make it Ollama. It's the Docker of local AI — it handles downloading models, managing memory, running inference, and serving an API, all with commands simple enough for a text message.
# macOS / Linux — one line, done
curl -fsSL https://ollama.com/install.sh | sh
# Pull a model (Qwen 3.5 9B is the current sweet spot)
ollama pull qwen3.5:9b
# Start chatting
ollama run qwen3.5:9b That's it. Three commands. You now have a local AI that works offline, keeps everything on your machine, and responds in your terminal.
Why Ollama wins
- Model library: 200+ models, one command to download —
ollama pull llama3.3 - Runs everywhere: macOS, Linux, Windows, even Raspberry Pi
- API compatible: Exposes an OpenAI-compatible API at
localhost:11434— anything that works with ChatGPT's API works with Ollama - Memory management: Automatically loads/unloads models based on available RAM
- 110K+ monthly searches — the biggest name in local AI for a reason
The limitation
Ollama is a backend. The chat interface is your terminal. For a proper ChatGPT-like experience with conversation history, file uploads, and a pretty UI — you'll want to pair it with Open WebUI (next on the list). We also have a detailed Ollama page with more setup details.
Difficulty: ⭐ (copy-paste one command)
Cost: Free, open source
2. Open WebUI — ChatGPT's Interface, Your Hardware
Open WebUI is the answer to "I want ChatGPT but on my own machine." It looks almost identical to ChatGPT's interface — conversation threads, model switching, file uploads, web search, even image generation. But everything runs locally through your Ollama backend.
# One command. That's it.
docker run -d -p 3000:8080 \
--add-host=host.docker.internal:host-gateway \
-v open-webui:/app/backend/data \
--name open-webui \
--restart always \
ghcr.io/open-webui/open-webui:main
# Open http://localhost:3000
# Connect to your Ollama instance
# You now have your own ChatGPT What makes it special
- Multi-user support: Create accounts for family/team members, each with their own chat history
- RAG built in: Upload PDFs, docs, and text files — ask questions about them
- Web search: Can search the web and cite sources (like ChatGPT's Browse)
- Model management: Download, delete, and switch models from the UI
- Mobile-friendly: Works great on phone browsers — add to home screen for an app-like experience
- Plugin system: Community plugins for image generation, TTS, code execution
The honest take
It's the closest thing to a full ChatGPT replacement you'll find. The UI is polished, updates are frequent (weekly releases), and the community is massive. The only downside is you need Docker, which adds a small learning curve if you've never used it. Learn more about Open WebUI on our tools page.
Difficulty: ⭐⭐ (need Docker, but it's one command)
Cost: Free, open source
3. Jan — The "Just Works" Desktop App
If the words "Docker" and "terminal" make you nervous, Jan is your answer. It's a desktop app. You download it, open it, pick a model, and start chatting. That's the entire setup process.
# Download from jan.ai — available for:
# macOS (Apple Silicon + Intel)
# Windows (x64)
# Linux (AppImage + deb)
# Or via Homebrew on macOS:
brew install --cask jan Why people love it
- Zero config: No terminal, no Docker, no API keys — download and chat
- Built-in model hub: Browse and download models from inside the app
- Offline first: Designed to work completely without internet
- Clean UI: Minimal, fast, doesn't try to do everything
- Cross-platform: macOS, Windows, Linux
- Local API: Also exposes an OpenAI-compatible API if you want to connect other tools
The trade-off
Jan is intentionally simple. No multi-user support, no RAG, no web search. It's a chat app for local models — and it does that one thing really well. If you need more features, look at Open WebUI. If you want something that "just works" for personal use, Jan is hard to beat.
Difficulty: ⭐ (download and open)
Cost: Free, open source
4. LibreChat — Multi-Provider Power User Setup
LibreChat is for people who want ONE interface for everything — local models via Ollama, cloud APIs from OpenAI/Anthropic/Google, and custom endpoints. It's the Swiss Army knife.
# Clone and run
git clone https://github.com/danny-avila/LibreChat.git
cd LibreChat
cp .env.example .env
# Edit .env to add your API keys (optional — works with Ollama too)
docker compose up -d
# Open http://localhost:3080
# Supports: Ollama, OpenAI, Anthropic, Google, Azure — all in one UI The power user features
- Multi-provider: Use Ollama for private stuff, Claude for complex reasoning, GPT for coding — all in one window
- Presets: Save model + system prompt combinations for different use cases
- Plugins: Web browsing, DALL-E, code interpreter, Wolfram Alpha
- Multi-user: Full user management with admin controls
- Conversation forking: Branch conversations to try different approaches
When to use this
If you want to use local models for privacy-sensitive work but still have access to cloud APIs when you need more power, LibreChat is the bridge. It's also great for teams who want to share a single AI interface with different model access levels.
Difficulty: ⭐⭐⭐ (Docker + config file)
Cost: Free, open source (cloud API costs separate)
5. AnythingLLM — When You Need RAG
AnythingLLM's superpower is document intelligence. Upload your company docs, research papers, legal contracts, or codebase — then ask questions about them. It chunks, embeds, and retrieves relevant context automatically.
# Desktop app — download from anythingllm.com
# Or self-host with Docker:
docker pull mintplexlabs/anythingllm
docker run -d -p 3001:3001 \
--cap-add SYS_ADMIN \
-v anythingllm_storage:/app/server/storage \
mintplexlabs/anythingllm
# Open http://localhost:3001
# Upload documents, create workspaces, RAG out of the box What sets it apart
- Workspaces: Organize documents into separate knowledge bases
- Multi-format: PDF, DOCX, TXT, CSV, code files, web pages, YouTube transcripts
- Embedding models: Choose your own embedding model for better retrieval
- Agent mode: Can browse the web, run code, and use tools
- Desktop app available: No Docker needed if you prefer a native app
The use case
If "chat with your documents" is what you actually need — not just chatting — AnythingLLM is the pick. Lawyers, researchers, and anyone drowning in PDFs will love it. For general chatting, it's overkill; use Open WebUI instead.
Difficulty: ⭐⭐ (desktop app is easy, Docker is standard)
Cost: Free, open source (cloud version available)
Which Model Should You Run?
The tool is only half the equation. Here's what model to put in it:
| Your Hardware | Best Model | Why |
|---|---|---|
| 8GB RAM, no GPU | Qwen 3.5 2B | Runs on anything, surprisingly capable for its size |
| 16GB RAM or 8GB VRAM | Qwen 3.5 9B ⭐ | The sweet spot. Beats GPT-5 Nano on benchmarks. 262K context. |
| 32GB RAM / M2+ Mac | Llama 3.3 8B or Qwen 3.5 9B | Fast inference, great for coding and reasoning |
| 24GB VRAM (RTX 4090) | Qwen 3 30B or Mistral Large | Near-Claude-level quality for complex tasks |
| Phone / Raspberry Pi | Qwen 3.5 0.8B | It works. Don't expect miracles, but it works. |
My recommendation: Start with ollama pull qwen3.5:9b. It's the current king of the "fits on normal hardware" category. Released March 2, 2026 — two days ago — and already the most talked-about local model on r/LocalLLM. For a full walkthrough, check our Qwen 3.5 setup guide.
Head-to-Head Comparison
| Feature | Ollama | Open WebUI | Jan | LibreChat | AnythingLLM |
|---|---|---|---|---|---|
| Setup difficulty | ⭐ | ⭐⭐ | ⭐ | ⭐⭐⭐ | ⭐⭐ |
| Pretty UI | ❌ (terminal) | ✅✅ | ✅ | ✅✅ | ✅ |
| Document upload/RAG | ❌ | ✅ | ❌ | ✅ | ✅✅✅ |
| Multi-user | ❌ | ✅ | ❌ | ✅ | ✅ |
| Cloud API support | ❌ | ✅ | ✅ | ✅✅✅ | ✅ |
| Works offline | ✅ | ✅ | ✅ | Partial | ✅ |
| Mobile support | ❌ | ✅ (PWA) | ❌ | ✅ (web) | ❌ |
Real Talk: Can Local AI Replace ChatGPT?
Honestly? For 80% of what most people use ChatGPT for — writing emails, brainstorming, summarizing text, answering questions, basic coding — yes. A Qwen 3.5 9B running locally will handle all of that without breaking a sweat.
For the other 20% — complex multi-step reasoning, advanced code generation, image generation, real-time web browsing — cloud models still have an edge. But that edge is shrinking every month.
The real question isn't "is local AI as good?" It's "is the privacy trade-off worth a small quality gap?"
After a week of 295% uninstall spikes and Pentagon deals, I think a lot of people just answered that question for themselves. If you're also looking to degoogle your phone, pair this with our GrapheneOS setup guide for full digital sovereignty. Concerned about AI agent security? Read our guide on vetting AI agent plugins.
Get Started in 5 Minutes
Install Ollama → pull Qwen 3.5 9B → run Open WebUI. Three commands, and you've got your own private ChatGPT. No account, no API key, no data leaving your machine.
Your conversations are yours. That shouldn't be a radical idea.