← Guides Self-Hosted AI

5 Self-Hosted ChatGPT Alternatives You Can Run Today (2026)

ChatGPT uninstalls are up 295%. Here are 5 open-source alternatives you can run on your own hardware — no API keys, no Pentagon deals, no data leaving your machine.

🖥️ 14 min read · Last updated: March 3, 2026

Why Everyone's Leaving ChatGPT

Let's get the elephant out of the room. OpenAI signed a deal with the Pentagon — the Department of Defense, rebranded as the "Department of War" under the current administration. ChatGPT uninstalls spiked 295% overnight. One-star reviews jumped 775%. Claude hit #1 on the App Store.

But here's the thing — switching from ChatGPT to Claude is just switching landlords. You're still sending every conversation, every document, every half-baked idea to someone else's servers. Today it's OpenAI's Pentagon deal. Tomorrow it might be Anthropic's turn to make a choice you don't like.

The actual solution? Run it yourself.

A year ago, self-hosted AI was a hobby for people with $10K GPU rigs and a tolerance for command-line pain. In 2026, you can run models that rival ChatGPT on a gaming laptop. The Qwen 3.5 9B model — released literally yesterday — beats GPT-5 Nano on vision benchmarks and runs on 8GB of VRAM.

No accounts. No API keys. No terms of service. No one reading your prompts. Your machine, your model, your data.

Here are five ways to do it, from "I've never touched a terminal" to "I run Kubernetes for fun."

What You Actually Need to Run Local AI

Before we dive in, let's kill the myth that you need a $3,000 GPU. Here's what actually matters:

💚 Minimum (Usable)

RAM: 8GB
GPU: None (CPU-only works)
Storage: 10GB free
Models: Qwen 3.5 0.8B–2B
Speed: Slow but functional

Any laptop from the last 5 years

💙 Sweet Spot

RAM: 16GB
GPU: 8GB VRAM (RTX 3060/4060)
Storage: 30GB free
Models: Qwen 3.5 4B–9B, Llama 3.3 8B
Speed: Feels like ChatGPT

A decent gaming PC or M1/M2 Mac

💜 Overkill (In a Good Way)

RAM: 32GB+
GPU: 16–24GB VRAM (RTX 4080/4090)
Storage: 100GB free
Models: Anything up to 70B quantized
Speed: Faster than cloud APIs

M2 Pro/Max Mac or serious GPU rig

Apple Silicon users: You're in luck. M1/M2/M3/M4 Macs use unified memory, meaning your 16GB or 32GB of RAM doubles as VRAM. A base M2 MacBook Air can comfortably run 7-9B models. It's the best local AI hardware per dollar right now.

1. Ollama — The One Everyone Should Start With

If you install one thing from this list, make it Ollama. It's the Docker of local AI — it handles downloading models, managing memory, running inference, and serving an API, all with commands simple enough for a text message.

Terminal

# macOS / Linux — one line, done
curl -fsSL https://ollama.com/install.sh | sh

# Pull a model (Qwen 3.5 9B is the current sweet spot)
ollama pull qwen3.5:9b

# Start chatting
ollama run qwen3.5:9b

That's it. Three commands. You now have a local AI that works offline, keeps everything on your machine, and responds in your terminal.

Why Ollama wins

Model library: 200+ models, one command to download — ollama pull llama3.3
Runs everywhere: macOS, Linux, Windows, even Raspberry Pi
API compatible: Exposes an OpenAI-compatible API at localhost:11434 — anything that works with ChatGPT's API works with Ollama
Memory management: Automatically loads/unloads models based on available RAM
110K+ monthly searches — the biggest name in local AI for a reason

The limitation

Ollama is a backend. The chat interface is your terminal. For a proper ChatGPT-like experience with conversation history, file uploads, and a pretty UI — you'll want to pair it with Open WebUI (next on the list). We also have a detailed Ollama page with more setup details.

Best for: Everyone. Install this first, then add a UI on top.
Difficulty: ⭐ (copy-paste one command)
Cost: Free, open source

2. Open WebUI — ChatGPT's Interface, Your Hardware

Open WebUI is the answer to "I want ChatGPT but on my own machine." It looks almost identical to ChatGPT's interface — conversation threads, model switching, file uploads, web search, even image generation. But everything runs locally through your Ollama backend.

Docker (requires Ollama running)

# One command. That's it.
docker run -d -p 3000:8080 \
  --add-host=host.docker.internal:host-gateway \
  -v open-webui:/app/backend/data \
  --name open-webui \
  --restart always \
  ghcr.io/open-webui/open-webui:main

# Open http://localhost:3000
# Connect to your Ollama instance
# You now have your own ChatGPT

What makes it special

Multi-user support: Create accounts for family/team members, each with their own chat history
RAG built in: Upload PDFs, docs, and text files — ask questions about them
Web search: Can search the web and cite sources (like ChatGPT's Browse)
Model management: Download, delete, and switch models from the UI
Mobile-friendly: Works great on phone browsers — add to home screen for an app-like experience
Plugin system: Community plugins for image generation, TTS, code execution

The honest take

It's the closest thing to a full ChatGPT replacement you'll find. The UI is polished, updates are frequent (weekly releases), and the community is massive. The only downside is you need Docker, which adds a small learning curve if you've never used it. Learn more about Open WebUI on our tools page.

Best for: Anyone who wants the ChatGPT experience without ChatGPT.
Difficulty: ⭐⭐ (need Docker, but it's one command)
Cost: Free, open source

3. Jan — The "Just Works" Desktop App

If the words "Docker" and "terminal" make you nervous, Jan is your answer. It's a desktop app. You download it, open it, pick a model, and start chatting. That's the entire setup process.

Installation

# Download from jan.ai — available for:
# macOS (Apple Silicon + Intel)
# Windows (x64)
# Linux (AppImage + deb)

# Or via Homebrew on macOS:
brew install --cask jan

Why people love it

Zero config: No terminal, no Docker, no API keys — download and chat
Built-in model hub: Browse and download models from inside the app
Offline first: Designed to work completely without internet
Clean UI: Minimal, fast, doesn't try to do everything
Cross-platform: macOS, Windows, Linux
Local API: Also exposes an OpenAI-compatible API if you want to connect other tools

The trade-off

Jan is intentionally simple. No multi-user support, no RAG, no web search. It's a chat app for local models — and it does that one thing really well. If you need more features, look at Open WebUI. If you want something that "just works" for personal use, Jan is hard to beat.

Best for: Non-technical users, people who just want to chat locally.
Difficulty: ⭐ (download and open)
Cost: Free, open source

4. LibreChat — Multi-Provider Power User Setup

LibreChat is for people who want ONE interface for everything — local models via Ollama, cloud APIs from OpenAI/Anthropic/Google, and custom endpoints. It's the Swiss Army knife.

Docker Compose

# Clone and run
git clone https://github.com/danny-avila/LibreChat.git
cd LibreChat
cp .env.example .env

# Edit .env to add your API keys (optional — works with Ollama too)
docker compose up -d

# Open http://localhost:3080
# Supports: Ollama, OpenAI, Anthropic, Google, Azure — all in one UI

The power user features

Multi-provider: Use Ollama for private stuff, Claude for complex reasoning, GPT for coding — all in one window
Presets: Save model + system prompt combinations for different use cases
Plugins: Web browsing, DALL-E, code interpreter, Wolfram Alpha
Multi-user: Full user management with admin controls
Conversation forking: Branch conversations to try different approaches

When to use this

If you want to use local models for privacy-sensitive work but still have access to cloud APIs when you need more power, LibreChat is the bridge. It's also great for teams who want to share a single AI interface with different model access levels.

Best for: Power users, teams, people who use multiple AI providers.
Difficulty: ⭐⭐⭐ (Docker + config file)
Cost: Free, open source (cloud API costs separate)

5. AnythingLLM — When You Need RAG

AnythingLLM's superpower is document intelligence. Upload your company docs, research papers, legal contracts, or codebase — then ask questions about them. It chunks, embeds, and retrieves relevant context automatically.

Docker

# Desktop app — download from anythingllm.com
# Or self-host with Docker:
docker pull mintplexlabs/anythingllm

docker run -d -p 3001:3001 \
  --cap-add SYS_ADMIN \
  -v anythingllm_storage:/app/server/storage \
  mintplexlabs/anythingllm

# Open http://localhost:3001
# Upload documents, create workspaces, RAG out of the box

What sets it apart

Workspaces: Organize documents into separate knowledge bases
Multi-format: PDF, DOCX, TXT, CSV, code files, web pages, YouTube transcripts
Embedding models: Choose your own embedding model for better retrieval
Agent mode: Can browse the web, run code, and use tools
Desktop app available: No Docker needed if you prefer a native app

The use case

If "chat with your documents" is what you actually need — not just chatting — AnythingLLM is the pick. Lawyers, researchers, and anyone drowning in PDFs will love it. For general chatting, it's overkill; use Open WebUI instead.

Best for: Document-heavy workflows, research, knowledge bases.
Difficulty: ⭐⭐ (desktop app is easy, Docker is standard)
Cost: Free, open source (cloud version available)

Which Model Should You Run?

The tool is only half the equation. Here's what model to put in it:

Your Hardware	Best Model	Why
8GB RAM, no GPU	Qwen 3.5 2B	Runs on anything, surprisingly capable for its size
16GB RAM or 8GB VRAM	Qwen 3.5 9B ⭐	The sweet spot. Beats GPT-5 Nano on benchmarks. 262K context.
32GB RAM / M2+ Mac	Llama 3.3 8B or Qwen 3.5 9B	Fast inference, great for coding and reasoning
24GB VRAM (RTX 4090)	Qwen 3 30B or Mistral Large	Near-Claude-level quality for complex tasks
Phone / Raspberry Pi	Qwen 3.5 0.8B	It works. Don't expect miracles, but it works.

My recommendation: Start with ollama pull qwen3.5:9b. It's the current king of the "fits on normal hardware" category. Released March 2, 2026 — two days ago — and already the most talked-about local model on r/LocalLLM. For a full walkthrough, check our Qwen 3.5 setup guide.

Head-to-Head Comparison

Feature	Ollama	Open WebUI	Jan	LibreChat	AnythingLLM
Setup difficulty	⭐	⭐⭐	⭐	⭐⭐⭐	⭐⭐
Pretty UI	❌ (terminal)	✅✅	✅	✅✅	✅
Document upload/RAG	❌	✅	❌	✅	✅✅✅
Multi-user	❌	✅	❌	✅	✅
Cloud API support	❌	✅	✅	✅✅✅	✅
Works offline	✅	✅	✅	Partial	✅
Mobile support	❌	✅ (PWA)	❌	✅ (web)	❌

Real Talk: Can Local AI Replace ChatGPT?

Honestly? For 80% of what most people use ChatGPT for — writing emails, brainstorming, summarizing text, answering questions, basic coding — yes. A Qwen 3.5 9B running locally will handle all of that without breaking a sweat.

For the other 20% — complex multi-step reasoning, advanced code generation, image generation, real-time web browsing — cloud models still have an edge. But that edge is shrinking every month.

The real question isn't "is local AI as good?" It's "is the privacy trade-off worth a small quality gap?"

After a week of 295% uninstall spikes and Pentagon deals, I think a lot of people just answered that question for themselves. If you're also looking to degoogle your phone, pair this with our GrapheneOS setup guide for full digital sovereignty. Concerned about AI agent security? Read our guide on vetting AI agent plugins.

Get Started in 5 Minutes

Install Ollama → pull Qwen 3.5 9B → run Open WebUI. Three commands, and you've got your own private ChatGPT. No account, no API key, no data leaving your machine.

Your conversations are yours. That shouldn't be a radical idea.