Setting Up Ollama with Open WebUI on a VPS Using Docker Compose
We earn commissions when you shop through the links on this page, at no additional cost to you. Learn more.
Running a large language model on a VPS gives you a persistent, always-on AI endpoint that you can reach from any device — your laptop, your phone, or your home server. I've been doing exactly this for the past several months, and the combination of Ollama (for model serving) and Open WebUI (for the browser-based chat interface) is genuinely the cleanest self-hosted AI stack I've used. This tutorial walks you through the complete Docker Compose setup, from a fresh VPS to a working chat UI with a real model loaded.
Choosing the Right VPS
This is the part most tutorials skip, and it matters a lot. Ollama needs RAM more than it needs CPU cores. A model like llama3.2:3b comfortably fits in 4 GB of RAM. For mistral:7b or llama3.1:8b, you want at least 8–10 GB. For anything in the 13B range, you're looking at 16 GB minimum and inference will be slow without a GPU.
I recommend starting with a DigitalOcean Droplet. Their Premium AMD droplets give you dedicated vCPUs and fast NVMe storage, which makes a real difference for model loading times. A 4 vCPU / 8 GB RAM Droplet runs around $48/month and handles 7B models fine for personal use.
Whatever provider you choose, make sure you're running Ubuntu 22.04 or 24.04 LTS. The rest of this tutorial assumes that. You'll also want Docker and Docker Compose installed — if you haven't done that yet, run:
curl -fsSL https://get.docker.com | sh
sudo usermod -aG docker $USER
newgrp docker
Log out and back in after this so your group membership takes effect.
The Docker Compose File
Here's the complete docker-compose.yml I use. I prefer to keep Ollama and Open WebUI on an isolated Docker network so they can talk to each other without exposing Ollama's API port (11434) to the internet. Open WebUI is the only thing that needs to be reachable from outside.
mkdir -p ~/ollama-stack && cd ~/ollama-stack
nano docker-compose.yml
Paste the following into that file:
version: "3.8"
networks:
ai-net:
driver: bridge
volumes:
ollama_data:
openwebui_data:
services:
ollama:
image: ollama/ollama:latest
container_name: ollama
restart: unless-stopped
networks:
- ai-net
volumes:
- ollama_data:/root/.ollama
# Expose port only to localhost — do NOT bind to 0.0.0.0
ports:
- "127.0.0.1:11434:11434"
environment:
- OLLAMA_KEEP_ALIVE=24h
deploy:
resources:
limits:
memory: 10g
open-webui:
image: ghcr.io/open-webui/open-webui:main
container_name: open-webui
restart: unless-stopped
networks:
- ai-net
volumes:
- openwebui_data:/app/backend/data
ports:
- "3000:8080"
environment:
- OLLAMA_BASE_URL=http://ollama:11434
- WEBUI_SECRET_KEY=change_this_to_a_random_string
depends_on:
- ollama
change_this_to_a_random_string in WEBUI_SECRET_KEY with an actual random value before you deploy. You can generate one with openssl rand -hex 32. This key signs session tokens — if you leave it as a placeholder, anyone who knows the default can forge sessions.Notice that I bind Ollama only to 127.0.0.1:11434, not 0.0.0.0:11434. Ollama's API has no authentication by default. If you bind it to all interfaces, anyone who finds port 11434 open can pull models, run inference, and burn through your server's resources. Open WebUI reaches Ollama by container name (ollama:11434) over the internal ai-net network, so the localhost binding doesn't affect that communication.
Starting the Stack and Pulling a Model
Bring everything up in detached mode:
cd ~/ollama-stack
docker compose up -d
# Watch logs to confirm both containers start cleanly
docker compose logs -f
Once Ollama is running, pull a model. I'd suggest starting with llama3.2:3b if you're on an 8 GB droplet — it's fast and surprisingly capable:
# Pull into the running Ollama container
docker exec -it ollama ollama pull llama3.2:3b
# Or if you want the larger Mistral 7B model
docker exec -it ollama ollama pull mistral:7b
# Confirm the model is available
docker exec -it ollama ollama list
The pull will take a few minutes depending on your VPS's network speed. Ollama stores the model weights in the ollama_data Docker volume, so they'll persist across container restarts.
Opening Up the Firewall
You need port 3000 open for Open WebUI. If you're using UFW (which I recommend on any VPS):
sudo ufw allow 3000/tcp comment "Open WebUI"
sudo ufw status
Do not open port 11434 in UFW. It should stay closed to external traffic — that's the whole point of the 127.0.0.1 binding above.
your.domain.com { reverse_proxy localhost:3000 } gets you automatic Let's Encrypt TLS in about 60 seconds. Check out our Caddy vs Nginx vs Traefik comparison if you haven't picked a reverse proxy yet.Logging In to Open WebUI
Navigate to http://your-vps-ip:3000 in your browser. The first account you create becomes the admin account — do this immediately after deploying so a stranger doesn't claim it first. After logging in, go to Settings → Models and you should see the model you pulled listed and ready to use.
Open WebUI has a solid feature set out of the box: conversation history, multiple model switching, system prompt templates, RAG document uploads, and a basic API proxy. For personal use, it genuinely replaces most of what I used to pay for with commercial chat frontends.
Keeping Everything Up to Date with Watchtower
Both Ollama and Open WebUI ship updates frequently. I add Watchtower to the same compose file to handle automatic image updates:
watchtower:
image: containrrr/watchtower:latest
container_name: watchtower
restart: unless-stopped
volumes:
- /var/run/docker.sock:/var/run/docker.sock
environment:
- WATCHTOWER_CLEANUP=true
- WATCHTOWER_SCHEDULE=0 0 4 * * *
- WATCHTOWER_NOTIFICATIONS=off
This runs at 4 AM every day, pulls new image versions if available, restarts affected containers, and cleans up old images. Add this block to your services: section in docker-compose.yml and run docker compose up -d watchtower to start it without restarting the other containers.
Resource Monitoring and Troubleshooting
If inference feels slow or the container is getting OOM-killed, check resource usage:
# Live resource usage for all containers
docker stats
# Check if Ollama OOM'd
docker inspect ollama | grep -i oom
# Check Open WebUI logs for connection errors
docker compose logs open-webui --tail=50
The most common issue I see is open-webui logging Connection refused when trying to reach Ollama. This almost always means the OLLAMA_BASE_URL environment variable is wrong or Ollama hasn't finished starting yet. Give Ollama 10–15 seconds after docker compose up before Open WebUI will be fully functional. The depends_on in the compose file handles ordering but doesn't wait for Ollama to be ready.
Wrapping Up
With this setup you have a persistent, private LLM endpoint on a VPS that you fully control. No usage caps, no data leaving your server, and no subscription fees beyond the VPS cost itself. The next logical step is putting a real domain and HTTPS in front of it — I'd point you toward the VPS hardening guide and the Caddy reverse proxy tutorial on this site. Once you've got TLS sorted, consider locking down Open WebUI further with Authelia if you want SSO and MFA in front of the login screen.
Discussion