Ollama & Stable Diffusion AI Guide#

1 HP1GPU - Docker (Ollama Model Inventory)#

1.1 llama3.2:latest (2.0 GB)#

  • Consuming VM: HP1Docker
  • Services:
    • karakeep (Docker)
    • paperless (Docker)
    • n8n (Docker)

1.2 qwen3-vl:8b (6.1 GB)#

  • Consuming VM: HP1Docker
    • Service: paperless-ai (Docker)

1.3 gemma3:12b (8.1 GB)#

  • Consuming VM: HP1GPU
  • Service:
    • open-webui (Docker)

1.4 llama3.2-vision:latest (7.8 GB)#

  • Consuming VM: HP1Docker
  • Service:
    • paperless Inactive (Commented out in compose)

1.5 qwen2.5vl:3b (3.2 GB)#

  • Consuming VM: HP1Docker
  • Service: paperless (Testing)
  • Status: Inactive (Commented out in compose)

1.6 glm-ocr:latest (2.2 GB)#

1.7 gemma4::e4b (9.6 GB)#

1.8 nomic-embed-text:latest (274 MB)#

  • Service: Internal Embeddings / RAG
  • Status: Active (Implicit)

2 Ollama Installation & Network Setup#

2.1 Basic Installation#

# Open firewall port
ufw allow 11434/tcp
# Official Install Script
curl -fsSL [https://ollama.com/install.sh](https://ollama.com/install.sh) | sh

2.2 Enable Local Network Access#

To allow other devices to use the AI, you must change the bind address from 127.0.0.1 to 0.0.0.0.

# Edit service configuration
systemctl edit ollama.service

Add the following block:

[Service]
Environment="OLLAMA_HOST=0.0.0.0:11434"
Environment="OLLAMA_ORIGINS=*"
Environment="OLLAMA_MODELS=/mnt/ollama/"
# Reload and Restart
systemctl daemon-reload && systemctl restart ollama

3 Update#

# Add description
curl -fsSL https://ollama.com/install.sh | sh
# Add description
sudo systemctl daemon-reload
sudo systemctl restart ollama

4 Multi-GPU Configuration (Tesla T4 & P4)#

If running two different GPUs, you can run two separate Ollama instances on different ports.

4.1 Instance 1: Tesla T4 (16GB) (Port 11434)#

Set CUDA_VISIBLE_DEVICES=0 in the main service.

4.2 Instance 2: Tesla P4 (8GB) (Port 11435)#

Create a second service:

sudo nano /etc/systemd/system/ollama-p4.service
[Unit]
Description=Ollama Service (Tesla P4)
After=network-online.target

[Service]
ExecStart=/usr/local/bin/ollama serve
User=ollama
Group=ollama
Restart=always
RestartSec=3
Environment="OLLAMA_HOST=0.0.0.0:11435"
Environment="OLLAMA_ORIGINS=*"
Environment="OLLAMA_MODELS=/mnt/ollama/"
Environment="CUDA_VISIBLE_DEVICES=1"

[Install]
WantedBy=default.target
sudo systemctl daemon-reload
sudo systemctl enable --now ollama-p4

4.3 Verify status#

journalctl -u ollama.service -n 50 --no-pager
journalctl -u ollama-p4.service -n 50 --no-pager

5 LLM Management (Ollama CLI)#

5.1 Basic Commands#

# Pull and Run models
ollama pull llama3.2
ollama run llama3.2
# List and Remove
ollama ls
ollama rm gemma:7b
# Verify GPU usage during a chat
ollama ps

5.2 Models Library#

  • Lightweight: llama3.2, gemma3:4b, ministral-3:3b
  • Performance: gemma3:12b, qwen3:8b, mistral:7b
  • Coding: qwen2.5-coder:7b

6 Custom Model Storage (External Disk)#

6.1 Format and Mount Partition#

# Format disk with 'ollama' label
mkfs.ext4 /dev/sdb1 -L "ollama"
mkdir -p /mnt/ollama
mount /dev/sdb1 /mnt/ollama
chown -R ollama:ollama /mnt/ollama/

6.2 Persistence (/etc/fstab)#

UUID=7fee698e-0940-4b26-8faf-3bf764f8a643 /mnt/ollama ext4 defaults,nofail 0 2

6.3 Optimization#

# Set reserved blocks to 0%
tune2fs -m 0 /dev/sdb1

7 Stable Diffusion WebUI Installation#

Requires NVIDIA drivers 550+ and ~10GB disk space.

7.1 Install Dependencies & Python 3.10#

apt install -y make build-essential libssl-dev zlib1g-dev libbz2-dev libreadline-dev libsqlite3-dev wget curl llvm libncurses5-dev libncursesw5-dev xz-utils tk-dev libffi-dev liblzma-dev git google-perftools
# Setup Pyenv for specific Python version
curl [https://pyenv.run](https://pyenv.run) | bash
# (Update .bashrc with export paths provided by script)
pyenv install 3.10
pyenv global 3.10

7.2 Download & Start#

mkdir ~/stablediffusion && cd ~/stablediffusion
wget -q [https://raw.githubusercontent.com/AUTOMATIC1111/stable-diffusion-webui/master/webui.sh](https://raw.githubusercontent.com/AUTOMATIC1111/stable-diffusion-webui/master/webui.sh)
chmod +x webui.sh
./webui.sh

7.3 Systemd Service Setup#

nano /usr/lib/systemd/system/stablediffusion.service

[Unit]
Description=Stable Diffusion Webui Service
After=network-online.target

[Service]
ExecStart=/home/marc/stablediffusion/stable-diffusion-webui/webui.sh --listen --api
User=marc
Restart=always
RestartSec=3

[Install]
WantedBy=default.target
systemctl daemon-reload
systemctl enable --now stablediffusion

7.4 Add Checkpoint (Model)#

cd ~/stablediffusion/stable-diffusion-webui/models/Stable-diffusion/
wget [https://huggingface.co/stabilityai/stable-diffusion-2-inpainting/resolve/main/512-inpainting-ema.ckpt](https://huggingface.co/stabilityai/stable-diffusion-2-inpainting/resolve/main/512-inpainting-ema.ckpt)

7.5 Enable Obsidian in Ollama (Mac Studio)#

launchctl setenv OLLAMA_ORIGINS "app://obsidian.md*"

Restart the application (close from menu bar)