The Ultimate Proxmox AI Appliance: A Step-by-Step Guide to Ollama in an LXC Container
Welcome, Proxmox users and homelab enthusiasts! If you’re looking to dive into the world of self-hosted AI without the overhead of a full virtual machine, you’ve come to the right place. This guide provides a detailed walkthrough for setting up an efficient, low-overhead AI server using the popular Ollama platform inside a Proxmox LXC (Linux Container). We’ll cover everything from basic setup to the more advanced topic of GPU passthrough for maximum performance.
Running Ollama in an LXC container is an excellent way to manage resources efficiently. LXCs share the host kernel, resulting in significantly less memory and CPU usage compared to a traditional VM, making it perfect for always-on homelab services.
Prerequisites
Before we begin, please ensure you have the following:
- A stable, running Proxmox VE installation.
- Basic familiarity with the Proxmox web UI and shell.
- An Ubuntu or Debian LXC template downloaded in your Proxmox storage. We recommend Ubuntu 22.04 for this guide.
- (Optional but highly recommended) An NVIDIA GPU installed in your Proxmox host if you plan to use GPU acceleration.
Step 1: Creating the LXC Container
Our first step is to create a dedicated, unprivileged LXC container that will house our Ollama installation. This provides a secure and isolated environment.
- Create the CT: In the Proxmox UI, click “Create CT”.
- General: Assign a hostname (e.g.,
ollama-lxc
) and a secure password. Uncheck the “Unprivileged container” box only if you are an advanced user and understand the security implications. For GPU passthrough, an unprivileged container is more secure and is the recommended method. - Template: Select the Ubuntu or Debian template you downloaded earlier.
- Disks: Allocate at least 20GB of disk space. Remember that AI models can be several gigabytes each, so plan accordingly.
- CPU: Assign at least 2-4 cores.
- Memory: A minimum of 4GB of RAM is recommended. If you plan on running larger models, 8GB or 16GB is better.
- Network: Configure a static IP address or use DHCP as per your network setup.
- Features: This is a critical step. Under the ‘Features’ tab, you must enable both Nesting and Keyctl. These are required for Ollama and other containerized applications to function correctly.
Once you’ve reviewed the summary, finish the creation process and start your new container.
Step 2: Configuring GPU Passthrough (Optional but Recommended)
To get the best performance, you’ll want to pass your host’s GPU through to the LXC container. This allows Ollama to use the GPU for hardware acceleration, drastically speeding up model inference.
Note: This process can be complex and may vary based on your specific hardware.
On the Proxmox Host
First, we need to tell Proxmox to make the GPU devices available to the container. Open a shell on your Proxmox host and edit the LXC’s configuration file, which is located at /etc/pve/lxc/YOUR_CT_ID.conf
. Add the following lines to the end of the file, which map the NVIDIA device and the rendering device into the container:
# For NVIDIA GPU Passthrough
lxc.cgroup2.devices.add: c 195:* rwm
lxc.cgroup2.devices.add: c 226:* rwm
lxc.mount.entry: /dev/nvidia0 dev/nvidia0 none bind,optional,create=file
lxc.mount.entry: /dev/nvidiactl dev/nvidiactl none bind,optional,create=file
lxc.mount.entry: /dev/nvidia-uvm dev/nvidia-uvm none bind,optional,create=file
lxc.mount.entry: /dev/nvidia-uvm-tools dev/nvidia-uvm-tools none bind,optional,create=file
Inside the LXC Container
With the device passed through, you now need to install the appropriate NVIDIA drivers *inside* the container so that Ollama can use them. The exact drivers will depend on the GPU you have.
Step 3: Installing Ollama in the LXC
Now for the main event! Open the console for your newly created LXC container from the Proxmox UI.
First, let’s update the system to ensure all packages are current:
apt update && apt upgrade -y
Next, install Ollama using their official one-line installation script. This command downloads and executes the script, which handles the installation of the Ollama binary and sets up a systemd service to run it automatically.
curl -fsSL https://ollama.com/install.sh | sh
After the installation completes, the Ollama service will be running in the background.
Step 4: Running Your First AI Model
With Ollama installed, you’re ready to download and interact with a large language model. It’s as simple as a single command.
Let’s download and run Meta’s Llama 3, a powerful and popular model.
# This command will download the model (this may take some time)
ollama pull llama3
# After the download is complete, you can run it interactively
ollama run llama3
You can now chat with the AI directly from your container’s command line! To exit, type /bye
.
Exposing Ollama to Your Network
By default, the Ollama API is only accessible from within the container (localhost). To allow other applications on your network (like a web UI) to connect to it, you need to configure it to listen on all network interfaces.
Edit the systemd service to set the OLLAMA_HOST environment variable:
systemctl edit ollama.service
Add the following lines, save the file, and exit the editor:
[Service]
Environment="OLLAMA_HOST=0.0.0.0"
Then, reload the systemd daemon and restart the Ollama service to apply the changes:
systemctl daemon-reload
systemctl restart ollama
Your Ollama API is now accessible on port 11434 of your LXC container’s IP address.
Conclusion
Congratulations! You now have an incredibly efficient, self-hosted AI appliance running in a Proxmox LXC container. This setup provides a powerful and scalable foundation for exploring the exciting world of local large language models. From here, you can install a web interface like Open WebUI, integrate Ollama with home automation systems, or start developing your own AI-powered applications.