Qwen 3 Omni Local Ai Setup Guide

When you click on links to various merchants on this site and make a purchase, this can result in this site earning a commission. Affiliate programs and affiliations include, but are not limited to, the eBay Partner Network. As an Amazon Associate I earn from qualifying purchases. #ad #promotions

how to setup qwen3 omni

Qwen3 is an exciting new Multimodal Audio, Video, Text and Image LLM that is able to be ran fully locally on a modest ai rig. Here are the cmds to get you up and running. This is a work in progress so it will be filled in with more details later but published as soon as I have the most basic cmds to get this working on a local server. This is running a Gradio interface to allow you to fully experience what it can offer. Qwen3 Omni does not yet fully support running in  openwebui, so gradio it is. Also we will be using Transformers to run this and not vLLM as it doesn’t support audio generation.

Accompanying Video

This guide assumes you have followed along with the part 1 and 2 guide to get a minimally viable local ai base system backup that can be deployed fast. This is a guide in our series on setting up a local ai homelab with Proxmox 9. YOU HAVE TO HAVE FOLLOW THE GUIDES SEQUENTIALLY IF UP TO Step 2, Llama.cpp and Unsloth setup guide, to follow along. We create a backup you will see me using in ALMOST EVERY future guide at the end of that guide, and you need it also as it makes rapid deployment easy.

Those guides here:

Step 1 Ollama and OpenWEBUI setup guide

Step 2 Llama.cpp and Unsloth setup guide

 

Restore from your backup to create a new LXC. Set a NEW IP address before you start the LXC container. It will be copied from the backup and will overlap an existing IP address likely and that will cause issues.

Enter the shell of the container and run

apt update && apt upgrade -y && apt install ffmpeg git pipenv -y

Clone the Qwen3 Omni vllm repository

git clone -b qwen3_omni https://github.com/wangxiongts/vllm.git
cd vllm

Create a env for Qwen3 Omni

Install Requirements

pip install -r requirements/build.txt
pip install -r requirements/cuda.txt

Build vLLM

export VLLM_PRECOMPILED_WHEEL_LOCATION=https://wheels.vllm.ai/a5dd03c1ebc5e4f56f3c9d3dc0436e9c582c978f/vllm-0.9.2-cp38-abi3-manylinux1_x86_64.whl

pip install git+https://github.com/huggingface/transformers
pip install accelerate
pip install qwen-omni-utils -U
pip install -U flash-attn –no-build-isolation
pip install gradio==5.44.1 gradio_client==1.12.1 soundfile==0.13.1

vLLM is now installed but we will next clone a repo to enable us to use the Gradio Interface

cd ..

git clone https://github.com/QwenLM/Qwen3-Omni.git

cd Qwen3-Omni

now we will edit the Gradio config to allow us to access this interface from our local network as a service.

Arrow down to the bottom of the file and look for the following line

qwen3 omni gradio settings

Change 127.0.0.1 to 0.0.0.0 and hit ctrl+w and y. This will bind the server to your LXC IP address, so make sure you have a unique IP address as mentioned above. You can leave the port of change the port if you want.

Run the gradio interface.

python web_demo.py -c Qwen/Qwen3-Omni-30B-A3B-Instruct –use-transformers –generate-audio –flash-attn2

Browse to your default address and port number and make sure you are using just http and not https. Mine is http://192.168.1.71:8901 for example

Now we will be adjusting a browser setting to enable some functionality next. I am using brave so this likely differs in various browsers.

Enable webcamera and microphone in Brave

Open up a new tab and go here

brave://flags/#unsafely-treat-insecure-origin-as-secure

Enable insecure origins treated as secure then add enter http://IP:PORT

You will have to restart your browser now but when you click “allow microphone” button in the gradio interface it will now work. There will also be an annoying nag bar at the top of the browser when you start it up now, I don’t know how to disable that sorry but drop comments in the video PLEASE!

A quick sample of the quality given the following image and prompted with “please explain the diagram I uploaded”

 

Server Rack Redo Diagram