Multimodal AI · 100% Local

Qwen 3.5 Vision

Run a multimodal vision-language model entirely in your browser. No server, no API keys — powered by Transformers.js and WebGPU.

Vision + Language
Unified Multimodal
201 Languages
Global Coverage
Reasoning
Code · Agents · Visual
Initializing model…
Model weights are cached for future visits.
Q
Qwen 3.5 Vision
Ready on WebGPU

Start a conversation

Optionally attach an image, then type your message.
The model runs entirely in your browser.