Qwen3 WebGPU

A hybrid reasoning model that runs locally in your browser with WebGPU acceleration.


You are about to load Qwen3-0.6B, a 0.6B parameter reasoning LLM optimized for in-browser inference. Everything runs entirely in your browser with 🤗 Transformers.js and ONNX Runtime Web, meaning no data is sent to a server. Once loaded, it can even be used offline. The source code for the demo is available on GitHub.

Reason

Disclaimer: Generated content may be inaccurate or false.