A hybrid reasoning model that runs locally in your browser with WebGPU acceleration.
You are about to load Qwen3-0.6B, a 0.6B parameter reasoning LLM optimized for in-browser inference. Everything runs entirely in your browser with 🤗 Transformers.js and ONNX Runtime Web, meaning no data is sent to a server. Once loaded, it can even be used offline. The source code for the demo is available on GitHub.
Reason
Disclaimer: Generated content may be inaccurate or false.