Moondream WebGPU

A private and powerful multimodal AI chatbot that runs locally in your browser.

You are about to load moondream2, a 1.86 billion parameter VLM (Vision-Language Model) that is optimized for inference on the web. Once downloaded, the model (1.8 GB) will be cached and reused when you revisit the page.

Everything runs directly in your browser using 🤗 Transformers.js and ONNX Runtime Web, meaning your conversations aren't sent to a server. You can even disconnect from the internet after the model has loaded!

Disclaimer: Generated content may be inaccurate or false.