Camera Classifier

Point your camera at things and let AI tell you what it sees

Inspired by Silicon Valley's iconic "Not Hotdog" app, this experiment uses your device camera and a machine learning model to classify objects in real-time. All processing happens locally in your browser using 🤗 Transformers.js.

How It Works

Camera Access: Uses the MediaDevices API to access your camera
Image Classification: Runs a MobileNet-based model to identify objects
Real-time Processing: Analyzes frames from your camera feed
Privacy First: Everything runs locally - no images are uploaded

Live Demo

🌭 Is it a hot dog?

Click "Start Camera" to begin

Model: Not loaded

The Silicon Valley Reference

In the HBO show Silicon Valley, Jian-Yang creates "SeeFood" - an app that was supposed to identify different types of food, but could only distinguish between hot dogs and "not hot dogs." While hilarious in the show, it perfectly demonstrates how AI classification works in practice!

This experiment uses a more sophisticated model (MobileNet) that can identify thousands of different objects, but the concept is the same: point, classify, and get results instantly.

Technical Details

Model: Vision Transformer (ViT) - image classification
Framework: Transformers.js with ONNX Runtime
Camera API: MediaDevices.getUserMedia()
Processing: Client-side inference using WebGPU/ WASM

Try It Yourself

Here's the source of this experiment, you can fork it and try it yourself(or copy and paste the code and run it locally):

Last updated: 2026-05-03T07:45:26-05:00

← Back to Experiments