NanoVDR: Visual Document Retrieval Demo
How it works: Type a text query below. A tiny 69M DistilBERT encoder (running on CPU)
maps your query into the same embedding space as a 2B VLM teacher (Qwen3-VL-Embedding-2B).
The document page embeddings were pre-computed offline by the teacher.
Retrieval is a simple dot product — no vision model runs at query time.
Corpus: 1,360 pages from ViDoRe v3 Computer Science (academic papers, slides, diagrams).