track — project series

Build the SQLite
of vector databases

Every app wants on-device semantic search, photo similarity, local RAG — AI features without a network call. There is no good embedded vector database. We are going to build one, from scratch, and understand every decision along the way.

Server-side vector databases are designed for a different world — abundant RAM, persistent network connections, dedicated hardware. The moment you embed one in a mobile app, every assumption breaks. This track is about building for the environment that actually exists on a user's device.

17 lessons

intermediate → advanced

Python → C

~7 hours

This track is being built

You need to be comfortable writing code and know basic linear algebra (dot products, what a vector is). No prior database, mobile, or ML knowledge required.

The environment you are building for

RAM budget

256 MB – 1 GB

float32 vectors don't fit — quantization is not optional

Network

none

every query is local — no round trip, no fallback

Process model

embedded library

no server, no daemon — linked directly into the app

Cold start

must be fast

queries must work before the index is fully in memory

Distribution

single file

the database ships with your app, like a SQLite file

Writers

one, rarely

no distributed consensus — reads dominate everything

These constraints are not limitations to apologise for. They are the source of every interesting design decision in this track. You will learn more about memory, storage, and query execution by building for a phone than you ever would building for a server.

What you will enable

on-device photo search local semantic search offline RAG face similarity document similarity on-device recommendations private AI features

What you will build

Part I

Naive foundation

Flat array, linear scan, exact cosine distance. Correct, slow, your baseline forever.

Part II

Storage & mmap

Binary layout, single-file format, mmap for zero-copy reads and fast cold starts.

Part III

Quantization

Scalar and product quantization. Fitting a million vectors into the RAM you actually have.

Part IV

Filtering

Inverted indexes, compiled predicates, pre- vs post-filter tradeoffs at mobile scale.

Part V

Approximate search

HNSW built from scratch, tuned for read-heavy embedded workloads.

Part VI

The C library

Rewrite in C. Clean API. Something you could actually ship inside a mobile app.

The final artifact is a C library with a clean API — something you could actually ship inside a mobile app. Not a demo. Not a proof of concept. Something real.

vdb_t *db = vdb_open("photos.vdb");
vdb_result_t *r = vdb_search(db, embedding, 10);

Lessons

Part I — Naive Foundation

lesson 01 · Python

The problem

What similarity search is, why you can't use a B-tree, and what makes every existing vector database the wrong tool for an embedded app.

locked

10 min

lesson 02 · Python

Vector distance from scratch

Dot product, cosine similarity, L2. Implement all three. Understand what each one measures and when to pick it.

locked

12 min

lesson 03 · Python

Brute-force search

O(n·d) scan — check every vector. This is your correctness baseline. Everything built later is measured against it.

locked

9 min

Part II — Storage & mmap

lesson 04 · Python

Binary layout

When the schema is known at open time, records are fixed-size. Memory layout computable once — not per read. How this eliminates parsing overhead entirely.

locked

13 min

lesson 05 · Python

Single-file format

File header, append-only record segments, the index as a contiguous block. A format you can hand someone as a file attachment and have work immediately.

locked

14 min

lesson 06 · C

Memory-mapped files

mmap the index. Zero-copy reads. What the OS actually does when you fault a page in. Why this is the right model for a cold-start database.

locked

15 min

Part III — Quantization

lesson 07 · Python

Why float32 doesn't fit

1M vectors × 512 dimensions × 4 bytes = 2 GB. The memory math that makes quantization unavoidable on every device that isn't a server.

locked

10 min

lesson 08 · Python

Scalar quantization

Compress float32 to int8. Calibration, scale factors, reconstruction error. 4× memory reduction with a small, measurable recall cost.

locked

14 min

lesson 09 · Python

Product quantization

Split vectors into subspaces, quantize each independently. Build a codebook. How PQ codes get you to 32× compression and asymmetric distance computation.

locked

18 min

Part IV — Filtering

lesson 10 · Python

Pre-filter vs post-filter

Filter before the vector search or after it. The tradeoff is not obvious — selectivity determines the winner. How to measure which strategy to use.

locked

11 min

lesson 11 · C

Inverted indexes & compiled predicates

Build an inverted index for scalar fields. When field offsets are known at schema time, filters become direct struct member reads — no lookup tables at query time.

locked

14 min

Part V — Approximate Search

lesson 12 · Python

Why exact search fails at scale

The recall vs. latency tradeoff. Why every production system — even on-device — eventually trades exactness for speed, and how to measure that tradeoff honestly.

locked

10 min

lesson 13 · Python

HNSW: the idea

Hierarchical navigable small world graphs. How layered greedy search achieves sub-linear query time. The intuition before the code.

locked

18 min

lesson 14 · Python

HNSW: implementation

Build it. Graph construction, node insertion, layer selection, greedy search. Tuned for read-heavy workloads where writes are rare and cold-start latency matters.

locked

22 min

Part VI — The C Library

lesson 15 · C

Rewriting in C

Port the Python to C. Every data structure you built in Python becomes a struct. Every loop becomes a reason to think about cache lines.

locked

20 min

lesson 16 · C

SIMD distance

AVX2 on x86, NEON on ARM. Rewrite the inner distance loop with intrinsics. Measure the speedup against scalar C. Understand what auto-vectorization misses and why.

locked

17 min

lesson 17 · C

The finished library

Clean public API. vdb_open, vdb_search, vdb_insert, vdb_close. Benchmark against the Python baseline and against brute-force. Ship it.

locked

20 min