Personal AI

Self-hosted vector DB shootout: pgvector, Qdrant, LanceDB

Three open-source vector stores cover most of the self-hosted RAG surface area in 2025. Worth being concrete about which one fits which workload, because the trade-offs matter and the docs won't tell you.

Sid Smith

21 Apr 2025 • 5 min read

The self-hosted vector store market converged faster than it had any right to. As of mid-April there are roughly three picks that cover most of the actual production use cases (pgvector, Qdrant, and LanceDB) and the choice between them is meaningful enough to be worth being explicit about. The vendor docs all describe their own product as the obvious choice; that's not particularly useful when you're trying to pick.

This is the comparison I wish I'd had when I was setting up the first self-hosted RAG stack on the homelab. Three picks, what each is actually for, the trade-offs that matter, and where they break down.

Comparison matrix of three self-hosted vector stores (pgvector, Qdrant, and LanceDB) across seven dimensions: operational simplicity, vector performance, metadata filtering, write-heavy workloads, multi-tenant production, local-first capability, and ecosystem maturity.

What these three actually are

pgvector is a Postgres extension. You install it into a Postgres database you're probably already running, you get a vector column type, and your existing relational data and your embeddings live in the same place. Indexes via HNSW or IVFFlat. Query through SQL. Operationally, it's "Postgres", which means anyone on the team who understands Postgres understands the operational surface.

Qdrant is a purpose-built vector database. Written in Rust, runs as its own service, has its own API surface. The data model is collections of points with vectors and payload metadata. The query model is vector similarity with optional metadata filters. Indexes via HNSW with a few tunable knobs. Cloud-hosted version exists; self-hosted is the default story.

LanceDB is a columnar-format vector store built on the Lance file format. The interesting bit: it's embedded by default. Your application opens a database directory on local disk, queries it like SQLite, and gets vector search without standing up a separate service. There's a server mode for shared deployment, but the default ergonomics are local-first.

These are different products. They are not three implementations of the same idea.

When each one fits

The decision is mostly about what shape of system you have around the embeddings, not about which has the fastest benchmark.

pgvector is the right pick when you already have Postgres and your embeddings are joining relational data. If your retrieval pattern is "find the top-k vectors that match this query, then filter by user/tenant/permission/date that lives in another table," doing that in SQL with a single database is dramatically simpler than running two systems and stitching them together. The performance is good enough for most applications. It's not the fastest at very high scales, but at the scales where most production RAG systems actually live (millions of vectors, not billions), it's fine. The operational story is the killer feature: backups, replication, monitoring, security policies all already exist for your Postgres instance.

Qdrant is the right pick when the vector workload is the dominant workload and you want the best per-vector performance. If you're indexing tens or hundreds of millions of vectors and the query rate is high, the purpose-built engine wins on latency and on memory efficiency under realistic load. The metadata filtering is genuinely good. Qdrant indexes the payload fields and can do filtered vector search efficiently in ways pgvector can't match at scale. The operational cost is real (it's a separate service to run, monitor, back up) but the product justifies it for vector-heavy workloads.

LanceDB is the right pick when the workload is local-first or embedded. Single-user applications, desktop tools, agents that need a private vector store on the user's own machine, these don't need a network service. The Lance format also lets the same data be read by Pandas, DuckDB, and the analytics tooling without a separate ETL step, which is nice for the analytics-tied embedding workflows. The trade-off is operational maturity: it's the youngest of the three and the documentation gaps show.

The trade-offs that don't show up in benchmarks

A few things worth knowing that the comparison-table coverage usually misses:

pgvector's real cost is at write-heavy scale. Postgres handles index updates well at moderate write rates; at high embedding-update volume the HNSW index becomes a real bottleneck because index rebuilds are expensive. If your application is constantly re-embedding documents (because the embedding model changes, or the source content changes), pgvector's write profile is worse than Qdrant's. For mostly-read workloads this doesn't matter.

Qdrant's filtering syntax is its own thing. It's powerful and well-designed, but it's not SQL. If your team is going to query the system from places other than the AI application code (for analytics, for audit, for ad-hoc investigation), you're learning a new query language. That's a real cost in a small team.

LanceDB's embedded model is incompatible with multi-tenant production. If your application has multiple users sharing the same backend, LanceDB's local-first model doesn't fit. The server mode exists but you're running a separate service at that point, and the per-vector-cost advantages versus the alternatives narrow significantly.

All three handle metadata filtering, but with different cost profiles. Filtering before vector search is the right pattern for highly-selective filters; filtering after is the right pattern for low-selectivity filters. pgvector lets you express both naturally in SQL; Qdrant has explicit pre-filter and post-filter modes; LanceDB defaults to one approach and the other requires effort. If your filters are complex, this matters.

What I'm actually running

For most of the self-hosted RAG work I've done in the last six months: pgvector when there's already a Postgres in the stack (which there usually is); Qdrant when the workload is vector-dominated and the embedding count is high enough that pgvector's index rebuild costs become a maintenance burden; LanceDB for the personal-AI tier, the local-first agents and tools that live on a single machine and don't need a service to run.

That's three different answers for three different shapes of system. The mistake to avoid is picking one as the universal answer and forcing the other two shapes through it. The friction shows up at the operational layer well before it shows up in benchmarks, which is why the vendor benchmarks all look like the vendor's product is the best, they're benchmarking the workload their product was built for.

For a home self-hosting setup, LanceDB is genuinely the cleanest fit; for a small-team production system the pgvector path is usually the right one even if it's not the highest-performance choice. For a Bedrock-style platform shop the question of running your own vector store at all is the real decision; once you've decided you're hosting it yourself, the three picks above cover the field.

What's likely to shift

Three things worth tracking through the rest of the year:

The first is pgvector's index improvements. The maintainers have been pushing hard on HNSW performance and on filtered vector search. If those land in the form they're prototyping, the gap between pgvector and Qdrant on high-scale workloads narrows enough that the "purpose-built database" advantage shrinks for a meaningful set of cases.

The second is Qdrant's hosted-tier pricing and feature pace. Their cloud product has been getting more competitive; if it keeps moving, the "self-host vs. managed" calculus changes for shops that don't want the operational burden.

The third is LanceDB's server-mode maturity. If they close the production-readiness gap, the embedded-and-server hybrid story becomes much more interesting than either pure approach. Right now you pick one; in a year you may pick both for different parts of the same system.

The thing that's not likely to shift: there's no single winner. The shape of your system determines which one fits, and the shape of your system changes more slowly than the vector-database market does. Pick for what you're actually building, not for what the benchmarks suggest.

What these three actually are

When each one fits

The trade-offs that don't show up in benchmarks

What I'm actually running

What's likely to shift

Subscribe to Echoes of the machine