Vector Databases Compared: pgvector, Qdrant, Pinecone, and When You Don't Need Any
Somewhere between "we need AI" and "we picked Pinecone" there is usually a meeting that skipped a few questions. How many vectors? How often do they change? Do you need metadata filtering? Do you already have a Postgres you are paying for? These questions matter more than the vendor comparison pages would have you believe.
I have shipped vector search on Postgres, on Qdrant, and on Pinecone. Each was the right call at the time, and each would have been the wrong call in a different context. This post is the practical comparison — honest about trade-offs, and honest about when you do not need any of them.
The Thing to Understand First
A vector database does three things: store embeddings, do approximate nearest neighbor search on them, and filter/combine with other query conditions. Everything else — replication, horizontal scale, hybrid search, multi-tenant isolation — is a layer above that core.
Approximate nearest neighbor (ANN) is the interesting part. Exact nearest neighbor is O(n) in the corpus size; at a million vectors it is already too slow for interactive use. ANN indexes — HNSW, IVF, ScaNN — trade a small amount of recall for orders-of-magnitude speed. All the mainstream databases use HNSW or a variant by default in 2026.
pgvector: The One That Shows Up for Free
pgvector is a Postgres extension that adds a vector column type, HNSW and IVFFlat indexes, and <=> / <-> / <#> distance operators. It is open source, runs wherever your Postgres runs, and is the default choice on every hosted Postgres I have used — RDS, Neon, Supabase, Aiven.
What it is good for:
- Small-to-medium corpora (under ~10M vectors) where the operational simplicity of "it is just a Postgres column" wins.
- Cases where you want transactional consistency between your vectors and your relational data. If a product row is deleted, the embedding goes with it in the same transaction.
- Teams that already have Postgres expertise and do not want to add another database to their operational surface.
What it is not good for:
- Very large corpora (hundreds of millions of vectors). Performance holds up further than people assume with HNSW, but memory pressure becomes the problem.
- Workloads with heavy, independent vector scaling. When your vector workload grows very differently from your relational workload, co-locating them in one database is a tax.
The shape of a pgvector query:
SELECT id, content
FROM documents
WHERE tenant_id = $1
ORDER BY embedding <=> $2
LIMIT 20;
That is it. Metadata filtering ("WHERE tenant_id = $1") integrates naturally because the whole thing is SQL. Hybrid search — combining BM25 via Postgres full-text search with vector similarity — is three lines of SQL away.
My heuristic: if your corpus fits in a single Postgres box's memory, start with pgvector. You can revisit when it stops holding up.
Qdrant: The Purpose-Built Open Source Pick
Qdrant is a standalone vector database written in Rust. It is open source, self-hostable, and has a managed cloud. Compared to pgvector it is a dedicated tool: the ANN implementation is tuned harder, the filtering engine is more sophisticated, and it scales horizontally out of the box.
What makes Qdrant notable in practice:
- Filter performance. Qdrant's filter engine can do pre-filtering without collapsing recall. This matters when your typical query is "find similar items where tenant_id = X." In pgvector, heavy filtering can push the query off the ANN index and into a slow scan.
- Payload support. You store structured metadata alongside each vector and filter on it at query time. Close to what you get in pgvector, but the engine is optimized for filter-heavy workloads.
- Horizontal scale. Sharding and replication are built in. Not a trivial deployment, but a real option.
The API is a thin HTTP/gRPC surface:
from qdrant_client import QdrantClient
from qdrant_client.models import VectorParams, Distance
client = QdrantClient(url="https://qdrant.example.com")
client.create_collection(
collection_name="docs",
vectors_config=VectorParams(size=1536, distance=Distance.COSINE),
)
client.upsert(
collection_name="docs",
points=[{"id": 1, "vector": embedding, "payload": {"tenant_id": 42}}],
)
results = client.search(
collection_name="docs",
query_vector=query_embedding,
query_filter={"must": [{"key": "tenant_id", "match": {"value": 42}}]},
limit=20,
)
When I reach for Qdrant: the corpus is larger than a single Postgres can hold comfortably, or the workload is dominated by filtered similarity queries where pgvector's recall would suffer. Self-hostable is a real feature for regulated environments.
Pinecone: The Fully Managed Default
Pinecone is a fully managed vector database. No servers, no replication to configure, no capacity planning beyond picking a tier. It is what a large fraction of teams actually ship on, because the operational load is close to zero.
What you trade for that:
- Cost. For serious workloads, Pinecone is meaningfully more expensive than self-hosted alternatives. The bill stops being negligible quickly.
- Data locality. Your vectors live in someone else's cloud. Compliance conversations change.
- Less flexibility. You cannot write custom indexes, run your own ANN tuning, or drop down to the raw engine. The tradeoffs are picked for you.
The shape:
from pinecone import Pinecone
pc = Pinecone(api_key="...")
index = pc.Index("docs")
index.upsert([("doc-1", embedding, {"tenant_id": 42})])
results = index.query(
vector=query_embedding,
top_k=20,
filter={"tenant_id": 42},
include_metadata=True,
)
Pinecone's managed experience is legitimately good. When operational overhead is the binding constraint and cost is not, it is the right pick.
The Decision I Actually Make
I ask three questions, in order:
1. Do I already have Postgres, and will this corpus plausibly stay under 10M vectors for the foreseeable future? Yes → pgvector. Ship it in a day, revisit if you outgrow it. No → continue.
2. Is operational simplicity — zero servers, zero ops work — worth a material per-month cost? Yes → Pinecone. Stop thinking about it. No → continue.
3. Will I benefit from tuning, custom indexes, or self-hosting in a regulated environment? Yes → Qdrant or Weaviate (pick Qdrant unless you specifically want GraphQL). No → go back and pick one of the first two.
I do not use "features" as a tiebreaker because everyone supports the features that matter — HNSW, filtering, hybrid search, metadata — with different levels of polish. The tiebreaker is what operational posture fits your team.
When You Do Not Need a Vector Database at All
The uncomfortable observation: many "RAG" systems would be better served by classical information retrieval.
If your corpus is under ~10,000 chunks and updates rarely, a simple in-memory index (NumPy + cosine similarity) is faster than any database, has zero infrastructure, and fits in a single file. I have shipped one that is literally a pickle file loaded at process start.
If your queries are dominated by exact match or keyword search, a vector database is the wrong tool. BM25 via Elasticsearch, Meilisearch, or Postgres full-text search will beat semantic search on metrics you care about. Add vectors on top only if you need them.
If your user's query is already highly structured — "invoices from 2024 over $10,000" — vectors add nothing. You want SQL or a search engine with proper facets.
The pattern I see most often: team decides to "add AI," reaches for the most-AI-sounding infrastructure, and ends up with a fragile system that performs worse than the Elasticsearch they could have written in two hours. Vectors are great when the query is fuzzy and the answer space is semantic. They are a mismatch for anything else.
What Changed in 2026
A few things that are worth knowing:
- pgvector got materially better. The HNSW implementation caught up to purpose-built stores for many workloads. The "you must pick a specialist vector DB" argument is weaker than it was two years ago.
- Hybrid search is table stakes. Every serious vector DB now supports BM25-style sparse retrieval and a merge step. Do not accept a database that does not.
- Managed Postgres-with-pgvector is the overwhelming default for new projects I audit. RDS, Neon, and Supabase all expose it well.
- Self-hosted Elasticsearch-with-vector-plugins is the dark horse. For teams that already operate Elasticsearch, ES's vector support is mature enough to be the one-and-done choice.
A Small Taxonomy
Where each pick sits, roughly:
- Prototypes, under 10K chunks → in-memory NumPy; do not bother with a database
- Small-medium scale with existing Postgres → pgvector
- Medium-large scale, zero-ops preference → Pinecone
- Medium-large scale, self-host preference → Qdrant (or Weaviate)
- Already on Elasticsearch → Elasticsearch vector fields
The wrong answer is not picking a worse database — it is picking the right database for the wrong workload. Ask the three questions. The answer is usually boring, and that is fine.
Keep reading
An MCP Server for Dynamics 365 Finance and Operations: Natural-Language Access to ERP
How to expose Dynamics 365 Finance and Operations to Claude through MCP — the architecture, the data entity choices, the auth model, and the operations that should never be one prompt away.
Event-Driven System Design: The Decisions That Bite You Later
A practical guide to the design decisions that determine whether an event-driven system stays maintainable or quietly rots — delivery guarantees, ordering, idempotency, schema evolution, and the outbox.
Claude Code Hooks: Automating Repo Guardrails Without Pre-Commit Fatigue
Use Claude Code hooks to enforce policy at the moment actions happen — before edits, before tool calls, on session start — without relying on pre-commit hooks everyone learns to bypass.
Newsletter
New posts, straight to your inbox
One email per post. No spam, no tracking pixels, unsubscribe anytime.
Comments
No comments yet. Be the first.