Vector Database Selection

Choosing the right vector database is crucial for RAG system performance. This lesson covers the major options and how to select the best one for your use case.

Vector Database Landscape

┌─────────────────────────────────────────────────────────────┐
│                Vector Database Options                       │
├─────────────────────────────────────────────────────────────┤
│                                                              │
│  Managed Services          │    Self-Hosted                 │
│  ──────────────────────    │    ─────────────────────────   │
│  • Pinecone                │    • Milvus                    │
│  • Weaviate Cloud          │    • Qdrant                    │
│  • Zilliz Cloud            │    • Chroma                    │
│                            │    • Weaviate                  │
│  Database Extensions       │                                │
│  ──────────────────────    │    In-Memory                   │
│  • pgvector (PostgreSQL)   │    ─────────────────────────   │
│  • Atlas Vector (MongoDB)  │    • FAISS                     │
│                            │    • Annoy                     │
│                                                              │
└─────────────────────────────────────────────────────────────┘

Comparison Table

Database	Best For	Scale	Filtering	Managed
Pinecone	Production, ease of use	Billions	Good	Yes
Qdrant	Filtering, self-hosted	Billions	Excellent	Optional
Milvus	High performance	Billions	Good	Optional
pgvector	PostgreSQL users	Millions	SQL-native	Via providers
Weaviate	GraphQL, hybrid search	Billions	Good	Optional
Chroma	Prototyping, small scale	Thousands	Basic	No

Pinecone

Strengths:

Fully managed, zero ops
Serverless pricing option
Simple API

from pinecone import Pinecone

# Initialize
pc = Pinecone(api_key="your-key")
index = pc.Index("documents")

# Upsert vectors
index.upsert(
    vectors=[
        {
            "id": "doc1",
            "values": embedding,
            "metadata": {"source": "manual.pdf", "page": 5}
        }
    ],
    namespace="product-docs"
)

# Query
results = index.query(
    vector=query_embedding,
    top_k=10,
    namespace="product-docs",
    filter={"source": {"$eq": "manual.pdf"}}
)

Considerations:

Vendor lock-in
Costs scale with vectors stored
Limited filtering compared to Qdrant

Qdrant

Strengths:

Excellent filtering capabilities
Open-source with cloud option
Rich payload support

from qdrant_client import QdrantClient
from qdrant_client.models import Distance, VectorParams, PointStruct

client = QdrantClient(url="http://localhost:6333")

# Create collection
client.create_collection(
    collection_name="documents",
    vectors_config=VectorParams(size=1536, distance=Distance.COSINE)
)

# Upsert
client.upsert(
    collection_name="documents",
    points=[
        PointStruct(
            id=1,
            vector=embedding,
            payload={
                "source": "manual.pdf",
                "page": 5,
                "category": "technical"
            }
        )
    ]
)

# Query with complex filtering
results = client.search(
    collection_name="documents",
    query_vector=query_embedding,
    query_filter={
        "must": [
            {"key": "category", "match": {"value": "technical"}},
            {"key": "page", "range": {"gte": 1, "lte": 10}}
        ]
    },
    limit=10
)

Considerations:

Self-hosted requires DevOps
Cloud option available but newer

pgvector

Strengths:

Familiar PostgreSQL
SQL joins with vector search
Existing infrastructure

import psycopg2

# Enable extension
cursor.execute("CREATE EXTENSION IF NOT EXISTS vector")

# Create table
cursor.execute("""
    CREATE TABLE documents (
        id SERIAL PRIMARY KEY,
        content TEXT,
        embedding vector(1536),
        metadata JSONB
    )
""")

# Create index for faster search
cursor.execute("""
    CREATE INDEX ON documents
    USING ivfflat (embedding vector_cosine_ops)
    WITH (lists = 100)
""")

# Query
cursor.execute("""
    SELECT id, content, metadata,
           1 - (embedding <=> %s) AS similarity
    FROM documents
    WHERE metadata->>'category' = 'technical'
    ORDER BY embedding <=> %s
    LIMIT 10
""", (query_embedding, query_embedding))

Considerations:

Limited scale (millions, not billions)
Index build time on large datasets
Great for small to medium applications

Selection Framework

Decision Tree

Start
  │
  ├─ Need billions of vectors?
  │     ├─ Yes ──▶ Pinecone or Milvus
  │     └─ No ───▶ Continue
  │
  ├─ Need complex filtering?
  │     ├─ Yes ──▶ Qdrant
  │     └─ No ───▶ Continue
  │
  ├─ Already using PostgreSQL?
  │     ├─ Yes ──▶ pgvector
  │     └─ No ───▶ Continue
  │
  ├─ Need zero ops?
  │     ├─ Yes ──▶ Pinecone
  │     └─ No ───▶ Qdrant or Milvus
  │
  └─ Prototyping only?
        ├─ Yes ──▶ Chroma
        └─ No ───▶ Qdrant

Cost Comparison (100M vectors)

Provider	Monthly Cost	Notes
Pinecone	$700-2000	Serverless or pod-based
Qdrant Cloud	$500-1500	Based on cluster size
Self-hosted	$200-500	Compute + storage
pgvector	$100-300	Existing DB may work

Interview Tip

When discussing vector databases, always mention:

Scale requirements - millions vs billions

Filtering needs - metadata queries

Operational complexity - managed vs self-hosted

Cost at scale - show you understand economics

Next, we'll explore hybrid retrieval strategies that combine dense and sparse retrieval. :::

Vector Database Landscape

Comparison Table

Pinecone

Qdrant

pgvector

Selection Framework

Decision Tree

Cost Comparison (100M vectors)

Interview Tip

Quiz

Stay on the Nerd Track