Lesson 6 of 23

Embedding Models & Vector Databases

Vector Database Landscape

3 min read

Vector databases are purpose-built for storing and searching high-dimensional vectors. Understanding their trade-offs helps you choose the right solution.

Database Categories

Category Examples Best For
Managed Cloud Pinecone, Weaviate Cloud Production, no ops
Self-Hosted Qdrant, Milvus, Weaviate Control, cost at scale
Embedded Chroma, LanceDB Development, edge
Extensions pgvector Existing Postgres infra

Pinecone

Fully managed, serverless vector database:

from pinecone import Pinecone

pc = Pinecone(api_key="your-api-key")

# Create index
pc.create_index(
    name="rag-index",
    dimension=1536,
    metric="cosine",
    spec={"serverless": {"cloud": "aws", "region": "us-east-1"}}
)

index = pc.Index("rag-index")

# Upsert vectors
index.upsert(vectors=[
    {
        "id": "doc1",
        "values": embedding,
        "metadata": {"source": "manual.pdf", "page": 5}
    }
])

# Query
results = index.query(
    vector=query_embedding,
    top_k=5,
    include_metadata=True,
    filter={"source": {"$eq": "manual.pdf"}}
)

Strengths: Zero ops, auto-scaling, excellent metadata filtering Considerations: Cost at scale, vendor lock-in

Qdrant

High-performance open-source database:

from qdrant_client import QdrantClient
from qdrant_client.models import Distance, VectorParams, PointStruct

client = QdrantClient(host="localhost", port=6333)

# Create collection
client.create_collection(
    collection_name="documents",
    vectors_config=VectorParams(size=1536, distance=Distance.COSINE)
)

# Insert points
client.upsert(
    collection_name="documents",
    points=[
        PointStruct(
            id=1,
            vector=embedding,
            payload={"source": "manual.pdf", "page": 5}
        )
    ]
)

# Search with filtering
results = client.search(
    collection_name="documents",
    query_vector=query_embedding,
    limit=5,
    query_filter={
        "must": [{"key": "source", "match": {"value": "manual.pdf"}}]
    }
)

Strengths: Fast, excellent filtering, hybrid search support Considerations: Self-hosting complexity

Weaviate

Schema-based vector database with GraphQL:

import weaviate
from weaviate.classes.config import Property, DataType

client = weaviate.connect_to_local()

# Create collection with schema
collection = client.collections.create(
    name="Document",
    properties=[
        Property(name="content", data_type=DataType.TEXT),
        Property(name="source", data_type=DataType.TEXT),
        Property(name="page", data_type=DataType.INT),
    ]
)

# Insert with auto-vectorization (optional)
collection.data.insert({
    "content": "Document text here",
    "source": "manual.pdf",
    "page": 5
})

# Search
results = collection.query.near_vector(
    near_vector=query_embedding,
    limit=5,
    filters=weaviate.classes.query.Filter.by_property("source").equal("manual.pdf")
)

Strengths: Built-in vectorization, GraphQL, multi-modal Considerations: Learning curve, resource requirements

Chroma

Lightweight embedded database (great for development):

import chromadb
from chromadb.utils import embedding_functions

client = chromadb.PersistentClient(path="./chroma_db")

# Use OpenAI embeddings
openai_ef = embedding_functions.OpenAIEmbeddingFunction(
    api_key="your-api-key",
    model_name="text-embedding-3-small"
)

# Create collection
collection = client.get_or_create_collection(
    name="documents",
    embedding_function=openai_ef
)

# Add documents (auto-embeds)
collection.add(
    documents=["Document text here"],
    metadatas=[{"source": "manual.pdf", "page": 5}],
    ids=["doc1"]
)

# Query
results = collection.query(
    query_texts=["search query"],
    n_results=5,
    where={"source": "manual.pdf"}
)

Strengths: Simple API, no server needed, fast setup Considerations: Not for production scale

pgvector (PostgreSQL Extension)

Add vectors to existing Postgres:

import psycopg2
from pgvector.psycopg2 import register_vector

conn = psycopg2.connect("postgresql://localhost/mydb")
register_vector(conn)

cur = conn.cursor()

# Create table with vector column
cur.execute("""
    CREATE TABLE documents (
        id SERIAL PRIMARY KEY,
        content TEXT,
        source TEXT,
        embedding vector(1536)
    )
""")

# Create index for fast search
cur.execute("""
    CREATE INDEX ON documents
    USING ivfflat (embedding vector_cosine_ops)
    WITH (lists = 100)
""")

# Insert
cur.execute(
    "INSERT INTO documents (content, source, embedding) VALUES (%s, %s, %s)",
    ("Document text", "manual.pdf", embedding)
)

# Query
cur.execute("""
    SELECT content, source, 1 - (embedding <=> %s) as similarity
    FROM documents
    WHERE source = 'manual.pdf'
    ORDER BY embedding <=> %s
    LIMIT 5
""", (query_embedding, query_embedding))

Strengths: Use existing Postgres, ACID compliance, joins Considerations: Not as fast as dedicated vector DBs

Comparison Matrix

Feature Pinecone Qdrant Weaviate Chroma pgvector
Hosting Managed Both Both Embedded Self
Filtering Excellent Excellent Good Basic SQL
Hybrid Search Yes Yes Yes No Manual
Scale Auto Manual Manual Limited Limited
Cost $$ Free/$ Free/$ Free Free
Setup Easy Medium Medium Easy Easy

Decision Framework

START
Need managed service with zero ops?
  ├─ YES → Pinecone
Already using PostgreSQL?
  ├─ YES → pgvector
Development/prototyping?
  ├─ YES → Chroma
Need hybrid search + advanced filtering?
  ├─ YES → Qdrant or Weaviate
Default → Qdrant (best performance/features balance)

Migration Tip: Start with Chroma for development, then migrate to Qdrant or Pinecone for production. The LangChain abstraction makes switching relatively easy.

Next, let's explore indexing strategies that determine search speed and accuracy. :::

Quiz

Module 2: Embedding Models & Vector Databases

Take Quiz