Embedding Models & Vector Databases
Vector Database Landscape
Vector databases are purpose-built for storing and searching high-dimensional vectors. Understanding their trade-offs helps you choose the right solution.
Database Categories
| Category | Examples | Best For |
|---|---|---|
| Managed Cloud | Pinecone, Weaviate Cloud | Production, no ops |
| Self-Hosted | Qdrant, Milvus, Weaviate | Control, cost at scale |
| Embedded | Chroma, LanceDB | Development, edge |
| Extensions | pgvector | Existing Postgres infra |
Pinecone
Fully managed, serverless vector database:
from pinecone import Pinecone
pc = Pinecone(api_key="your-api-key")
# Create index
pc.create_index(
name="rag-index",
dimension=1536,
metric="cosine",
spec={"serverless": {"cloud": "aws", "region": "us-east-1"}}
)
index = pc.Index("rag-index")
# Upsert vectors
index.upsert(vectors=[
{
"id": "doc1",
"values": embedding,
"metadata": {"source": "manual.pdf", "page": 5}
}
])
# Query
results = index.query(
vector=query_embedding,
top_k=5,
include_metadata=True,
filter={"source": {"$eq": "manual.pdf"}}
)
Strengths: Zero ops, auto-scaling, excellent metadata filtering Considerations: Cost at scale, vendor lock-in
Qdrant
High-performance open-source database:
from qdrant_client import QdrantClient
from qdrant_client.models import Distance, VectorParams, PointStruct
client = QdrantClient(host="localhost", port=6333)
# Create collection
client.create_collection(
collection_name="documents",
vectors_config=VectorParams(size=1536, distance=Distance.COSINE)
)
# Insert points
client.upsert(
collection_name="documents",
points=[
PointStruct(
id=1,
vector=embedding,
payload={"source": "manual.pdf", "page": 5}
)
]
)
# Search with filtering
results = client.search(
collection_name="documents",
query_vector=query_embedding,
limit=5,
query_filter={
"must": [{"key": "source", "match": {"value": "manual.pdf"}}]
}
)
Strengths: Fast, excellent filtering, hybrid search support Considerations: Self-hosting complexity
Weaviate
Schema-based vector database with GraphQL:
import weaviate
from weaviate.classes.config import Property, DataType
client = weaviate.connect_to_local()
# Create collection with schema
collection = client.collections.create(
name="Document",
properties=[
Property(name="content", data_type=DataType.TEXT),
Property(name="source", data_type=DataType.TEXT),
Property(name="page", data_type=DataType.INT),
]
)
# Insert with auto-vectorization (optional)
collection.data.insert({
"content": "Document text here",
"source": "manual.pdf",
"page": 5
})
# Search
results = collection.query.near_vector(
near_vector=query_embedding,
limit=5,
filters=weaviate.classes.query.Filter.by_property("source").equal("manual.pdf")
)
Strengths: Built-in vectorization, GraphQL, multi-modal Considerations: Learning curve, resource requirements
Chroma
Lightweight embedded database (great for development):
import chromadb
from chromadb.utils import embedding_functions
client = chromadb.PersistentClient(path="./chroma_db")
# Use OpenAI embeddings
openai_ef = embedding_functions.OpenAIEmbeddingFunction(
api_key="your-api-key",
model_name="text-embedding-3-small"
)
# Create collection
collection = client.get_or_create_collection(
name="documents",
embedding_function=openai_ef
)
# Add documents (auto-embeds)
collection.add(
documents=["Document text here"],
metadatas=[{"source": "manual.pdf", "page": 5}],
ids=["doc1"]
)
# Query
results = collection.query(
query_texts=["search query"],
n_results=5,
where={"source": "manual.pdf"}
)
Strengths: Simple API, no server needed, fast setup Considerations: Not for production scale
pgvector (PostgreSQL Extension)
Add vectors to existing Postgres:
import psycopg2
from pgvector.psycopg2 import register_vector
conn = psycopg2.connect("postgresql://localhost/mydb")
register_vector(conn)
cur = conn.cursor()
# Create table with vector column
cur.execute("""
CREATE TABLE documents (
id SERIAL PRIMARY KEY,
content TEXT,
source TEXT,
embedding vector(1536)
)
""")
# Create index for fast search
cur.execute("""
CREATE INDEX ON documents
USING ivfflat (embedding vector_cosine_ops)
WITH (lists = 100)
""")
# Insert
cur.execute(
"INSERT INTO documents (content, source, embedding) VALUES (%s, %s, %s)",
("Document text", "manual.pdf", embedding)
)
# Query
cur.execute("""
SELECT content, source, 1 - (embedding <=> %s) as similarity
FROM documents
WHERE source = 'manual.pdf'
ORDER BY embedding <=> %s
LIMIT 5
""", (query_embedding, query_embedding))
Strengths: Use existing Postgres, ACID compliance, joins Considerations: Not as fast as dedicated vector DBs
Comparison Matrix
| Feature | Pinecone | Qdrant | Weaviate | Chroma | pgvector |
|---|---|---|---|---|---|
| Hosting | Managed | Both | Both | Embedded | Self |
| Filtering | Excellent | Excellent | Good | Basic | SQL |
| Hybrid Search | Yes | Yes | Yes | No | Manual |
| Scale | Auto | Manual | Manual | Limited | Limited |
| Cost | $$ | Free/$ | Free/$ | Free | Free |
| Setup | Easy | Medium | Medium | Easy | Easy |
Decision Framework
START
│
▼
Need managed service with zero ops?
│
├─ YES → Pinecone
│
▼
Already using PostgreSQL?
│
├─ YES → pgvector
│
▼
Development/prototyping?
│
├─ YES → Chroma
│
▼
Need hybrid search + advanced filtering?
│
├─ YES → Qdrant or Weaviate
│
▼
Default → Qdrant (best performance/features balance)
Migration Tip: Start with Chroma for development, then migrate to Qdrant or Pinecone for production. The LangChain abstraction makes switching relatively easy.
Next, let's explore indexing strategies that determine search speed and accuracy. :::