Feature Stores & Feature Engineering
Tecton & Cloud Alternatives
2 min read
While Feast is excellent for getting started, managed feature stores offer enterprise features, better scalability, and reduced operational burden.
Managed Feature Store Landscape
┌─────────────────────────────────────────────────────────────────┐
│ Feature Store Options │
├─────────────────┬─────────────────┬─────────────────────────────┤
│ Open Source │ Managed │ Cloud-Native │
├─────────────────┼─────────────────┼─────────────────────────────┤
│ Feast │ Tecton │ AWS SageMaker FS │
│ │ Databricks FS │ GCP Vertex AI FS │
│ │ Hopsworks │ Azure Synapse │
└─────────────────┴─────────────────┴─────────────────────────────┘
Tecton
Tecton is a fully-managed feature platform built by the team that created Uber's Michelangelo.
Key Features
| Feature | Description |
|---|---|
| Real-time features | Sub-100ms latency |
| Feature monitoring | Drift detection built-in |
| Versioning | Automatic feature versioning |
| RBAC | Fine-grained access control |
| Notebooks | Interactive development |
When to Choose Tecton
| Situation | Recommendation |
|---|---|
| Real-time ML at scale | Tecton |
| Strict SLA requirements | Tecton |
| Enterprise governance needs | Tecton |
| Learning/prototyping | Feast |
| Budget constraints | Feast |
Cloud-Native Options
AWS SageMaker Feature Store
# AWS SageMaker Feature Store example
import boto3
from sagemaker.feature_store.feature_group import FeatureGroup
# Create feature group
feature_group = FeatureGroup(
name="customer-features",
sagemaker_session=sagemaker_session
)
# Define features
feature_definitions = [
{"FeatureName": "customer_id", "FeatureType": "String"},
{"FeatureName": "total_spend", "FeatureType": "Fractional"},
{"FeatureName": "order_count", "FeatureType": "Integral"},
]
# Create feature group
feature_group.create(
s3_uri=f"s3://{bucket}/feature-store",
record_identifier_name="customer_id",
event_time_feature_name="event_time",
feature_definitions=feature_definitions,
enable_online_store=True
)
Best for: Teams already on AWS with existing SageMaker workflows.
GCP Vertex AI Feature Store
# GCP Vertex AI Feature Store example
from google.cloud import aiplatform
# Initialize
aiplatform.init(project="my-project", location="us-central1")
# Create feature store
feature_store = aiplatform.Featurestore.create(
featurestore_id="my_featurestore",
online_store_fixed_node_count=1
)
# Create entity type
customer_entity = feature_store.create_entity_type(
entity_type_id="customer",
description="Customer entity"
)
# Create features
customer_entity.batch_create_features(
feature_configs={
"total_spend": {"value_type": "DOUBLE"},
"order_count": {"value_type": "INT64"},
}
)
Best for: Teams on GCP with Vertex AI pipelines.
Databricks Feature Store
# Databricks Feature Store example
from databricks.feature_store import FeatureStoreClient
fs = FeatureStoreClient()
# Create feature table
fs.create_table(
name="customer_features",
primary_keys=["customer_id"],
df=feature_df,
description="Customer aggregated features"
)
# Read features for training
training_set = fs.create_training_set(
df=labels_df,
feature_lookups=[
FeatureLookup(
table_name="customer_features",
feature_names=["total_spend", "order_count"],
lookup_key="customer_id"
)
],
label="churn"
)
Best for: Teams using Databricks for data engineering and ML.
Comparison Matrix
| Feature | Feast | Tecton | SageMaker | Vertex AI | Databricks |
|---|---|---|---|---|---|
| Open source | Yes | No | No | No | No |
| Real-time | Basic | Excellent | Good | Good | Good |
| Streaming | Limited | Native | Limited | Native | Spark |
| Monitoring | DIY | Built-in | Built-in | Built-in | Built-in |
| Cost | Infrastructure | Premium | Pay-per-use | Pay-per-use | Pay-per-use |
| Lock-in | None | Some | High | High | Medium |
Cost Considerations
| Solution | Typical Monthly Cost |
|---|---|
| Feast (self-hosted) | $500-2,000 (infrastructure) |
| Tecton | $5,000-50,000+ |
| SageMaker FS | $1,000-10,000 |
| Vertex AI FS | $1,000-10,000 |
| Databricks FS | Included in Databricks |
Decision Framework
Start
│
▼
┌─────────────────────────────────────┐
│ Do you need real-time ML at scale? │
└─────────────────┬───────────────────┘
│
┌───────┴───────┐
│ │
Yes No
│ │
▼ ▼
┌────────────────┐ ┌────────────────┐
│ Tecton or │ │ Already on a │
│ Cloud Provider │ │ cloud platform?│
└────────────────┘ └───────┬────────┘
│
┌────────┴────────┐
│ │
Yes No
│ │
▼ ▼
┌────────────────┐ ┌────────────────┐
│ Use cloud- │ │ Start with │
│ native option │ │ Feast │
└────────────────┘ └────────────────┘
Migration Path
Phase 1: Feast (Local)
│
│ Prove value, define features
│
▼
Phase 2: Feast (Cloud Storage)
│
│ Scale with S3/GCS backends
│
▼
Phase 3: Managed Solution
│
│ Move to Tecton/Cloud-native
│ when scale or SLA requires
▼
Phase 4: Enterprise Platform
Best Practices
| Practice | Why |
|---|---|
| Start simple | Don't over-engineer early |
| Define abstractions | Make migration easier |
| Monitor costs | Feature stores can be expensive |
| Evaluate regularly | Needs change over time |
Key insight: Start with Feast to learn feature store concepts and prove value. Migrate to managed solutions when real-time requirements, scale, or enterprise features justify the cost.
Next module: We'll explore model registry and serving with MLflow and BentoML. :::