AWS Storage Services: S3, EBS, EFS & FSx

Storage selection impacts performance, cost, and availability. Interviewers frequently ask about trade-offs between storage options.

S3: Object Storage

Storage Classes & Use Cases

Class	Retrieval Time	Use Case	Cost (per GB/month)
Standard	Immediate	Frequently accessed	$0.023
Intelligent-Tiering	Immediate	Unknown access patterns	$0.0025 + monitoring
Standard-IA	Immediate	Infrequent, quick access needed	$0.0125
One Zone-IA	Immediate	Reproducible data, cost-sensitive	$0.01
Glacier Instant	Milliseconds	Archive, immediate access	$0.004
Glacier Flexible	1-12 hours	Archive, flexible retrieval	$0.0036
Glacier Deep Archive	12-48 hours	Long-term archive	$0.00099

Interview Question: S3 Performance

Q: "Your application needs to read 50,000 objects from S3 in under 60 seconds. How do you optimize?"

A: S3 performance optimization strategies:

Prefix parallelization: Distribute objects across multiple prefixes (3,500 PUT/5,500 GET per prefix per second)
S3 Transfer Acceleration: Use CloudFront edge locations for faster uploads
Multipart downloads: Download large objects in parallel chunks
Request parallelization: Use concurrent connections (50K objects ÷ 60s = ~833 requests/second is achievable)
S3 Select: Query data within objects to reduce transfer

S3 Security Best Practices

Enable S3 Block Public Access at account level
Use bucket policies with least-privilege
Enable SSE-S3 or SSE-KMS encryption by default
Enable versioning for critical data
Use S3 Access Points for multi-tenant access control

EBS: Block Storage

Volume Types Comparison

Type	IOPS	Throughput	Use Case
gp3	Up to 16,000	1,000 MB/s	General workloads, boot volumes
gp2	Burst to 16,000	250 MB/s	Legacy, burstable workloads
io2 Block Express	256,000	4,000 MB/s	Critical databases, SAP HANA
st1	N/A	500 MB/s	Big data, log processing
sc1	N/A	250 MB/s	Cold data, infrequent access

Interview Question: EBS vs Instance Store

Q: "When would you use instance store instead of EBS?"

A: Instance store (ephemeral) suits:

Temporary data: Scratch space, buffers, caches
High IOPS needs: i3en instances offer 400K IOPS
Cost sensitivity: No additional storage charges
Distributed systems: Where data is replicated elsewhere (Cassandra, Kafka)

Critical: Data is lost on stop/terminate. Never use for durable storage.

EBS Optimization Pattern

Database Tier:
  - Primary: io2 (256K IOPS, Multi-Attach disabled)
  - Replicas: gp3 (16K IOPS, cost-optimized)

Application Tier:
  - Boot: gp3 (3,000 IOPS baseline)
  - Data: Based on workload

Archive:
  - Snapshots to S3 Glacier

EFS: Elastic File System

Performance Modes

Mode	Use Case	Latency
General Purpose	Web serving, CMS, containers	Low (sub-ms)
Max I/O	Big data, media processing	Higher (ms)

Throughput Modes

Mode	Behavior	Use Case
Bursting	Scales with storage size	Variable workloads
Provisioned	Fixed throughput	Consistent performance
Elastic	Auto-scales throughput	Unpredictable workloads

Interview Question: EFS vs EBS

Q: "Your application runs on 5 EC2 instances and needs shared file storage. Compare EFS and EBS."

Factor	EFS	EBS Multi-Attach
Sharing	Thousands of instances	Up to 16 io2 volumes
Protocol	NFS (POSIX-compliant)	Block-level
Region scope	Multi-AZ	Single AZ
Use case	Shared content, CMS	Clustered databases
Cost	Higher ($0.30/GB)	Lower ($0.125/GB for gp3)

Recommendation: EFS for true shared filesystem needs; EBS Multi-Attach only for specific cluster applications.

FSx: Managed File Systems

FSx Options

Service	File System	Use Case
FSx for Windows	NTFS	Windows workloads, AD integration
FSx for Lustre	Lustre	HPC, ML training, video rendering
FSx for NetApp ONTAP	ONTAP	Enterprise NAS replacement
FSx for OpenZFS	ZFS	Linux/NFS workloads

Interview Question: FSx for Lustre with S3

Q: "You need to process 10TB of data from S3 with high throughput for ML training. What architecture?"

A: Use FSx for Lustre with S3 integration:

Create FSx for Lustre linked to S3 bucket
Data is lazy-loaded (not copied) from S3
100+ GB/s throughput for ML workloads
Write results back to S3 automatically
Delete FSx after processing (pay only for compute time)

Storage Decision Framework

Object storage (any size, web-accessible)? → S3
Block storage (single instance)?
  └── High IOPS (>16K)? → io2 Block Express
  └── General workload? → gp3
  └── Throughput-intensive? → st1
Shared file system?
  └── Windows? → FSx for Windows
  └── HPC/ML? → FSx for Lustre
  └── General Linux? → EFS or FSx for OpenZFS

Cost Tip: Always consider S3 Intelligent-Tiering for data with unknown access patterns - it automatically optimizes costs.

Next, we'll cover AWS networking fundamentals. :::