AWS Architecture & Services Deep Dive

AWS Storage Services: S3, EBS, EFS & FSx

4 min read

Storage selection impacts performance, cost, and availability. Interviewers frequently ask about trade-offs between storage options.

S3: Object Storage

Storage Classes & Use Cases

ClassRetrieval TimeUse CaseCost (per GB/month)
StandardImmediateFrequently accessed$0.023
Intelligent-TieringImmediateUnknown access patterns$0.0025 + monitoring
Standard-IAImmediateInfrequent, quick access needed$0.0125
One Zone-IAImmediateReproducible data, cost-sensitive$0.01
Glacier InstantMillisecondsArchive, immediate access$0.004
Glacier Flexible1-12 hoursArchive, flexible retrieval$0.0036
Glacier Deep Archive12-48 hoursLong-term archive$0.00099

Interview Question: S3 Performance

Q: "Your application needs to read 50,000 objects from S3 in under 60 seconds. How do you optimize?"

A: S3 performance optimization strategies:

  1. Prefix parallelization: Distribute objects across multiple prefixes (3,500 PUT/5,500 GET per prefix per second)
  2. S3 Transfer Acceleration: Use CloudFront edge locations for faster uploads
  3. Multipart downloads: Download large objects in parallel chunks
  4. Request parallelization: Use concurrent connections (50K objects ÷ 60s = ~833 requests/second is achievable)
  5. S3 Select: Query data within objects to reduce transfer

S3 Security Best Practices

  • Enable S3 Block Public Access at account level
  • Use bucket policies with least-privilege
  • Enable SSE-S3 or SSE-KMS encryption by default
  • Enable versioning for critical data
  • Use S3 Access Points for multi-tenant access control

EBS: Block Storage

Volume Types Comparison

TypeIOPSThroughputUse Case
gp3Up to 16,0001,000 MB/sGeneral workloads, boot volumes
gp2Burst to 16,000250 MB/sLegacy, burstable workloads
io2 Block Express256,0004,000 MB/sCritical databases, SAP HANA
st1N/A500 MB/sBig data, log processing
sc1N/A250 MB/sCold data, infrequent access

Interview Question: EBS vs Instance Store

Q: "When would you use instance store instead of EBS?"

A: Instance store (ephemeral) suits:

  • Temporary data: Scratch space, buffers, caches
  • High IOPS needs: i3en instances offer 400K IOPS
  • Cost sensitivity: No additional storage charges
  • Distributed systems: Where data is replicated elsewhere (Cassandra, Kafka)

Critical: Data is lost on stop/terminate. Never use for durable storage.

EBS Optimization Pattern

Database Tier:
  - Primary: io2 (256K IOPS, Multi-Attach disabled)
  - Replicas: gp3 (16K IOPS, cost-optimized)

Application Tier:
  - Boot: gp3 (3,000 IOPS baseline)
  - Data: Based on workload

Archive:
  - Snapshots to S3 Glacier

EFS: Elastic File System

Performance Modes

ModeUse CaseLatency
General PurposeWeb serving, CMS, containersLow (sub-ms)
Max I/OBig data, media processingHigher (ms)

Throughput Modes

ModeBehaviorUse Case
BurstingScales with storage sizeVariable workloads
ProvisionedFixed throughputConsistent performance
ElasticAuto-scales throughputUnpredictable workloads

Interview Question: EFS vs EBS

Q: "Your application runs on 5 EC2 instances and needs shared file storage. Compare EFS and EBS."

A:

FactorEFSEBS Multi-Attach
SharingThousands of instancesUp to 16 io2 volumes
ProtocolNFS (POSIX-compliant)Block-level
Region scopeMulti-AZSingle AZ
Use caseShared content, CMSClustered databases
CostHigher ($0.30/GB)Lower ($0.125/GB for gp3)

Recommendation: EFS for true shared filesystem needs; EBS Multi-Attach only for specific cluster applications.

FSx: Managed File Systems

FSx Options

ServiceFile SystemUse Case
FSx for WindowsNTFSWindows workloads, AD integration
FSx for LustreLustreHPC, ML training, video rendering
FSx for NetApp ONTAPONTAPEnterprise NAS replacement
FSx for OpenZFSZFSLinux/NFS workloads

Interview Question: FSx for Lustre with S3

Q: "You need to process 10TB of data from S3 with high throughput for ML training. What architecture?"

A: Use FSx for Lustre with S3 integration:

  1. Create FSx for Lustre linked to S3 bucket
  2. Data is lazy-loaded (not copied) from S3
  3. 100+ GB/s throughput for ML workloads
  4. Write results back to S3 automatically
  5. Delete FSx after processing (pay only for compute time)

Storage Decision Framework

Object storage (any size, web-accessible)? → S3
Block storage (single instance)?
  └── High IOPS (>16K)? → io2 Block Express
  └── General workload? → gp3
  └── Throughput-intensive? → st1
Shared file system?
  └── Windows? → FSx for Windows
  └── HPC/ML? → FSx for Lustre
  └── General Linux? → EFS or FSx for OpenZFS

Cost Tip: Always consider S3 Intelligent-Tiering for data with unknown access patterns - it automatically optimizes costs.

Next, we'll cover AWS networking fundamentals. :::

Quick check: how does this lesson land for you?

Quiz

Module 2: AWS Architecture & Services Deep Dive

Take Quiz
FREE WEEKLY NEWSLETTER

Stay on the Nerd Track

One email per week — courses, deep dives, tools, and AI experiments.

No spam. Unsubscribe anytime.