flowchart TD A[[Temporary<br>Instance Store]] B[[Permanent<br>EBS]] C(Storage connected<br>to EC2) C --> A C --> B
05. Storage
👈 Back to: 📝 Blog | 💼 LinkedIn | ✍️ Medium
AWS Storage Types
Three storage types:
- Block storage: Splits data into fixed-size blocks with addresses → efficient random access; common for OS, DB volumes.
- File storage: Hierarchical directory tree (paths); each file has metadata (name, size, created date). More folders can add latency.
- Object storage: Flat namespace; each object = data + metadata + unique identifier; optimized for scale & throughput.
File Storage (What/When)
- Structure: Tree-like folders (like traditional file systems).
- Note: Every additional folder adds latency.
- Use cases: large content repositories, dev environments, user home directories.
Block Storage (What/Why)
- Concept: Files split into addressable blocks → efficient retrieval.
- Typically used where low-latency, frequent updates, or OS-level mount semantics are needed.

Object Storage (What/Why)
- Flat structure (no true hierarchy).
- Good for high throughput and massive scale.
- Updating part of an object generally means overwriting the whole object.
File vs Object (from your notes):
- File: hierarchy YES; low-latency R/W; partial edit implies overwrite file; example: Amazon EFS
- Object: flat; high throughput; partial edit implies overwrite object; example: Amazon S3
| File Storage | Object Storge |
|---|---|
| Hierarchy - YES; Folder or tree-like structure | Flat Structure. No hierarchy |
| Good for low-latency read-write | Good for high throughput |
| Edit a portion, you overwrite the whole file | Edit a portion, you overwrite the whole object |
| Amazon EFS | Amazon S3 |
EC2 Storage Options: Instance Store vs EBS
Two EC2 instance storage options
- Instance Store: ephemeral block storage; good for stateless workloads.
- EBS: persistent block storage; behaves like an external drive that can outlive the instance.
2.2 Attachment relationships (EC2 ↔︎ EBS)
1 EC2 to Many EBS Volumes
- An EBS volume (in the same AZ) can be detached from one EC2 and attached to another.
flowchart LR EC[EC2] EB1[[EBS 1]] EB2[[EBS 2]] EB3[[EBS 3]] EC --> EB1 EC --> EB2 EC --> EB3
1 EBS to 1 EC2 (typical)
flowchart LR EC[EC2] --> EB1[[EBS 1]]
1 EBS to Many EC2 (supported for some instances)
flowchart LR EB1[[EBS 1]] ECA[EC2 A] ECB[EC2 B] EB1 --> ECA EB1 --> ECB
Analogy/limits (from your notes):
- Like an external drive: if compute fails, data can still remain on EBS.
- Volumes have max size limits (scalability bounded per volume).
2.3 Scaling EBS volumes
Two main approaches:
- Increase volume size up to the maximum (16 TB per volume, per your notes).
- Attach multiple volumes to a single EC2 instance (one-to-many).
2.4 AMI types (Instance Store-backed vs EBS-backed
flowchart TD AMI(AMI) AMI1[[Instance Store<br>Backed AMI]] AMI2[[EBS-Volume<br>backed AMI<br>most common]] AMI --> AMI1 AMI --> AMI2
Key points:
- If an instance running on instance-store backed AMI is stopped, data is lost.
- Instance-store backed AMIs are useful for stateless apps.
- Reboot does not lose instance-store data (stop/hibernate/terminate does).
3) Latency vs Throughput (Choosing storage/perf tradeoffs)
Latency = time for one packet to reach destination (important for DB + web interactions)
flowchart LR W[Web Server] -- 1 packet sent<br> in 10 millisec --> C[Client]
Throughput = number of packets delivered per second (important for big data)
flowchart LR W[Web Server] -- 10 packets sent<br> in 1 sec --> C[Client]
EBS Volume Types (SSD vs HDD) + Fit-to-Workload
Rule of thumb (from your table):
- Provisioned IOPS SSD → very low latency (databases, payment systems)
- General Purpose SSD → low latency (web servers, general transactional workloads)
- Throughput Optimized HDD → very high throughput (big data)
- Cold HDD → infrequently accessed; can tolerate higher latency, still may need throughput for transfers
Key point: SSDs are faster and more expensive than HDDs.

EBS Snapshots (Backups)
- Incremental backups
- First snapshot stores full data
- Later snapshots store only changed blocks
Amazon S3 (Object Storage)
What S3 is
- S3 is object storage: flat structure, objects addressed via unique identifiers.
- Object = file + metadata (store as many as needed).
S3 URL/structure (image requested)

S3 Security
- Everything is private by default
- You can make buckets/folders/objects public, but typical best practice is granular access.
- Two main access controls:
- IAM policies
- S3 bucket policies
When to use S3 bucket policies (per your notes):
- Simple cross-account access without IAM roles
- IAM policy size limit constraints (bucket policies support larger size)
Bucket policies apply to buckets only, not folders/objects.
S3 Encryption
- Encryption in transit and at rest
- Server-side encryption: S3 encrypts before storing; decrypts on download
- Client-side encryption: you encrypt before upload and manage keys/tools
S3 Versioning
- Helps recover from accidental delete/overwrite
- Delete puts a delete marker (object not immediately removed); remove marker to restore
- Overwrite creates a new version; older versions remain accessible
Bucket states:
- Unversioned (default)
- Versioning-enabled
- Versioning-suspended (no new versions, old versions remain)
S3 Storage Classes (quick mapping)
- S3 Standard: frequent access; low latency/high throughput; higher cost; 11-nines durability
- S3 Intelligent-Tiering: unpredictable access; auto-moves between frequent/infrequent tiers; small overhead
- S3 Standard-IA: infrequent but rapid access; lower storage cost, higher retrieval cost
- S3 One Zone-IA: infrequent, single-AZ redundancy; cheaper; lower availability
- S3 Glacier: archival; minutes to hours retrieval; very low storage cost
- S3 Glacier Deep Archive: lowest cost; retrieval up to ~12 hours
- S3 Outposts: on-prem S3 for local residency/low latency needs
Lifecycle Management (Automate cost control)
Lifecycle policies can automate:
- Transition (move between storage classes)
- Expiration (permanent deletion)
Good candidates:
- Periodic logs (keep 1 week/month then delete)
- Data whose access frequency decreases over time (hot → warm → archive → delete)
Storage Services Recap
- EC2 Instance Store: ephemeral block storage; for stateless apps; persists through reboot, not through stop/hibernate/terminate.
- EBS: persistent; supports resizing + snapshots; SSD for I/O sensitive, HDD for throughput intensive.
- S3: object storage; pay-as-you-go; replicated across multiple AZs; not attached to compute.
- EFS / FSx: serverless file services; no upfront provisioning; pay for use.
Quiz Notes (Key Takeaways)
- Max single S3 object size: 5 TB (good for media/video hosting).
- EBS fits high-transaction relational DB storage layers.
- Instance store data persists on reboot, not on stop/hibernate/terminate.
- S3 Standard-IA vs Glacier Deep Archive:
- IA when rarely accessed but needs quick retrieval
- Deep Archive when rarely accessed and retrieval delay is acceptable (often for compliance/legal retention)
- Block storage is best when only a small portion of a file changes.
Resource link (as provided): Storage_quiz