Kubernetes_storage
Chapter 23: Kubernetes Storage - PersistentVolumes and StorageClasses
Section titled “Chapter 23: Kubernetes Storage - PersistentVolumes and StorageClasses”Table of Contents
Section titled “Table of Contents”- Introduction to Kubernetes Storage
- The Storage Challenge
- PersistentVolumes (PV)
- PersistentVolumeClaims (PVC)
- StorageClasses
- Using Storage in Pods
- Volume Snapshots
- Storage Best Practices
- Hands-on Lab
- Summary
Introduction to Kubernetes Storage
Section titled “Introduction to Kubernetes Storage”Container Storage is Ephemeral
Section titled “Container Storage is Ephemeral”By default, containers have ephemeral storage. When a container restarts or is deleted, all data stored in the container’s filesystem is lost. This is a problem for applications that need to persist data.
┌─────────────────────────────────────────────────────────────────────────────┐│ CONTAINER STORAGE - EPHEMERAL │├─────────────────────────────────────────────────────────────────────────────┤│ ││ Problem: Container data is lost on restart ││ ──────────────────────────────────────── ││ ││ ┌─────────────────────────────────────────────────────────────────┐ ││ │ POD │ ││ │ │ ││ │ ┌─────────────────────────┐ │ ││ │ │ Container │ │ ││ │ │ │ Write to /data │ ││ │ │ /data/file.txt │──────────────────┐ │ ││ │ │ │ │ │ ││ │ │ Container Filesystem │ │ │ ││ │ │ │ ▼ │ ││ │ └─────────────────────────┘ ┌──────────────┐ │ ││ │ │ Data in RAM │ │ ││ │ Container Restart ───────────────►│ or Temp Disk │ │ ││ │ └──────────────┘ │ ││ │ │ ││ └─────────────────────────────────────────────────────────────────┘ ││ ││ Result: Data Lost! ││ │└─────────────────────────────────────────────────────────────────────────────┘Solution: Persistent Storage
Section titled “Solution: Persistent Storage”┌─────────────────────────────────────────────────────────────────────────────┐│ PERSISTENT STORAGE SOLUTION │├─────────────────────────────────────────────────────────────────────────────┤│ ││ Solution: Use PersistentVolumes ││ ││ ┌─────────────────────────────────────────────────────────────────┐ ││ │ POD │ ││ │ │ ││ │ ┌─────────────────────────┐ │ ││ │ │ Container │ │ ││ │ │ │ Write to /data │ ││ │ │ /data/file.txt │──────────────────┐ │ ││ │ │ │ │ │ ││ │ │ Container Filesystem │ │ │ ││ │ │ │ ▼ │ ││ │ └─────────────────────────┘ ┌──────────────┐ │ ││ │ │ Persistent │ │ ││ │ Container Restart ───────────────►│ Volume │ │ ││ │ │ (NFS, Cloud│ │ ││ │ │ Storage) │ │ ││ │ └──────────────┘ │ ││ │ │ ││ └─────────────────────────────────────────────────────────────────┘ ││ ││ Result: Data Persists! ││ │└─────────────────────────────────────────────────────────────────────────────┘The Storage Challenge
Section titled “The Storage Challenge”Storage in Containers vs VMs
Section titled “Storage in Containers vs VMs”┌─────────────────────────────────────────────────────────────────────────────┐│ STORAGE CHALLENGES IN KUBERNETES │├─────────────────────────────────────────────────────────────────────────────┤│ ││ Traditional Storage: ││ ┌──────────────────┐ ││ │ Application │ ││ │ │ ││ │ Mount /dev/sdb │ Fixed storage mapping ││ │ │ ││ └────────┬─────────┘ ││ │ ││ ▼ ││ ┌──────────────────┐ ││ │ Physical Disk │ ││ └──────────────────┘ ││ ││ Kubernetes Storage: ││ ┌──────────────────┐ ││ │ Application │ ││ │ │ ││ │ PVC ──────────┼──────────────────────────────┐ ││ │ │ │ ││ └────────┬─────────┘ │ ││ │ ▼ ││ ┌────────┴─────────┐ ┌──────────────────┐ ││ │ PV (Persistent │◄──────────────────│ Storage Backend │ ││ │ Volume) │ │ - NFS │ ││ │ │ │ - AWS EBS │ ││ └──────────────────┘ │ - GCP PD │ ││ │ - Ceph │ ││ Kubernetes abstracts storage │ - etc. │ ││ away from applications └──────────────────┘ ││ │└─────────────────────────────────────────────────────────────────────────────┘PersistentVolumes (PV)
Section titled “PersistentVolumes (PV)”What is a PersistentVolume?
Section titled “What is a PersistentVolume?”A PersistentVolume (PV) is a piece of storage in the cluster that has been provisioned by an administrator or dynamically provisioned using StorageClasses. It exists independently of any individual Pod.
PV YAML Example
Section titled “PV YAML Example”apiVersion: v1kind: PersistentVolumemetadata: name: my-pv labels: type: localspec: storageClassName: manual capacity: storage: 10Gi accessModes: - ReadWriteOnce persistentVolumeReclaimPolicy: Retain # For local storage nodeAffinity: required: nodeSelectorTerms: - matchExpressions: - key: kubernetes.io/hostname operator: In values: - node-01 # Or use hostPath # hostPath: # path: "/mnt/data"PV Access Modes
Section titled “PV Access Modes”┌─────────────────────────────────────────────────────────────────────────────┐│ PERSISTENTVOLUME ACCESS MODES │├─────────────────────────────────────────────────────────────────────────────┤│ ││ Access Mode │ Abbreviation │ Description ││ ─────────────────┼──────────────┼───────────────────────────── ││ ReadWriteOnce │ RWO │ Single node read-write ││ ReadOnlyMany │ ROX │ Multiple nodes read-only ││ ReadWriteMany │ RWX │ Multiple nodes read-write ││ ││ Examples: ││ ┌─────────────────────────────────────────────────────────────────────┐ ││ │ │ ││ │ ReadWriteOnce (RWO): │ ││ │ ┌──────────┐ │ ││ │ │ Node 1 │ ◄── Can read/write │ ││ │ │ PV │ │ ││ │ │ │ ✗ Node 2 cannot mount │ ││ │ └──────────┘ │ ││ │ │ ││ │ ReadOnlyMany (ROX): │ ││ │ ┌──────────┐ │ ││ │ │ Node 1 │ ◄── Can read │ ││ │ │ PV │ ◄── Can read │ ││ │ │ │ │ ││ │ │ Node 2 │ │ ││ │ └──────────┘ │ ││ │ │ ││ │ ReadWriteMany (RWX): │ ││ │ ┌──────────┐ │ ││ │ │ Node 1 │ ◄── Can read/write │ ││ │ │ PV │ ◄── Can read/write │ ││ │ │ │ │ ││ │ │ Node 2 │ │ ││ │ └──────────┘ │ ││ │ │ ││ └─────────────────────────────────────────────────────────────────────┘ ││ │└─────────────────────────────────────────────────────────────────────────────┘PV Reclaim Policy
Section titled “PV Reclaim Policy”spec: # Options: Retain, Delete, Recycle (deprecated) persistentVolumeReclaimPolicy: Retain- Retain: Manual reclamation - data preserved
- Delete: Delete the storage when PVC is deleted
- Recycle: Delete data and make available again (deprecated)
PersistentVolumeClaims (PVC)
Section titled “PersistentVolumeClaims (PVC)”What is a PVC?
Section titled “What is a PVC?”A PersistentVolumeClaim (PVC) is a request for storage by a user. It is similar to a Pod: Pods consume node resources (CPU, RAM), while PVCs consume PV resources (storage size, IO).
PVC YAML Example
Section titled “PVC YAML Example”apiVersion: v1kind: PersistentVolumeClaimmetadata: name: my-pvcspec: storageClassName: manual accessModes: - ReadWriteOnce resources: requests: storage: 5GiBinding PV to PVC
Section titled “Binding PV to PVC”┌─────────────────────────────────────────────────────────────────────────────┐│ PV AND PVC BINDING │├─────────────────────────────────────────────────────────────────────────────┤│ ││ ┌─────────────────────────────────────────────────────────────────┐ ││ │ PVC Request │ ││ │ name: my-pvc │ ││ │ storage: 5Gi │ ││ │ storageClassName: manual │ ││ │ accessModes: [ReadWriteOnce] │ ││ └─────────────────────────────────────────────────────────────────┘ ││ │ ││ │ Kubernetes tries to find ││ │ a matching PV ││ ▼ ││ ┌─────────────────────────────────────────────────────────────────┐ ││ │ PV Pool │ ││ │ │ ││ │ PV1: 10Gi, RWO, manual ◄──── MATCH! │ ││ │ PV2: 20Gi, RWX, standard │ ││ │ PV3: 5Gi, RWO, standard │ ││ │ │ ││ └─────────────────────────────────────────────────────────────────┘ ││ │ ││ ▼ ││ ┌─────────────────────────────────────────────────────────────────┐ ││ │ Bound! │ ││ │ PVC "my-pvc" is now bound to PV1 │ ││ │ The PV is no longer available for other PVCs │ ││ └─────────────────────────────────────────────────────────────────┘ ││ │└─────────────────────────────────────────────────────────────────────────────┘StorageClasses
Section titled “StorageClasses”What is a StorageClass?
Section titled “What is a StorageClass?”A StorageClass provides a way for administrators to describe different “classes” of storage. Different classes might map to different quality-of-service levels, backup policies, or arbitrary policies.
StorageClass Example
Section titled “StorageClass Example”api8s.io/v1kind: StorageClassmetadata: name: fast-storageprovisioner: kubernetes.io/gce-pd # For GCEparameters: type: pd-ssd replication-type: regional-pdreclaimPolicy: RetainallowVolumeExpansion: truevolumeBindingMode: WaitForFirstConsumerCloud Provider StorageClasses
Section titled “Cloud Provider StorageClasses”┌─────────────────────────────────────────────────────────────────────────────┐│ STORAGE CLASS PROVISIONERS │├─────────────────────────────────────────────────────────────────────────────┤│ ││ Cloud Provider │ Provisioner │ Example ││ ────────────────┼──────────────────────────────────┼──────────── ││ AWS EBS │ kubernetes.io/aws-ebs │ gp2, io1, st1 ││ GCP PD │ kubernetes.io/gce-pd │ pd-standard, pd-ssd││ Azure Disk │ kubernetes.io/azure-disk │ Standard_LRS ││ Azure Files │ kubernetes.io/azure-file │ ││ NFS │ nfs.provisioner.io │ ││ Ceph RBD │ ceph.com/rbd │ ││ CephFS │ ceph.com/cephfs │ ││ Local │ kubernetes.io/no-provisioner │ ││ ││ Managed Services: ││ ┌─────────────────────────────────────────────────────────────────┐ ││ │ AWS: gp3, io2 Block Express │ ││ │ GCP: pd-balanced, extreme │ ││ │ Azure: premium-v2, standard-v2 │ ││ └─────────────────────────────────────────────────────────────────┘ ││ │└─────────────────────────────────────────────────────────────────────────────┘Dynamic Provisioning
Section titled “Dynamic Provisioning”┌─────────────────────────────────────────────────────────────────────────────┐│ DYNAMIC PROVISIONING │├─────────────────────────────────────────────────────────────────────────────┤│ ││ Without StorageClass With StorageClass ││ ──────────────────────── ──────────────────────── ││ ││ Admin creates PVs Admin creates StorageClass ││ ┌─────┐ ┌─────┐ ┌─────────────────────┐ ││ │ PV1 │ │ PV2 │ │ StorageClass: fast │ ││ └─────┘ └─────┘ │ Provisioner: aws-ebs│ ││ ▲ ▲ │ Type: io1 │ ││ │ │ └─────────────────────┘ ││ User requests PVC │ ││ ┌─────┐ ▼ ││ │PVC1 │ User requests PVC ││ └─────┘ │ ││ ▼ ││ ┌─────────────────────┐ ││ │ PROVISIONER │ ││ │ (creates PV) │ ││ └──────────┬──────────┘ ││ │ ││ ▼ ││ ┌─────────────────────┐ ││ │ Auto-created PV │ ││ │ + Attached Disk │ ││ └─────────────────────┘ ││ │└─────────────────────────────────────────────────────────────────────────────┘Using Storage in Pods
Section titled “Using Storage in Pods”Using PVC in Pod
Section titled “Using PVC in Pod”apiVersion: v1kind: Podmetadata: name: myapp-podspec: containers: - name: myapp image: nginx:latest volumeMounts: - name: my-storage mountPath: /data volumes: - name: my-storage persistentVolumeClaim: claimName: my-pvcUsing emptyDir for Temporary Storage
Section titled “Using emptyDir for Temporary Storage”apiVersion: v1kind: Podmetadata: name: myapp-podspec: containers: - name: myapp image: nginx:latest volumeMounts: - name: cache mountPath: /tmp/cache - name: shared-data mountPath: /usr/share/nginx/html volumes: - name: cache emptyDir: sizeLimit: 100Mi - name: shared-data emptyDir: {}Volume Snapshots
Section titled “Volume Snapshots”Creating Volume Snapshots
Section titled “Creating Volume Snapshots”apiVersion: snapshot.storage.k8s.io/v1kind: VolumeSnapshotClassmetadata: name: my-snapshot-classdriver: pd.csi.storage.gke.ioparameters: type: pd-standarddeletionPolicy: DeleteapiVersion: snapshot.storage.k8s.io/v1kind: VolumeSnapshotmetadata: name: my-snapshotspec: volumeSnapshotClassName: my-snapshot-class source: persistentVolumeClaimName: my-pvcRestoring from Snapshot
Section titled “Restoring from Snapshot”apiVersion: v1kind: PersistentVolumeClaimmetadata: name: restored-pvcspec: storageClassName: fast-storage dataSource: name: my-snapshot kind: VolumeSnapshot apiGroup: snapshot.storage.k8s.io accessModes: - ReadWriteOnce resources: requests: storage: 5GiStorage Best Practices
Section titled “Storage Best Practices”┌─────────────────────────────────────────────────────────────────────────────┐│ STORAGE BEST PRACTICES │├─────────────────────────────────────────────────────────────────────────────┤│ ││ ✓ DO: ││ ───────────────────────────────────────────────────────────────────── ││ • Use PVCs for all persistent data ││ • Set appropriate storage requests ││ • Use StorageClasses for dynamic provisioning ││ • Enable volume snapshots for backup ││ • Use ReadWriteMany (RWX) when multiple pods need access ││ • Set proper access modes matching your needs ││ ││ ✗ DON'T: ││ ───────────────────────────────────────────────────────────────────── ││ • Don't store data in container filesystem ││ • Don't use hostPath in production ││ • Don't use emptyDir for persistent data ││ • Don't forget to set resource limits ││ ││ Storage Selection Guide: ││ ──────────────────────── ││ ┌─────────────────────────────────────────────────────────────────┐ ││ │ Workload │ Recommended Storage │ ││ │ ──────────────────────────────────────────────────────────── │ ││ │ Database (MySQL) │ RWO, SSD (io1/gp3) │ ││ │ Database (Mongo) │ RWO, SSD (gp3) │ ││ │ File Storage │ RWX, NFS │ ││ │ Cache/Temp │ emptyDir, memory │ ││ │ Logs │ emptyDir or persistent │ ││ └─────────────────────────────────────────────────────────────────┘ ││ │└─────────────────────────────────────────────────────────────────────────────┘Hands-on Lab
Section titled “Hands-on Lab”Lab: Working with Persistent Storage
Section titled “Lab: Working with Persistent Storage”In this hands-on lab, we’ll create and use PersistentVolumes.
Prerequisites
Section titled “Prerequisites”- A running Kubernetes cluster (minikube or kind)
Lab Steps
Section titled “Lab Steps”# Step 1: Create a PersistentVolume (using hostPath for local testing)cat > pv.yaml << 'EOF'apiVersion: v1kind: PersistentVolumemetadata: name: my-pv labels: type: localspec: storageClassName: manual capacity: storage: 1Gi accessModes: - ReadWriteOnce hostPath: path: "/mnt/data"EOF
kubectl apply -f pv.yamlkubectl get pv
# Step 2: Create a PersistentVolumeClaimcat > pvc.yaml << 'EOF'apiVersion: v1kind: PersistentVolumeClaimmetadata: name: my-pvcspec: storageClassName: manual accessModes: - ReadWriteOnce resources: requests: storage: 500MiEOF
kubectl apply -f pvc.yamlkubectl get pvc
# Step 3: Create a Pod using the PVCcat > pod-pvc.yaml << 'EOF'apiVersion: v1kind: Podmetadata: name: myapp-podspec: containers: - name: myapp image: nginx:latest volumeMounts: - name: my-storage mountPath: /usr/share/nginx/html volumes: - name: my-storage persistentVolumeClaim: claimName: my-pvcEOF
kubectl apply -f pod-pvc.yamlkubectl get pods
# Step 4: Write data to the volumekubectl exec myapp-pod -- sh -c 'echo "Hello from Persistent Volume" > /usr/share/nginx/html/index.html'
# Step 5: Delete and recreate pod to test persistencekubectl delete pod myapp-podkubectl apply -f pod-pvc.yaml
# Step 6: Verify data persistskubectl exec myapp-pod -- cat /usr/share/nginx/html/index.html
# Step 7: Clean upkubectl delete pod myapp-podkubectl delete pvc my-pvckubectl delete pv my-pvSummary
Section titled “Summary”Key Takeaways
Section titled “Key Takeaways”- PersistentVolumes - Cluster-wide storage resources
- PersistentVolumeClaims - Requests for storage by pods
- StorageClasses - Dynamic provisioning of storage
- Access Modes - RWO, ROX, RWX for different use cases
- Volume Snapshots - Backup and restore capabilities
Quick Reference
Section titled “Quick Reference”# Create PVkubectl apply -f pv.yaml
# Create PVCkubectl apply -f pvc.yaml
# Use PVC in Pod# volumes:# - name: my-storage# persistentVolumeClaim:# claimName: my-pvc
# Get storage infokubectl get pvkubectl get pvckubectl get storageclassNext Steps
Section titled “Next Steps”In the next chapter, we’ll explore Kubernetes Namespaces (Chapter 24), covering:
- Namespace isolation
- Resource quotas
- Network policies per namespace