Skip to content

Kubernetes_storage

Chapter 23: Kubernetes Storage - PersistentVolumes and StorageClasses

Section titled “Chapter 23: Kubernetes Storage - PersistentVolumes and StorageClasses”
  1. Introduction to Kubernetes Storage
  2. The Storage Challenge
  3. PersistentVolumes (PV)
  4. PersistentVolumeClaims (PVC)
  5. StorageClasses
  6. Using Storage in Pods
  7. Volume Snapshots
  8. Storage Best Practices
  9. Hands-on Lab
  10. Summary

By default, containers have ephemeral storage. When a container restarts or is deleted, all data stored in the container’s filesystem is lost. This is a problem for applications that need to persist data.

┌─────────────────────────────────────────────────────────────────────────────┐
│ CONTAINER STORAGE - EPHEMERAL │
├─────────────────────────────────────────────────────────────────────────────┤
│ │
│ Problem: Container data is lost on restart │
│ ──────────────────────────────────────── │
│ │
│ ┌─────────────────────────────────────────────────────────────────┐ │
│ │ POD │ │
│ │ │ │
│ │ ┌─────────────────────────┐ │ │
│ │ │ Container │ │ │
│ │ │ │ Write to /data │ │
│ │ │ /data/file.txt │──────────────────┐ │ │
│ │ │ │ │ │ │
│ │ │ Container Filesystem │ │ │ │
│ │ │ │ ▼ │ │
│ │ └─────────────────────────┘ ┌──────────────┐ │ │
│ │ │ Data in RAM │ │ │
│ │ Container Restart ───────────────►│ or Temp Disk │ │ │
│ │ └──────────────┘ │ │
│ │ │ │
│ └─────────────────────────────────────────────────────────────────┘ │
│ │
│ Result: Data Lost! │
│ │
└─────────────────────────────────────────────────────────────────────────────┘
┌─────────────────────────────────────────────────────────────────────────────┐
│ PERSISTENT STORAGE SOLUTION │
├─────────────────────────────────────────────────────────────────────────────┤
│ │
│ Solution: Use PersistentVolumes │
│ │
│ ┌─────────────────────────────────────────────────────────────────┐ │
│ │ POD │ │
│ │ │ │
│ │ ┌─────────────────────────┐ │ │
│ │ │ Container │ │ │
│ │ │ │ Write to /data │ │
│ │ │ /data/file.txt │──────────────────┐ │ │
│ │ │ │ │ │ │
│ │ │ Container Filesystem │ │ │ │
│ │ │ │ ▼ │ │
│ │ └─────────────────────────┘ ┌──────────────┐ │ │
│ │ │ Persistent │ │ │
│ │ Container Restart ───────────────►│ Volume │ │ │
│ │ │ (NFS, Cloud│ │ │
│ │ │ Storage) │ │ │
│ │ └──────────────┘ │ │
│ │ │ │
│ └─────────────────────────────────────────────────────────────────┘ │
│ │
│ Result: Data Persists! │
│ │
└─────────────────────────────────────────────────────────────────────────────┘

┌─────────────────────────────────────────────────────────────────────────────┐
│ STORAGE CHALLENGES IN KUBERNETES │
├─────────────────────────────────────────────────────────────────────────────┤
│ │
│ Traditional Storage: │
│ ┌──────────────────┐ │
│ │ Application │ │
│ │ │ │
│ │ Mount /dev/sdb │ Fixed storage mapping │
│ │ │ │
│ └────────┬─────────┘ │
│ │ │
│ ▼ │
│ ┌──────────────────┐ │
│ │ Physical Disk │ │
│ └──────────────────┘ │
│ │
│ Kubernetes Storage: │
│ ┌──────────────────┐ │
│ │ Application │ │
│ │ │ │
│ │ PVC ──────────┼──────────────────────────────┐ │
│ │ │ │ │
│ └────────┬─────────┘ │ │
│ │ ▼ │
│ ┌────────┴─────────┐ ┌──────────────────┐ │
│ │ PV (Persistent │◄──────────────────│ Storage Backend │ │
│ │ Volume) │ │ - NFS │ │
│ │ │ │ - AWS EBS │ │
│ └──────────────────┘ │ - GCP PD │ │
│ │ - Ceph │ │
│ Kubernetes abstracts storage │ - etc. │ │
│ away from applications └──────────────────┘ │
│ │
└─────────────────────────────────────────────────────────────────────────────┘

A PersistentVolume (PV) is a piece of storage in the cluster that has been provisioned by an administrator or dynamically provisioned using StorageClasses. It exists independently of any individual Pod.

pv.yaml
apiVersion: v1
kind: PersistentVolume
metadata:
name: my-pv
labels:
type: local
spec:
storageClassName: manual
capacity:
storage: 10Gi
accessModes:
- ReadWriteOnce
persistentVolumeReclaimPolicy: Retain
# For local storage
nodeAffinity:
required:
nodeSelectorTerms:
- matchExpressions:
- key: kubernetes.io/hostname
operator: In
values:
- node-01
# Or use hostPath
# hostPath:
# path: "/mnt/data"
┌─────────────────────────────────────────────────────────────────────────────┐
│ PERSISTENTVOLUME ACCESS MODES │
├─────────────────────────────────────────────────────────────────────────────┤
│ │
│ Access Mode │ Abbreviation │ Description │
│ ─────────────────┼──────────────┼───────────────────────────── │
│ ReadWriteOnce │ RWO │ Single node read-write │
│ ReadOnlyMany │ ROX │ Multiple nodes read-only │
│ ReadWriteMany │ RWX │ Multiple nodes read-write │
│ │
│ Examples: │
│ ┌─────────────────────────────────────────────────────────────────────┐ │
│ │ │ │
│ │ ReadWriteOnce (RWO): │ │
│ │ ┌──────────┐ │ │
│ │ │ Node 1 │ ◄── Can read/write │ │
│ │ │ PV │ │ │
│ │ │ │ ✗ Node 2 cannot mount │ │
│ │ └──────────┘ │ │
│ │ │ │
│ │ ReadOnlyMany (ROX): │ │
│ │ ┌──────────┐ │ │
│ │ │ Node 1 │ ◄── Can read │ │
│ │ │ PV │ ◄── Can read │ │
│ │ │ │ │ │
│ │ │ Node 2 │ │ │
│ │ └──────────┘ │ │
│ │ │ │
│ │ ReadWriteMany (RWX): │ │
│ │ ┌──────────┐ │ │
│ │ │ Node 1 │ ◄── Can read/write │ │
│ │ │ PV │ ◄── Can read/write │ │
│ │ │ │ │ │
│ │ │ Node 2 │ │ │
│ │ └──────────┘ │ │
│ │ │ │
│ └─────────────────────────────────────────────────────────────────────┘ │
│ │
└─────────────────────────────────────────────────────────────────────────────┘
spec:
# Options: Retain, Delete, Recycle (deprecated)
persistentVolumeReclaimPolicy: Retain
  • Retain: Manual reclamation - data preserved
  • Delete: Delete the storage when PVC is deleted
  • Recycle: Delete data and make available again (deprecated)

A PersistentVolumeClaim (PVC) is a request for storage by a user. It is similar to a Pod: Pods consume node resources (CPU, RAM), while PVCs consume PV resources (storage size, IO).

pvc.yaml
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: my-pvc
spec:
storageClassName: manual
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 5Gi
┌─────────────────────────────────────────────────────────────────────────────┐
│ PV AND PVC BINDING │
├─────────────────────────────────────────────────────────────────────────────┤
│ │
│ ┌─────────────────────────────────────────────────────────────────┐ │
│ │ PVC Request │ │
│ │ name: my-pvc │ │
│ │ storage: 5Gi │ │
│ │ storageClassName: manual │ │
│ │ accessModes: [ReadWriteOnce] │ │
│ └─────────────────────────────────────────────────────────────────┘ │
│ │ │
│ │ Kubernetes tries to find │
│ │ a matching PV │
│ ▼ │
│ ┌─────────────────────────────────────────────────────────────────┐ │
│ │ PV Pool │ │
│ │ │ │
│ │ PV1: 10Gi, RWO, manual ◄──── MATCH! │ │
│ │ PV2: 20Gi, RWX, standard │ │
│ │ PV3: 5Gi, RWO, standard │ │
│ │ │ │
│ └─────────────────────────────────────────────────────────────────┘ │
│ │ │
│ ▼ │
│ ┌─────────────────────────────────────────────────────────────────┐ │
│ │ Bound! │ │
│ │ PVC "my-pvc" is now bound to PV1 │ │
│ │ The PV is no longer available for other PVCs │ │
│ └─────────────────────────────────────────────────────────────────┘ │
│ │
└─────────────────────────────────────────────────────────────────────────────┘

A StorageClass provides a way for administrators to describe different “classes” of storage. Different classes might map to different quality-of-service levels, backup policies, or arbitrary policies.

storage.kclass.yaml
api8s.io/v1
kind: StorageClass
metadata:
name: fast-storage
provisioner: kubernetes.io/gce-pd # For GCE
parameters:
type: pd-ssd
replication-type: regional-pd
reclaimPolicy: Retain
allowVolumeExpansion: true
volumeBindingMode: WaitForFirstConsumer
┌─────────────────────────────────────────────────────────────────────────────┐
│ STORAGE CLASS PROVISIONERS │
├─────────────────────────────────────────────────────────────────────────────┤
│ │
│ Cloud Provider │ Provisioner │ Example │
│ ────────────────┼──────────────────────────────────┼──────────── │
│ AWS EBS │ kubernetes.io/aws-ebs │ gp2, io1, st1 │
│ GCP PD │ kubernetes.io/gce-pd │ pd-standard, pd-ssd│
│ Azure Disk │ kubernetes.io/azure-disk │ Standard_LRS │
│ Azure Files │ kubernetes.io/azure-file │ │
│ NFS │ nfs.provisioner.io │ │
│ Ceph RBD │ ceph.com/rbd │ │
│ CephFS │ ceph.com/cephfs │ │
│ Local │ kubernetes.io/no-provisioner │ │
│ │
│ Managed Services: │
│ ┌─────────────────────────────────────────────────────────────────┐ │
│ │ AWS: gp3, io2 Block Express │ │
│ │ GCP: pd-balanced, extreme │ │
│ │ Azure: premium-v2, standard-v2 │ │
│ └─────────────────────────────────────────────────────────────────┘ │
│ │
└─────────────────────────────────────────────────────────────────────────────┘
┌─────────────────────────────────────────────────────────────────────────────┐
│ DYNAMIC PROVISIONING │
├─────────────────────────────────────────────────────────────────────────────┤
│ │
│ Without StorageClass With StorageClass │
│ ──────────────────────── ──────────────────────── │
│ │
│ Admin creates PVs Admin creates StorageClass │
│ ┌─────┐ ┌─────┐ ┌─────────────────────┐ │
│ │ PV1 │ │ PV2 │ │ StorageClass: fast │ │
│ └─────┘ └─────┘ │ Provisioner: aws-ebs│ │
│ ▲ ▲ │ Type: io1 │ │
│ │ │ └─────────────────────┘ │
│ User requests PVC │ │
│ ┌─────┐ ▼ │
│ │PVC1 │ User requests PVC │
│ └─────┘ │ │
│ ▼ │
│ ┌─────────────────────┐ │
│ │ PROVISIONER │ │
│ │ (creates PV) │ │
│ └──────────┬──────────┘ │
│ │ │
│ ▼ │
│ ┌─────────────────────┐ │
│ │ Auto-created PV │ │
│ │ + Attached Disk │ │
│ └─────────────────────┘ │
│ │
└─────────────────────────────────────────────────────────────────────────────┘

pod-with-pvc.yaml
apiVersion: v1
kind: Pod
metadata:
name: myapp-pod
spec:
containers:
- name: myapp
image: nginx:latest
volumeMounts:
- name: my-storage
mountPath: /data
volumes:
- name: my-storage
persistentVolumeClaim:
claimName: my-pvc
pod-with-emptydir.yaml
apiVersion: v1
kind: Pod
metadata:
name: myapp-pod
spec:
containers:
- name: myapp
image: nginx:latest
volumeMounts:
- name: cache
mountPath: /tmp/cache
- name: shared-data
mountPath: /usr/share/nginx/html
volumes:
- name: cache
emptyDir:
sizeLimit: 100Mi
- name: shared-data
emptyDir: {}

volume-snapshot-class.yaml
apiVersion: snapshot.storage.k8s.io/v1
kind: VolumeSnapshotClass
metadata:
name: my-snapshot-class
driver: pd.csi.storage.gke.io
parameters:
type: pd-standard
deletionPolicy: Delete
volume-snapshot.yaml
apiVersion: snapshot.storage.k8s.io/v1
kind: VolumeSnapshot
metadata:
name: my-snapshot
spec:
volumeSnapshotClassName: my-snapshot-class
source:
persistentVolumeClaimName: my-pvc
restore-pvc.yaml
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: restored-pvc
spec:
storageClassName: fast-storage
dataSource:
name: my-snapshot
kind: VolumeSnapshot
apiGroup: snapshot.storage.k8s.io
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 5Gi

┌─────────────────────────────────────────────────────────────────────────────┐
│ STORAGE BEST PRACTICES │
├─────────────────────────────────────────────────────────────────────────────┤
│ │
│ ✓ DO: │
│ ───────────────────────────────────────────────────────────────────── │
│ • Use PVCs for all persistent data │
│ • Set appropriate storage requests │
│ • Use StorageClasses for dynamic provisioning │
│ • Enable volume snapshots for backup │
│ • Use ReadWriteMany (RWX) when multiple pods need access │
│ • Set proper access modes matching your needs │
│ │
│ ✗ DON'T: │
│ ───────────────────────────────────────────────────────────────────── │
│ • Don't store data in container filesystem │
│ • Don't use hostPath in production │
│ • Don't use emptyDir for persistent data │
│ • Don't forget to set resource limits │
│ │
│ Storage Selection Guide: │
│ ──────────────────────── │
│ ┌─────────────────────────────────────────────────────────────────┐ │
│ │ Workload │ Recommended Storage │ │
│ │ ──────────────────────────────────────────────────────────── │ │
│ │ Database (MySQL) │ RWO, SSD (io1/gp3) │ │
│ │ Database (Mongo) │ RWO, SSD (gp3) │ │
│ │ File Storage │ RWX, NFS │ │
│ │ Cache/Temp │ emptyDir, memory │ │
│ │ Logs │ emptyDir or persistent │ │
│ └─────────────────────────────────────────────────────────────────┘ │
│ │
└─────────────────────────────────────────────────────────────────────────────┘

In this hands-on lab, we’ll create and use PersistentVolumes.

  • A running Kubernetes cluster (minikube or kind)
Terminal window
# Step 1: Create a PersistentVolume (using hostPath for local testing)
cat > pv.yaml << 'EOF'
apiVersion: v1
kind: PersistentVolume
metadata:
name: my-pv
labels:
type: local
spec:
storageClassName: manual
capacity:
storage: 1Gi
accessModes:
- ReadWriteOnce
hostPath:
path: "/mnt/data"
EOF
kubectl apply -f pv.yaml
kubectl get pv
# Step 2: Create a PersistentVolumeClaim
cat > pvc.yaml << 'EOF'
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: my-pvc
spec:
storageClassName: manual
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 500Mi
EOF
kubectl apply -f pvc.yaml
kubectl get pvc
# Step 3: Create a Pod using the PVC
cat > pod-pvc.yaml << 'EOF'
apiVersion: v1
kind: Pod
metadata:
name: myapp-pod
spec:
containers:
- name: myapp
image: nginx:latest
volumeMounts:
- name: my-storage
mountPath: /usr/share/nginx/html
volumes:
- name: my-storage
persistentVolumeClaim:
claimName: my-pvc
EOF
kubectl apply -f pod-pvc.yaml
kubectl get pods
# Step 4: Write data to the volume
kubectl exec myapp-pod -- sh -c 'echo "Hello from Persistent Volume" > /usr/share/nginx/html/index.html'
# Step 5: Delete and recreate pod to test persistence
kubectl delete pod myapp-pod
kubectl apply -f pod-pvc.yaml
# Step 6: Verify data persists
kubectl exec myapp-pod -- cat /usr/share/nginx/html/index.html
# Step 7: Clean up
kubectl delete pod myapp-pod
kubectl delete pvc my-pvc
kubectl delete pv my-pv

  1. PersistentVolumes - Cluster-wide storage resources
  2. PersistentVolumeClaims - Requests for storage by pods
  3. StorageClasses - Dynamic provisioning of storage
  4. Access Modes - RWO, ROX, RWX for different use cases
  5. Volume Snapshots - Backup and restore capabilities
Terminal window
# Create PV
kubectl apply -f pv.yaml
# Create PVC
kubectl apply -f pvc.yaml
# Use PVC in Pod
# volumes:
# - name: my-storage
# persistentVolumeClaim:
# claimName: my-pvc
# Get storage info
kubectl get pv
kubectl get pvc
kubectl get storageclass

In the next chapter, we’ll explore Kubernetes Namespaces (Chapter 24), covering:

  • Namespace isolation
  • Resource quotas
  • Network policies per namespace