Maintenance

Site is under maintenance — quizzes are still available.

Go to quizzes
Sponsored Reserved space — layout preview until AdSense is connected

Tutorial

Persistent Volumes and Claims: Making Kubernetes Work for Stateful Apps

Learn how Persistent Volumes, Persistent Volume Claims, and StorageClasses turn ephemeral Kubernetes clusters into platforms that can run databases, queues, and any stateful workload reliably.

June 2026 · 9 min read · 2 views · 0 hearts

Kubernetes isn’t built for stateful workloads by default. Its architecture assumes pods are ephemeral, cattle not pets. But the moment you run a database, a message queue, or any application that must survive a pod restart, you hit the storage wall. This article untangles how Persistent Volumes (PVs) and Persistent Volume Claims (PVCs) turn Kubernetes into a real platform for stateful apps.

Why Pods Forget Everything

Every pod starts with a clean slate. Containers are stateless by design—their filesystem is destroyed when the pod dies. For logging, caching, or any ephemeral data, that’s fine. But a PostgreSQL pod? Lose its data directory, lose the database.

Kubernetes solves this with a storage abstraction layer. You don’t attach a physical disk to a pod. Instead, you define what kind of storage you need and let the system figure out which physical resource to use.

Persistent Volumes: The Storage Backend

A Persistent Volume is a cluster-wide storage resource. It’s like a pool of disk capacity that pods can tap into. PVs can be backed by NFS, cloud disks (AWS EBS, GCE PD), local SSDs, or even network storage solutions like Ceph.

Example PV for a GCE Persistent Disk:

apiVersion: v1
kind: PersistentVolume
metadata:
  name: my-pv
spec:
  capacity:
    storage: 10Gi
  accessModes:
    - ReadWriteOnce
  gcePersistentDisk:
    pdName: my-disk
    fsType: ext4

That accessModes field is critical. It tells Kubernetes who can mount this volume. ReadWriteOnce means one node can read and write. ReadOnlyMany means many nodes can mount it read-only. ReadWriteMany is for shared storage like NFS.

Persistent Volume Claims: The Request

A PVC is a pod’s request for storage. You don’t reference the PV directly—you make a claim, and Kubernetes finds a matching PV or provisions one dynamically.

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: my-claim
spec:
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 5Gi

The claim is mounted into a pod like a volume:

apiVersion: v1
kind: Pod
metadata:
  name: my-pod
spec:
  containers:
  - name: my-container
    image: alpine
    volumeMounts:
    - mountPath: /data
      name: my-storage
  volumes:
  - name: my-storage
    persistentVolumeClaim:
      claimName: my-claim

Kubernetes will bind that claim to a PV with at least 5Gi and matching access mode. If no PV exists and you have dynamic provisioning configured (via a StorageClass), it creates one on the fly.

Dynamic Provisioning: The Game Changer

Manually creating PVs for every pod doesn’t scale. That’s where StorageClasses come in. A StorageClass defines a provisioner—a plugin that creates storage on demand.

apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: fast-ssd
provisioner: kubernetes.io/gce-pd
parameters:
  type: pd-ssd
  replication-type: none

Now a PVC that references storageClassName: fast-ssd will automatically get a GCE PD-SSD created and attached. No manual disk creation.

Most cloud clusters have a default StorageClass already defined. Check with kubectl get storageclass.

StatefulSets: Managing Identity for Stateful Workloads

Deployments are great for stateless apps. But stateful workloads often need stable network identities and ordered scaling. That’s the job of a StatefulSet.

A StatefulSet creates pods with predictable names (e.g., mysql-0, mysql-1, mysql-2). Each pod can have its own PVC, which follows it even if the pod is rescheduled. The PVCs are retained when the pod dies, so the new pod picks up the same data.

Example StatefulSet with storage:

apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: web
spec:
  serviceName: "nginx"
  replicas: 3
  selector:
    matchLabels:
      app: nginx
  template:
    metadata:
      labels:
        app: nginx
    spec:
      containers:
      - name: nginx
        image: nginx
        volumeMounts:
        - name: www
          mountPath: /usr/share/nginx/html
  volumeClaimTemplates:
  - metadata:
      name: www
    spec:
      accessModes: [ "ReadWriteOnce" ]
      resources:
        requests:
          storage: 1Gi

Notice volumeClaimTemplates. This is the magic—each replica gets its own unique PVC named www-web-0, www-web-1, etc. If web-0 is deleted and recreated, it reclaims the same PVC.

Common Pitfalls and Practical Tips

  • Access mode mismatch: A PVC with ReadWriteOnce can’t bind to a PV that only supports ReadOnlyMany. Always match.
  • Reclaim policy: When you delete a PVC, what happens to the PV? The default is Retain, meaning the disk sticks around (and you pay for it). Use Delete for ephemeral workloads or set up lifecycle management.
  • Local storage: Using hostPath volumes is tempting for testing, but it ties a pod to a specific node. That breaks rescheduling. Use it only for development.
  • StatefulSet scaling: Adding replicas creates new PVCs, but removing replicas does not delete PVCs by default. You must clean them up manually or use a custom controller.
  • Performance isolation: On cloud disks, performance often scales with disk size. A 10Gi GP2 EBS volume has lower IOPS than a 100Gi one. Right-size early.

When to Use What

Workload Type Best Approach
Stateless web app Deployment, no PVCs
Cache (Redis, Memcached) Deployment with emptyDir or local SSDs
Database (PostgreSQL, MySQL) StatefulSet with dynamic provisioning
Shared storage (logs, assets) PVC with ReadWriteMany (NFS, EFS)
CI/CD artifacts PVC with Delete reclaim policy

The Big Picture

Kubernetes storage isn’t magic—it’s a well-designed abstraction over real infrastructure. PVs and PVCs decouple storage management from pod lifecycle. StatefulSets add identity and ordering. Dynamic provisioning removes manual overhead.

Once you understand these pieces, running even a production database on Kubernetes becomes not just possible, but practical. You control where data lives, how it persists, and how it scales. And that’s the point of containers: not just to run code, but to run everything with the same operational discipline.

Comments

Questions, corrections, and tips stay visible for everyone reading this page.

0 in thread

Join the discussion

Shown next to your comment.

Up to 4,000 characters

No comments yet

Be the first to leave a note — it helps the next reader.