CKA Hands-on Lab Exercises¶
These labs simulate real CKA exam scenarios. Practice these exercises to build hands-on skills for the performance-based exam.
Prerequisites¶
- Kubernetes cluster (kubeadm, minikube, or kind)
- kubectl configured
- Root/sudo access for cluster operations
Lab 1: Cluster Installation with kubeadm¶
Objective: Install a Kubernetes cluster using kubeadm
Tasks¶
- Initialize control plane:
# On control plane node
sudo kubeadm init --pod-network-cidr=10.244.0.0/16
# Set up kubeconfig
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
- Install CNI (Calico):
kubectl apply -f https://raw.githubusercontent.com/projectcalico/calico/v3.26.0/manifests/calico.yaml
- Join worker nodes:
# On worker nodes (use token from kubeadm init output)
sudo kubeadm join <control-plane-ip>:6443 --token <token> \
--discovery-token-ca-cert-hash sha256:<hash>
- Verify cluster:
Expected Outcome
- Control plane initialized successfully - CNI installed and running - Worker nodes joined and Ready - All system pods runningLab 2: ETCD Backup and Restore¶
Objective: Backup and restore etcd cluster data
Tasks¶
- Create test data:
kubectl create namespace backup-test
kubectl create deployment nginx --image=nginx -n backup-test
kubectl get all -n backup-test
- Backup etcd:
# Find etcd pod and get certs location
kubectl describe pod -n kube-system etcd-<node-name>
# Backup etcd
ETCDCTL_API=3 etcdctl snapshot save /tmp/etcd-backup.db \
--endpoints=https://127.0.0.1:2379 \
--cacert=/etc/kubernetes/pki/etcd/ca.crt \
--cert=/etc/kubernetes/pki/etcd/server.crt \
--key=/etc/kubernetes/pki/etcd/server.key
# Verify backup
ETCDCTL_API=3 etcdctl snapshot status /tmp/etcd-backup.db --write-out=table
- Simulate data loss:
- Restore etcd:
# Stop kube-apiserver (move manifest)
sudo mv /etc/kubernetes/manifests/kube-apiserver.yaml /tmp/
# Restore snapshot
ETCDCTL_API=3 etcdctl snapshot restore /tmp/etcd-backup.db \
--data-dir=/var/lib/etcd-restored
# Update etcd manifest to use new data-dir
sudo sed -i 's|/var/lib/etcd|/var/lib/etcd-restored|g' /etc/kubernetes/manifests/etcd.yaml
# Restore kube-apiserver
sudo mv /tmp/kube-apiserver.yaml /etc/kubernetes/manifests/
# Verify restoration
kubectl get namespace backup-test
kubectl get deployment -n backup-test
Expected Outcome
- Backup created successfully - After restore, backup-test namespace and deployment exist againLab 3: Cluster Upgrade¶
Objective: Upgrade Kubernetes cluster version
Tasks¶
- Check current version:
- Upgrade control plane:
# Update package repo
sudo apt update
sudo apt-cache madison kubeadm
# Upgrade kubeadm
sudo apt-mark unhold kubeadm
sudo apt-get install -y kubeadm=1.28.0-00
sudo apt-mark hold kubeadm
# Plan upgrade
sudo kubeadm upgrade plan
# Apply upgrade
sudo kubeadm upgrade apply v1.28.0
# Upgrade kubelet and kubectl
sudo apt-mark unhold kubelet kubectl
sudo apt-get install -y kubelet=1.28.0-00 kubectl=1.28.0-00
sudo apt-mark hold kubelet kubectl
sudo systemctl daemon-reload
sudo systemctl restart kubelet
- Upgrade worker nodes:
# Drain node
kubectl drain <worker-node> --ignore-daemonsets --delete-emptydir-data
# On worker node: upgrade kubeadm, kubelet, kubectl
# Then uncordon
kubectl uncordon <worker-node>
- Verify upgrade:
Expected Outcome
- All nodes upgraded to new version - Cluster functioning normally after upgradeLab 4: Troubleshooting Broken Cluster¶
Objective: Diagnose and fix cluster issues
Scenario 1: Kubelet Not Running¶
# Check kubelet status
sudo systemctl status kubelet
# Check logs
sudo journalctl -u kubelet -f
# Common fixes:
sudo systemctl start kubelet
sudo systemctl enable kubelet
Scenario 2: API Server Not Responding¶
# Check API server pod
sudo crictl ps | grep kube-apiserver
# Check manifest
sudo cat /etc/kubernetes/manifests/kube-apiserver.yaml
# Check logs
sudo crictl logs <container-id>
Scenario 3: Node NotReady¶
# Check node conditions
kubectl describe node <node-name>
# Check kubelet on node
ssh <node> "sudo systemctl status kubelet"
# Check CNI
kubectl get pods -n kube-system | grep -E "calico|flannel|weave"
Scenario 4: Pods Pending¶
# Check pod events
kubectl describe pod <pod-name>
# Check scheduler
kubectl get pods -n kube-system | grep scheduler
# Check node resources
kubectl describe nodes | grep -A5 "Allocated resources"
Troubleshooting Checklist
1. Check component status: `kubectl get componentstatuses` 2. Check node status: `kubectl get nodes` 3. Check system pods: `kubectl get pods -n kube-system` 4. Check kubelet: `systemctl status kubelet` 5. Check logs: `journalctl -u kubelet`Lab 5: Workload Scheduling¶
Objective: Control pod placement using various scheduling methods
Tasks¶
- Node Selector:
# Label node
kubectl label node <node-name> disktype=ssd
# Create pod with nodeSelector
cat <<EOF | kubectl apply -f -
apiVersion: v1
kind: Pod
metadata:
name: nginx-ssd
spec:
nodeSelector:
disktype: ssd
containers:
- name: nginx
image: nginx
EOF
kubectl get pod nginx-ssd -o wide
- Node Affinity:
apiVersion: v1
kind: Pod
metadata:
name: nginx-affinity
spec:
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: disktype
operator: In
values:
- ssd
containers:
- name: nginx
image: nginx
- Taints and Tolerations:
# Taint a node
kubectl taint nodes <node-name> dedicated=special:NoSchedule
# Create pod with toleration
cat <<EOF | kubectl apply -f -
apiVersion: v1
kind: Pod
metadata:
name: nginx-toleration
spec:
tolerations:
- key: "dedicated"
operator: "Equal"
value: "special"
effect: "NoSchedule"
containers:
- name: nginx
image: nginx
EOF
- Cleanup:
kubectl delete pod nginx-ssd nginx-affinity nginx-toleration
kubectl taint nodes <node-name> dedicated-
kubectl label node <node-name> disktype-
Expected Outcome
- Pods scheduled only on nodes matching criteria - Taints prevent scheduling unless toleratedLab 6: Services and Networking¶
Objective: Configure various service types and network policies
Tasks¶
- Create deployment and services:
kubectl create deployment web --image=nginx --replicas=3
# ClusterIP
kubectl expose deployment web --port=80 --name=web-clusterip
# NodePort
kubectl expose deployment web --port=80 --type=NodePort --name=web-nodeport
# Verify
kubectl get svc
- Create NetworkPolicy:
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: deny-all
namespace: default
spec:
podSelector: {}
policyTypes:
- Ingress
- Egress
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: allow-web
namespace: default
spec:
podSelector:
matchLabels:
app: web
policyTypes:
- Ingress
ingress:
- from:
- podSelector:
matchLabels:
access: allowed
ports:
- protocol: TCP
port: 80
- Test NetworkPolicy:
# This should fail (no label)
kubectl run test --image=busybox --rm -it --restart=Never -- wget -qO- --timeout=2 http://web-clusterip
# This should work (has label)
kubectl run test --image=busybox --rm -it --restart=Never --labels="access=allowed" -- wget -qO- http://web-clusterip
- Cleanup:
kubectl delete deployment web
kubectl delete svc web-clusterip web-nodeport
kubectl delete networkpolicy deny-all allow-web
Lab 7: Storage - PV and PVC¶
Objective: Configure persistent storage
Tasks¶
- Create PersistentVolume:
apiVersion: v1
kind: PersistentVolume
metadata:
name: pv-hostpath
spec:
capacity:
storage: 1Gi
accessModes:
- ReadWriteOnce
hostPath:
path: /tmp/pv-data
persistentVolumeReclaimPolicy: Retain
- Create PersistentVolumeClaim:
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: pvc-hostpath
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 500Mi
- Use PVC in Pod:
apiVersion: v1
kind: Pod
metadata:
name: pv-pod
spec:
containers:
- name: app
image: nginx
volumeMounts:
- mountPath: /data
name: storage
volumes:
- name: storage
persistentVolumeClaim:
claimName: pvc-hostpath
- Verify:
kubectl get pv,pvc
kubectl exec pv-pod -- ls /data
kubectl exec pv-pod -- sh -c "echo 'test data' > /data/test.txt"
kubectl exec pv-pod -- cat /data/test.txt
- Cleanup:
Additional Practice Resources¶
- Killer.sh CKA Simulator (included with exam)
- Killercoda CKA Scenarios
- Kubernetes the Hard Way