Kubernetes问题排查:Orphaned Pod Found - but Volume Paths Are Still Present on Disk

问题描述

系统日志(/var/log/message)大量orphaned pod found报错:

Apr 27 18:42:50 VM-47-99-centos kubelet: E0427 18:42:50.416097 1088010 kubelet_volumes.go:154] orphaned pod "bfadefa6-d6ed-4727-a882-62119485b426" found, but volume paths are still present on disk : There were a total of 1 errors similar to this. Turn up verbosity to see them.

问题定位

查阅相关资料:
https://github.com/kubernetes/kubernetes/issues/60987
https://github.com/kubernetes/kubernetes/pull/68616

其中最关键的说明:

While meet Orphan Pod, kubelet will clean up it and its directorys (cleanupOrphanedPodDirs);
But if there are mount path in the directorys, the clean action will be skipped.

当发现孤儿Pod时,k8s会自动清理宿主机上的pod目录;但是,如果这个挂载了数据卷(volume),k8s则会跳过这一操作。

源码:

# kubernetes/pkg/volume/util/util.go

 notMnt, mntErr := mounter.IsLikelyNotMountPoint(mountPath) 
 if mntErr != nil { 
 	return mntErr 
 } 
 if notMnt { 
 	glog.V(4).Infof("%q is unmounted, deleting the directory", mountPath) 
 	return os.Remove(mountPath) 
 } 
 return fmt.Errorf("Failed to unmount path %v", mountPath) 

解决问题

#!/bin/sh

orphanedPods=`cat /var/log/messages|grep 'orphaned pod'|awk -F '"' '{print $2}'|uniq`;
orphanedPodsNum=`echo $orphanedPods|awk -F ' ' '{print NF}'`;
echo -e "orphanedPods: $orphanedPodsNum \n$orphanedPods";

for i in $orphanedPods
do
  echo "Deleting Orphaned pod id: $i";
  rm -rf /var/lib/kubelet/pods/$i;
done