参考自《kubernetes权威指南》
作为服务发现机制的基本功能,在集群内需要能够通过服务名对服务进行访问,这就需要一个集群范围的DNS服务来完成服务名到clusterIP的解析
kubernetes提供的虚拟DNS服务名为skydns,由4个组件组成
etcd:DNS存储kube2sky:将kubernetes master中的service注册到etcdskyDNS:提供dns域名解析服务healthz:提供对skydns服务的健康检查功能复制以下代码时,如果TAB长度不是八个空格,可能报文件格式错误 1.创建skydns-rc.yaml的rc文件 vim skydns-rc.yaml
apiVersion: v1 kind: ReplicationController metadata: name: kube-dns-v11 namespace: kube-system labels: k8s-app: kube-dns version: v11 kubernetes.io/cluster-service: "true" spec: replicas: 1 selector: k8s-app: kube-dns version: v11 template: metadata: labels: k8s-app: kube-dns version: v11 kubernetes.io/cluster-service: "true" spec: containers: - name: etcd image: gcr.io/google_containers/etcd-amd64:2.2.1 resources: limits: cpu: 100m memory: 50Mi requests: cpu: 100m memory: 50Mi command: - /usr/local/bin/etcd - -data-dir - /tmp/data - -listen-client-urls - http://127.0.0.1:2379,http://127.0.0.1:4001 - -advertise-client-urls - http://127.0.0.1:2379,http://127.0.0.1:4001 - -initial-cluster-token - skydns-etcd volumeMounts: - name: etcd-storage mountPath: /tmp/data - name: kube2sky image: gcr.io/google_containers/kube2sky-amd64:1.15 resources: limits: cpu: 100m memory: 50Mi requests: cpu: 100m memory: 50Mi livenessProbe: httpGet: path: /healthz port: 8080 scheme: HTTP initialDelaySeconds: 60 timeoutSeconds: 5 successThreshold: 1 failureThreshold: 5 readinessProbe: httpGet: path: /readiness port: 8081 scheme: HTTP initialDelaySeconds: 30 timeoutSeconds: 5 args: - --kube-master-url=http://192.168.56.102:8080 - --domain=cluster.local - name: skydns image: gcr.io/google_containers/skydns:2015-10-13-8c72f8c resources: limits: cpu: 100m memory: 50Mi requests: cpu: 100m memory: 50Mi args: - -machines=http://127.0.0.1:4001 - -addr=0.0.0.0:53 - -ns-rotate=false - -domain=cluster.local ports: - containerPort: 53 name: dns protocol: UDP - containerPort: 53 name: dns-tcp protocol: TCP - name: healthz image: gcr.io/google_containers/exechealthz:1.0 resources: limits: cpu: 10m memory: 20Mi requests: cpu: 10m memory: 20Mi args: - -cmd=nslookup kubernetes.default.svc.cluster.local 127.0.0.1 > /de/null - -port=8080 ports: - containerPort: 8080 protocol: TCP volumes: - name: etcd-storage emptyDir: {} dnsPolicy: Default2.创建skydns的svc文件 vim skydns-svc.yaml
apiVersion: v1 kind: Service metadata: name: kube-dns namespace: kube-system labels: k8s-app: kube-dns kubernetes.io/cluster-service: "true" kubernetes.io/name: "KubeDNS" spec: selector: k8s-app: kube-dns clusterIP: 169.169.0.100 ports: - name: dns port: 53 protocol: UDP - name: dns-tcp port: 53 protocol: TCP如果无法访问google , 需参考文章最后疑难解析修改镜像名
3.修改每台node上kubelet启动参数,添加 –cluster_dns=169.169.0.100 –cluster_domain=cluser.local
4.创建资源 kubectl create -f skydns-rc.yaml kubectl create -f skydns-svc.yaml
5.检查资源是否正常 kubectl get rc –namespace=kube-system kubectl get pods –namespace=kube-system kubectl get svc –namespace=kube-system
6.为redis-master创建一个service vim redis-master-service.yaml
apiVersion: v1 kind: Service metadata: name: redis-master labels: name: redis-master spec: ports: - port: 6379 targetPort: 6379 selector: name: redis-masterkubectl get services 检查刚刚创建的redis-master服务是否正常
7.使用一个带有nslookup工具的pod来验证dns服务是否正常工作 vim busybox.yaml
apiVersion: v1 kind: Pod metadata: name: busybox namespace: default spec: containers: - name: busybox image: busybox command: - sleep - "3600"kubectl create -f busybox.yaml
kubectl exec busybox – nslookup redis-master
实验结果:无法解析redis-master [root@kube-master kubernetes]# kubectl exec kube-dns-v11-bh30c -c etcd –namespace=kube-system etcdctl ls /skydns/local/cluster /skydns/local/cluster/svc /skydns/local/cluster/pod 结果发现redis-master存在上述svc/default下,因此查询需要修改为:
kubectl exec busybox -- nslookup redis-master.defualt.svc [root@kube-master kubernetes]# kubectl exec kube-dns-v11-bh30c -c etcd --namespace=kube-system etcdctl -- ls --recursive /skydns/local/cluster/ /skydns/local/cluster/svc /skydns/local/cluster/svc/default /skydns/local/cluster/svc/default/kubernetes /skydns/local/cluster/svc/default/kubernetes/430884aa /skydns/local/cluster/svc/default/redis-master /skydns/local/cluster/svc/default/redis-master/3896faab /skydns/local/cluster/svc/kube-system /skydns/local/cluster/svc/kube-system/kube-dns /skydns/local/cluster/svc/kube-system/kube-dns/1c613ef2 /skydns/local/cluster/pod /skydns/local/cluster/pod/kube-system /skydns/local/cluster/pod/kube-system/172-17-0-2 /skydns/local/cluster/pod/kube-system/172-17-0-2/c6db276b /skydns/local/cluster/pod/default /skydns/local/cluster/pod/default/172-17-0-3 /skydns/local/cluster/pod/default/172-17-0-3/edc91421.如果无法访问google , 无法获取pause-amd则做如下处理
docker pull kubeguide/pause-amd64:3.0 在每台node上为kubelet的启动参数添加: --pod-infra-container-image=kubeguide/pause-amd64:3.0 修改skydns-rc.yaml,修改image镜像为如下 outrider/etcd-amd64 outrider/kube2sky outrider/skydns outrider/exechealthz2.创建rc成功,却没有pod,kubectl describe rc mysql显示No API token found for service account “default” vim /etc/kubernetes/apiserver 将“–admission-control=NamespaceLifecycle,NamespaceExists, LimitRanger, SecurityContextDeny, ServiceAccount, ResourceQuota” 中的SecurityContextDeny, ServiceAccount删掉并重启kube-apiserver
3.no nodes,调度失败,没有可用的节点 node和master节点通信故障,可以通过journalctl -u kubelet | tail -300分析日志排错
4.certificate is valid for kubernetes.default.svc, kubernetes.default, kubernetes, localhost, not kube-master 或者Get https://192.168.56.102:6443/api/v1/services?resourceVersion=0: x509: certificate is valid for 192.168.124.17, 169.169.0.1, not 192.168.56.102 , certificate signed by unknown authority。 建议不使用证书实验,或者重新做证书
