我们在一个新环境里部署应用,经常会结合Job控制器来实现我们的初始化数据。当然,这是一种最简单的使用场景,创建一个 Job 对象以便以一种可靠的方式运行某 Pod 直到完成。同时这个Pod初始化数据时依赖某个服务接口、数据库、又或者是其它的中间件服务,所以我们就需要使用一些条件去判断--循环请求,延时,直到接口正常返回数据。

某个HTTP接口

不死不休版

顾名思义,探测URL没有正常返回我就一直重试--天荒地老、不死不休。直接成功后我再试行我需要执行的Job。

apiVersion: batch/v1
kind: Job
metadata:
  name: xxx-init
  labels:
    app.kubernetes.io/managed-by: "Helm"
spec:
  backoffLimit: 1
  activeDeadlineSeconds: 3600
  template:
    metadata:
      name: xxx-init
      labels:
        app.kubernetes.io/managed-by: "Helm"
    spec:
      imagePullSecrets:
        - name: secrest
      restartPolicy: Never
      containers:
      - name: data-init
        image: curl:7.74.0
        command: ["/bin/sh","-c"]
        args: 
        - |-
          probe_url=http://SERVICE_NAME:PORT/XX/XX
          until false
          do
              if [ "$(curl -m 3 -s -w '%{http_code}' -o /dev/null $probe_url)" -eq 200 ]
              then
                  echo "init action"
                  exit 0
              else
                  echo "wait, retry..."
                  sleep 3
              fi
          done

提示:也可以使用另种风格,与上述一样的功能。

probe_url=http://SERVICE_NAME:PORT/XX/XX
until [ "$(curl -m 3 -s -w '%{http_code}' -o /dev/null $probe_url)" -eq 200 ]
do
    echo "wait, retry..." && sleep 3
done

echo "do something"

改良版

一直重试在一特殊的场景下可能并不是我们想要的,比如我们将它设定一个有效的时间范围,超过这个时间我就是以异常代码退出。

伪脚本如下:

count=0
max_count=30
delay=10
probe_url=http://gitee-xx:port

until [ "$(curl -s -w '%{http_code}' -o /dev/null $probe_url)" -eq 200 ]
do
 if [ $count != $max_count ]
 then
     echo "wait, $count retries..."
     sleep $delay
     count=$(( count+1 ))
 else
     echo "Retry has reached the upper limit, exit"
     exit 1
 fi
done

echo "do something"

提示: 建议在Job控制器引入activeDeadlineSeconds参数,即在结束 N 秒之后,可以成为被自动删除的对象。

PGSQL

检测服务是否可以正常登陆

psql -h localhost -p 5432 -U USERNAME -d demo -v ON_ERROR_STOP=1 -c 'select 1'

提示:demo数据库需存在。

或者使用以下

#waiting for postgres
until psql --host=$KONG_PG_HOST --username=$POLLING_USER $POLLING_DATABASE -w &>/dev/null
do
  echo "Waiting for PostgreSQL..."
  sleep 1
done

示例

apiVersion: batch/v1
kind: Job
metadata:
  name: pgsql-init
  labels:
    app.kubernetes.io/managed-by: "Helm"
spec:
  backoffLimit: 1
  activeDeadlineSeconds: 3600
  template:
    metadata:
      name: pgsql-init
      labels:
        app.kubernetes.io/managed-by: "Helm"
    spec:
      imagePullSecrets:
        - name: oschub
      restartPolicy: Never
      initContainers:
      - name: wait-pgsql-ready
        image: bitnami/postgresql:13
        env:
        - name: PG_HOST
          value: "gitee-postgres"
        - name: PG_PORT
          value: "5432"
        - name: PG_USER
          value: "zhang3"
        - name: PGPASSWORD
          value: "passwd"
        - name: PG_DBNAME
          value: "demo"
        command: ["/bin/sh","-c"]
        args: 
        - |-

          until psql -h $PG_HOST -p $PG_PORT -U $PG_USER -d $PG_DBNAME -v ON_ERROR_STOP=1 -c 'select 1' -w >/dev/null
          do
            echo "wait, retry..." && sleep 3
          done
      containers:
        ...

MySQL

mysql可以使用以下方式来探测服务是否正常

export MYSQL_PWD && mysql --user=${MYSQL_USER} --password='${MYSQL_PASSWORD}' --silent --execute "SELECT 1;"

示例

apiVersion: batch/v1
kind: Job
metadata:
  name: mysql-init
  labels:
    app.kubernetes.io/name: "MySQL-init"
    app.kubernetes.io/type: "Job"
    app.kubernetes.io/managed-by: "Helm"
spec:
  backoffLimit: 1
  activeDeadlineSeconds: 3600
  template:
    metadata:
      name: mysql-init
      labels:
        app.kubernetes.io/name: "MySQL-init"
        app.kubernetes.io/type: "Job"
        app.kubernetes.io/managed-by: "Helm"
    spec:
      imagePullSecrets:
        - name: oschub
      restartPolicy: Never
      initContainers:
      - name: wait-mysql-ready
        image: mysql:5.7
        env:
        - name: MYSQL_HOST
          value: "mysql-svc"
        - name: MYSQL_PORT
          value: "3306"
        - name: MYSQL_USER
          value: "root"
        - name: MYSQL_PWD
          value: "password"
        command: ["/bin/sh","-c"]
        args: 
        - |-

          until mysql --host=${MYSQL_HOST} --user=${MYSQL_USER} --password=${MYSQL_PWD} --silent --execute "SELECT 1;" >/dev/null
          do
            echo "wait, retry..." && sleep 3
          done
      containers:
        ...

Elasticsearch

curl --fail localhost:9200/_cluster/health

MinIO

      initContainers:
      - name: wait-minio-ready
        imagePullPolicy: IfNotPresent
        image: curl:7.74.0
        command: ["/bin/sh","-c"]
        args: 
        - |-
          probe_url=http://gitee-minio:9000/minio/health/live
          until [ "$(curl -m 3 -s -w '%{http_code}' -o /dev/null $probe_url)" -eq 200 ]
          do
              echo "wait, retry..." && sleep 3
          done

参考引用