BUG #15222: pg_isready fails connection after previous pg_isreadyclaims success - Mailing list pgsql-bugs

From PG Bug reporting form
Subject BUG #15222: pg_isready fails connection after previous pg_isreadyclaims success
Date
Msg-id 152778474106.26722.11024944991637906220@wrigleys.postgresql.org
Whole thread Raw
Responses Re: BUG #15222: pg_isready fails connection after previous pg_isready claims success
List pgsql-bugs
The following bug has been logged on the website:

Bug reference:      15222
Logged by:          Jon Watte
Email address:      jwatte@gmail.com
PostgreSQL version: 10.4
Operating system:   Ubuntu 18.04 + docker.io
Description:

When starting up an instance of the postgres Docker container, I'm seeing
postgres reject a new connection after already having accepted previous
connections and pg_isready reporting it's ready.

I have a script that starts a new instance of postgres using the
postgres:latest docker container.
This script waits for the postgres server to be available, by using
pg_isready, and by running psql "select 1" as belt-and-suspenders.
Then, it tries to set up a new database using the psql command, and also
first calling pg_isready to double-check readiness.
However, a second pg_isready command sometimes gets the error that
/var/run/postgresql:5432 is rejecting connections, as if it's not yet
ready.

Here's the docker start-up script, run on the host:
docker pull postgres
docker run --rm \
  --name $POSTGRES_NAME \
  -e POSTGRES_PASSWORD=totally-my-password-in-cleartext \
  --network $NETWORKNAME \
  -v $OBSERVE_ROOT:/home/dev/observe \
  -d \
  postgres
# postgres needs some time to get started -- slower machines need more
time
wait_for_postgres $POSTGRES_NAME

Here's the checking script:
function wait_for_postgres() {
  echo "Waiting for $1 to serve requests"
  for i in `seq 100`; do
    sleep 0.2
    echo -n "."
    if docker exec $1 pg_isready -q; then
      echo -n " (postgres is ready) "
      break
    fi
  done

  # believe it or not, we've seen cases where pg_isready returns true,
  # but the the server goes back to not responding to requests
  success=false
  for i in `seq 100`; do
    sleep 0.2
    echo -n "."
    if docker exec $1 psql -U postgres -q -c "select 1 as ready;"; then
      success=true
      break
    fi
  done

  # report final status
  echo -n "pg_isready: "
  docker exec $1 pg_isready || true
  if ! $success; then
    echo "postgres is still not ready in container $1"
    exit 1
  fi
}


Here's the create-database script excerpt:
  DBNAME="${i}_${ENV}"
  echo "Creating database $DBNAME in server $POSTGRES_NAME"
  docker exec $POSTGRES_NAME pg_isready
  docker exec $POSTGRES_NAME psql -U postgres -c "create database
${DBNAME};"
  docker exec $POSTGRES_NAME psql -U postgres -c "grant all privileges on
database ${DBNAME} to public;"
  echo "Loading schemas for $DBNAME"

Here's a log from running those scripts:
14:59:10 Using default tag: latest
14:59:11 latest: Pulling from library/postgres
14:59:11 Digest:
sha256:c604b88af0e7adbe45cd2b9c329479c7a5305bd88da37d4806dcc56ab5a31d42
14:59:11 Status: Image is up to date for postgres:latest
14:59:11 9ea92edee64a7cf9d45e80427495a4270d7f4ffaadc48187d0ea13a4ea27dad2
14:59:14 Waiting for executor1-postgres to serve requests
14:59:14 ......... (postgres is ready) . ready 
14:59:17 -------
14:59:17      1
14:59:17 (1 row)
14:59:17 
14:59:17 pg_isready: /var/run/postgresql:5432 - accepting connections
14:59:17 Creating database observe_dev in server executor1-postgres
14:59:18 /var/run/postgresql:5432 - rejecting connections
14:59:18 Build step 'Run with timeout' marked build as failure

Here is the container I'm using:
postgres                                                    latest
   61d053fc271c        5 days ago          236MB

(It has also happened with previous container images)

The version in the container:
psql (10.4 (Debian 10.4-2.pgdg90+1))


pgsql-bugs by date:

Previous
From: Tom Lane
Date:
Subject: Re: BUG #15221: Could not open file "pg_subtrans/0013"
Next
From: Tom Lane
Date:
Subject: Re: BUG #15222: pg_isready fails connection after previous pg_isready claims success