Thread: BUG #15222: pg_isready fails connection after previous pg_isreadyclaims success

BUG #15222: pg_isready fails connection after previous pg_isreadyclaims success

From
PG Bug reporting form
Date:
The following bug has been logged on the website:

Bug reference:      15222
Logged by:          Jon Watte
Email address:      jwatte@gmail.com
PostgreSQL version: 10.4
Operating system:   Ubuntu 18.04 + docker.io
Description:

When starting up an instance of the postgres Docker container, I'm seeing
postgres reject a new connection after already having accepted previous
connections and pg_isready reporting it's ready.

I have a script that starts a new instance of postgres using the
postgres:latest docker container.
This script waits for the postgres server to be available, by using
pg_isready, and by running psql "select 1" as belt-and-suspenders.
Then, it tries to set up a new database using the psql command, and also
first calling pg_isready to double-check readiness.
However, a second pg_isready command sometimes gets the error that
/var/run/postgresql:5432 is rejecting connections, as if it's not yet
ready.

Here's the docker start-up script, run on the host:
docker pull postgres
docker run --rm \
  --name $POSTGRES_NAME \
  -e POSTGRES_PASSWORD=totally-my-password-in-cleartext \
  --network $NETWORKNAME \
  -v $OBSERVE_ROOT:/home/dev/observe \
  -d \
  postgres
# postgres needs some time to get started -- slower machines need more
time
wait_for_postgres $POSTGRES_NAME

Here's the checking script:
function wait_for_postgres() {
  echo "Waiting for $1 to serve requests"
  for i in `seq 100`; do
    sleep 0.2
    echo -n "."
    if docker exec $1 pg_isready -q; then
      echo -n " (postgres is ready) "
      break
    fi
  done

  # believe it or not, we've seen cases where pg_isready returns true,
  # but the the server goes back to not responding to requests
  success=false
  for i in `seq 100`; do
    sleep 0.2
    echo -n "."
    if docker exec $1 psql -U postgres -q -c "select 1 as ready;"; then
      success=true
      break
    fi
  done

  # report final status
  echo -n "pg_isready: "
  docker exec $1 pg_isready || true
  if ! $success; then
    echo "postgres is still not ready in container $1"
    exit 1
  fi
}


Here's the create-database script excerpt:
  DBNAME="${i}_${ENV}"
  echo "Creating database $DBNAME in server $POSTGRES_NAME"
  docker exec $POSTGRES_NAME pg_isready
  docker exec $POSTGRES_NAME psql -U postgres -c "create database
${DBNAME};"
  docker exec $POSTGRES_NAME psql -U postgres -c "grant all privileges on
database ${DBNAME} to public;"
  echo "Loading schemas for $DBNAME"

Here's a log from running those scripts:
14:59:10 Using default tag: latest
14:59:11 latest: Pulling from library/postgres
14:59:11 Digest:
sha256:c604b88af0e7adbe45cd2b9c329479c7a5305bd88da37d4806dcc56ab5a31d42
14:59:11 Status: Image is up to date for postgres:latest
14:59:11 9ea92edee64a7cf9d45e80427495a4270d7f4ffaadc48187d0ea13a4ea27dad2
14:59:14 Waiting for executor1-postgres to serve requests
14:59:14 ......... (postgres is ready) . ready 
14:59:17 -------
14:59:17      1
14:59:17 (1 row)
14:59:17 
14:59:17 pg_isready: /var/run/postgresql:5432 - accepting connections
14:59:17 Creating database observe_dev in server executor1-postgres
14:59:18 /var/run/postgresql:5432 - rejecting connections
14:59:18 Build step 'Run with timeout' marked build as failure

Here is the container I'm using:
postgres                                                    latest
   61d053fc271c        5 days ago          236MB

(It has also happened with previous container images)

The version in the container:
psql (10.4 (Debian 10.4-2.pgdg90+1))


=?utf-8?q?PG_Bug_reporting_form?= <noreply@postgresql.org> writes:
> When starting up an instance of the postgres Docker container, I'm seeing
> postgres reject a new connection after already having accepted previous
> connections and pg_isready reporting it's ready.

Hmm, does anything show up in the postmaster log when this happens?
(It might be worth turning on log_connections to help ensure there's
some log trace.)

            regards, tom lane


How do I best turn on log_connections in the postgres log container?

Also, because this is intermittent, I'll need to make that modification to the postgres container each time we spin up a test container, as we pull a fresh new postgres container each time. What particular log files from inside the container should I look at, if any? (Compared to the docker logs output)

Sincerely,

jw





Sincerely,

Jon Watte


--
"I find that the harder I work, the more luck I seem to have." -- Thomas Jefferson

On Thu, May 31, 2018 at 9:51 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
PG Bug reporting form <noreply@postgresql.org> writes:
> When starting up an instance of the postgres Docker container, I'm seeing
> postgres reject a new connection after already having accepted previous
> connections and pg_isready reporting it's ready.

Hmm, does anything show up in the postmaster log when this happens?
(It might be worth turning on log_connections to help ensure there's
some log trace.)

                        regards, tom lane

Jon Watte <jwatte@gmail.com> writes:
> How do I best turn on log_connections in the postgres log container?

Somebody else will have to answer that --- I'm not very familiar with
any of the available container packagings.  If nothing else, ALTER
SYSTEM followed by a DB restart would do it, but there may be a nicer
way.

            regards, tom lane


I think I'm slightly closer to the culprit.
It appears that the Docker container re-starts Postgres the first time it's launched as part of initial set-up.
I guess that my "pg_ready" and "select 1" statements hit the window while the process is up, but re-configuring itself.
I think this is a surprising way of structuring the container -- why wouldn't this one-time setup just be run as part of baking the container image?

I could call "docker logs" and look for  PostgreSQL init process complete; ready for start up. in my wait_for_postgres function, perhaps?


The files belonging to this database system will be owned by user "postgres".
This user must also own the server process.

The database cluster will be initialized with locale "en_US.utf8".
The default database encoding has accordingly been set to "UTF8".
The default text search configuration will be set to "english".

Data page checksums are disabled.

fixing permissions on existing directory /var/lib/postgresql/data ... ok
creating subdirectories ... ok
selecting default max_connections ... 100
selecting default shared_buffers ... 128MB
selecting dynamic shared memory implementation ... posix
creating configuration files ... ok
running bootstrap script ... ok
performing post-bootstrap initialization ... ok
syncing data to disk ... ok

Success. You can now start the database server using:

    pg_ctl -D /var/lib/postgresql/data -l logfile start


WARNING: enabling "trust" authentication for local connections
You can change this by editing pg_hba.conf or using the option -A, or
--auth-local and --auth-host, the next time you run initdb.
waiting for server to start....2018-05-31 22:25:00.211 UTC [76] LOG:  listening on Unix socket "/var/run/postgresql/.s.PGSQL.5432"
2018-05-31 22:25:00.227 UTC [77] LOG:  database system was shut down at 2018-05-31 22:25:00 UTC
2018-05-31 22:25:00.232 UTC [76] LOG:  database system is ready to accept connections
 done
server started
ALTER ROLE


/usr/local/bin/docker-entrypoint.sh: ignoring /docker-entrypoint-initdb.d/*

waiting for server to shut down...2018-05-31 22:25:00.391 UTC [76] LOG:  received fast shutdown request
.2018-05-31 22:25:00.395 UTC [76] LOG:  aborting any active transactions
2018-05-31 22:25:00.398 UTC [76] LOG:  worker process: logical replication launcher (PID 83) exited with exit code 1
2018-05-31 22:25:00.398 UTC [78] LOG:  shutting down
2018-05-31 22:25:00.429 UTC [76] LOG:  database system is shut down
 done
server stopped

PostgreSQL init process complete; ready for start up.

2018-05-31 22:25:00.507 UTC [1] LOG:  listening on IPv4 address "0.0.0.0", port 5432
2018-05-31 22:25:00.507 UTC [1] LOG:  listening on IPv6 address "::", port 5432
2018-05-31 22:25:00.509 UTC [1] LOG:  listening on Unix socket "/var/run/postgresql/.s.PGSQL.5432"
2018-05-31 22:25:00.528 UTC [99] LOG:  database system was shut down at 2018-05-31 22:25:00 UTC
2018-05-31 22:25:00.535 UTC [1] LOG:  database system is ready to accept connections









Sincerely,

Jon Watte


--
"I find that the harder I work, the more luck I seem to have." -- Thomas Jefferson

On Thu, May 31, 2018 at 3:08 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Jon Watte <jwatte@gmail.com> writes:
> How do I best turn on log_connections in the postgres log container?

Somebody else will have to answer that --- I'm not very familiar with
any of the available container packagings.  If nothing else, ALTER
SYSTEM followed by a DB restart would do it, but there may be a nicer
way.

                        regards, tom lane