Re: BUG #15641: Autoprewarm worker fails to start on Windows withhuge pages in use Old PostgreSQL community/pgsql-bugs x - Mailing list pgsql-bugs

From Mithun Cy
Subject Re: BUG #15641: Autoprewarm worker fails to start on Windows withhuge pages in use Old PostgreSQL community/pgsql-bugs x
Date
Msg-id CADq3xVY_DjvRMv2DpFugyaW+7ZJAqeEU6vcFPossXTEGSE=toA@mail.gmail.com
Whole thread Raw
In response to AW: BUG #15641: Autoprewarm worker fails to start on Windows with huge pages in use Old PostgreSQL community/pgsql-bugs x  ("Hans Buschmann" <buschmann@nidsa.net>)
List pgsql-bugs
Thanks Hans, for a simple reproducible tests.

On Sun, Feb 24, 2019 at 6:54 PM Hans Buschmann <buschmann@nidsa.net> wrote:
> Here is the start of  the error log:
>
> CPS PRD 2019-02-24 12:11:57 CET  00000  1:> LOG:  database system was interrupted; last known up at 2019-02-17 16:14:05 CET
> CPS PRD 2019-02-24 12:12:16 CET  00000  2:> LOG:  entering standby mode
> CPS PRD 2019-02-24 12:12:16 CET  00000  3:> LOG:  redo starts at 0/23000028
> CPS PRD 2019-02-24 12:12:16 CET  00000  4:> LOG:  consistent recovery state reached at 0/23000168
> CPS PRD 2019-02-24 12:12:16 CET  00000  5:> LOG:  invalid record length at 0/24000060: wanted 24, got 0
> CPS PRD 2019-02-24 12:12:16 CET  00000  9:> LOG:  database system is ready to accept read only connections
> CPS PRD 2019-02-24 12:12:16 CET  3D000  1:> FATAL:  database 16384 does not exist
> CPS PRD 2019-02-24 12:12:16 CET  00000 10:> LOG:  background worker "autoprewarm worker" (PID 3968) exited with exit code 1
> CPS PRD 2019-02-24 12:12:16 CET  00000  1:> LOG:  autoprewarm successfully prewarmed 0 of 12402 previously-loaded blocks
> CPS PRD 2019-02-24 12:12:17 CET  XX000  1:> FATAL:  could not connect to the primary server: FATAL:  no pg_hba.conf entry for replication connection from host "192.168.27.155", user "replicator", SSL off
> CPS PRD 2019-02-24 12:12:17 CET  55000  1:> ERROR:  could not map dynamic shared memory segment

As per the log Auto prewarm master did exit ("autoprewarm successfully prewarmed 0 of 12402 previously-loaded blocks") first. Then only we started getting "could not map dynamic shared memory segment".
That is, master has done dsm_detach and then workers started throwing error after that.

> This seems easy to reproduce:
>
> - Install/create a database with autoprewarm on and pg_prewarm loaded.
> - Fill the autoprewarm cache with some data
> - pg_dump the database
> - drop the database
> - create the database and pg_restore it from the dump
> - start the instance and logs are flooded
>
> I have taken no further investigation in the sourcecode due to limited skills so far...

I was able to reproduce same.

The  "worker.bgw_restart_time" is never set for autoprewarm workers so on error it get restarted after some period of time (default behavior). Since database itself is dropped our attempt to connect to that database failed and then worker exited. But again got restated by postmaster then we start seeing above DSM segment error.

I think every autoprewarm worker should be set with "worker.bgw_restart_time = BGW_NEVER_RESTART;" so that there shall not be repeated prewarm attempt of a dropped database. I will try to think further and submit a patch for same.

--
Thanks and Regards
Mithun Chicklore Yogendra
EnterpriseDB: http://www.enterprisedb.com

pgsql-bugs by date:

Previous
From: "Hans Buschmann"
Date:
Subject: AW: BUG #15641: Autoprewarm worker fails to start on Windows with huge pages in use Old PostgreSQL community/pgsql-bugs x
Next
From: Andrew Gierth
Date:
Subject: Re: BUG #15648: oracle_fdw extension not able to create