Re: 9.6rc1 Background worker starting multiple times - Mailing list pgsql-bugs

From Robert Haas
Subject Re: 9.6rc1 Background worker starting multiple times
Date
Msg-id CA+TgmobKpkOWh28A_A_Ptib9ZEgFOi3FE_V==V7TAAA--t8bvg@mail.gmail.com
Whole thread Raw
In response to 9.6rc1 Background worker starting multiple times  (Keith Fiske <keith@omniti.com>)
Responses Re: 9.6rc1 Background worker starting multiple times  (Keith Fiske <keith@omniti.com>)
List pgsql-bugs
On Thu, Sep 15, 2016 at 6:17 PM, Keith Fiske <keith@omniti.com> wrote:
> Running into an issue with pg_partman's (version 2.6.0) background worker
> process with 9.6rc1.
>
> https://github.com/keithf4/pg_partman/tree/master/test/test_bgw
>
> Noticed the issue when a user ran the pgtap tests against it and I'm seeing
> the same thing as well. It's working fine with 9.4 and 9.5. Looking in the
> postgresql logs it seems to be trying to start the initial background worker
> process multiple times and then one of them fails to start properly which
> then causes issues when maintenance tries to run. I've attached the
> postgresql log file from when 9.5 starts up
> (95_postgresql-2016-09-15_180306.log) and things are working normally and
> another log file from 9.6 (96_postgresql-2016-09-15_180131.log) that shows
> the odd startup behavior

I don't know what's happening here exactly, but I think your statement
that it is trying to start the same background worker multiple times
is not accurate.  There are a series of log entries like this:

2016-09-15 18:03:06 EDT [] [7688]: [8-1] user=,db=,e=00000 DEBUG:
registering background worker "pg_partman dynamic background worker
(dbname=keith)"
2016-09-15 18:03:06 EDT [] [7688]: [9-1] user=,db=,e=00000 DEBUG:
starting background worker process "pg_partman dynamic background
worker (dbname=keith)"
2016-09-15 18:03:06 EDT [] [7694]: [1-1] user=,db=,e=00000 LOG:
autovacuum launcher started
2016-09-15 18:03:06 EDT [] [7697]: [1-1] user=,db=,e=00000 DEBUG:
Before parsing dbname GUC in dynamic main func: keith
2016-09-15 18:03:06 EDT [] [7697]: [2-1] user=,db=,e=00000 DEBUG:  GUC
rawstring copy: keith
2016-09-15 18:03:06 EDT [] [7697]: [3-1] user=,db=,e=00000 DEBUG:
Entered foreach loop: name (keith), db_main_counter (0), dbcounter (0)
2016-09-15 18:03:06 EDT [] [7697]: [4-1] user=,db=,e=00000 DEBUG:
Parsing list: keith (0)
2016-09-15 18:03:06 EDT [] [7697]: [5-1] user=,db=,e=00000 DEBUG:
Before run_maint initialize connection for db keith
2016-09-15 18:03:06 EDT [] [7697]: [6-1] user=,db=,e=00000 DEBUG:
After run_maint initialize connection for db keith
2016-09-15 18:03:06 EDT [] [7697]: [7-1] user=,db=,e=00000 DEBUG:
Checking if pg_partman extension is installed in database: keith
2016-09-15 18:03:06 EDT [] [7697]: [8-1] user=,db=,e=00000 LOG:
pg_partman dynamic background worker (dbname=keith) dynamic background
worker initialized with role keith on database keith
2016-09-15 18:03:06 EDT [] [7697]: [9-1] user=,db=,e=00000 LOG:
pg_partman dynamic background worker (dbname=keith): SELECT
partman.run_maintenance(p_analyze := true, p_jobmon := true) called by
role keith on database keith
2016-09-15 18:03:06 EDT [] [7697]: [10-1] user=,db=,e=00000 DEBUG:
pg_partman dynamic BGW shutting down gracefully for database keith.
2016-09-15 18:03:06 EDT [] [7688]: [10-1] user=,db=,e=00000 DEBUG:
worker process: pg_partman dynamic background worker (dbname=keith)
(PID 7697) exited with exit code 0
2016-09-15 18:03:06 EDT [] [7688]: [11-1] user=,db=,e=00000 DEBUG:
unregistering background worker "pg_partman dynamic background worker
(dbname=keith)"

Note that the worker is registered (by something) at the beginning of
that sequence and then unregistered (because it exited with code 0) at
the end of that sequence.  So the trouble isn't that the bgworker
system itself is restarting the worker, but that your code keeps
reregistering it.  I can't guess why your code behaves differently on
9.6 than on 9.5 and prior off the top of my head, but you might want
to try to figure that out.  I don't rule out the possibility that
there's some bug in 9.6 that is tickling your code to behave
differently, but I don't think I want to try to go analyze your code
and figure out what the issue is either...

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

pgsql-bugs by date:

Previous
From: Tom Lane
Date:
Subject: Re: Segmentation fault with pg_restore
Next
From: Robert Haas
Date:
Subject: Re: Want a routine vacuum for postgres