Buildfarm failure from overly noisy warning message - Mailing list pgsql-hackers

From Tom Lane
Subject Buildfarm failure from overly noisy warning message
Date
Msg-id 23314.1437922565@sss.pgh.pa.us
Whole thread Raw
Responses Re: Buildfarm failure from overly noisy warning message  (Andres Freund <andres@anarazel.de>)
Re: Buildfarm failure from overly noisy warning message  (Kevin Grittner <kgrittn@ymail.com>)
List pgsql-hackers
chipmunk failed last night
http://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=chipmunk&dt=2015-07-26%2007%3A36%3A32

like so:

================== pgsql.build/src/test/regress/regression.diffs ===================
*** /home/pgbfarm/buildroot/REL9_3_STABLE/pgsql.build/src/test/regress/expected/create_index.out    Sun Jul 26 10:37:41
2015
--- /home/pgbfarm/buildroot/REL9_3_STABLE/pgsql.build/src/test/regress/results/create_index.out    Sun Jul 26 10:51:48
2015
***************
*** 14,19 ****
--- 14,20 ---- CREATE INDEX tenk1_hundred ON tenk1 USING btree(hundred int4_ops); CREATE INDEX tenk1_thous_tenthous ON
tenk1(thousand, tenthous); CREATE INDEX tenk2_unique1 ON tenk2 USING btree(unique1 int4_ops);
 
+ WARNING:  could not send signal to process 30123: No such process CREATE INDEX tenk2_unique2 ON tenk2 USING
btree(unique2int4_ops); CREATE INDEX tenk2_hundred ON tenk2 USING btree(hundred int4_ops); CREATE INDEX rix ON road
USINGbtree (name text_ops);
 

======================================================================

What's evidently happened here is that our session tried to boot an
autovacuum process off a table lock, only that process was gone by the
time we issued the kill() call.  No problem really ... but aside from
causing buildfarm noise, I could see somebody getting really panicky
if this message appeared on a production server.

I'm inclined to reduce the WARNING to LOG, and/or skip it altogether
if the error is ESRCH.  The relevant code in ProcSleep() is:
               ereport(LOG,                     (errmsg("sending cancel to blocking autovacuum PID %d",
           pid),                      errdetail_log("%s", logbuf.data)));
 
               if (kill(pid, SIGINT) < 0)               {                   /* Just a warning to allow multiple callers
*/                  ereport(WARNING,                           (errmsg("could not send signal to process %d: %m",
                           pid)));               }
 

so logging failures at LOG level seems pretty reasonable.  One could
also argue that both of these ereports ought to be downgraded to DEBUG1
or less, since this mechanism is pretty well shaken out by now.

Thoughts?
        regards, tom lane



pgsql-hackers by date:

Previous
From: Kouhei Kaigai
Date:
Subject: CustomScan and readfuncs.c
Next
From: Joe Conway
Date:
Subject: Re: A little RLS oversight?