Re: pgbench bug candidate: negative "initial connection time" - Mailing list pgsql-hackers

From Yugo NAGATA
Subject Re: pgbench bug candidate: negative "initial connection time"
Date
Msg-id 20210619004605.83386a39ffca891f809f38a5@sraoss.co.jp
Whole thread Raw
In response to Re: pgbench bug candidate: negative "initial connection time"  (Fabien COELHO <coelho@cri.ensmp.fr>)
List pgsql-hackers
Hello Horiguchi-san, Fabien,

On Fri, 18 Jun 2021 15:58:48 +0200 (CEST)
Fabien COELHO <coelho@cri.ensmp.fr> wrote:

> 
> >>>        /* must be something wrong */
> >>>        pg_log_error("%s() failed: %m", SOCKET_WAIT_METHOD);
> >>>        goto done;
> >> 
> >> Should say such like "thread %d aborted: %s() failed: ...".
> 
> After having a lookg, there are already plenty such cases. I'd say not to 
> change anything for beta, and think of it for the next round.

Agreed. Basically, I think the existing message should be left as is.

> >> ====
> >>   errno = THREAD_BARRIER_INIT(&barrier, nthreads);
> >>   if (errno != 0)
> >> +  {
> >>     pg_log_fatal("could not initialize barrier: %m");
> >> +    exit(1);
> >> 
> >> This is a run-time error. Maybe we should return 2 in that case.
> 
> I think that you are right, but there are plenty such places where exit 
> should be 2 instead of 1 if the doc is followed:
> 
> """Errors during the run such as database errors or problems in the script 
> will result in exit status 2."""
> 
> My beta take is to let these as they are, i.e. pretty inconsistent all 
> over pgbench, and schedule a cleanup on the next round.

As same as the below Fabian's comment about thread->logfile, 

> >> ===
> >>     if (thread->logfile == NULL)
> >>     {
> >>       pg_log_fatal("could not open logfile \"%s\": %m", logpath);
> >> -      goto done;
> >> +      exit(1);
> >> 
> >> Maybe we should exit with 2 this case.
> >
> > Yep.
> 
> The bench is not even started, this is not really run time yet, 1 seems 
> ok. The failure may be due to a typo in the path which comes from the 
> user.

the bench is not started at THREAD_BARRIER_INIT, so I think exit(1) is ok.  

> 
> >>         /* must be something wrong */
> >> -        pg_log_fatal("%s() failed: %m", SOCKET_WAIT_METHOD);
> >> +        pg_log_error("%s() failed: %m", SOCKET_WAIT_METHOD);
> >>         goto done;
> >> 
> >> Why doesn't a fatal error cause an immediate exit?
> >
> > Good point. I do not know, but I would expect it to be the case, and AFAICR 
> > it does not.
> >
> >> (And if we change this to fatal, we also need to change similar errors in 
> >> the same function to fatal.)
> >
> > Possibly.
> 
> On second look, I think that error is fine, indeed we do not stop the 
> process, so "fatal" it is not;

I replaced this 'fatal' with 'error' because we are aborting the client
instead of exit(1). When pgbench was rewritten to use common logging API
by the commit 30a3e772b40, somehow pg_log_fatal was used, but I am
wondering it should have be pg_log_error.

> Attached Yugo-san patch with some updates discussed in the previous mails, 
> so as to move things along.

Thank you for update. I agree with this fix.

Regards,
Yugo Nagata

-- 
Yugo NAGATA <nagata@sraoss.co.jp>



pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: Centralizing protective copying of utility statements
Next
From: Egor Rogov
Date:
Subject: pg_stats and range statistics