Re: pgbench bug candidate: negative "initial connection time" - Mailing list pgsql-hackers
From | Yugo NAGATA |
---|---|
Subject | Re: pgbench bug candidate: negative "initial connection time" |
Date | |
Msg-id | 20210619004605.83386a39ffca891f809f38a5@sraoss.co.jp Whole thread Raw |
In response to | Re: pgbench bug candidate: negative "initial connection time" (Fabien COELHO <coelho@cri.ensmp.fr>) |
List | pgsql-hackers |
Hello Horiguchi-san, Fabien, On Fri, 18 Jun 2021 15:58:48 +0200 (CEST) Fabien COELHO <coelho@cri.ensmp.fr> wrote: > > >>> /* must be something wrong */ > >>> pg_log_error("%s() failed: %m", SOCKET_WAIT_METHOD); > >>> goto done; > >> > >> Should say such like "thread %d aborted: %s() failed: ...". > > After having a lookg, there are already plenty such cases. I'd say not to > change anything for beta, and think of it for the next round. Agreed. Basically, I think the existing message should be left as is. > >> ==== > >> errno = THREAD_BARRIER_INIT(&barrier, nthreads); > >> if (errno != 0) > >> + { > >> pg_log_fatal("could not initialize barrier: %m"); > >> + exit(1); > >> > >> This is a run-time error. Maybe we should return 2 in that case. > > I think that you are right, but there are plenty such places where exit > should be 2 instead of 1 if the doc is followed: > > """Errors during the run such as database errors or problems in the script > will result in exit status 2.""" > > My beta take is to let these as they are, i.e. pretty inconsistent all > over pgbench, and schedule a cleanup on the next round. As same as the below Fabian's comment about thread->logfile, > >> === > >> if (thread->logfile == NULL) > >> { > >> pg_log_fatal("could not open logfile \"%s\": %m", logpath); > >> - goto done; > >> + exit(1); > >> > >> Maybe we should exit with 2 this case. > > > > Yep. > > The bench is not even started, this is not really run time yet, 1 seems > ok. The failure may be due to a typo in the path which comes from the > user. the bench is not started at THREAD_BARRIER_INIT, so I think exit(1) is ok. > > >> /* must be something wrong */ > >> - pg_log_fatal("%s() failed: %m", SOCKET_WAIT_METHOD); > >> + pg_log_error("%s() failed: %m", SOCKET_WAIT_METHOD); > >> goto done; > >> > >> Why doesn't a fatal error cause an immediate exit? > > > > Good point. I do not know, but I would expect it to be the case, and AFAICR > > it does not. > > > >> (And if we change this to fatal, we also need to change similar errors in > >> the same function to fatal.) > > > > Possibly. > > On second look, I think that error is fine, indeed we do not stop the > process, so "fatal" it is not; I replaced this 'fatal' with 'error' because we are aborting the client instead of exit(1). When pgbench was rewritten to use common logging API by the commit 30a3e772b40, somehow pg_log_fatal was used, but I am wondering it should have be pg_log_error. > Attached Yugo-san patch with some updates discussed in the previous mails, > so as to move things along. Thank you for update. I agree with this fix. Regards, Yugo Nagata -- Yugo NAGATA <nagata@sraoss.co.jp>
pgsql-hackers by date: