Re: pgbench logging broken by time logic changes - Mailing list pgsql-hackers
From | Thomas Munro |
---|---|
Subject | Re: pgbench logging broken by time logic changes |
Date | |
Msg-id | CA+hUKG+jhatozihmqy4=Ufd0OHwZadEd_vg-_+26hE3f4C3TgQ@mail.gmail.com Whole thread Raw |
In response to | pgbench logging broken by time logic changes (Gregory Smith <gregsmithpgsql@gmail.com>) |
Responses |
Re: pgbench logging broken by time logic changes
Re: pgbench logging broken by time logic changes |
List | pgsql-hackers |
Hi Greg, On Thu, Jun 17, 2021 at 2:49 AM Gregory Smith <gregsmithpgsql@gmail.com> wrote: > Back on March 10 Thomas Munro committed and wrestled multiple reworks of the pgbench code from Fabien and the crew. Thefeature to synchronize startup I'm looking forward to testing now that I have a packaged beta. Variations on that problemhave bit me so many times I added code last year to my pgbench processing pipeline to just throw out the first andlast 10% of every data set. Yeah, commit aeb57af8 is a nice improvement and was the main thing I wanted to get into the tree for 14 in this area, because it was measuring the wrong thing. > Before I could get to startup timing I noticed the pgbench logging output was broken via commit 547f04e7 "Improve timelogic": https://www.postgresql.org/message-id/E1lJqpF-00064e-C6%40gemulon.postgresql.org It does suck that we broke the logging and that it took 3 months for anyone to notice and report it to the list. Seems like it should be straightforward to fix, though, with fixes already proposed (though I haven't studied them yet, will do). > I have a lot of community oriented work backed up behind this right now, so I'm gonna be really honest. This time reworkcommit in its current form makes me uncomfortable at this point in the release schedule. The commit has already foughtthrough two rounds of platform specific bug fixes. But since the buildfarm doesn't test the logging feature, thatwhole process is suspect. It's true that this work produced a few rounds of small portability follow-ups: c427de42 (work around strange hacks elsewhere in the tree for AIX), 68b34b23 (missing calling convention specifier on Windows), and de91c3b9 (adjust pthread missing-function code for threadless builds). These were problems that didn't show up on developer or CI systems (including threadless and Windows), and IMHO are typical sorts of problems you expect to have to work through when stuff hits the build farm, especially when using new system interfaces. So I don't think any of that, on its own, supports reverting anything here. > My take on the PostgreSQL way to proceed: this bug exposes that pgbench logging is a feature we finally need to designtesting for. We need a new buildfarm test and then a march through a full release phase to see how it goes. Onlythen should we start messing with the time logic. Even if we fixed the source today on both my test platforms, I'd stillbe nervous that beta 2 could ship and more performance testing could fall over from this modification. And that's cuttingthings a little close. > > The fastest way to get me back to comfortable would be to unwind 547f04e7 and its associated fixes and take it back toreview. I understand the intent and value; I appreciate the work so far. The big industry architecture shift from Intelto ARM has me worried about time overhead again, the old code is wonky, and in the PG15 release cycle I already haveresources planned around this area. Let me study the proposed fixes on this and the other thread about pgbench logging for a bit. Glad to hear that you're working on this area. I guess you might be researching stuff along the same sorts of lines as in the thread "Reduce timing overhead of EXPLAIN ANALYZE using rdtsc?" (though that's about the executor). As I already expressed in that thread, if the backend's instrumentation code is improved as proposed there, we'll probably want to rip some of these pgbench changes out anyway and go back to common instrumentation code. For that reason, I'm not super attached to that new pg_time_usec_t stuff at all, and wouldn't be sad if we reverted that piece. I am moderately attached to the sync changes, though. pgbench 13 is objectively producing incorrect results in that respect.
pgsql-hackers by date: