Re: benchmarking Flex practices - Mailing list pgsql-hackers

From John Naylor
Subject Re: benchmarking Flex practices
Date
Msg-id CACPNZCvvAzZfWVTonLLwz9D2pBDKtnJEy0JtCR8H+4XGH5Higw@mail.gmail.com
Whole thread Raw
In response to Re: benchmarking Flex practices  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: benchmarking Flex practices
List pgsql-hackers
On Fri, Jun 21, 2019 at 12:02 AM Andres Freund <andres@anarazel.de> wrote:
> Might be worth also testing with a more repetitive testcase to measure
> both cache locality and branch prediction. I assume that with
> information_schema there's enough variability that these effects play a
> smaller role. And there's plenty real-world cases where there's a *lot*
> of very similar statements being parsed over and over. I'd probably just
> measure the statements pgbench generates or such.

I tried benchmarking with a query string with just

BEGIN;
UPDATE pgbench_accounts SET abalance = abalance + 1 WHERE aid = 1;
SELECT abalance FROM pgbench_accounts WHERE aid = 1;
UPDATE pgbench_tellers SET tbalance = tbalance + 1 WHERE tid = 1;
UPDATE pgbench_branches SET bbalance = bbalance + 1 WHERE bid = 1;
INSERT INTO pgbench_history (tid, bid, aid, delta, mtime) VALUES (1,
1, 1, 1, CURRENT_TIMESTAMP);
END;

repeated about 500 times. With this, backtracking is about 3% slower:

HEAD:
1.15s

patch:
1.19s

patch + huge array:
1.19s

That's possibly significant enough to be evidence for your assumption,
as well as to persuade us to keep things as they are.

On Thu, Jun 20, 2019 at 10:52 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Huh.  That's really interesting, because removing backtracking was a
> demonstrable, significant win when we did it [1].  I wonder what has
> changed?  I'd be prepared to believe that today's machines are more
> sensitive to the amount of cache space eaten by the tables --- but that
> idea seems contradicted by your result that the table size isn't
> important.  (I'm wishing I'd documented the test case I used in 2005...)

It's possible the code used with backtracking is better predicted than
15 years ago, but my uneducated hunch is our Bison grammar has gotten
much worse in cache misses and branch prediction than the scanner has
in 15 years. That, plus the recent keyword lookup optimization might
have caused parsing to be completely dominated by Bison. If that's the
case, the 3% slowdown above could be a significant portion of scanning
in isolation.

> Hm.  Smaller binary is definitely nice, but 31763 is close enough to
> 32768 that I'd have little faith in the optimization surviving for long.
> Is there any way we could buy back some more transitions?

I tried quickly ripping out the unicode escape support entirely. It
builds with warnings, but the point is to just get the size -- that
produced an array with only 28428 elements, and that's keeping all the
no-backup rules intact. This might be unworkable and/or ugly, but I
wonder if it's possible to pull unicode escape handling into the
parsing stage, with "UESCAPE" being a keyword token that we have to
peek ahead to check for. I'll look for other rules that could be more
easily optimized, but I'm not terribly optimistic.

-- 
John Naylor                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



pgsql-hackers by date:

Previous
From: Peter Eisentraut
Date:
Subject: using explicit_bzero
Next
From: Dean Rasheed
Date:
Subject: Re: Choosing values for multivariate MCV lists