Re: Single client performance on trivial SELECTs - Mailing list pgsql-hackers

From Robert Haas
Subject Re: Single client performance on trivial SELECTs
Date
Msg-id BANLkTinKZ05HqyORBq8puC4kK8j2pZ0hLw@mail.gmail.com
Whole thread Raw
In response to Re: Single client performance on trivial SELECTs  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: Single client performance on trivial SELECTs  (Noah Misch <noah@leadboat.com>)
List pgsql-hackers
On Thu, Apr 14, 2011 at 7:43 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> I doubt that it's possible to make AllocSetAlloc radically cheaper.
> I think the more likely route to improvement there is going to be to
> find a way to do fewer pallocs.  For instance, if we had more rigorous
> rules about which data structures are read-only to which code, we could
> probably get rid of a lot of just-in-case tree copying that happens in
> the parser and planner.
>
> But at the same time, even if we could drive all palloc costs to zero,
> it would only make a 10% difference in this example.  And this sort of
> fairly flat profile is what I see in most cases these days --- we've
> been playing performance whack-a-mole for long enough now that there
> isn't much low-hanging fruit left.

There are other architectural approaches that we could take to
reducing the parsing overhead.  Random ideas:

1. Separate the parser into a parser for DML statements only, and
another parser for everything else, to avoid cluttering the main
parser with lots of keywords and non-terminals that aren't going to be
used for the kinds of queries people care about parsing speed.

2. Hand-code a parser, or use some tool other than bison to generate one.

3. Some kind of parse-tree cache, so that executing the same exact
statement many times doesn't require reparsing it every time.

It's fairly far down in the noise on this particular profile, but in
the low-hanging fruit department, I think we should fix
ScanKeywordLookup to use a smarter algorithm that is more like O(1)
rather than O(lg n) in the number of keywords.  It shouldn't be
terribly difficult to come up with some kind of hash function based
on, say, the first two characters of the keyword that would be a lot
faster than what we're doing now.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


pgsql-hackers by date:

Previous
From: David Fetter
Date:
Subject: Re: Single client performance on trivial SELECTs
Next
From: Robert Haas
Date:
Subject: Re: POSIX shared memory redux