Re: reducing the footprint of ScanKeyword (was Re: Large writable variables) - Mailing list pgsql-hackers

From Tom Lane
Subject Re: reducing the footprint of ScanKeyword (was Re: Large writable variables)
Date
Msg-id 9174.1545938531@sss.pgh.pa.us
Whole thread Raw
In response to Re: reducing the footprint of ScanKeyword (was Re: Large writablevariables)  (Andrew Dunstan <andrew.dunstan@2ndquadrant.com>)
Responses Re: reducing the footprint of ScanKeyword (was Re: Large writablevariables)  (Andres Freund <andres@anarazel.de>)
List pgsql-hackers
Andrew Dunstan <andrew.dunstan@2ndquadrant.com> writes:
> RD parsers are not terribly hard to write.

Sure, as long as they are for grammars that are (a) small, (b) static,
and (c) LL(1), which is strictly weaker than the LALR(1) grammar class
that bison can handle.  We already have a whole lot of constructs that
are at the edges of what bison can handle, which makes me dubious that
an RD parser could be built at all without a lot of performance-eating
lookahead and/or backtracking.

> A smaller project might be to see if we can replace the binary keyword
> search in ScanKeyword with a perfect hashing function generated by
> gperf, or something similar. I had a quick look at that, too.

Yeah, we've looked at gperf before, eg

https://www.postgresql.org/message-id/20170927183156.jqzcsy7ocjcbdnmo@alap3.anarazel.de

Perhaps it'd be a win but I'm not very convinced.

I don't know much about the theory of perfect hashing, but I wonder
if we could just roll our own tool for that.  Since we're not dealing
with extremely large keyword sets, perhaps brute force search for a
set of multipliers for a hash computation like
(char[0] * some_prime + char[1] * some_other_prime ...) mod table_size
would work.

            regards, tom lane


pgsql-hackers by date:

Previous
From: Stephen Frost
Date:
Subject: Re: row filtering for logical replication
Next
From: Alvaro Herrera
Date:
Subject: Re: removal of dangling temp tables