Home > mailing lists

Re: reducing the footprint of ScanKeyword (was Re: Large writable variables) - Mailing list pgsql-hackers

From	John Naylor
Subject	Re: reducing the footprint of ScanKeyword (was Re: Large writable variables)
Date	December 20, 2018 05:59:43
Msg-id	CAJVSVGWbC0KuZKV=wVNUvGWO5L-ay_aYRztEAanX8=bc1Ju0kA@mail.gmail.com Whole thread Raw
In response to	Re: reducing the footprint of ScanKeyword (was Re: Large writable variables) (Andrew Gierth <andrew@tao11.riddles.org.uk>)
List	pgsql-hackers

Tree view

On 12/19/18, Andrew Gierth <andrew@tao11.riddles.org.uk> wrote:
> Is there any particular reason not to go further and use a perfect hash
> function for the lookup, rather than binary search?

When I was investigating faster algorithms, I ruled out gperf based on
discussions in the archives. The approach here has modest goals and
shouldn't be too invasive.

With the makefile support and separate keyword files in place, that'll
be one less thing to do if we ever decide to replace binary search.
The giant string will likely be useful as well.

Since we're on the subject, I think some kind of trie would be ideal
performance-wise, but a large amount of work. The nice thing about a
trie is that it can be faster then a hash table for a key miss. I
found a paper that described some space-efficient trie variations [1],
but we'd likely have to code the algorithm and a way to emit a C code
representation of it. I've found some libraries, but that would have
more of the same difficulties in practicality that gperf had.

[1] https://infoscience.epfl.ch/record/64394/files/triesearches.pdf

-John Naylor

pgsql-hackers by date:

From: Michael Paquier
Date: 20 December 2018, 05:05:22
Subject: Re: A few new options for vacuumdb

From: Pavel Stehule
Date: 20 December 2018, 06:31:59
Subject: Re: slow queries over information schema.tables

Re: reducing the footprint of ScanKeyword (was Re: Large writable variables) - Mailing list pgsql-hackers

Previous

Next