pgsql: Speed up "rare & frequent" type GIN queries. - Mailing list pgsql-committers

From Heikki Linnakangas
Subject pgsql: Speed up "rare & frequent" type GIN queries.
Date
Msg-id E1WBlXR-0004T4-Is@gemulon.postgresql.org
Whole thread Raw
Responses Re: pgsql: Speed up "rare & frequent" type GIN queries.  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-committers
Speed up "rare & frequent" type GIN queries.

If you have a GIN query like "rare & frequent", we currently fetch all the
items that match either rare or frequent, call the consistent function for
each item, and let the consistent function filter out items that only match
one of the terms. However, if we can deduce that "rare" must be present for
the overall qual to be true, we can scan all the rare items, and for each
rare item, skip over to the next frequent item with the same or greater TID.
That greatly speeds up "rare & frequent" type queries.

To implement that, introduce the concept of a tri-state consistent function,
where the 3rd value is MAYBE, indicating that we don't know if that term is
present. Operator classes only provide a boolean consistent function, so we
simulate the tri-state consistent function by calling the boolean function
several times, with the MAYBE arguments set to all combinations of TRUE and
FALSE. Testing all combinations is only feasible for a small number of MAYBE
arguments, but it is envisioned that we'll provide a way for operator
classes to provide a native tri-state consistent function, which can be much
more efficient. But that is not included in this patch.

We were already using that trick to for lossy pages, calling the consistent
function with the lossy entry set to TRUE and FALSE. Now that we have the
tri-state consistent function, use it for lossy pages too.

Alexander Korotkov, with fair amount of refactoring by me.

Branch
------
master

Details
-------
http://git.postgresql.org/pg/commitdiff/dbc649fd773e7e16458bfbec2611bf15f4355bc4

Modified Files
--------------
src/backend/access/gin/Makefile   |    2 +-
src/backend/access/gin/ginget.c   |  327 ++++++++++++++++++++++++++-----------
src/backend/access/gin/ginlogic.c |  180 ++++++++++++++++++++
src/backend/access/gin/ginscan.c  |    2 +
src/include/access/gin_private.h  |   31 ++++
5 files changed, 443 insertions(+), 99 deletions(-)


pgsql-committers by date:

Previous
From: Heikki Linnakangas
Date:
Subject: pgsql: Fix thinko in comment.
Next
From: Tom Lane
Date:
Subject: Re: pgsql: Speed up "rare & frequent" type GIN queries.