Hstore: Query speedups with Gin index - Mailing list pgsql-hackers

From Blake Smith
Subject Hstore: Query speedups with Gin index
Date
Msg-id CAPxT4eEe-MNs4FbVN1Le_nWsTgyw3N38tENbN5W7GPBrWB7cJA@mail.gmail.com
Whole thread Raw
Responses Re: Hstore: Query speedups with Gin index  (Michael Paquier <michael.paquier@gmail.com>)
List pgsql-hackers

Hey everyone,

I'm looking for feedback on a contrib/hstore patch.

We've been experiencing slow "@>" queries involving an hstore column that's covered by a Gin index. At the current postgresql git HEAD, the hstore <-> gin interface produces the following text items to be indexed:

hstore: "'a'=>'1234', 'b'=>'test'"
Produces indexed text items: "Ka", "V1234", "Kb", "Vtest"

For the size of our production table (10s of millions of rows), I observed significant query speedups by changing the index strategy to the following:

hstore: "'a'=>'1234', 'b'=>'test'"
Produces indexed text items: "Ka", "KaV1234", "Kb", "KbVtest" 

The combined entry is used to support "contains (@>)" queries, and the key only item is used to support "key contains (?)" queries. This change seems to help especially with hstore keys that have high cardinalities. Downsides of this change is that it requires an index rebuild, and the index will be larger in size.

Patch attached. Any thoughts on this change?

Thanks,

Blake
Attachment

pgsql-hackers by date:

Previous
From: bricklen
Date:
Subject: Re: pg_system_identifier()
Next
From: Josh Berkus
Date:
Subject: Re: pg_system_identifier()