Re: BUG #14210: filter by "=" constraint doesn't work when hash index is present on a column - Mailing list pgsql-bugs

From Robert Haas
Subject Re: BUG #14210: filter by "=" constraint doesn't work when hash index is present on a column
Date
Msg-id CA+Tgmob9-OUU6J=ScpduDUGicEJSfhabvJkTn71C_hETSMwJqA@mail.gmail.com
Whole thread Raw
In response to Re: BUG #14210: filter by "=" constraint doesn't work when hash index is present on a column  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: BUG #14210: filter by "=" constraint doesn't work when hash index is present on a column  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-bugs
On Thu, Jun 23, 2016 at 7:08 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Daniel Newman <dtnewman@gmail.com> writes:
>> Yes. If i delete the index and recreate it, the bug is replicated.
>
> So the answer is that this got broken by commit 9f03ca915196dfc8,
> which appears to have imagined that _hash_form_tuple() is just an
> alias for index_form_tuple().  But it ain't.  As a result, construction
> of hash indexes larger than shared_buffers is broken in 9.5 and up:
> what gets entered into the index is garbage, because we are taking
> raw data as if it were already hashed.  (In fact, in this example,
> we seem to be taking *pointers* to raw data as the hash values.)

Oops.

>> Interestingly, I modified the pg_dump file a bit. At the end, it says:
>>> CREATE INDEX hash_issue_index ON hash_issue_table USING hash
>>> (hash_issue_column);
>>> DROP INDEX hash_issue_index;
>>> CREATE INDEX hash_issue_index ON hash_issue_table USING hash
>>> (hash_issue_column);
>> This is because the issue was not replicating (for some reason) when it
>> built the index the first time.
>
> I think what's happening there is that the first CREATE INDEX
> underestimates the size of the table and decides it doesn't need to
> use the _h_spool() code path.  The other path isn't broken.
>
> We can either revert the aforesaid commit so far as it affects hash,
> or do something to break _hash_form_tuple's encapsulation of the
> hash-value-for-data substitution.  I don't immediately see a non-messy
> way to do the latter.

Would it work to have something like _hash_form_tuple() except that
instead of returning an index tuple it would just return the Datum and
isnull flags and let the caller decide what to do with them?

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

pgsql-bugs by date:

Previous
From: John McKown
Date:
Subject: Re: BUG #14214: i am enable to upload .xlsx file in my postgres database.
Next
From: Tom Lane
Date:
Subject: Re: BUG #14210: filter by "=" constraint doesn't work when hash index is present on a column