Re: BUG #14210: filter by "=" constraint doesn't work when hash index is present on a column - Mailing list pgsql-bugs

From Peter Geoghegan
Subject Re: BUG #14210: filter by "=" constraint doesn't work when hash index is present on a column
Date
Msg-id CAM3SWZTBAo4hjbBd780+MrOKiKp_TMo1N3A0Rw9_im8gbD7fQA@mail.gmail.com
Whole thread Raw
In response to Re: BUG #14210: filter by "=" constraint doesn't work when hash index is present on a column  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: BUG #14210: filter by "=" constraint doesn't work when hash index is present on a column
List pgsql-bugs
On Mon, Jun 27, 2016 at 1:22 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Peter Geoghegan <pg@heroku.com> writes:
>> I like this idea. Should I write a patch?
>
> BTW, while you're at it: it strikes me that the threshold should be
> either min(NBuffers, maintenance_work_mem) or
> min(NLocBuffer, maintenance_work_mem), depending on whether we're
> talking about a regular or temp table/index.  That is, there's a
> pre-existing bug here that when NLocBuffer is a good deal less than
> NBuffers (which is the typical case nowadays) you'll get a lot of
> thrashing between local buffers and kernel cache, if the index isn't
> quite big enough to trigger the sorting code.  This might not manifest
> as actual I/O, but it's not the intended behavior either.

Finally found time for this. Attached patch tests hash tuplesorts
along the lines we discussed. It adds one new tuplesort operation,
which does not spill to disk. It also asserts that hash values
retrieved through the tuplesort interface are in fact in sorted order.
I wanted to have something reliably trip when comparetup_index_hash()
gives wrong answers, even when a corrupt final index is perhaps not a
consequence of the underlying bug.

Due to how the hash build code works, maintenance_work_mem will need
to be 2 blocks smaller than the final index size in order to force a
sort (see code comments). I think that this does not matter, since
none of this is documented currently.

Having reviewed the coverage report for tuplesort.c [1], I think that
the only notable testing omission once this latest patch is committed
will be that there is no coverage for "backwards get tuple" cases
(this is more noticeable when looking at coverage statistics for
logtape.c, actually). This seems less important to me, and would be
awkward to test in any case, given my constraints.

[1] http://coverage.postgresql.org/src/backend/utils/sort/tuplesort.c.gcov.html
--
Peter Geoghegan

Attachment

pgsql-bugs by date:

Previous
From: Charles
Date:
Subject: Using row-level security results in less optimal query plans
Next
From: Amit Kapila
Date:
Subject: Re: Invalid indexes should not consume update overhead