Thread: Index plan returns different results to sequential scan

Index plan returns different results to sequential scan

From
John Burns
Date:
Hi… 

We’ve been using the postcode extension from PGXN for a number of years and it has not caused any problems until now. 
The original developer is no longer around, but a couple of minor changes have kept it operating since version 9.



There is an operator (%)  and underlying ‘C’ function (postcode_eq_partial) that provides a containment test 
e.g. ‘NW10 5AQ’ % ‘NW10’  is true , ‘L17 3PB’ % ‘NW10’ is false etc.


If we run a   query against a table with a postcode field but no index  using the % operator we get the correct result.

If we run the same query against the table after adding a tree index on the postcode result we get few fewer results.

If we drop and reindex we get a different number of results.

The query is SELECT * FROM XXX where postcode % ’NW10’   
To create a sample table  — create table XXX ( udprn bigint, postcode postcode ) 
To Index it  CREATE INDEX on XXX(postcode)

The underlying representation of the postcode is a 32 bit integer, so not especially esoteric.

Although the postcode package is probably not especially significant in the scheme of things , the behaviour of indexed versus non-indexed queries is worrisome.

I’ve had to make a couple of minor changes to the postcode package as the Postgres versions have changed, i.e. define TRUE and FALSE in the c header files and change the location of the include files when Postgres 16 arrived, but nothing else.

I’ve attached a sample data file to populate a table that exhibits the behaviour, and the tweaked version of the Postcode package.

Regards, John Burns

Version : PostgreSQL 16.2 (Ubuntu 16.2-1.pgdg22.04+1) on x86_64-pc-linux-gnu, compiled by gcc (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0, 64-bit
john@impactdatametrics.com



Attachment

Re: Index plan returns different results to sequential scan

From
Tomas Vondra
Date:
On 3/21/24 18:25, John Burns wrote:
> Hi…
> 
> ...
> 
> Although the postcode package is probably not especially significant
> in the scheme of things , the behaviour of indexed versus non-indexed
> queries is worrisome.
> 

Seems like a thinko in the indexing, but that's just a guess.

> I’ve had to make a couple of minor changes to the postcode package
> as the Postgres versions have changed, i.e. define TRUE and FALSE in
> the c header files and change the location of the include files when
> Postgres 16 arrived, but nothing else.
Can you share a patch with the tweaks?

> 
> I’ve attached a sample data file to populate a table that exhibits 
> the behaviour, and the tweaked version of the Postcode package.
> 

I don't see any attachments :-(


regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: Index plan returns different results to sequential scan

From
Tomas Vondra
Date:

On 3/21/24 21:08, John Burns wrote:
> 
> 
>> On 21 Mar 2024, at 20:27, Tomas Vondra <tomas.vondra@enterprisedb.com> wrote:
>>
>>
>> Sorry about that … the patched version of the package, plus a sample data set is in the attached zip file,
>>
>> -John 
>>

Thanks, I can reproduce the issue. I don't know why is it happening, but
the behavior I see is that the (postcode % 'NW10') query somehow misses
rows with postcodes 'NW10 0AA' - 'NW10 3NL' when executed using index.

I don't see anything obviously wrong in the extension / index pages.

You suggested it used to work OK and then broke. Do you perhaps know at
which version it broke? Or did you use 9.x until now, upgrades to 16 and
realized it's broken?


regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: Index plan returns different results to sequential scan

From
Tomas Vondra
Date:
On 3/22/24 00:58, Tomas Vondra wrote:
> 
> 
> On 3/21/24 21:08, John Burns wrote:
>>
>>
>>> On 21 Mar 2024, at 20:27, Tomas Vondra <tomas.vondra@enterprisedb.com> wrote:
>>>
>>>
>>> Sorry about that … the patched version of the package, plus a sample data set is in the attached zip file,
>>>
>>> -John 
>>>
> 
> Thanks, I can reproduce the issue. I don't know why is it happening, but
> the behavior I see is that the (postcode % 'NW10') query somehow misses
> rows with postcodes 'NW10 0AA' - 'NW10 3NL' when executed using index.
> 
> I don't see anything obviously wrong in the extension / index pages.
> 
> You suggested it used to work OK and then broke. Do you perhaps know at
> which version it broke? Or did you use 9.x until now, upgrades to 16 and
> realized it's broken?
> 

After some bisecting, this seems to be broken since this PG12 commit:

https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=dd299df8189bd00fbe54b72c64f43b6af2ffeccd

==========================================================================
commit    dd299df8189bd00fbe54b72c64f43b6af2ffeccd
tree    931ef720687d61cf5e75464fa0b1c1d75fb3f9d3    tree
parent    e5adcb789d80ba565ccacb1ed4341a7c29085238    commit | diff
Make heap TID a tiebreaker nbtree index column.

Make nbtree treat all index tuples as having a heap TID attribute.
Index searches can distinguish duplicates by heap TID, since heap TID is
always guaranteed to be unique.  This general approach has numerous
benefits for performance, and is prerequisite to teaching VACUUM to
perform "retail index tuple deletion".

...
==========================================================================

I don't know what the problem is, or whether it's a big in PG or in the
postcode extension (I agree the extension is fairly straightforward).


regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: Index plan returns different results to sequential scan

From
Peter Geoghegan
Date:
On Thu, Mar 21, 2024 at 9:22 PM Tomas Vondra
<tomas.vondra@enterprisedb.com> wrote:
> I don't know what the problem is, or whether it's a big in PG or in the
> postcode extension (I agree the extension is fairly straightforward).

Could you try running amcheck's bt_index_check function against the index?

--
Peter Geoghegan



Re: Index plan returns different results to sequential scan

From
John Burns
Date:
I’m sorry, but I can’t really say when it broke — we suspect when we went from version 14 to 16 — we use this
functionalitya lot, and we would have noticed long ago if it was wrong. 

As additional information, I created a new operator that used another function — wrapping  the  function
postcode_cmp_partial(...)= 0 — and it exhibited the same behaviour.   
Not surprised since the postcode_eq_partial is a trivially simple function.

— John Burns


> On 22 Mar 2024, at 00:58, Tomas Vondra <tomas.vondra@enterprisedb.com> wrote:
>
>
>
> On 3/21/24 21:08, John Burns wrote:
>>
>>
>>> On 21 Mar 2024, at 20:27, Tomas Vondra <tomas.vondra@enterprisedb.com> wrote:
>>>
>>>
>>> Sorry about that … the patched version of the package, plus a sample data set is in the attached zip file,
>>>
>>> -John
>>>
>
> Thanks, I can reproduce the issue. I don't know why is it happening, but
> the behavior I see is that the (postcode % 'NW10') query somehow misses
> rows with postcodes 'NW10 0AA' - 'NW10 3NL' when executed using index.
>
> I don't see anything obviously wrong in the extension / index pages.
>
> You suggested it used to work OK and then broke. Do you perhaps know at
> which version it broke? Or did you use 9.x until now, upgrades to 16 and
> realized it's broken?
>
>
> regards
>
> --
> Tomas Vondra
> EnterpriseDB: http://www.enterprisedb.com
> The Enterprise PostgreSQL Company




Re: Index plan returns different results to sequential scan

From
Tomas Vondra
Date:
On 3/22/24 02:40, Peter Geoghegan wrote:
> On Thu, Mar 21, 2024 at 9:22 PM Tomas Vondra
> <tomas.vondra@enterprisedb.com> wrote:
>> I don't know what the problem is, or whether it's a big in PG or in the
>> postcode extension (I agree the extension is fairly straightforward).
> 
> Could you try running amcheck's bt_index_check function against the index?
> 

I tried (on master). It doesn't report anything:

test=# select bt_index_check('xxx_postcode_idx', 't');
 bt_index_check
----------------

(1 row)

I also tried looking at the index using bt_page_items from pageinspect,
and I did not notice anything obviously wrong. But I don't have much
experience with btree at such low level ...


regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: Index plan returns different results to sequential scan

From
Tomas Vondra
Date:

On 3/22/24 08:43, Tomas Vondra wrote:
> On 3/22/24 02:40, Peter Geoghegan wrote:
>> On Thu, Mar 21, 2024 at 9:22 PM Tomas Vondra
>> <tomas.vondra@enterprisedb.com> wrote:
>>> I don't know what the problem is, or whether it's a big in PG or in the
>>> postcode extension (I agree the extension is fairly straightforward).
>>
>> Could you try running amcheck's bt_index_check function against the index?
>>
> 
> I tried (on master). It doesn't report anything:
> 
> test=# select bt_index_check('xxx_postcode_idx', 't');
>  bt_index_check
> ----------------
> 
> (1 row)
> 
> I also tried looking at the index using bt_page_items from pageinspect,
> and I did not notice anything obviously wrong. But I don't have much
> experience with btree at such low level ...
> 

I realized I can create an index on e5adcb789, apply dd299df81 and
create another index, and then compare what bt_page_items says for those.

FWIW xxx_postcode_idx (created before e5adcb789) keeps returning the
correct data even after dd299df81 gets applied.

I'll only show 5 rows - the only difference is in the first item (which
I guess is hikey?), all other items are exactly the same.

For the index created on e5adcb789 (that returns the correct results
even with the new build), I get this:



test=# select * from bt_page_items(get_Raw_page('xxx_postcode_idx', 1))
limit 5;
 itemoffset |  ctid  | itemlen | nulls | vars |          data
------------+--------+---------+-------+------+-------------------------
          1 | (2,15) |      16 | f     | f    | ce 51 20 50 00 00 00 00
          2 | (0,1)  |      16 | f     | f    | 27 48 20 01 00 00 00 00
          3 | (0,2)  |      16 | f     | f    | 30 48 20 01 00 00 00 00
          4 | (0,3)  |      16 | f     | f    | 41 48 20 01 00 00 00 00
          5 | (0,4)  |      16 | f     | f    | 41 48 20 01 00 00 00 00
(5 rows)

test=# select * from bt_page_items(get_Raw_page('xxx_postcode_idx', 2))
limit 5;
 itemoffset |  ctid  | itemlen | nulls | vars |          data
------------+--------+---------+-------+------+-------------------------
          1 | (4,5)  |      16 | f     | f    | 25 60 20 50 00 00 00 00
          2 | (2,15) |      16 | f     | f    | ce 51 20 50 00 00 00 00
          3 | (2,16) |      16 | f     | f    | d3 51 20 50 00 00 00 00
          4 | (2,17) |      16 | f     | f    | d3 51 20 50 00 00 00 00
          5 | (2,18) |      16 | f     | f    | d3 51 20 50 00 00 00 00
(5 rows)

test=# select * from bt_page_items(get_Raw_page('xxx_postcode_idx', 3))
limit 5;
 itemoffset |  ctid  | itemlen | nulls | vars |          data
------------+--------+---------+-------+------+-------------------------
          1 | (1,0)  |       8 | f     | f    |
          2 | (2,15) |      16 | f     | f    | ce 51 20 50 00 00 00 00
          3 | (4,5)  |      16 | f     | f    | 25 60 20 50 00 00 00 00
(3 rows)

test=# select * from bt_page_items(get_Raw_page('xxx_postcode_idx', 4))
limit 5;
 itemoffset | ctid  | itemlen | nulls | vars |          data
------------+-------+---------+-------+------+-------------------------
          1 | (4,5) |      16 | f     | f    | 25 60 20 50 00 00 00 00
          2 | (4,6) |      16 | f     | f    | 25 60 20 50 00 00 00 00
          3 | (4,7) |      16 | f     | f    | 25 60 20 50 00 00 00 00
          4 | (4,8) |      16 | f     | f    | 25 60 20 50 00 00 00 00
          5 | (4,9) |      16 | f     | f    | 30 60 20 50 00 00 00 00
(5 rows)



while for the index created with dd299df81 (and returning incomplete
results) I get this:



test=# select * from bt_page_items(get_Raw_page('xxx2_postcode_idx', 1))
limit 5;
 itemoffset | ctid  | itemlen | nulls | vars |          data
------------+-------+---------+-------+------+-------------------------
          1 | (2,1) |      16 | f     | f    | ce 51 20 50 00 00 00 00
          2 | (0,1) |      16 | f     | f    | 27 48 20 01 00 00 00 00
          3 | (0,2) |      16 | f     | f    | 30 48 20 01 00 00 00 00
          4 | (0,3) |      16 | f     | f    | 41 48 20 01 00 00 00 00
          5 | (0,4) |      16 | f     | f    | 41 48 20 01 00 00 00 00
(5 rows)

test=# select * from bt_page_items(get_Raw_page('xxx2_postcode_idx', 2))
limit 5;
 itemoffset |   ctid   | itemlen | nulls | vars |
data
------------+----------+---------+-------+------+-------------------------------------------------
          1 | (4,4097) |      24 | f     | f    | 25 60 20 50 00 00 00
00 00 00 00 00 04 00 04 00
          2 | (2,15)   |      16 | f     | f    | ce 51 20 50 00 00 00 00
          3 | (2,16)   |      16 | f     | f    | d3 51 20 50 00 00 00 00
          4 | (2,17)   |      16 | f     | f    | d3 51 20 50 00 00 00 00
          5 | (2,18)   |      16 | f     | f    | d3 51 20 50 00 00 00 00
(5 rows)

test=# select * from bt_page_items(get_Raw_page('xxx2_postcode_idx', 3))
limit 5;
 itemoffset |   ctid   | itemlen | nulls | vars |
data
------------+----------+---------+-------+------+-------------------------------------------------
          1 | (1,0)    |       8 | f     | f    |
          2 | (2,1)    |      16 | f     | f    | ce 51 20 50 00 00 00 00
          3 | (4,4097) |      24 | f     | f    | 25 60 20 50 00 00 00
00 00 00 00 00 04 00 04 00
(3 rows)

test=# select * from bt_page_items(get_Raw_page('xxx2_postcode_idx', 4))
limit 5;
 itemoffset | ctid  | itemlen | nulls | vars |          data
------------+-------+---------+-------+------+-------------------------
          1 | (4,5) |      16 | f     | f    | 25 60 20 50 00 00 00 00
          2 | (4,6) |      16 | f     | f    | 25 60 20 50 00 00 00 00
          3 | (4,7) |      16 | f     | f    | 25 60 20 50 00 00 00 00
          4 | (4,8) |      16 | f     | f    | 25 60 20 50 00 00 00 00
          5 | (4,9) |      16 | f     | f    | 30 60 20 50 00 00 00 00
(5 rows)



regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: Index plan returns different results to sequential scan

From
Peter Geoghegan
Date:
On Thu, Mar 21, 2024 at 2:03 PM John Burns <john@impactdatametrics.com> wrote:
> The query is SELECT * FROM XXX where postcode % ’NW10’
> To create a sample table  — create table XXX ( udprn bigint, postcode postcode )
> To Index it  CREATE INDEX on XXX(postcode)

The opfamily's % operator uses the B-Tree equality strategy. This
means that it works the same way as = works in most other opfamilies.

I don't see how equality can work reliably here. A query with a
predicate "WHERE my_indexed_postcode_column % ‘NW10’" seems to work by
masking the value stored in the index, throwing away some amount of
suffix bytes in the process. But the values from the index are still
stored in their original order -- the temporarily masked suffix bytes
aren't masked in the index, of course (they're only masked
temporarily, by the cross-type equality operator %).

Wouldn't you need something closer to "WHERE
my_indexed_postcode_column >= ‘NW10’ and my_indexed_postcode_column <
‘NW11’" for this to work reliably?

The relevant rules for btree operator families are described here:

https://www.postgresql.org/docs/devel/btree-behavior.html

Offhand, I suspect that you don't see problems pre-12 B-Tree because
the B-Tree code happened to have been more forgiving of opfamilies
that were broken in this way. Earlier versions treated < and <= as the
same thing in certain contexts.

--
Peter Geoghegan



Re: Index plan returns different results to sequential scan

From
Tomas Vondra
Date:

On 3/23/24 03:24, Peter Geoghegan wrote:
> On Thu, Mar 21, 2024 at 2:03 PM John Burns <john@impactdatametrics.com> wrote:
>> The query is SELECT * FROM XXX where postcode % ’NW10’
>> To create a sample table  — create table XXX ( udprn bigint, postcode postcode )
>> To Index it  CREATE INDEX on XXX(postcode)
> 
> The opfamily's % operator uses the B-Tree equality strategy. This
> means that it works the same way as = works in most other opfamilies.
> 
> I don't see how equality can work reliably here. A query with a
> predicate "WHERE my_indexed_postcode_column % ‘NW10’" seems to work by
> masking the value stored in the index, throwing away some amount of
> suffix bytes in the process. But the values from the index are still
> stored in their original order -- the temporarily masked suffix bytes
> aren't masked in the index, of course (they're only masked
> temporarily, by the cross-type equality operator %).
> 
> Wouldn't you need something closer to "WHERE
> my_indexed_postcode_column >= ‘NW10’ and my_indexed_postcode_column <
> ‘NW11’" for this to work reliably?
> 

Yeah, I was not sure how come this could be processed using equality
operator, but I got distracted by bisecting this to a particular commit.
I didn't realize the commit might have changed how much we rely on the
opfamily to do things correctly.

I do think the '%' operator is pretty close to what we do for LIKE with
prefix patterns for text:

explain select * from t where a like 'aa%';
                              QUERY PLAN
-----------------------------------------------------------------------
 Index Only Scan using t_a_idx on t  (cost=0.29..4.31 rows=1 width=33)
   Index Cond: ((a ~>=~ 'aa'::text) AND (a ~<~ 'ab'::text))
   Filter: (a ~~ 'aa%'::text)
(3 rows)

So I guess postcode would need to do something similar - treat the '%'
operator as separate from opclass equality, and define a new support
procedure akin to text_support, translating the '%' into the range quals
that can match the index.


regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company