Thread: BUG #18206: Strange performance behaviour depending on location of field in query.

BUG #18206: Strange performance behaviour depending on location of field in query.

From
PG Bug reporting form
Date:
The following bug has been logged on the website:

Bug reference:      18206
Logged by:          Daniel Migowski
Email address:      dmigowski@ikoffice.de
PostgreSQL version: 15.5
Operating system:   Windows + Linux
Description:

I have a table with some columns id and a to z and these two queries have
drastically different runtimes.

  select count(id) FROM testtable t WHERE t.z IS NULL;  (35% slower than
WHERE t.a IS NULL)
  select count(id) FROM testtable t WHERE t.a IS NULL;

This just screems like somewhere PostgreSQL is iterating over fields over
and over again without assigning field indexes to the literals. 

Please create a test table this way:

drop table if exists testtable cascade;
SELECT generate_series::int4 as id, null::int4 a, null::int4 b,
    null::int4 c, null::int4 d, null::int4 e, null::int4 f, null::int4 g,
null::int4 h, null::int4 i, null::int4 j,
    null::int4 k, null::int4 l, null::int4 m, null::int4 n, null::int4 o,
null::int4 p, null::int4 q, null::int4 r, 
    null::int4 s, null::int4 t, null::int4 u, null::int4 v, null::int4 w,
null::int4 x, null::int4 y, null::int4 z
    into testtable 
   FROM generate_series(1,6000000,1);

Doesn't matters if parallel query is used, can also be deactivated. But it's
strange to see that reordering the physical column layout can have such a
large effect on the tables.


PG Bug reporting form <noreply@postgresql.org> writes:
> I have a table with some columns id and a to z and these two queries have
> drastically different runtimes.

>   select count(id) FROM testtable t WHERE t.z IS NULL;  (35% slower than
> WHERE t.a IS NULL)
>   select count(id) FROM testtable t WHERE t.a IS NULL;

> This just screems like somewhere PostgreSQL is iterating over fields over
> and over again without assigning field indexes to the literals. 

This is entirely unsurprising.  There's no cheap way to extract values
from a row that contains nulls: the offset of the field you want can't
be determined without iterating over all the fields before it, since
some of them might not be there.

One could imagine ways to optimize queries of this exact form: if
the WHERE clause is "WHERE t.z IS [NOT] NULL" and nothing else,
in principle it could be checked by examining z's bit in the nulls
bitmap, without really extracting any field values.  But that'd
require adding some remarkably ugly warts to the clause evaluation
mechanism, and I doubt it would be worth the trouble.

            regards, tom lane



Ah, ok, so it has to decode all the fields until the one needed is reached, because only be decoding one can find out
thelength of individual fields.  

Sorry for the noise. But please have a look at my previous report, which is the really interesting one.

-----Ursprüngliche Nachricht-----
Von: Tom Lane <tgl@sss.pgh.pa.us>
Gesendet: Sonntag, 19. November 2023 17:59
An: Daniel Migowski <dmigowski@ikoffice.de>
Cc: pgsql-bugs@lists.postgresql.org
Betreff: Re: BUG #18206: Strange performance behaviour depending on location of field in query.

PG Bug reporting form <noreply@postgresql.org> writes:
> I have a table with some columns id and a to z and these two queries
> have drastically different runtimes.

>   select count(id) FROM testtable t WHERE t.z IS NULL;  (35% slower
> than WHERE t.a IS NULL)
>   select count(id) FROM testtable t WHERE t.a IS NULL;

> This just screems like somewhere PostgreSQL is iterating over fields
> over and over again without assigning field indexes to the literals.

This is entirely unsurprising.  There's no cheap way to extract values from a row that contains nulls: the offset of
thefield you want can't be determined without iterating over all the fields before it, since some of them might not be
there.

One could imagine ways to optimize queries of this exact form: if the WHERE clause is "WHERE t.z IS [NOT] NULL" and
nothingelse, in principle it could be checked by examining z's bit in the nulls bitmap, without really extracting any
fieldvalues.  But that'd require adding some remarkably ugly warts to the clause evaluation mechanism, and I doubt it
wouldbe worth the trouble. 

            regards, tom lane