Thread: BUG #1252: Optimization of SELECT for NOT NULL case

BUG #1252: Optimization of SELECT for NOT NULL case

From
"PostgreSQL Bugs List"
Date:
The following bug has been logged online:

Bug reference:      1252
Logged by:          Alexander Kirpa

Email address:      postgres@bilteks.com

PostgreSQL version: 8.0 Beta

Operating system:   FreeBSD 5.2.1

Description:        Optimization of SELECT for NOT NULL case

Details:

CREATE TABLE x (id int4 NOT NULL);
Populate table, for instance, 10M rows;
SELECT count(*) from x WHERE x IS NULL;
Optimizator should rewrite "x IS NULL" to simple "false"

Best regards,
 Alexander Kirpa

Re: BUG #1252: Optimization of SELECT for NOT NULL case

From
Bruno Wolff III
Date:
On Tue, Sep 14, 2004 at 00:57:07 +0100,
  PostgreSQL Bugs List <pgsql-bugs@postgresql.org> wrote:
>
> Description:        Optimization of SELECT for NOT NULL case
>
> Details:
>
> CREATE TABLE x (id int4 NOT NULL);
> Populate table, for instance, 10M rows;
> SELECT count(*) from x WHERE x IS NULL;
> Optimizator should rewrite "x IS NULL" to simple "false"

Based on responses to other optimization requests I have seen, I think the
answer to this one is going to be that it isn't worth paying the cost
for every query to check for this case, since no one is going to write
a query like this except by mistake.

Re: BUG #1252: Optimization of SELECT for NOT NULL case

From
Tom Lane
Date:
Bruno Wolff III <bruno@wolff.to> writes:
> On Tue, Sep 14, 2004 at 00:57:07 +0100,
>> CREATE TABLE x (id int4 NOT NULL);
>> SELECT count(*) from x WHERE x IS NULL;
>> Optimizator should rewrite "x IS NULL" to simple "false"

> Based on responses to other optimization requests I have seen, I think the
> answer to this one is going to be that it isn't worth paying the cost
> for every query to check for this case, since no one is going to write
> a query like this except by mistake.

There's been some speculation about making the optimizer aware of table
constraints in general (eg, CHECK constraints) but I tend to agree that
doing this only for NOT NULL isn't very exciting.

            regards, tom lane

Re: BUG #1252: Optimization of SELECT for NOT NULL case

From
"Alexander Kirpa"
Date:
On 15 Sep 2004, at 12:21, Bruno Wolff III wrote:

> On Tue, Sep 14, 2004 at 00:57:07 +0100,
>   PostgreSQL Bugs List <pgsql-bugs@postgresql.org> wrote:
> >
> > Description:        Optimization of SELECT for NOT NULL case
> >
> > Details:
> >
> > CREATE TABLE x (id int4 NOT NULL);
> > Populate table, for instance, 10M rows;
> > SELECT count(*) from x WHERE x IS NULL;
> > Optimizator should rewrite "x IS NULL" to simple "false"
>
> Based on responses to other optimization requests I have seen, I think
> the answer to this one is going to be that it isn't worth paying the
> cost for every query to check for this case, since no one is going to
> write a query like this except by mistake.
>

Regarding your point of view possible exist reason for remove
optimization for case like below
SELECT count(*) from x where id>1 AND id<0
I basically don't see any significant difference in optimization
for "id IS NULL" and "id>1 AND id<0"

From other point of view not always well know NOT NULL a case
for some columns, especially in case JOIN and complex subselects
or dynamic SQL. Main reason to include this optimization, don't
need any a scan of table in optimized case as result well know
before any work with data from tables.

From last point of view optimization in DB servers need for make
work for application developer (not always with serious knowledge
in mathematics and relational databases) more easy and for
customer more swiftly.

And latest point of view: DB server developer have way for
enhance for instance  EXPLAIN statement by displaying hint for
static SQL.

In any case thank you for postgreSQL 8,  particularly for
background disk writer and improving using of shared buffers.
Good DB engine have more knowledge about information in
shared buffers and using her, that any disk cache system of OS,
plus avoiding or minimize using cache remove don't need
"moving" between cache and buffers.

Best regards,
 Alexander Kirpa

Re: BUG #1252: Optimization of SELECT for NOT NULL case

From
Bruno Wolff III
Date:
On Wed, Sep 15, 2004 at 23:29:43 +0300,
  Alexander Kirpa <postgres@bilteks.com> wrote:
> On 15 Sep 2004, at 12:21, Bruno Wolff III wrote:
>
> Regarding your point of view possible exist reason for remove
> optimization for case like below
> SELECT count(*) from x where id>1 AND id<0
> I basically don't see any significant difference in optimization
> for "id IS NULL" and "id>1 AND id<0"

People generally know whether or not a table is NOT NULL and aren't very
likely to write queries searching for NULL values in such a table.

A general constraint handler would be useful (for example in handling
partitions of data sets), but just doing it for NOT NULL seems like it
would be a net loss.