Thread: BUG #1252: Optimization of SELECT for NOT NULL case
The following bug has been logged online: Bug reference: 1252 Logged by: Alexander Kirpa Email address: postgres@bilteks.com PostgreSQL version: 8.0 Beta Operating system: FreeBSD 5.2.1 Description: Optimization of SELECT for NOT NULL case Details: CREATE TABLE x (id int4 NOT NULL); Populate table, for instance, 10M rows; SELECT count(*) from x WHERE x IS NULL; Optimizator should rewrite "x IS NULL" to simple "false" Best regards, Alexander Kirpa
On Tue, Sep 14, 2004 at 00:57:07 +0100, PostgreSQL Bugs List <pgsql-bugs@postgresql.org> wrote: > > Description: Optimization of SELECT for NOT NULL case > > Details: > > CREATE TABLE x (id int4 NOT NULL); > Populate table, for instance, 10M rows; > SELECT count(*) from x WHERE x IS NULL; > Optimizator should rewrite "x IS NULL" to simple "false" Based on responses to other optimization requests I have seen, I think the answer to this one is going to be that it isn't worth paying the cost for every query to check for this case, since no one is going to write a query like this except by mistake.
Bruno Wolff III <bruno@wolff.to> writes: > On Tue, Sep 14, 2004 at 00:57:07 +0100, >> CREATE TABLE x (id int4 NOT NULL); >> SELECT count(*) from x WHERE x IS NULL; >> Optimizator should rewrite "x IS NULL" to simple "false" > Based on responses to other optimization requests I have seen, I think the > answer to this one is going to be that it isn't worth paying the cost > for every query to check for this case, since no one is going to write > a query like this except by mistake. There's been some speculation about making the optimizer aware of table constraints in general (eg, CHECK constraints) but I tend to agree that doing this only for NOT NULL isn't very exciting. regards, tom lane
On 15 Sep 2004, at 12:21, Bruno Wolff III wrote: > On Tue, Sep 14, 2004 at 00:57:07 +0100, > PostgreSQL Bugs List <pgsql-bugs@postgresql.org> wrote: > > > > Description: Optimization of SELECT for NOT NULL case > > > > Details: > > > > CREATE TABLE x (id int4 NOT NULL); > > Populate table, for instance, 10M rows; > > SELECT count(*) from x WHERE x IS NULL; > > Optimizator should rewrite "x IS NULL" to simple "false" > > Based on responses to other optimization requests I have seen, I think > the answer to this one is going to be that it isn't worth paying the > cost for every query to check for this case, since no one is going to > write a query like this except by mistake. > Regarding your point of view possible exist reason for remove optimization for case like below SELECT count(*) from x where id>1 AND id<0 I basically don't see any significant difference in optimization for "id IS NULL" and "id>1 AND id<0" From other point of view not always well know NOT NULL a case for some columns, especially in case JOIN and complex subselects or dynamic SQL. Main reason to include this optimization, don't need any a scan of table in optimized case as result well know before any work with data from tables. From last point of view optimization in DB servers need for make work for application developer (not always with serious knowledge in mathematics and relational databases) more easy and for customer more swiftly. And latest point of view: DB server developer have way for enhance for instance EXPLAIN statement by displaying hint for static SQL. In any case thank you for postgreSQL 8, particularly for background disk writer and improving using of shared buffers. Good DB engine have more knowledge about information in shared buffers and using her, that any disk cache system of OS, plus avoiding or minimize using cache remove don't need "moving" between cache and buffers. Best regards, Alexander Kirpa
On Wed, Sep 15, 2004 at 23:29:43 +0300, Alexander Kirpa <postgres@bilteks.com> wrote: > On 15 Sep 2004, at 12:21, Bruno Wolff III wrote: > > Regarding your point of view possible exist reason for remove > optimization for case like below > SELECT count(*) from x where id>1 AND id<0 > I basically don't see any significant difference in optimization > for "id IS NULL" and "id>1 AND id<0" People generally know whether or not a table is NOT NULL and aren't very likely to write queries searching for NULL values in such a table. A general constraint handler would be useful (for example in handling partitions of data sets), but just doing it for NOT NULL seems like it would be a net loss.