Re: Select queries which violates table constrains - Mailing list pgsql-hackers

From Heikki Linnakangas
Subject Re: Select queries which violates table constrains
Date
Msg-id 5370DD81.6040006@vmware.com
Whole thread Raw
In response to Select queries which violates table constrains  (Joni Martikainen <joni@shade-fx.com>)
List pgsql-hackers
On 05/10/2014 09:24 PM, Joni Martikainen wrote:
> Hi,
>
> I investigated some select query performance issues and noticed that
> postgresql misses some obvious cases while processing SELECT query. I
> mean the case where WHERE clause contains statement which condition
> would be against table structure. (excuse my language, look the code)
>
> Example:
> Let the table be :
>
> CREATE TABLE test
> (
>     id numeric(3,0) NOT NULL,
>     somecolumn numeric(5,0) NOT NULL,
>     CONSTRAINT id_pk PRIMARY KEY (id)
> );
>
> Simple table with "somecolumn" column which has constraint NOT NULL.
>
> Let's do a following query to the table.
>
> SELECT somecolumn FROM test WHERE somecolumn IS NULL;
>
> Result is empty result set which is obvious because any null value would
> be against the table constrain.
> The thing here is that postgresql does SeqScan to this table in order to
> find out if there is any null values.
>
> Explain:
> "Seq Scan on test  (cost=0.00..1.06 rows=1 width=5)"
> "  Filter: (somecolumn IS NULL)"
> "Planning time: 0.778 ms"
>
> SeqScan can be avoided by making index for "somecolumn" and indexing all
> the null values. That index would be empty and very fast but also very
> pointless since table constraint here is simple.
> No one would do such a query in real life but some programmatically
> generated queries does this kind of things. Only way I found to go
> around this problem was to create those empty indexies but I think the
> query optimizer could be smarter here.
>
> I took a look of the optimizer code and I didn't find any code which
> avoids this kind of situations. (I expect that it would be optimizer's
> task to find out this kind of things)
>
> I was thinking some feature for optimizer where the optimizer could add
> a hint for an executor if some query plan path leads to the empty result
> set case. If executor sees this hint it could avoid doing seqscan and
> actually even index scans. This kind of query constraint vs. table
> constraint comparison should be anyway cheaper process to execute than
> seqscan.
>
> The question is that, is there any reason why such an optimization phase
> could not be implemented? Another question is that how is the query
> engine handling the partitioned table case? Am i right that table
> partitions are solved by table constrains and indexies are used to
> validate which child table to look for? And so forth could this kind of
> new optimization phase benefit partitioned tables?

Actually, the planner can perform that optimization. The trick is called 
"constraint exclusion". It is typically used for partitioning, where the 
WHERE-clause restricts the query to a single partition, and you would 
otherwise have to scan all the partitions. It is not usually a very 
useful optimization, and it is somewhat expensive to check for that 
case, so it is disabled by default except for partitioned tables. But if 
you do "set constraint_exclusion=on", you will get the plan you're 
looking for:

postgres=# set  constraint_exclusion=on;
SET
postgres=# explain SELECT somecolumn FROM test WHERE somecolumn IS NULL;                QUERY PLAN
------------------------------------------ Result  (cost=0.00..0.01 rows=1 width=0)   One-Time Filter: false Planning
time:0.071 ms
 
(3 rows)

- Heikki



pgsql-hackers by date:

Previous
From: Andres Freund
Date:
Subject: Re: Proposal for CSN based snapshots
Next
From: Tom Lane
Date:
Subject: Re: wrapping in extended mode doesn't work well with default pager