Thread: [BUGS] BUG #14715: Constraint exclusion isn't used in function usinglanguage sql

[BUGS] BUG #14715: Constraint exclusion isn't used in function usinglanguage sql

From

cliveevans@ntlworld.com

Date:

21 June 2017, 20:00:56

The following bug has been logged on the website:

Bug reference:      14715
Logged by:          Clive Evans
Email address:      cliveevans@ntlworld.com
PostgreSQL version: 9.6.3
Operating system:   CentOS Linux release 7.3.1611
Description:

The same query written using PL/PGSQL will only scan the expected partition
tables.

For example:
partition_test=# DO                                                                             
$$             
BEGIN                          
CREATE TABLE customer_reviews(   customer_id TEXT,   review_date DATE,   review_rating INTEGER,   review_votes INTEGER,
 review_helpful_votes INTEGER,   product_id CHAR(10),   product_title TEXT,
          product_sales_rank BIGINT,   product_group TEXT,   product_category TEXT,   product_subcategory TEXT,
similar_product_idsCHAR(10)[]
 
);                                
FOR n in 2000..2004
LOOP                                                     EXECUTE 'CREATE TABLE customer_reviews_' || n || ' (check
(review_date>= 
''' || n || '-01-01'' AND review_date < ''' || n + 1 || '-01-01'')) INHERITS
(customer_reviews) ;';
END LOOP;
END                             
$$ ;
DO
Time: 15.647 ms
partition_test=# CREATE OR REPLACE FUNCTION title_vs_review_sql(start_date
DATE, end_date DATE)
RETURNS TABLE (  title_length_bucket INTEGER,  review_average NUMERIC,  count BIGINT
) AS                 
$BODY$                    
SELECT                      width_bucket(length(product_title), 1, 50, 5) title_length_bucket,
round(avg(review_rating),2) AS review_average,   count(*)
 
FROM                            customer_reviews
WHERE                        product_group = 'Book'   AND                              review_date BETWEEN start_date
ANDend_date
 
GROUP BY   title_length_bucket
ORDER BY
                           
title_length_bucket        
$BODY$
LANGUAGE SQL;
CREATE FUNCTION
Time: 0.918 ms
partition_test=# CREATE OR REPLACE FUNCTION
title_vs_review_plpgsql(start_date DATE, end_date DATE)
RETURNS TABLE (  title_length_bucket INTEGER,  review_average NUMERIC,  count BIGINT
) AS
$BODY$
BEGIN
RETURN QUERY       SELECT           width_bucket(length(product_title), 1, 50, 5)
title_length_bucket,           round(avg(review_rating), 2) AS review_average,           count(*)       FROM
customer_reviews      WHERE           product_group = 'Book'           AND           review_date BETWEEN start_date AND
end_date      GROUP BY           title_length_bucket       ORDER BY           title_length_bucket;
 
END;
$BODY$
LANGUAGE PLPGSQL;
CREATE FUNCTION
Time: 1.375 ms
partition_test=# set auto_explain.log_min_duration to 0;
SET
Time: 0.190 ms
partition_test=# set auto_explain.log_nested_statements to true ;
SET
Time: 0.197 ms
partition_test=# select title_vs_review_plpgsql('2001-01-01',
'2001-12-31');title_vs_review_plpgsql 
-------------------------
(0 rows)

Time: 0.899 ms
partition_test=# select title_vs_review_sql('2001-01-01', '2001-12-31');title_vs_review_sql 
---------------------
(0 rows)

Time: 0.949 ms

When we check the logs for the query plans, I expect them both to only scan
the one child table, however:
< 2017-06-21 13:40:21.086 UTC > LOG:  duration: 0.013 ms  plan:Query Text: SELECT
width_bucket(length(product_title),1, 50, 5) title_length_bucket,        round(avg(review_rating), 2) AS
review_average,       count(*)    FROM       customer_reviews    WHERE        product_group = 'Book'        AND
review_dateBETWEEN start_date AND end_date    GROUP BY        title_length_bucket    ORDER BY
title_length_bucketGroupAggregate (cost=14.95..15.01 rows=2 width=44) (actual 
time=0.011..0.011 rows=0 loops=1)  Group Key: (width_bucket((length(customer_reviews.product_title))::double
precision, '1'::double precision, '50'::double precision, 5))  ->  Sort  (cost=14.95..14.95 rows=2 width=8) (actual
time=0.010..0.010
rows=0 loops=1)        Sort Key:
(width_bucket((length(customer_reviews.product_title))::double precision,
'1'::double precision, '50'::double precision, 5))        Sort Method: quicksort  Memory: 25kB        ->  Result
(cost=0.00..14.94rows=2 width=8) (actual 
time=0.005..0.005 rows=0 loops=1)              ->  Append  (cost=0.00..14.90 rows=2 width=36) (actual
time=0.004..0.004 rows=0 loops=1)                    ->  Seq Scan on customer_reviews  (cost=0.00..0.00
rows=1 width=36) (actual time=0.003..0.003 rows=0 loops=1)                          Filter: ((review_date >=
'2001-01-01'::date)AND 
(review_date <= '2001-12-31'::date) AND (product_group = 'Book'::text))                    ->  Seq Scan on
customer_reviews_2001 
(cost=0.00..14.90 rows=1 width=36) (actual time=0.000..0.000 rows=0
loops=1)                          Filter: ((review_date >= '2001-01-01'::date) AND
(review_date <= '2001-12-31'::date) AND (product_group = 'Book'::text))
< 2017-06-21 13:40:21.086 UTC > CONTEXT:  PL/pgSQL function
title_vs_review_plpgsql(date,date) line 3 at RETURN QUERY
< 2017-06-21 13:40:21.086 UTC > LOG:  duration: 0.640 ms  plan:Query Text: select title_vs_review_plpgsql('2001-01-01',
'2001-12-31');Result (cost=0.00..5.25 rows=1000 width=32) (actual time=0.636..0.636 
rows=0 loops=1)
< 2017-06-21 13:40:26.869 UTC > LOG:  duration: 0.030 ms  plan:Query Text: SELECT
width_bucket(length(product_title),1, 50, 5) title_length_bucket,    round(avg(review_rating), 2) AS review_average,
count(*)FROM  customer_reviewsWHERE    product_group = 'Book'    AND    review_date BETWEEN start_date AND
end_dateGROUPBY    title_length_bucketORDER BY    title_length_bucketGroupAggregate  (cost=74.68..74.88 rows=6
width=44)(actual 
time=0.028..0.028 rows=0 loops=1)  Group Key: (width_bucket((length(customer_reviews.product_title))::double
precision, '1'::double precision, '50'::double precision, 5))  ->  Sort  (cost=74.68..74.70 rows=6 width=8) (actual
time=0.026..0.026
rows=0 loops=1)        Sort Key:
(width_bucket((length(customer_reviews.product_title))::double precision,
'1'::double precision, '50'::double precision, 5))        Sort Method: quicksort  Memory: 25kB        ->  Result
(cost=0.00..74.61rows=6 width=8) (actual 
time=0.007..0.007 rows=0 loops=1)              ->  Append  (cost=0.00..74.50 rows=6 width=36) (actual
time=0.007..0.007 rows=0 loops=1)                    ->  Seq Scan on customer_reviews  (cost=0.00..0.00
rows=1 width=36) (actual time=0.003..0.003 rows=0 loops=1)                          Filter: ((review_date >= $1) AND
(review_date<= 
$2) AND (product_group = 'Book'::text))                    ->  Seq Scan on customer_reviews_2000
(cost=0.00..14.90 rows=1 width=36) (actual time=0.001..0.001 rows=0
loops=1)                          Filter: ((review_date >= $1) AND (review_date <=
$2) AND (product_group = 'Book'::text))                    ->  Seq Scan on customer_reviews_2001
(cost=0.00..14.90 rows=1 width=36) (actual time=0.001..0.001 rows=0
loops=1)                          Filter: ((review_date >= $1) AND (review_date <=
$2) AND (product_group = 'Book'::text))                    ->  Seq Scan on customer_reviews_2002
(cost=0.00..14.90 rows=1 width=36) (actual time=0.000..0.000 rows=0
loops=1)                          Filter: ((review_date >= $1) AND (review_date <=
$2) AND (product_group = 'Book'::text))                    ->  Seq Scan on customer_reviews_2003
(cost=0.00..14.90 rows=1 width=36) (actual time=0.001..0.001 rows=0
loops=1)                          Filter: ((review_date >= $1) AND (review_date <=
$2) AND (product_group = 'Book'::text))                    ->  Seq Scan on customer_reviews_2004
(cost=0.00..14.90 rows=1 width=36) (actual time=0.001..0.001 rows=0
loops=1)                          Filter: ((review_date >= $1) AND (review_date <=
$2) AND (product_group = 'Book'::text))
< 2017-06-21 13:40:26.869 UTC > CONTEXT:  SQL function "title_vs_review_sql"
statement 1
< 2017-06-21 13:40:26.869 UTC > LOG:  duration: 0.684 ms  plan:Query Text: select title_vs_review_sql('2001-01-01',
'2001-12-31');Result (cost=0.00..5.25 rows=1000 width=32) (actual time=0.680..0.680 
rows=0 loops=1)





--
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

Re: [BUGS] BUG #14715: Constraint exclusion isn't used in functionusing language sql

From

Kyotaro HORIGUCHI

Date:

23 June 2017, 07:39:57

Hello,

At Wed, 21 Jun 2017 14:00:56 +0000, cliveevans@ntlworld.com wrote in
<20170621140056.27883.82221@wrigleys.postgresql.org>
> PostgreSQL version: 9.6.3
...
> The same query written using PL/PGSQL will only scan the expected partition
> tables.

The two are different in that the parameters of the PL/PgSQL
function are regarded as constants at the time of planning of the
inner SQL statement, while those of the SQL function are
not. Constraint exclusion is considered while planning so the SQL
function doesn't get benefit of it.

This doesn't seem a bug, but I couldn't find such a description
in the documentation, though. Maybe it should be here.

https://www.postgresql.org/docs/9.6/static/ddl-partitioning.html#DDL-PARTITIONING-CAVEATS

regareds,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center



-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

Re: [BUGS] BUG #14715: Constraint exclusion isn't used in functionusing language sql

From

Clive Evans

Date:

23 June 2017, 14:59:32

On 23/06/17 01:39, Kyotaro HORIGUCHI wrote:

Hello,

At Wed, 21 Jun 2017 14:00:56 +0000, cliveevans@ntlworld.com wrote in <20170621140056.27883.82221@wrigleys.postgresql.org>

PostgreSQL version: 9.6.3

...

The same query written using PL/PGSQL will only scan the expected partition
tables.

The two are different in that the parameters of the PL/PgSQL
function are regarded as constants at the time of planning of the
inner SQL statement, while those of the SQL function are
not. Constraint exclusion is considered while planning so the SQL
function doesn't get benefit of it.

I understand this, and I understand why. You can't plan based on what you don't yet know. I'm not entirely clear why the inner query can treat the parameter as a constant, whereas the outer one is forced to treat it as dynamic.

I assume it's something to do with when the plan is created. Possibly it's an attempt to save the planning time and re-use the plan in the case of a 'normal' SQL function, although this seems like a trade off that may well have significant downsides - in this case obviously, but also others.

Perhaps I'm looking for a new feature, rather than reporting a bug. Something that allows lazy planning, where there's a good chance that treating the parameters as constants is likely to improve the produced plan sufficiently to offset the additional overhead of repeatedly planning.

thanks,
--
Clive Evans

Re: [BUGS] BUG #14715: Constraint exclusion isn't used in functionusing language sql

From

Clive Evans

Date:

23 June 2017, 21:35:06

On 23/06/17 08:59, Clive Evans wrote:
>
>
> On 23/06/17 01:39, Kyotaro HORIGUCHI wrote:
>> Hello,
>>
>> At Wed, 21 Jun 2017 14:00:56 +0000,cliveevans@ntlworld.com  wrote
in<20170621140056.27883.82221@wrigleys.postgresql.org>
>>> PostgreSQL version: 9.6.3
>> ...
>>> The same query written using PL/PGSQL will only scan the expected partition
>>> tables.
>> The two are different in that the parameters of the PL/PgSQL
>> function are regarded as constants at the time of planning of the
>> inner SQL statement, while those of the SQL function are
>> not. Constraint exclusion is considered while planning so the SQL
>> function doesn't get benefit of it.
>
> I understand this, and I understand why. You can't plan based on what 
> you don't yet know. I'm not entirely clear why the inner query can 
> treat the parameter as a constant, whereas the outer one is forced to 
> treat it as dynamic.
>
> I assume it's something to do with when the plan is created. Possibly 
> it's an attempt to save the planning time and re-use the plan in the 
> case of a 'normal' SQL function, although this seems like a trade off 
> that may well have significant downsides - in this case obviously, but 
> also others.
>
> Perhaps I'm looking for a new feature, rather than reporting a bug. 
> Something that allows lazy planning, where there's a good chance that 
> treating the parameters as constants is likely to improve the produced 
> plan sufficiently to offset the additional overhead of repeatedly 
> planning.

Sorry, mail client went a bit rogue. Here it is again, (hopefully) 
legibly ...

thanks,
Clive


-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs