Re: BUG #19059: PostgreSQL fails to evaluate the cheaper expression first, leading to 45X performance degradation - Mailing list pgsql-bugs

From Jinhui Lai
Subject Re: BUG #19059: PostgreSQL fails to evaluate the cheaper expression first, leading to 45X performance degradation
Date
Msg-id tencent_B680F384162B9BF0262DD675C6C21E9ED106@qq.com
Whole thread Raw
In response to Re: BUG #19059: PostgreSQL fails to evaluate the cheaper expression first, leading to 45X performance degradation  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-bugs
> We do do that at the top AND level (cf. order_qual_clauses()), but we have not bothered for OR clauses.

Hi, Tom. Thanks for your reply.
I have another case that influences both AND/OR clauses. You can reproduce it as follows:

-- Create tables t0 and t1. 
CREATE TABLE t0(c0 INT8); -- small table
INSERT INTO t0 SELECT * FROM generate_series(1, 1000);
CREATE TABLE t1(c1 INT8);  -- large table
INSERT INTO t1 SELECT * FROM generate_series(1, 10000000);

-- These two short-circuit evaluation examples happen before scanning any tables, as their plans only contain one row. 
SELECT (SELECT MIN(c1) FROM t1)>0 OR TRUE;  -- Time: 0.311 ms
SELECT (SELECT MIN(c1) FROM t1)>0 AND FALSE;  -- Time: 0.318 ms
explain SELECT (SELECT MIN(c1) FROM t1)>0 OR TRUE;
                QUERY PLAN                
------------------------------------------
 Result  (cost=0.00..0.01 rows=1 width=1)

-- These two short-circuit evaluation examples may happen after scanning table t0, as their execution times are similar to that of the query "SELECT (SELECT MIN(c0) FROM t0)>0". 
SELECT (SELECT MIN(c0) FROM t0)>0 OR (SELECT MIN(c1) FROM t1)>0;  -- Time: 0.416 ms
SELECT (SELECT MIN(c0) FROM t0)<0 AND (SELECT MIN(c1) FROM t1)>0; -- Time: 0.640 ms

SELECT (SELECT MIN(c0) FROM t0)>0;  -- Time: 0.665 ms

-- As demonstrated by the following two queries, the optimizer fails to reorder expressions in the SELECT clause for AND/OR operations. This can be observed in their execution plans. Since t0 is smaller than t1, evaluating t0 first (based on the cost model) would be more efficient. 
-- Particularly, given that PostgreSQL applies short-circuit evaluation during execution, the logical reordering of these expressions becomes a crucial optimization opportunity.

SELECT (SELECT MIN(c1) FROM t1)>0 OR (SELECT MIN(c0) FROM t0)>0;
Time: 148.815 ms
explain SELECT (SELECT MIN(c1) FROM t1)>0 OR (SELECT MIN(c0) FROM t0)>0;
                                          QUERY PLAN                                           
-----------------------------------------------------------------------------------------------
 Result  (cost=97348.95..97348.96 rows=1 width=1)
   InitPlan 1
     ->  Finalize Aggregate  (cost=97331.43..97331.44 rows=1 width=8)
           ->  Gather  (cost=97331.21..97331.42 rows=2 width=8)
                 Workers Planned: 2
                 ->  Partial Aggregate  (cost=96331.21..96331.22 rows=1 width=8)
                       ->  Parallel Seq Scan on t1  (cost=0.00..85914.57 rows=4166657 width=8)
   InitPlan 2
     ->  Aggregate  (cost=17.50..17.51 rows=1 width=8)
           ->  Seq Scan on t0  (cost=0.00..15.00 rows=1000 width=8)


SELECT (SELECT MIN(c1) FROM t1)>0 AND (SELECT MIN(c0) FROM t0)<0;
Time: 153.308 ms

explain SELECT (SELECT MIN(c1) FROM t1)>0 AND (SELECT MIN(c0) FROM t0)<0;
                                          QUERY PLAN                                           
-----------------------------------------------------------------------------------------------
 Result  (cost=97348.95..97348.96 rows=1 width=1)
   InitPlan 1
     ->  Finalize Aggregate  (cost=97331.43..97331.44 rows=1 width=8)
           ->  Gather  (cost=97331.21..97331.42 rows=2 width=8)
                 Workers Planned: 2
                 ->  Partial Aggregate  (cost=96331.21..96331.22 rows=1 width=8)
                       ->  Parallel Seq Scan on t1  (cost=0.00..85914.57 rows=4166657 width=8)
   InitPlan 2
     ->  Aggregate  (cost=17.50..17.51 rows=1 width=8)
           ->  Seq Scan on t0  (cost=0.00..15.00 rows=1000 width=8)

Thanks you once again. I look forward to your reply.
Best regard,
Jinhui




pgsql-bugs by date:

Previous
From: Vik Fearing
Date:
Subject: Re: BUG #19059: PostgreSQL fails to evaluate the cheaper expression first, leading to 45X performance degradation
Next
From: myzhen
Date:
Subject: coredump of language internal function