Thread: Performance of Query 4 on TPC-DS Benchmark

Performance of Query 4 on TPC-DS Benchmark

From
Ba Jinsheng
Date:
Hi all,

Please see this case:


Query 4 on TPC-DS benchmark:

with year_total as (
 select c_customer_id customer_id
       ,c_first_name customer_first_name
       ,c_last_name customer_last_name
       ,c_preferred_cust_flag customer_preferred_cust_flag
       ,c_birth_country customer_birth_country
       ,c_login customer_login
       ,c_email_address customer_email_address
       ,d_year dyear
       ,sum(((ss_ext_list_price-ss_ext_wholesale_cost-ss_ext_discount_amt)+ss_ext_sales_price)/2) year_total
       ,'s' sale_type
 from customer
     ,store_sales
     ,date_dim
 where c_customer_sk = ss_customer_sk
   and ss_sold_date_sk = d_date_sk
 group by c_customer_id
         ,c_first_name
         ,c_last_name
         ,c_preferred_cust_flag
         ,c_birth_country
         ,c_login
         ,c_email_address
         ,d_year
 union all
 select c_customer_id customer_id
       ,c_first_name customer_first_name
       ,c_last_name customer_last_name
       ,c_preferred_cust_flag customer_preferred_cust_flag
       ,c_birth_country customer_birth_country
       ,c_login customer_login
       ,c_email_address customer_email_address
       ,d_year dyear
       ,sum((((cs_ext_list_price-cs_ext_wholesale_cost-cs_ext_discount_amt)+cs_ext_sales_price)/2) ) year_total
       ,'c' sale_type
 from customer
     ,catalog_sales
     ,date_dim
 where c_customer_sk = cs_bill_customer_sk
   and cs_sold_date_sk = d_date_sk
 group by c_customer_id
         ,c_first_name
         ,c_last_name
         ,c_preferred_cust_flag
         ,c_birth_country
         ,c_login
         ,c_email_address
         ,d_year
union all
 select c_customer_id customer_id
       ,c_first_name customer_first_name
       ,c_last_name customer_last_name
       ,c_preferred_cust_flag customer_preferred_cust_flag
       ,c_birth_country customer_birth_country
       ,c_login customer_login
       ,c_email_address customer_email_address
       ,d_year dyear
       ,sum((((ws_ext_list_price-ws_ext_wholesale_cost-ws_ext_discount_amt)+ws_ext_sales_price)/2) ) year_total
       ,'w' sale_type
 from customer
     ,web_sales
     ,date_dim
 where c_customer_sk = ws_bill_customer_sk
   and ws_sold_date_sk = d_date_sk
 group by c_customer_id
         ,c_first_name
         ,c_last_name
         ,c_preferred_cust_flag
         ,c_birth_country
         ,c_login
         ,c_email_address
         ,d_year
         )
  select  
                  t_s_secyear.customer_id
                 ,t_s_secyear.customer_first_name
                 ,t_s_secyear.customer_last_name
                 ,t_s_secyear.customer_email_address
 from year_total t_s_firstyear
     ,year_total t_s_secyear
     ,year_total t_c_firstyear
     ,year_total t_c_secyear
     ,year_total t_w_firstyear
     ,year_total t_w_secyear
 where t_s_secyear.customer_id = t_s_firstyear.customer_id
   and t_s_firstyear.customer_id = t_c_secyear.customer_id
   and t_s_firstyear.customer_id = t_c_firstyear.customer_id
   and t_s_firstyear.customer_id = t_w_firstyear.customer_id
   and t_s_firstyear.customer_id = t_w_secyear.customer_id
   and t_s_firstyear.sale_type = 's'
   and t_c_firstyear.sale_type = 'c'
   and t_w_firstyear.sale_type = 'w'
   and t_s_secyear.sale_type = 's'
   and t_c_secyear.sale_type = 'c'
   and t_w_secyear.sale_type = 'w'
   and t_s_firstyear.dyear =  2001
   and t_s_secyear.dyear = 2001+1
   and t_c_firstyear.dyear =  2001
   and t_c_secyear.dyear =  2001+1
   and t_w_firstyear.dyear = 2001
   and t_w_secyear.dyear = 2001+1
   and t_s_firstyear.year_total > 0
   and t_c_firstyear.year_total > 0
   and t_w_firstyear.year_total > 0
   and case when t_c_firstyear.year_total > 0 then t_c_secyear.year_total / t_c_firstyear.year_total else null end
           > case when t_s_firstyear.year_total > 0 then t_s_secyear.year_total / t_s_firstyear.year_total else null end
   and case when t_c_firstyear.year_total > 0 then t_c_secyear.year_total / t_c_firstyear.year_total else null end
           > case when t_w_firstyear.year_total > 0 then t_w_secyear.year_total / t_w_firstyear.year_total else null end
 order by t_s_secyear.customer_id
         ,t_s_secyear.customer_first_name
         ,t_s_secyear.customer_last_name
         ,t_s_secyear.customer_email_address
limit 100;





The execution time is more than 50 minutes:
                                                                                                                                                                               QUERY PLAN                                                                                                                                                                              
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=1255378.56..1255378.57 rows=1 width=132) (actual time=3024403.311..3024403.342 rows=8 loops=1)
   CTE year_total
     ->  Append  (cost=197433.23..461340.62 rows=5041142 width=216) (actual time=4126.043..7897.747 rows=384208 loops=1)
           ->  HashAggregate  (cost=197433.23..233436.60 rows=2880270 width=216) (actual time=4126.042..4231.703 rows=190581 loops=1)
                 Group Key: customer.c_customer_id, customer.c_first_name, customer.c_last_name, customer.c_preferred_cust_flag, customer.c_birth_country, customer.c_login, customer.c_email_address, date_dim.d_year
                 Batches: 1  Memory Usage: 213017kB
                 ->  Hash Join  (cost=8151.60..103824.45 rows=2880270 width=174) (actual time=69.110..1686.608 rows=2685453 loops=1)
                       Hash Cond: (store_sales.ss_sold_date_sk = date_dim.d_date_sk)
                       ->  Hash Join  (cost=5103.00..93214.72 rows=2880270 width=174) (actual time=49.517..1162.567 rows=2750652 loops=1)
                             Hash Cond: (store_sales.ss_customer_sk = customer.c_customer_sk)
                             ->  Seq Scan on store_sales  (cost=0.00..80550.70 rows=2880270 width=30) (actual time=0.018..208.022 rows=2880404 loops=1)
                             ->  Hash  (cost=3853.00..3853.00 rows=100000 width=152) (actual time=49.271..49.271 rows=100000 loops=1)
                                   Buckets: 131072  Batches: 1  Memory Usage: 17161kB
                                   ->  Seq Scan on customer  (cost=0.00..3853.00 rows=100000 width=152) (actual time=0.011..26.448 rows=100000 loops=1)
                       ->  Hash  (cost=2135.49..2135.49 rows=73049 width=8) (actual time=19.369..19.370 rows=73049 loops=1)
                             Buckets: 131072  Batches: 1  Memory Usage: 3878kB
                             ->  Seq Scan on date_dim  (cost=0.00..2135.49 rows=73049 width=8) (actual time=0.037..11.763 rows=73049 loops=1)
           ->  HashAggregate  (cost=114410.03..132428.63 rows=1441488 width=216) (actual time=2369.202..2447.868 rows=136978 loops=1)
                 Group Key: customer_1.c_customer_id, customer_1.c_first_name, customer_1.c_last_name, customer_1.c_preferred_cust_flag, customer_1.c_birth_country, customer_1.c_login, customer_1.c_email_address, date_dim_1.d_year
                 Batches: 1  Memory Usage: 131097kB
                 ->  Hash Join  (cost=8151.60..67561.67 rows=1441488 width=177) (actual time=62.483..974.143 rows=1430939 loops=1)
                       Hash Cond: (catalog_sales.cs_sold_date_sk = date_dim_1.d_date_sk)
                       ->  Hash Join  (cost=5103.00..60728.94 rows=1441488 width=177) (actual time=46.571..687.972 rows=1434519 loops=1)
                             Hash Cond: (catalog_sales.cs_bill_customer_sk = customer_1.c_customer_sk)
                             ->  Seq Scan on catalog_sales  (cost=0.00..51841.88 rows=1441488 width=33) (actual time=0.029..128.238 rows=1441548 loops=1)
                             ->  Hash  (cost=3853.00..3853.00 rows=100000 width=152) (actual time=46.311..46.325 rows=100000 loops=1)
                                   Buckets: 131072  Batches: 1  Memory Usage: 17161kB
                                   ->  Seq Scan on customer customer_1  (cost=0.00..3853.00 rows=100000 width=152) (actual time=0.005..23.350 rows=100000 loops=1)
                       ->  Hash  (cost=2135.49..2135.49 rows=73049 width=8) (actual time=15.677..15.677 rows=73049 loops=1)
                             Buckets: 131072  Batches: 1  Memory Usage: 3878kB
                             ->  Seq Scan on date_dim date_dim_1  (cost=0.00..2135.49 rows=73049 width=8) (actual time=0.015..7.957 rows=73049 loops=1)
           ->  HashAggregate  (cost=61277.38..70269.68 rows=719384 width=216) (actual time=1166.953..1198.730 rows=56649 loops=1)
                 Group Key: customer_2.c_customer_id, customer_2.c_first_name, customer_2.c_last_name, customer_2.c_preferred_cust_flag, customer_2.c_birth_country, customer_2.c_login, customer_2.c_email_address, date_dim_2.d_year
                 Batches: 1  Memory Usage: 57369kB
                 ->  Hash Join  (cost=8151.60..37897.40 rows=719384 width=177) (actual time=68.327..508.594 rows=719119 loops=1)
                       Hash Cond: (web_sales.ws_sold_date_sk = date_dim_2.d_date_sk)
                       ->  Hash Join  (cost=5103.00..32960.30 rows=719384 width=177) (actual time=52.240..357.963 rows=719217 loops=1)
                             Hash Cond: (web_sales.ws_bill_customer_sk = customer_2.c_customer_sk)
                             ->  Seq Scan on web_sales  (cost=0.00..25968.84 rows=719384 width=33) (actual time=0.032..62.464 rows=719384 loops=1)
                             ->  Hash  (cost=3853.00..3853.00 rows=100000 width=152) (actual time=51.959..51.960 rows=100000 loops=1)
                                   Buckets: 131072  Batches: 1  Memory Usage: 17161kB
                                   ->  Seq Scan on customer customer_2  (cost=0.00..3853.00 rows=100000 width=152) (actual time=0.004..25.350 rows=100000 loops=1)
                       ->  Hash  (cost=2135.49..2135.49 rows=73049 width=8) (actual time=15.831..15.834 rows=73049 loops=1)
                             Buckets: 131072  Batches: 1  Memory Usage: 3878kB
                             ->  Seq Scan on date_dim date_dim_2  (cost=0.00..2135.49 rows=73049 width=8) (actual time=0.014..8.100 rows=73049 loops=1)
   ->  Sort  (cost=794037.94..794037.95 rows=1 width=132) (actual time=3024403.310..3024403.313 rows=8 loops=1)
         Sort Key: t_s_secyear.customer_id, t_s_secyear.customer_first_name, t_s_secyear.customer_last_name, t_s_secyear.customer_email_address
         Sort Method: quicksort  Memory: 26kB
         ->  Nested Loop  (cost=0.00..794037.93 rows=1 width=132) (actual time=354851.431..3024403.218 rows=8 loops=1)
               Join Filter: ((t_s_secyear.customer_id = t_w_secyear.customer_id) AND (CASE WHEN (t_c_firstyear.year_total > '0'::numeric) THEN (t_c_secyear.year_total / t_c_firstyear.year_total) ELSE NULL::numeric END > CASE WHEN (t_w_firstyear.year_total > '0'::numeric) THEN (t_w_secyear.year_total / t_w_firstyear.year_total) ELSE NULL::numeric END))
               Rows Removed by Join Filter: 810136
               ->  Nested Loop  (cost=0.00..668006.23 rows=1 width=308) (actual time=33554.075..3021248.646 rows=72 loops=1)
                     Join Filter: ((t_s_secyear.customer_id = t_c_secyear.customer_id) AND (CASE WHEN (t_c_firstyear.year_total > '0'::numeric) THEN (t_c_secyear.year_total / t_c_firstyear.year_total) ELSE NULL::numeric END > CASE WHEN (t_s_firstyear.year_total > '0'::numeric) THEN (t_s_secyear.year_total / t_s_firstyear.year_total) ELSE NULL::numeric END))
                     Rows Removed by Join Filter: 11876277
                     ->  Nested Loop  (cost=0.00..541974.53 rows=1 width=320) (actual time=14866.104..3001271.961 rows=437 loops=1)
                           Join Filter: (t_s_firstyear.customer_id = t_s_secyear.customer_id)
                           Rows Removed by Join Filter: 44702488
                           ->  Nested Loop  (cost=0.00..415941.57 rows=2 width=156) (actual time=11739.944..2946020.749 rows=1171 loops=1)
                                 Join Filter: (t_s_firstyear.customer_id = t_w_firstyear.customer_id)
                                 Rows Removed by Join Filter: 112695277
                                 ->  Nested Loop  (cost=0.00..277302.08 rows=9 width=104) (actual time=8139.729..2351733.795 rows=9952 loops=1)
                                       Join Filter: (t_s_firstyear.customer_id = t_c_firstyear.customer_id)
                                       Rows Removed by Join Filter: 997895870
                                       ->  CTE Scan on year_total t_s_firstyear  (cost=0.00..138631.41 rows=42 width=52) (actual time=4126.046..4234.598 rows=37923 loops=1)
                                             Filter: ((year_total > '0'::numeric) AND (sale_type = 's'::text) AND (dyear = 2001))
                                             Rows Removed by Filter: 346285
                                       ->  CTE Scan on year_total t_c_firstyear  (cost=0.00..138631.41 rows=42 width=52) (actual time=28.926..60.356 rows=26314 loops=37923)
                                             Filter: ((year_total > '0'::numeric) AND (sale_type = 'c'::text) AND (dyear = 2001))
                                             Rows Removed by Filter: 357894
                                 ->  CTE Scan on year_total t_w_firstyear  (cost=0.00..138631.41 rows=42 width=52) (actual time=49.572..59.057 rows=11324 loops=9952)
                                       Filter: ((year_total > '0'::numeric) AND (sale_type = 'w'::text) AND (dyear = 2001))
                                       Rows Removed by Filter: 372884
                           ->  CTE Scan on year_total t_s_secyear  (cost=0.00..126028.55 rows=126 width=164) (actual time=0.002..44.949 rows=38175 loops=1171)
                                 Filter: ((sale_type = 's'::text) AND (dyear = 2002))
                                 Rows Removed by Filter: 346033
                     ->  CTE Scan on year_total t_c_secyear  (cost=0.00..126028.55 rows=126 width=52) (actual time=21.023..44.097 rows=27177 loops=437)
                           Filter: ((sale_type = 'c'::text) AND (dyear = 2002))
                           Rows Removed by Filter: 357031
               ->  CTE Scan on year_total t_w_secyear  (cost=0.00..126028.55 rows=126 width=52) (actual time=36.137..43.090 rows=11252 loops=72)
                     Filter: ((sale_type = 'w'::text) AND (dyear = 2002))
                     Rows Removed by Filter: 372956
 Planning Time: 4.529 ms
 Execution Time: 3024486.695 ms
(83 rows)




If we disable this checking:
diff --git a/src/backend/optimizer/util/pathnode.c b/src/backend/optimizer/util/pathnode.c
index c42742d2c7..d47d0f0e59 100644
--- a/src/backend/optimizer/util/pathnode.c
+++ b/src/backend/optimizer/util/pathnode.c
@@ -555,16 +555,6 @@ add_path(RelOptInfo *parent_rel, Path *new_path)
                                                }
                                                break;
                                        case COSTS_BETTER2:
-                                               if (keyscmp != PATHKEYS_BETTER1)
-                                               {
-                                                       outercmp = bms_subset_compare(PATH_REQ_OUTER(new_path),
-                                                                                                                      PATH_REQ_OUTER(old_path));
-                                                       if ((outercmp == BMS_EQUAL ||
-                                                                outercmp == BMS_SUBSET2) &&
-                                                               new_path->rows >= old_path->rows &&
-                                                               new_path->parallel_safe <= old_path->parallel_safe)
-                                                               accept_new = false; /* old dominates new */
-                                               }
                                                break;
                                        case COSTS_DIFFERENT:
 

The execution time is reduced to around 8 seconds:
                                                                                                                                                                                                                                                                                         QUERY PLAN                                                                                                                                                                                                                                                                                        
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=1255392.10..1255392.10 rows=1 width=132) (actual time=8068.870..8068.887 rows=8 loops=1)
   CTE year_total
     ->  Append  (cost=197441.91..461358.17 rows=5041411 width=216) (actual time=4004.918..7547.884 rows=384208 loops=1)
           ->  HashAggregate  (cost=197441.91..233447.56 rows=2880452 width=216) (actual time=4004.917..4112.220 rows=190581 loops=1)
                 Group Key: customer.c_customer_id, customer.c_first_name, customer.c_last_name, customer.c_preferred_cust_flag, customer.c_birth_country, customer.c_login, customer.c_email_address, date_dim.d_year
                 Batches: 1  Memory Usage: 213017kB
                 ->  Hash Join  (cost=8151.60..103827.22 rows=2880452 width=174) (actual time=73.291..1594.391 rows=2685453 loops=1)
                       Hash Cond: (store_sales.ss_sold_date_sk = date_dim.d_date_sk)
                       ->  Hash Join  (cost=5103.00..93217.01 rows=2880452 width=174) (actual time=52.407..1100.642 rows=2750652 loops=1)
                             Hash Cond: (store_sales.ss_customer_sk = customer.c_customer_sk)
                             ->  Seq Scan on store_sales  (cost=0.00..80552.52 rows=2880452 width=30) (actual time=0.017..213.629 rows=2880404 loops=1)
                             ->  Hash  (cost=3853.00..3853.00 rows=100000 width=152) (actual time=52.083..52.083 rows=100000 loops=1)
                                   Buckets: 131072  Batches: 1  Memory Usage: 17161kB
                                   ->  Seq Scan on customer  (cost=0.00..3853.00 rows=100000 width=152) (actual time=0.013..28.104 rows=100000 loops=1)
                       ->  Hash  (cost=2135.49..2135.49 rows=73049 width=8) (actual time=20.589..20.592 rows=73049 loops=1)
                             Buckets: 131072  Batches: 1  Memory Usage: 3878kB
                             ->  Seq Scan on date_dim  (cost=0.00..2135.49 rows=73049 width=8) (actual time=0.036..12.464 rows=73049 loops=1)
           ->  HashAggregate  (cost=114414.19..132433.87 rows=1441575 width=216) (actual time=2186.304..2263.365 rows=136978 loops=1)
                 Group Key: customer_1.c_customer_id, customer_1.c_first_name, customer_1.c_last_name, customer_1.c_preferred_cust_flag, customer_1.c_birth_country, customer_1.c_login, customer_1.c_email_address, date_dim_1.d_year
                 Batches: 1  Memory Usage: 131097kB
                 ->  Hash Join  (cost=8151.60..67563.00 rows=1441575 width=177) (actual time=63.765..880.282 rows=1430939 loops=1)
                       Hash Cond: (catalog_sales.cs_sold_date_sk = date_dim_1.d_date_sk)
                       ->  Hash Join  (cost=5103.00..60730.04 rows=1441575 width=177) (actual time=47.735..619.585 rows=1434519 loops=1)
                             Hash Cond: (catalog_sales.cs_bill_customer_sk = customer_1.c_customer_sk)
                             ->  Seq Scan on catalog_sales  (cost=0.00..51842.75 rows=1441575 width=33) (actual time=0.033..121.738 rows=1441548 loops=1)
                             ->  Hash  (cost=3853.00..3853.00 rows=100000 width=152) (actual time=47.433..47.433 rows=100000 loops=1)
                                   Buckets: 131072  Batches: 1  Memory Usage: 17161kB
                                   ->  Seq Scan on customer customer_1  (cost=0.00..3853.00 rows=100000 width=152) (actual time=0.004..23.960 rows=100000 loops=1)
                       ->  Hash  (cost=2135.49..2135.49 rows=73049 width=8) (actual time=15.755..15.756 rows=73049 loops=1)
                             Buckets: 131072  Batches: 1  Memory Usage: 3878kB
                             ->  Seq Scan on date_dim date_dim_1  (cost=0.00..2135.49 rows=73049 width=8) (actual time=0.017..8.030 rows=73049 loops=1)
           ->  HashAggregate  (cost=61277.38..70269.68 rows=719384 width=216) (actual time=1121.896..1152.982 rows=56649 loops=1)
                 Group Key: customer_2.c_customer_id, customer_2.c_first_name, customer_2.c_last_name, customer_2.c_preferred_cust_flag, customer_2.c_birth_country, customer_2.c_login, customer_2.c_email_address, date_dim_2.d_year
                 Batches: 1  Memory Usage: 57369kB
                 ->  Hash Join  (cost=8151.60..37897.40 rows=719384 width=177) (actual time=63.239..470.700 rows=719119 loops=1)
                       Hash Cond: (web_sales.ws_sold_date_sk = date_dim_2.d_date_sk)
                       ->  Hash Join  (cost=5103.00..32960.30 rows=719384 width=177) (actual time=47.365..329.499 rows=719217 loops=1)
                             Hash Cond: (web_sales.ws_bill_customer_sk = customer_2.c_customer_sk)
                             ->  Seq Scan on web_sales  (cost=0.00..25968.84 rows=719384 width=33) (actual time=0.036..60.068 rows=719384 loops=1)
                             ->  Hash  (cost=3853.00..3853.00 rows=100000 width=152) (actual time=47.076..47.078 rows=100000 loops=1)
                                   Buckets: 131072  Batches: 1  Memory Usage: 17161kB
                                   ->  Seq Scan on customer customer_2  (cost=0.00..3853.00 rows=100000 width=152) (actual time=0.005..23.909 rows=100000 loops=1)
                       ->  Hash  (cost=2135.49..2135.49 rows=73049 width=8) (actual time=15.615..15.615 rows=73049 loops=1)
                             Buckets: 131072  Batches: 1  Memory Usage: 3878kB
                             ->  Seq Scan on date_dim date_dim_2  (cost=0.00..2135.49 rows=73049 width=8) (actual time=0.019..8.060 rows=73049 loops=1)
   ->  Sort  (cost=794033.93..794033.94 rows=1 width=132) (actual time=8068.869..8068.872 rows=8 loops=1)
         Sort Key: t_s_secyear.customer_id, t_s_secyear.customer_first_name, t_s_secyear.customer_last_name, t_s_secyear.customer_email_address
         Sort Method: quicksort  Memory: 26kB
         ->  Hash Join  (cost=794032.91..794033.92 rows=1 width=132) (actual time=8061.079..8068.861 rows=8 loops=1)
               Hash Cond: (t_s_secyear.customer_id = t_s_firstyear.customer_id)
               Join Filter: ((CASE WHEN (t_c_firstyear.year_total > '0'::numeric) THEN (t_c_secyear.year_total / t_c_firstyear.year_total) ELSE NULL::numeric END > CASE WHEN (t_s_firstyear.year_total > '0'::numeric) THEN (t_s_secyear.year_total / t_s_firstyear.year_total) ELSE NULL::numeric END) AND (CASE WHEN (t_c_firstyear.year_total > '0'::numeric) THEN (t_c_secyear.year_total / t_c_firstyear.year_total) ELSE NULL::numeric END > CASE WHEN (t_w_firstyear.year_total > '0'::numeric) THEN (t_w_secyear.year_total / t_w_firstyear.year_total) ELSE NULL::numeric END))
               Rows Removed by Join Filter: 11
               ->  Merge Join  (cost=390716.09..390717.01 rows=16 width=268) (actual time=7877.494..7885.971 rows=2793 loops=1)
                     Merge Cond: (t_s_secyear.customer_id = t_c_secyear.customer_id)
                     ->  Sort  (cost=126039.67..126039.99 rows=126 width=164) (actual time=7764.696..7768.885 rows=38175 loops=1)
                           Sort Key: t_s_secyear.customer_id
                           Sort Method: quicksort  Memory: 7021kB
                           ->  CTE Scan on year_total t_s_secyear  (cost=0.00..126035.28 rows=126 width=164) (actual time=4004.934..7746.174 rows=38175 loops=1)
                                 Filter: ((sale_type = 's'::text) AND (dyear = 2002))
                                 Rows Removed by Filter: 346033
                     ->  Sort  (cost=264676.42..264676.48 rows=26 width=104) (actual time=112.790..113.328 rows=7188 loops=1)
                           Sort Key: t_c_firstyear.customer_id
                           Sort Method: quicksort  Memory: 687kB
                           ->  Hash Join  (cost=138639.33..264675.81 rows=26 width=104) (actual time=83.825..109.977 rows=7188 loops=1)
                                 Hash Cond: (t_c_secyear.customer_id = t_c_firstyear.customer_id)
                                 ->  CTE Scan on year_total t_c_secyear  (cost=0.00..126035.28 rows=126 width=52) (actual time=20.337..42.859 rows=27177 loops=1)
                                       Filter: ((sale_type = 'c'::text) AND (dyear = 2002))
                                       Rows Removed by Filter: 357031
                                 ->  Hash  (cost=138638.80..138638.80 rows=42 width=52) (actual time=63.460..63.461 rows=26314 loops=1)
                                       Buckets: 32768 (originally 1024)  Batches: 1 (originally 1)  Memory Usage: 1725kB
                                       ->  CTE Scan on year_total t_c_firstyear  (cost=0.00..138638.80 rows=42 width=52) (actual time=28.367..59.710 rows=26314 loops=1)
                                             Filter: ((year_total > '0'::numeric) AND (sale_type = 'c'::text) AND (dyear = 2001))
                                             Rows Removed by Filter: 357894
               ->  Hash  (cost=403316.74..403316.74 rows=6 width=156) (actual time=182.649..182.650 rows=479 loops=1)
                     Buckets: 1024  Batches: 1  Memory Usage: 59kB
                     ->  Merge Join  (cost=403316.35..403316.74 rows=6 width=156) (actual time=177.380..182.540 rows=479 loops=1)
                           Merge Cond: (t_s_firstyear.customer_id = t_w_firstyear.customer_id)
                           ->  Sort  (cost=138639.93..138640.04 rows=42 width=52) (actual time=75.347..77.869 rows=37909 loops=1)
                                 Sort Key: t_s_firstyear.customer_id
                                 Sort Method: quicksort  Memory: 3273kB
                                 ->  CTE Scan on year_total t_s_firstyear  (cost=0.00..138638.80 rows=42 width=52) (actual time=0.002..60.630 rows=37923 loops=1)
                                       Filter: ((year_total > '0'::numeric) AND (sale_type = 's'::text) AND (dyear = 2001))
                                       Rows Removed by Filter: 346285
                           ->  Sort  (cost=264676.42..264676.48 rows=26 width=104) (actual time=102.001..102.053 rows=1306 loops=1)
                                 Sort Key: t_w_firstyear.customer_id
                                 Sort Method: quicksort  Memory: 139kB
                                 ->  Hash Join  (cost=138639.33..264675.81 rows=26 width=104) (actual time=93.653..101.516 rows=1306 loops=1)
                                       Hash Cond: (t_w_secyear.customer_id = t_w_firstyear.customer_id)
                                       ->  CTE Scan on year_total t_w_secyear  (cost=0.00..126035.28 rows=126 width=52) (actual time=33.759..40.504 rows=11252 loops=1)
                                             Filter: ((sale_type = 'w'::text) AND (dyear = 2002))
                                             Rows Removed by Filter: 372956
                                       ->  Hash  (cost=138638.80..138638.80 rows=42 width=52) (actual time=59.875..59.875 rows=11324 loops=1)
                                             Buckets: 16384 (originally 1024)  Batches: 1 (originally 1)  Memory Usage: 763kB
                                             ->  CTE Scan on year_total t_w_firstyear  (cost=0.00..138638.80 rows=42 width=52) (actual time=48.632..58.272 rows=11324 loops=1)
                                                   Filter: ((year_total > '0'::numeric) AND (sale_type = 'w'::text) AND (dyear = 2001))
                                                   Rows Removed by Filter: 372884
 Planning Time: 6.118 ms
 Execution Time: 8179.205 ms
(98 rows)



The execution time is reduced by 99%. Similarly to my previous reports, I wonder if we can optimize the logic somewhere to improve the performance.


Environment:

The default configurations of PostgreSQL incur the error: "ERROR:  could not resize shared memory segment "/PostgreSQL.3539600020" to 2097152 bytes: No space left on device"
Therefore, to run this query, I set up "work_mem = 1GB"

For the benchmark, I used 1 GB data, and my entire data folder can be downloaded here: https://drive.google.com/file/d/1iK5gfyKudfn2BczpoZbNRY_IAD_rITZu/view?usp=sharing

The connection string is: postgresql://ubuntu:ubuntu@127.0.0.1:5432/tpcds"

tpch=# select version();
                                             version                                              
--------------------------------------------------------------------------------------------------
 PostgreSQL 17.0 on x86_64-pc-linux-gnu, compiled by gcc (Ubuntu 13.2.0-23ubuntu4) 13.2.0, 64-bit
(1 row)




Best regards,

Jinsheng Ba

 

Notice: This email is generated from the account of an NUS alumnus. Contents, views, and opinions therein are solely those of the sender.

Re: Performance of Query 4 on TPC-DS Benchmark

From
Thomas Munro
Date:
On Mon, Nov 11, 2024 at 8:36 AM Ba Jinsheng <bajinsheng@u.nus.edu> wrote:
> The default configurations of PostgreSQL incur the error: "ERROR:  could not resize shared memory segment
"/PostgreSQL.3539600020"to 2097152 bytes: No space left on device" 

No comment on your optimiser experiments for now, but for this error:
it reminds me of a low/default --shm-size limit from Docker, or other
similar container stuff?



Re: Performance of Query 4 on TPC-DS Benchmark

From
Alena Rybakina
Date:

Hi!

On 10.11.2024 22:35, Ba Jinsheng wrote:
P {margin-top:0;margin-bottom:0;}
Hi all,

Please see this case:


Query 4 on TPC-DS benchmark:

with year_total as (
 select c_customer_id customer_id
       ,c_first_name customer_first_name
       ,c_last_name customer_last_name
       ,c_preferred_cust_flag customer_preferred_cust_flag
       ,c_birth_country customer_birth_country
       ,c_login customer_login
       ,c_email_address customer_email_address
       ,d_year dyear
       ,sum(((ss_ext_list_price-ss_ext_wholesale_cost-ss_ext_discount_amt)+ss_ext_sales_price)/2) year_total
       ,'s' sale_type
 from customer
     ,store_sales
     ,date_dim
 where c_customer_sk = ss_customer_sk
   and ss_sold_date_sk = d_date_sk
 group by c_customer_id
         ,c_first_name
         ,c_last_name
         ,c_preferred_cust_flag
         ,c_birth_country
         ,c_login
         ,c_email_address
         ,d_year
 union all
 select c_customer_id customer_id
       ,c_first_name customer_first_name
       ,c_last_name customer_last_name
       ,c_preferred_cust_flag customer_preferred_cust_flag
       ,c_birth_country customer_birth_country
       ,c_login customer_login
       ,c_email_address customer_email_address
       ,d_year dyear
       ,sum((((cs_ext_list_price-cs_ext_wholesale_cost-cs_ext_discount_amt)+cs_ext_sales_price)/2) ) year_total
       ,'c' sale_type
 from customer
     ,catalog_sales
     ,date_dim
 where c_customer_sk = cs_bill_customer_sk
   and cs_sold_date_sk = d_date_sk
 group by c_customer_id
         ,c_first_name
         ,c_last_name
         ,c_preferred_cust_flag
         ,c_birth_country
         ,c_login
         ,c_email_address
         ,d_year
union all
 select c_customer_id customer_id
       ,c_first_name customer_first_name
       ,c_last_name customer_last_name
       ,c_preferred_cust_flag customer_preferred_cust_flag
       ,c_birth_country customer_birth_country
       ,c_login customer_login
       ,c_email_address customer_email_address
       ,d_year dyear
       ,sum((((ws_ext_list_price-ws_ext_wholesale_cost-ws_ext_discount_amt)+ws_ext_sales_price)/2) ) year_total
       ,'w' sale_type
 from customer
     ,web_sales
     ,date_dim
 where c_customer_sk = ws_bill_customer_sk
   and ws_sold_date_sk = d_date_sk
 group by c_customer_id
         ,c_first_name
         ,c_last_name
         ,c_preferred_cust_flag
         ,c_birth_country
         ,c_login
         ,c_email_address
         ,d_year
         )
  select  
                  t_s_secyear.customer_id
                 ,t_s_secyear.customer_first_name
                 ,t_s_secyear.customer_last_name
                 ,t_s_secyear.customer_email_address
 from year_total t_s_firstyear
     ,year_total t_s_secyear
     ,year_total t_c_firstyear
     ,year_total t_c_secyear
     ,year_total t_w_firstyear
     ,year_total t_w_secyear
 where t_s_secyear.customer_id = t_s_firstyear.customer_id
   and t_s_firstyear.customer_id = t_c_secyear.customer_id
   and t_s_firstyear.customer_id = t_c_firstyear.customer_id
   and t_s_firstyear.customer_id = t_w_firstyear.customer_id
   and t_s_firstyear.customer_id = t_w_secyear.customer_id
   and t_s_firstyear.sale_type = 's'
   and t_c_firstyear.sale_type = 'c'
   and t_w_firstyear.sale_type = 'w'
   and t_s_secyear.sale_type = 's'
   and t_c_secyear.sale_type = 'c'
   and t_w_secyear.sale_type = 'w'
   and t_s_firstyear.dyear =  2001
   and t_s_secyear.dyear = 2001+1
   and t_c_firstyear.dyear =  2001
   and t_c_secyear.dyear =  2001+1
   and t_w_firstyear.dyear = 2001
   and t_w_secyear.dyear = 2001+1
   and t_s_firstyear.year_total > 0
   and t_c_firstyear.year_total > 0
   and t_w_firstyear.year_total > 0
   and case when t_c_firstyear.year_total > 0 then t_c_secyear.year_total / t_c_firstyear.year_total else null end
           > case when t_s_firstyear.year_total > 0 then t_s_secyear.year_total / t_s_firstyear.year_total else null end
   and case when t_c_firstyear.year_total > 0 then t_c_secyear.year_total / t_c_firstyear.year_total else null end
           > case when t_w_firstyear.year_total > 0 then t_w_secyear.year_total / t_w_firstyear.year_total else null end
 order by t_s_secyear.customer_id
         ,t_s_secyear.customer_first_name
         ,t_s_secyear.customer_last_name
         ,t_s_secyear.customer_email_address
limit 100;





The execution time is more than 50 minutes:
                                                                                                                                                                               QUERY PLAN                                                                                                                                                                              
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=1255378.56..1255378.57 rows=1 width=132) (actual time=3024403.311..3024403.342 rows=8 loops=1)
   CTE year_total
     ->  Append  (cost=197433.23..461340.62 rows=5041142 width=216) (actual time=4126.043..7897.747 rows=384208 loops=1)
           ->  HashAggregate  (cost=197433.23..233436.60 rows=2880270 width=216) (actual time=4126.042..4231.703 rows=190581 loops=1)
                 Group Key: customer.c_customer_id, customer.c_first_name, customer.c_last_name, customer.c_preferred_cust_flag, customer.c_birth_country, customer.c_login, customer.c_email_address, date_dim.d_year
                 Batches: 1  Memory Usage: 213017kB
                 ->  Hash Join  (cost=8151.60..103824.45 rows=2880270 width=174) (actual time=69.110..1686.608 rows=2685453 loops=1)
                       Hash Cond: (store_sales.ss_sold_date_sk = date_dim.d_date_sk)
                       ->  Hash Join  (cost=5103.00..93214.72 rows=2880270 width=174) (actual time=49.517..1162.567 rows=2750652 loops=1)
                             Hash Cond: (store_sales.ss_customer_sk = customer.c_customer_sk)
                             ->  Seq Scan on store_sales  (cost=0.00..80550.70 rows=2880270 width=30) (actual time=0.018..208.022 rows=2880404 loops=1)
                             ->  Hash  (cost=3853.00..3853.00 rows=100000 width=152) (actual time=49.271..49.271 rows=100000 loops=1)
                                   Buckets: 131072  Batches: 1  Memory Usage: 17161kB
                                   ->  Seq Scan on customer  (cost=0.00..3853.00 rows=100000 width=152) (actual time=0.011..26.448 rows=100000 loops=1)
                       ->  Hash  (cost=2135.49..2135.49 rows=73049 width=8) (actual time=19.369..19.370 rows=73049 loops=1)
                             Buckets: 131072  Batches: 1  Memory Usage: 3878kB
                             ->  Seq Scan on date_dim  (cost=0.00..2135.49 rows=73049 width=8) (actual time=0.037..11.763 rows=73049 loops=1)
           ->  HashAggregate  (cost=114410.03..132428.63 rows=1441488 width=216) (actual time=2369.202..2447.868 rows=136978 loops=1)
                 Group Key: customer_1.c_customer_id, customer_1.c_first_name, customer_1.c_last_name, customer_1.c_preferred_cust_flag, customer_1.c_birth_country, customer_1.c_login, customer_1.c_email_address, date_dim_1.d_year
                 Batches: 1  Memory Usage: 131097kB
                 ->  Hash Join  (cost=8151.60..67561.67 rows=1441488 width=177) (actual time=62.483..974.143 rows=1430939 loops=1)
                       Hash Cond: (catalog_sales.cs_sold_date_sk = date_dim_1.d_date_sk)
                       ->  Hash Join  (cost=5103.00..60728.94 rows=1441488 width=177) (actual time=46.571..687.972 rows=1434519 loops=1)
                             Hash Cond: (catalog_sales.cs_bill_customer_sk = customer_1.c_customer_sk)
                             ->  Seq Scan on catalog_sales  (cost=0.00..51841.88 rows=1441488 width=33) (actual time=0.029..128.238 rows=1441548 loops=1)
                             ->  Hash  (cost=3853.00..3853.00 rows=100000 width=152) (actual time=46.311..46.325 rows=100000 loops=1)
                                   Buckets: 131072  Batches: 1  Memory Usage: 17161kB
                                   ->  Seq Scan on customer customer_1  (cost=0.00..3853.00 rows=100000 width=152) (actual time=0.005..23.350 rows=100000 loops=1)
                       ->  Hash  (cost=2135.49..2135.49 rows=73049 width=8) (actual time=15.677..15.677 rows=73049 loops=1)
                             Buckets: 131072  Batches: 1  Memory Usage: 3878kB
                             ->  Seq Scan on date_dim date_dim_1  (cost=0.00..2135.49 rows=73049 width=8) (actual time=0.015..7.957 rows=73049 loops=1)
           ->  HashAggregate  (cost=61277.38..70269.68 rows=719384 width=216) (actual time=1166.953..1198.730 rows=56649 loops=1)
                 Group Key: customer_2.c_customer_id, customer_2.c_first_name, customer_2.c_last_name, customer_2.c_preferred_cust_flag, customer_2.c_birth_country, customer_2.c_login, customer_2.c_email_address, date_dim_2.d_year
                 Batches: 1  Memory Usage: 57369kB
                 ->  Hash Join  (cost=8151.60..37897.40 rows=719384 width=177) (actual time=68.327..508.594 rows=719119 loops=1)
                       Hash Cond: (web_sales.ws_sold_date_sk = date_dim_2.d_date_sk)
                       ->  Hash Join  (cost=5103.00..32960.30 rows=719384 width=177) (actual time=52.240..357.963 rows=719217 loops=1)
                             Hash Cond: (web_sales.ws_bill_customer_sk = customer_2.c_customer_sk)
                             ->  Seq Scan on web_sales  (cost=0.00..25968.84 rows=719384 width=33) (actual time=0.032..62.464 rows=719384 loops=1)
                             ->  Hash  (cost=3853.00..3853.00 rows=100000 width=152) (actual time=51.959..51.960 rows=100000 loops=1)
                                   Buckets: 131072  Batches: 1  Memory Usage: 17161kB
                                   ->  Seq Scan on customer customer_2  (cost=0.00..3853.00 rows=100000 width=152) (actual time=0.004..25.350 rows=100000 loops=1)
                       ->  Hash  (cost=2135.49..2135.49 rows=73049 width=8) (actual time=15.831..15.834 rows=73049 loops=1)
                             Buckets: 131072  Batches: 1  Memory Usage: 3878kB
                             ->  Seq Scan on date_dim date_dim_2  (cost=0.00..2135.49 rows=73049 width=8) (actual time=0.014..8.100 rows=73049 loops=1)
   ->  Sort  (cost=794037.94..794037.95 rows=1 width=132) (actual time=3024403.310..3024403.313 rows=8 loops=1)
         Sort Key: t_s_secyear.customer_id, t_s_secyear.customer_first_name, t_s_secyear.customer_last_name, t_s_secyear.customer_email_address
         Sort Method: quicksort  Memory: 26kB
         ->  Nested Loop  (cost=0.00..794037.93 rows=1 width=132) (actual time=354851.431..3024403.218 rows=8 loops=1)
               Join Filter: ((t_s_secyear.customer_id = t_w_secyear.customer_id) AND (CASE WHEN (t_c_firstyear.year_total > '0'::numeric) THEN (t_c_secyear.year_total / t_c_firstyear.year_total) ELSE NULL::numeric END > CASE WHEN (t_w_firstyear.year_total > '0'::numeric) THEN (t_w_secyear.year_total / t_w_firstyear.year_total) ELSE NULL::numeric END))
               Rows Removed by Join Filter: 810136
               ->  Nested Loop  (cost=0.00..668006.23 rows=1 width=308) (actual time=33554.075..3021248.646 rows=72 loops=1)
                     Join Filter: ((t_s_secyear.customer_id = t_c_secyear.customer_id) AND (CASE WHEN (t_c_firstyear.year_total > '0'::numeric) THEN (t_c_secyear.year_total / t_c_firstyear.year_total) ELSE NULL::numeric END > CASE WHEN (t_s_firstyear.year_total > '0'::numeric) THEN (t_s_secyear.year_total / t_s_firstyear.year_total) ELSE NULL::numeric END))
                     Rows Removed by Join Filter: 11876277
                     ->  Nested Loop  (cost=0.00..541974.53 rows=1 width=320) (actual time=14866.104..3001271.961 rows=437 loops=1)
                           Join Filter: (t_s_firstyear.customer_id = t_s_secyear.customer_id)
                           Rows Removed by Join Filter: 44702488
                           ->  Nested Loop  (cost=0.00..415941.57 rows=2 width=156) (actual time=11739.944..2946020.749 rows=1171 loops=1)
                                 Join Filter: (t_s_firstyear.customer_id = t_w_firstyear.customer_id)
                                 Rows Removed by Join Filter: 112695277
                                 ->  Nested Loop  (cost=0.00..277302.08 rows=9 width=104) (actual time=8139.729..2351733.795 rows=9952 loops=1)
                                       Join Filter: (t_s_firstyear.customer_id = t_c_firstyear.customer_id)
                                       Rows Removed by Join Filter: 997895870
                                       ->  CTE Scan on year_total t_s_firstyear  (cost=0.00..138631.41 rows=42 width=52) (actual time=4126.046..4234.598 rows=37923 loops=1)
                                             Filter: ((year_total > '0'::numeric) AND (sale_type = 's'::text) AND (dyear = 2001))
                                             Rows Removed by Filter: 346285
                                       ->  CTE Scan on year_total t_c_firstyear  (cost=0.00..138631.41 rows=42 width=52) (actual time=28.926..60.356 rows=26314 loops=37923)
                                             Filter: ((year_total > '0'::numeric) AND (sale_type = 'c'::text) AND (dyear = 2001))
                                             Rows Removed by Filter: 357894
                                 ->  CTE Scan on year_total t_w_firstyear  (cost=0.00..138631.41 rows=42 width=52) (actual time=49.572..59.057 rows=11324 loops=9952)
                                       Filter: ((year_total > '0'::numeric) AND (sale_type = 'w'::text) AND (dyear = 2001))
                                       Rows Removed by Filter: 372884
                           ->  CTE Scan on year_total t_s_secyear  (cost=0.00..126028.55 rows=126 width=164) (actual time=0.002..44.949 rows=38175 loops=1171)
                                 Filter: ((sale_type = 's'::text) AND (dyear = 2002))
                                 Rows Removed by Filter: 346033
                     ->  CTE Scan on year_total t_c_secyear  (cost=0.00..126028.55 rows=126 width=52) (actual time=21.023..44.097 rows=27177 loops=437)
                           Filter: ((sale_type = 'c'::text) AND (dyear = 2002))
                           Rows Removed by Filter: 357031
               ->  CTE Scan on year_total t_w_secyear  (cost=0.00..126028.55 rows=126 width=52) (actual time=36.137..43.090 rows=11252 loops=72)
                     Filter: ((sale_type = 'w'::text) AND (dyear = 2002))
                     Rows Removed by Filter: 372956
 Planning Time: 4.529 ms
 Execution Time: 3024486.695 ms
(83 rows)



I have noticed significant underestimation here. In addition, these nodes are executed noticeably slower in the plan, you can see it by the current time in the explain.

Have you tried any tools to improve the cardinality yet, like aqo [0]?


[0] https://github.com/postgrespro/aqo

-- 
Regards,
Alena Rybakina
Postgres Professional

Re: Performance of Query 4 on TPC-DS Benchmark

From
Alena Rybakina
Date:
On 10.11.2024 23:16, Alena Rybakina wrote:

Hi!

On 10.11.2024 22:35, Ba Jinsheng wrote:
P {margin-top:0;margin-bottom:0;}
Hi all,

Please see this case:


Query 4 on TPC-DS benchmark:

with year_total as (
 select c_customer_id customer_id
       ,c_first_name customer_first_name
       ,c_last_name customer_last_name
       ,c_preferred_cust_flag customer_preferred_cust_flag
       ,c_birth_country customer_birth_country
       ,c_login customer_login
       ,c_email_address customer_email_address
       ,d_year dyear
       ,sum(((ss_ext_list_price-ss_ext_wholesale_cost-ss_ext_discount_amt)+ss_ext_sales_price)/2) year_total
       ,'s' sale_type
 from customer
     ,store_sales
     ,date_dim
 where c_customer_sk = ss_customer_sk
   and ss_sold_date_sk = d_date_sk
 group by c_customer_id
         ,c_first_name
         ,c_last_name
         ,c_preferred_cust_flag
         ,c_birth_country
         ,c_login
         ,c_email_address
         ,d_year
 union all
 select c_customer_id customer_id
       ,c_first_name customer_first_name
       ,c_last_name customer_last_name
       ,c_preferred_cust_flag customer_preferred_cust_flag
       ,c_birth_country customer_birth_country
       ,c_login customer_login
       ,c_email_address customer_email_address
       ,d_year dyear
       ,sum((((cs_ext_list_price-cs_ext_wholesale_cost-cs_ext_discount_amt)+cs_ext_sales_price)/2) ) year_total
       ,'c' sale_type
 from customer
     ,catalog_sales
     ,date_dim
 where c_customer_sk = cs_bill_customer_sk
   and cs_sold_date_sk = d_date_sk
 group by c_customer_id
         ,c_first_name
         ,c_last_name
         ,c_preferred_cust_flag
         ,c_birth_country
         ,c_login
         ,c_email_address
         ,d_year
union all
 select c_customer_id customer_id
       ,c_first_name customer_first_name
       ,c_last_name customer_last_name
       ,c_preferred_cust_flag customer_preferred_cust_flag
       ,c_birth_country customer_birth_country
       ,c_login customer_login
       ,c_email_address customer_email_address
       ,d_year dyear
       ,sum((((ws_ext_list_price-ws_ext_wholesale_cost-ws_ext_discount_amt)+ws_ext_sales_price)/2) ) year_total
       ,'w' sale_type
 from customer
     ,web_sales
     ,date_dim
 where c_customer_sk = ws_bill_customer_sk
   and ws_sold_date_sk = d_date_sk
 group by c_customer_id
         ,c_first_name
         ,c_last_name
         ,c_preferred_cust_flag
         ,c_birth_country
         ,c_login
         ,c_email_address
         ,d_year
         )
  select  
                  t_s_secyear.customer_id
                 ,t_s_secyear.customer_first_name
                 ,t_s_secyear.customer_last_name
                 ,t_s_secyear.customer_email_address
 from year_total t_s_firstyear
     ,year_total t_s_secyear
     ,year_total t_c_firstyear
     ,year_total t_c_secyear
     ,year_total t_w_firstyear
     ,year_total t_w_secyear
 where t_s_secyear.customer_id = t_s_firstyear.customer_id
   and t_s_firstyear.customer_id = t_c_secyear.customer_id
   and t_s_firstyear.customer_id = t_c_firstyear.customer_id
   and t_s_firstyear.customer_id = t_w_firstyear.customer_id
   and t_s_firstyear.customer_id = t_w_secyear.customer_id
   and t_s_firstyear.sale_type = 's'
   and t_c_firstyear.sale_type = 'c'
   and t_w_firstyear.sale_type = 'w'
   and t_s_secyear.sale_type = 's'
   and t_c_secyear.sale_type = 'c'
   and t_w_secyear.sale_type = 'w'
   and t_s_firstyear.dyear =  2001
   and t_s_secyear.dyear = 2001+1
   and t_c_firstyear.dyear =  2001
   and t_c_secyear.dyear =  2001+1
   and t_w_firstyear.dyear = 2001
   and t_w_secyear.dyear = 2001+1
   and t_s_firstyear.year_total > 0
   and t_c_firstyear.year_total > 0
   and t_w_firstyear.year_total > 0
   and case when t_c_firstyear.year_total > 0 then t_c_secyear.year_total / t_c_firstyear.year_total else null end
           > case when t_s_firstyear.year_total > 0 then t_s_secyear.year_total / t_s_firstyear.year_total else null end
   and case when t_c_firstyear.year_total > 0 then t_c_secyear.year_total / t_c_firstyear.year_total else null end
           > case when t_w_firstyear.year_total > 0 then t_w_secyear.year_total / t_w_firstyear.year_total else null end
 order by t_s_secyear.customer_id
         ,t_s_secyear.customer_first_name
         ,t_s_secyear.customer_last_name
         ,t_s_secyear.customer_email_address
limit 100;





The execution time is more than 50 minutes:
                                                                                                                                                                               QUERY PLAN                                                                                                                                                                              
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=1255378.56..1255378.57 rows=1 width=132) (actual time=3024403.311..3024403.342 rows=8 loops=1)
   CTE year_total
     ->  Append  (cost=197433.23..461340.62 rows=5041142 width=216) (actual time=4126.043..7897.747 rows=384208 loops=1)
           ->  HashAggregate  (cost=197433.23..233436.60 rows=2880270 width=216) (actual time=4126.042..4231.703 rows=190581 loops=1)
                 Group Key: customer.c_customer_id, customer.c_first_name, customer.c_last_name, customer.c_preferred_cust_flag, customer.c_birth_country, customer.c_login, customer.c_email_address, date_dim.d_year
                 Batches: 1  Memory Usage: 213017kB
                 ->  Hash Join  (cost=8151.60..103824.45 rows=2880270 width=174) (actual time=69.110..1686.608 rows=2685453 loops=1)
                       Hash Cond: (store_sales.ss_sold_date_sk = date_dim.d_date_sk)
                       ->  Hash Join  (cost=5103.00..93214.72 rows=2880270 width=174) (actual time=49.517..1162.567 rows=2750652 loops=1)
                             Hash Cond: (store_sales.ss_customer_sk = customer.c_customer_sk)
                             ->  Seq Scan on store_sales  (cost=0.00..80550.70 rows=2880270 width=30) (actual time=0.018..208.022 rows=2880404 loops=1)
                             ->  Hash  (cost=3853.00..3853.00 rows=100000 width=152) (actual time=49.271..49.271 rows=100000 loops=1)
                                   Buckets: 131072  Batches: 1  Memory Usage: 17161kB
                                   ->  Seq Scan on customer  (cost=0.00..3853.00 rows=100000 width=152) (actual time=0.011..26.448 rows=100000 loops=1)
                       ->  Hash  (cost=2135.49..2135.49 rows=73049 width=8) (actual time=19.369..19.370 rows=73049 loops=1)
                             Buckets: 131072  Batches: 1  Memory Usage: 3878kB
                             ->  Seq Scan on date_dim  (cost=0.00..2135.49 rows=73049 width=8) (actual time=0.037..11.763 rows=73049 loops=1)
           ->  HashAggregate  (cost=114410.03..132428.63 rows=1441488 width=216) (actual time=2369.202..2447.868 rows=136978 loops=1)
                 Group Key: customer_1.c_customer_id, customer_1.c_first_name, customer_1.c_last_name, customer_1.c_preferred_cust_flag, customer_1.c_birth_country, customer_1.c_login, customer_1.c_email_address, date_dim_1.d_year
                 Batches: 1  Memory Usage: 131097kB
                 ->  Hash Join  (cost=8151.60..67561.67 rows=1441488 width=177) (actual time=62.483..974.143 rows=1430939 loops=1)
                       Hash Cond: (catalog_sales.cs_sold_date_sk = date_dim_1.d_date_sk)
                       ->  Hash Join  (cost=5103.00..60728.94 rows=1441488 width=177) (actual time=46.571..687.972 rows=1434519 loops=1)
                             Hash Cond: (catalog_sales.cs_bill_customer_sk = customer_1.c_customer_sk)
                             ->  Seq Scan on catalog_sales  (cost=0.00..51841.88 rows=1441488 width=33) (actual time=0.029..128.238 rows=1441548 loops=1)
                             ->  Hash  (cost=3853.00..3853.00 rows=100000 width=152) (actual time=46.311..46.325 rows=100000 loops=1)
                                   Buckets: 131072  Batches: 1  Memory Usage: 17161kB
                                   ->  Seq Scan on customer customer_1  (cost=0.00..3853.00 rows=100000 width=152) (actual time=0.005..23.350 rows=100000 loops=1)
                       ->  Hash  (cost=2135.49..2135.49 rows=73049 width=8) (actual time=15.677..15.677 rows=73049 loops=1)
                             Buckets: 131072  Batches: 1  Memory Usage: 3878kB
                             ->  Seq Scan on date_dim date_dim_1  (cost=0.00..2135.49 rows=73049 width=8) (actual time=0.015..7.957 rows=73049 loops=1)
           ->  HashAggregate  (cost=61277.38..70269.68 rows=719384 width=216) (actual time=1166.953..1198.730 rows=56649 loops=1)
                 Group Key: customer_2.c_customer_id, customer_2.c_first_name, customer_2.c_last_name, customer_2.c_preferred_cust_flag, customer_2.c_birth_country, customer_2.c_login, customer_2.c_email_address, date_dim_2.d_year
                 Batches: 1  Memory Usage: 57369kB
                 ->  Hash Join  (cost=8151.60..37897.40 rows=719384 width=177) (actual time=68.327..508.594 rows=719119 loops=1)
                       Hash Cond: (web_sales.ws_sold_date_sk = date_dim_2.d_date_sk)
                       ->  Hash Join  (cost=5103.00..32960.30 rows=719384 width=177) (actual time=52.240..357.963 rows=719217 loops=1)
                             Hash Cond: (web_sales.ws_bill_customer_sk = customer_2.c_customer_sk)
                             ->  Seq Scan on web_sales  (cost=0.00..25968.84 rows=719384 width=33) (actual time=0.032..62.464 rows=719384 loops=1)
                             ->  Hash  (cost=3853.00..3853.00 rows=100000 width=152) (actual time=51.959..51.960 rows=100000 loops=1)
                                   Buckets: 131072  Batches: 1  Memory Usage: 17161kB
                                   ->  Seq Scan on customer customer_2  (cost=0.00..3853.00 rows=100000 width=152) (actual time=0.004..25.350 rows=100000 loops=1)
                       ->  Hash  (cost=2135.49..2135.49 rows=73049 width=8) (actual time=15.831..15.834 rows=73049 loops=1)
                             Buckets: 131072  Batches: 1  Memory Usage: 3878kB
                             ->  Seq Scan on date_dim date_dim_2  (cost=0.00..2135.49 rows=73049 width=8) (actual time=0.014..8.100 rows=73049 loops=1)
   ->  Sort  (cost=794037.94..794037.95 rows=1 width=132) (actual time=3024403.310..3024403.313 rows=8 loops=1)
         Sort Key: t_s_secyear.customer_id, t_s_secyear.customer_first_name, t_s_secyear.customer_last_name, t_s_secyear.customer_email_address
         Sort Method: quicksort  Memory: 26kB
         ->  Nested Loop  (cost=0.00..794037.93 rows=1 width=132) (actual time=354851.431..3024403.218 rows=8 loops=1)
               Join Filter: ((t_s_secyear.customer_id = t_w_secyear.customer_id) AND (CASE WHEN (t_c_firstyear.year_total > '0'::numeric) THEN (t_c_secyear.year_total / t_c_firstyear.year_total) ELSE NULL::numeric END > CASE WHEN (t_w_firstyear.year_total > '0'::numeric) THEN (t_w_secyear.year_total / t_w_firstyear.year_total) ELSE NULL::numeric END))
               Rows Removed by Join Filter: 810136
               ->  Nested Loop  (cost=0.00..668006.23 rows=1 width=308) (actual time=33554.075..3021248.646 rows=72 loops=1)
                     Join Filter: ((t_s_secyear.customer_id = t_c_secyear.customer_id) AND (CASE WHEN (t_c_firstyear.year_total > '0'::numeric) THEN (t_c_secyear.year_total / t_c_firstyear.year_total) ELSE NULL::numeric END > CASE WHEN (t_s_firstyear.year_total > '0'::numeric) THEN (t_s_secyear.year_total / t_s_firstyear.year_total) ELSE NULL::numeric END))
                     Rows Removed by Join Filter: 11876277
                     ->  Nested Loop  (cost=0.00..541974.53 rows=1 width=320) (actual time=14866.104..3001271.961 rows=437 loops=1)
                           Join Filter: (t_s_firstyear.customer_id = t_s_secyear.customer_id)
                           Rows Removed by Join Filter: 44702488
                           ->  Nested Loop  (cost=0.00..415941.57 rows=2 width=156) (actual time=11739.944..2946020.749 rows=1171 loops=1)
                                 Join Filter: (t_s_firstyear.customer_id = t_w_firstyear.customer_id)
                                 Rows Removed by Join Filter: 112695277
                                 ->  Nested Loop  (cost=0.00..277302.08 rows=9 width=104) (actual time=8139.729..2351733.795 rows=9952 loops=1)
                                       Join Filter: (t_s_firstyear.customer_id = t_c_firstyear.customer_id)
                                       Rows Removed by Join Filter: 997895870
                                       ->  CTE Scan on year_total t_s_firstyear  (cost=0.00..138631.41 rows=42 width=52) (actual time=4126.046..4234.598 rows=37923 loops=1)
                                             Filter: ((year_total > '0'::numeric) AND (sale_type = 's'::text) AND (dyear = 2001))
                                             Rows Removed by Filter: 346285
                                       ->  CTE Scan on year_total t_c_firstyear  (cost=0.00..138631.41 rows=42 width=52) (actual time=28.926..60.356 rows=26314 loops=37923)
                                             Filter: ((year_total > '0'::numeric) AND (sale_type = 'c'::text) AND (dyear = 2001))
                                             Rows Removed by Filter: 357894
                                 ->  CTE Scan on year_total t_w_firstyear  (cost=0.00..138631.41 rows=42 width=52) (actual time=49.572..59.057 rows=11324 loops=9952)
                                       Filter: ((year_total > '0'::numeric) AND (sale_type = 'w'::text) AND (dyear = 2001))
                                       Rows Removed by Filter: 372884
                           ->  CTE Scan on year_total t_s_secyear  (cost=0.00..126028.55 rows=126 width=164) (actual time=0.002..44.949 rows=38175 loops=1171)
                                 Filter: ((sale_type = 's'::text) AND (dyear = 2002))
                                 Rows Removed by Filter: 346033
                     ->  CTE Scan on year_total t_c_secyear  (cost=0.00..126028.55 rows=126 width=52) (actual time=21.023..44.097 rows=27177 loops=437)
                           Filter: ((sale_type = 'c'::text) AND (dyear = 2002))
                           Rows Removed by Filter: 357031
               ->  CTE Scan on year_total t_w_secyear  (cost=0.00..126028.55 rows=126 width=52) (actual time=36.137..43.090 rows=11252 loops=72)
                     Filter: ((sale_type = 'w'::text) AND (dyear = 2002))
                     Rows Removed by Filter: 372956
 Planning Time: 4.529 ms
 Execution Time: 3024486.695 ms
(83 rows)



I have noticed significant underestimation here. In addition, these nodes are executed noticeably slower in the plan, you can see it by the current time in the explain.
Sorry, I meant actual time, not current time.

-- 
Regards,
Alena Rybakina
Postgres Professional

Re: Performance of Query 4 on TPC-DS Benchmark

From
Andrei Lepikhov
Date:
On 11/11/24 02:35, Ba Jinsheng wrote:
> Hi all,
> 
> Please see this case:
> 
> 
> Query 4 on TPC-DS benchmark:
Thank you for interesting example!
Looking into explains I see two sortings:
->  Sort  (cost=794037.94..794037.95 rows=1 width=132)
    (actual time=3024403.310..3024403.313 rows=8 loops=1)
->  Sort  (cost=794033.93..794033.94 rows=1 width=132)
    (actual time=8068.869..8068.872 rows=8 loops=1)

Almost the same cost and different execution time. So, I think, the core 
of the problem in accuracy of selectivity estimation.
In this specific example I see lots of composite scan filters:
- ((sale_type = 'w'::text) AND (dyear = 2002))
- ((year_total > '0'::numeric) AND (sale_type = 'w'::text) AND (dyear = 
2001))
- ((year_total > '0'::numeric) AND (sale_type = 's'::text) AND (dyear = 
2001))

It is all the time a challenge for PostgreSQL to estimate such a filter 
because of absent information on joint column distribution.
Can you research this way by building extended statistics on these 
clauses? It could move the plan to the more optimal direction.

-- 
regards, Andrei Lepikhov



Re: Performance of Query 4 on TPC-DS Benchmark

From
Ba Jinsheng
Date:

>It is all the time a challenge for PostgreSQL to estimate such a filter
>because of absent information on joint column distribution.
>Can you research this way by building extended statistics on these
>clauses? It could move the plan to the more optimal direction.

Thanks a lot for your effort to analyze this issue, and we really appreciate your suggestions!  Currently, we focus on exposing these issues that affect performance. In the future, we may consider to look into such a direction as you suggested. 


> Have you tried any tools to improve the cardinality yet, like aqo [0]?
Yes, but it takes nearly 1 hour to run this query at a time, so I only run "EXPLAIN ANALYZE" once, and the performance seems slightly improved.


              QUERY PLAN                                                                                                                                                                                                                                                                              
                                     
 Limit  (cost=293880.50..293880.50 rows=1 width=132) (actual time=2527921.078..2527921.233 rows=8 loops=1)
   CTE year_total
     ->  Gather  (cost=115049.92..233367.07 rows=384208 width=216) (actual time=1116.139..4005.105 rows=384208 loops=1)
           Workers Planned: 2
           Workers Launched: 2
           ->  Parallel Append  (cost=114049.92..193946.27 rows=160087 width=216) (actual time=2430.791..2510.131 rows=128069 loops=3)
                 ->  HashAggregate  (cost=190763.57..193145.83 rows=190581 width=216) (actual time=3977.521..4070.200 rows=190581 loops=1)
                       Group Key: customer.c_customer_id, customer.c_first_name, customer.c_last_name, customer.c_preferred_cust_flag, customer.c_birth_country, customer.c_login, customer.c_email_address, date_dim.d_year
                       Worker 1:  Batches: 1  Memory Usage: 120857kB
                       ->  Hash Join  (cost=8151.60..103486.35 rows=2685453 width=174) (actual time=64.667..1605.601 rows=2685453 loops=1)
                             Hash Cond: (store_sales.ss_sold_date_sk = date_dim.d_date_sk)
                             ->  Hash Join  (cost=5103.00..93216.88 rows=2750652 width=174) (actual time=48.111..1121.801 rows=2750652 loops=1)
                                   Hash Cond: (store_sales.ss_customer_sk = customer.c_customer_sk)
                                   ->  Seq Scan on store_sales  (cost=0.00..80552.52 rows=2880404 width=30) (actual time=0.068..230.529 rows=2880404 loops=1)
                                   ->  Hash  (cost=3853.00..3853.00 rows=100000 width=152) (actual time=47.735..47.735 rows=100000 loops=1)
                                         Buckets: 131072  Batches: 1  Memory Usage: 17161kB
                                         ->  Seq Scan on customer  (cost=0.00..3853.00 rows=100000 width=152) (actual time=0.012..25.023 rows=100000 loops=1)
                             ->  Hash  (cost=2135.49..2135.49 rows=73049 width=8) (actual time=16.242..16.242 rows=73049 loops=1)
                                   Buckets: 131072  Batches: 1  Memory Usage: 3878kB
                                   ->  Seq Scan on date_dim  (cost=0.00..2135.49 rows=73049 width=8) (actual time=0.074..8.744 rows=73049 loops=1)
                 ->  HashAggregate  (cost=114049.92..115762.15 rows=136978 width=216) (actual time=2199.723..2268.851 rows=136978 loops=1)
                       Group Key: customer_1.c_customer_id, customer_1.c_first_name, customer_1.c_last_name, customer_1.c_preferred_cust_flag, customer_1.c_birth_country, customer_1.c_login, customer_1.c_email_address, date_dim_1.d_year
                       Worker 0:  Batches: 1  Memory Usage: 88089kB
                       ->  Hash Join  (cost=8151.60..67544.41 rows=1430939 width=177) (actual time=81.920..911.231 rows=1430939 loops=1)
                             Hash Cond: (catalog_sales.cs_sold_date_sk = date_dim_1.d_date_sk)
                             ->  Hash Join  (cost=5103.00..60729.97 rows=1434519 width=177) (actual time=53.469..638.140 rows=1434519 loops=1)
                                   Hash Cond: (catalog_sales.cs_bill_customer_sk = customer_1.c_customer_sk)
                                   ->  Seq Scan on catalog_sales  (cost=0.00..51842.75 rows=1441548 width=33) (actual time=0.066..134.023 rows=1441548 loops=1)
                                   ->  Hash  (cost=3853.00..3853.00 rows=100000 width=152) (actual time=52.937..52.937 rows=100000 loops=1)
                                         Buckets: 131072  Batches: 1  Memory Usage: 17161kB
                                         ->  Seq Scan on customer customer_1  (cost=0.00..3853.00 rows=100000 width=152) (actual time=0.019..27.549 rows=100000 loops=1)
                             ->  Hash  (cost=2135.49..2135.49 rows=73049 width=8) (actual time=27.968..27.968 rows=73049 loops=1)
                                   Buckets: 131072  Batches: 1  Memory Usage: 3878kB
                                   ->  Seq Scan on date_dim date_dim_1  (cost=0.00..2135.49 rows=73049 width=8) (actual time=0.099..14.115 rows=73049 loops=1)
                 ->  HashAggregate  (cost=61268.33..61976.44 rows=56649 width=216) (actual time=1115.125..1142.838 rows=56649 loops=1)
                       Group Key: customer_2.c_customer_id, customer_2.c_first_name, customer_2.c_last_name, customer_2.c_preferred_cust_flag, customer_2.c_birth_country, customer_2.c_login, customer_2.c_email_address, date_dim_2.d_year
                       Batches: 1  Memory Usage: 35865kB
                       ->  Hash Join  (cost=8151.60..37896.96 rows=719119 width=177) (actual time=85.606..491.698 rows=719119 loops=1)
                             Hash Cond: (web_sales.ws_sold_date_sk = date_dim_2.d_date_sk)
                             ->  Hash Join  (cost=5103.00..32960.30 rows=719217 width=177) (actual time=59.536..342.685 rows=719217 loops=1)
                                   Hash Cond: (web_sales.ws_bill_customer_sk = customer_2.c_customer_sk)
                                   ->  Seq Scan on web_sales  (cost=0.00..25968.84 rows=719384 width=33) (actual time=0.032..67.592 rows=719384 loops=1)
                                   ->  Hash  (cost=3853.00..3853.00 rows=100000 width=152) (actual time=59.430..59.430 rows=100000 loops=1)
                                         Buckets: 131072  Batches: 1  Memory Usage: 17161kB
                                         ->  Seq Scan on customer customer_2  (cost=0.00..3853.00 rows=100000 width=152) (actual time=0.006..33.826 rows=100000 loops=1)
                             ->  Hash  (cost=2135.49..2135.49 rows=73049 width=8) (actual time=25.997..25.998 rows=73049 loops=1)
                                   Buckets: 131072  Batches: 1  Memory Usage: 3878kB
                                   ->  Seq Scan on date_dim date_dim_2  (cost=0.00..2135.49 rows=73049 width=8) (actual time=0.016..13.499 rows=73049 loops=1)
   ->  Sort  (cost=60513.43..60513.44 rows=1 width=132) (actual time=2527921.077..2527921.080 rows=8 loops=1)
         Sort Key: t_s_secyear.customer_id, t_s_secyear.customer_first_name, t_s_secyear.customer_last_name, t_s_secyear.customer_email_address
         Sort Method: quicksort  Memory: 26kB
         ->  Nested Loop  (cost=0.00..60513.42 rows=1 width=132) (actual time=388081.669..2527921.053 rows=8 loops=1)
               Join Filter: ((t_s_secyear.customer_id = t_c_firstyear.customer_id) AND (CASE WHEN (t_c_firstyear.year_total > '0'::numeric) THEN (t_c_secyear.year_total / t_c_firstyear.year_total) ELSE NULL::numeric END > CASE WHEN (t_s_firstyear.year_total > '0'::numeric) THEN (t_s_secyear.year_total / t_s_firstyear.year_total) ELSE NULL::numeric END) AND (CASE WHEN (t_c_firstyear.year_total > '0'::numeric) THEN (t_c_secyear.year_total / t_c_firstyear.year_total) ELSE NULL::numeric END > CASE WHEN (t_w_firstyear.year_total > '0'::numeric) THEN (t_w_secyear.year_total / t_w_firstyear.year_total) ELSE NULL::numeric END))
               Rows Removed by Join Filter: 1289378
               ->  Nested Loop  (cost=0.00..49947.59 rows=1 width=372) (actual time=106080.918..2524931.374 rows=49 loops=1)
                     Join Filter: (t_s_secyear.customer_id = t_w_secyear.customer_id)
                     Rows Removed by Join Filter: 4962083
                     ->  Nested Loop  (cost=0.00..40342.26 rows=1 width=320) (actual time=10316.060..2506476.401 rows=441 loops=1)
                           Join Filter: (t_s_secyear.customer_id = t_c_secyear.customer_id)
                           Rows Removed by Join Filter: 44298069
                           ->  Nested Loop  (cost=0.00..30736.94 rows=1 width=268) (actual time=7429.522..2435241.747 rows=1630 loops=1)
                                 Join Filter: (t_s_firstyear.customer_id = t_s_secyear.customer_id)
                                 Rows Removed by Join Filter: 165296120
                                 ->  Nested Loop  (cost=0.00..21131.61 rows=1 width=104) (actual time=3984.981..2240540.776 rows=4330 loops=1)
                                       Join Filter: (t_s_firstyear.customer_id = t_w_firstyear.customer_id)
                                       Rows Removed by Join Filter: 429435722
                                       ->  CTE Scan on year_total t_s_firstyear  (cost=0.00..10565.72 rows=3 width=52) (actual time=3984.972..4058.789 rows=37923 loops=1)
                                             Filter: ((year_total > '0'::numeric) AND (sale_type = 's'::text) AND (dyear = 2001))
                                             Rows Removed by Filter: 346285
                                       ->  CTE Scan on year_total t_w_firstyear  (cost=0.00..10565.72 rows=3 width=52) (actual time=0.001..58.349 rows=11324 loops=37923)
                                             Filter: ((year_total > '0'::numeric) AND (sale_type = 'w'::text) AND (dyear = 2001))
                                             Rows Removed by Filter: 372884
                                 ->  CTE Scan on year_total t_s_secyear  (cost=0.00..9605.20 rows=10 width=164) (actual time=20.421..42.865 rows=38175 loops=4330)
                                       Filter: ((sale_type = 's'::text) AND (dyear = 2002))
                                       Rows Removed by Filter: 346033
                           ->  CTE Scan on year_total t_c_secyear  (cost=0.00..9605.20 rows=10 width=52) (actual time=5.974..42.155 rows=27177 loops=1630)
                                 Filter: ((sale_type = 'c'::text) AND (dyear = 2002))
                                 Rows Removed by Filter: 357031
                     ->  CTE Scan on year_total t_w_secyear  (cost=0.00..9605.20 rows=10 width=52) (actual time=0.002..41.219 rows=11252 loops=441)
                           Filter: ((sale_type = 'w'::text) AND (dyear = 2002))
                           Rows Removed by Filter: 372956
               ->  CTE Scan on year_total t_c_firstyear  (cost=0.00..10565.72 rows=3 width=52) (actual time=8.525..59.572 rows=26314 loops=49)
                     Filter: ((year_total > '0'::numeric) AND (sale_type = 'c'::text) AND (dyear = 2001))
                     Rows Removed by Filter: 357894
 Planning Time: 17.924 ms
 Execution Time: 2527936.040 ms
(86 rows)
Notice: This email is generated from the account of an NUS alumnus. Contents, views, and opinions therein are solely those of the sender.

Re: Performance of Query 4 on TPC-DS Benchmark

From
Ba Jinsheng
Date:

>> The default configurations of PostgreSQL incur the error: "ERROR:  could not resize shared memory segment "/PostgreSQL.3539600020" to 2097152 bytes: No space left on device"

>No comment on your optimiser experiments for now, but for this error:
>it reminds me of a low/default --shm-size limit from Docker, or other
>similar container stuff?


Aha, you are right. The error disappears if running PostgreSQL out of docker container.
Notice: This email is generated from the account of an NUS alumnus. Contents, views, and opinions therein are solely those of the sender.