Thread: Question of Parallel Hash Join on TPC-H Benchmark

Question of Parallel Hash Join on TPC-H Benchmark

From

Ba Jinsheng

Date:

10 October 2024, 17:11:49

Hi everyone,

Considering the query 2 in TPC-H benchmark:

explain analyze
select s_acctbal, s_name, n_name, p_partkey, p_mfgr, s_address, s_phone, s_comment from PART, SUPPLIER, PARTSUPP, NATION, REGION where p_partkey = ps_partkey and s_suppkey = ps_suppkey and p_size = 30 and p_type like '%STEEL' and s_nationkey = n_nationkey and n_regionkey = r_regionkey and r_name = 'ASIA' and ps_supplycost = ( select min(ps_supplycost) from PARTSUPP, SUPPLIER, NATION, REGION where p_partkey = ps_partkey and s_suppkey = ps_suppkey and s_nationkey = n_nationkey and n_regionkey = r_regionkey and r_name = 'ASIA' ) order by s_acctbal desc, n_name, s_name, p_partkey limit 100;

I run it on PostgreSQL and got this query plan:

                                                                                      QUERY PLAN                                                                                     

 ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

  Limit  (cost=66839.16..66839.17 rows=1 width=195) (actual time=267.054..267.114 rows=100 loops=1)

    ->  Sort  (cost=66839.16..66839.17 rows=1 width=195) (actual time=267.053..267.108 rows=100 loops=1)

          Sort Key: supplier.s_acctbal DESC, nation.n_name, supplier.s_name, part.p_partkey

          Sort Method: top-N heapsort  Memory: 71kB

          ->  Hash Join  (cost=37874.96..66839.15 rows=1 width=195) (actual time=227.273..266.816 rows=485 loops=1)

                Hash Cond: ((part.p_partkey = partsupp.ps_partkey) AND ((SubPlan 1) = partsupp.ps_supplycost))

                ->  Gather  (cost=1000.00..6458.40 rows=804 width=30) (actual time=0.567..0.733 rows=826 loops=1)

                      Workers Planned: 2

                      Workers Launched: 2

                      ->  Parallel Seq Scan on part  (cost=0.00..5378.00 rows=335 width=30) (actual time=0.156..15.960 rows=275 loops=3)

                            Filter: (((p_type)::text ~~ '%STEEL'::text) AND (p_size = 30))

                            Rows Removed by Filter: 66391

                ->  Hash  (cost=30567.96..30567.96 rows=160000 width=175) (actual time=225.816..225.820 rows=160240 loops=1)

                      Buckets: 65536  Batches: 8  Memory Usage: 4648kB

                      ->  Hash Join  (cost=407.96..30567.96 rows=160000 width=175) (actual time=5.200..162.555 rows=160240 loops=1)

                            Hash Cond: (partsupp.ps_suppkey = supplier.s_suppkey)

                            ->  Seq Scan on partsupp  (cost=0.00..25560.00 rows=800000 width=14) (actual time=0.016..59.835 rows=800000 loops=1)

                            ->  Hash  (cost=382.96..382.96 rows=2000 width=169) (actual time=5.169..5.172 rows=2003 loops=1)

                                  Buckets: 2048  Batches: 1  Memory Usage: 413kB

                                  ->  Hash Join  (cost=2.46..382.96 rows=2000 width=169) (actual time=0.108..4.336 rows=2003 loops=1)

                                        Hash Cond: (supplier.s_nationkey = nation.n_nationkey)

                                        ->  Seq Scan on supplier  (cost=0.00..323.00 rows=10000 width=144) (actual time=0.013..2.478 rows=10000 loops=1)

                                        ->  Hash  (cost=2.40..2.40 rows=5 width=33) (actual time=0.069..0.071 rows=5 loops=1)

                                              Buckets: 1024  Batches: 1  Memory Usage: 9kB

                                              ->  Hash Join  (cost=1.07..2.40 rows=5 width=33) (actual time=0.060..0.066 rows=5 loops=1)

                                                    Hash Cond: (nation.n_regionkey = region.r_regionkey)

                                                    ->  Seq Scan on nation  (cost=0.00..1.25 rows=25 width=37) (actual time=0.011..0.013 rows=25 loops=1)

                                                    ->  Hash  (cost=1.06..1.06 rows=1 width=4) (actual time=0.029..0.029 rows=1 loops=1)

                                                          Buckets: 1024  Batches: 1  Memory Usage: 9kB

                                                          ->  Seq Scan on region  (cost=0.00..1.06 rows=1 width=4) (actual time=0.014..0.015 rows=1 loops=1)

                                                                Filter: (r_name = 'ASIA'::bpchar)

                                                                Rows Removed by Filter: 4

                SubPlan 1

                  ->  Aggregate  (cost=48.70..48.71 rows=1 width=32) (actual time=0.019..0.019 rows=1 loops=1311)

                        ->  Nested Loop  (cost=0.85..48.70 rows=1 width=6) (actual time=0.014..0.018 rows=1 loops=1311)

                              Join Filter: (region_1.r_regionkey = nation_1.n_regionkey)

                              Rows Removed by Join Filter: 3

                              ->  Seq Scan on region region_1  (cost=0.00..1.06 rows=1 width=4) (actual time=0.001..0.001 rows=1 loops=1311)

                                    Filter: (r_name = 'ASIA'::bpchar)

                                    Rows Removed by Filter: 4

                              ->  Nested Loop  (cost=0.85..47.58 rows=4 width=10) (actual time=0.010..0.017 rows=4 loops=1311)

                                    ->  Nested Loop  (cost=0.71..46.96 rows=4 width=10) (actual time=0.009..0.013 rows=4 loops=1311)

                                          ->  Index Scan using partsupp_pkey on partsupp partsupp_1  (cost=0.42..13.75 rows=4 width=10) (actual time=0.007..0.008 rows=4 loops=1311)

                                                Index Cond: (ps_partkey = part.p_partkey)

                                          ->  Index Scan using supplier_pkey on supplier supplier_1  (cost=0.29..8.30 rows=1 width=8) (actual time=0.001..0.001 rows=1 loops=5244)

                                                Index Cond: (s_suppkey = partsupp_1.ps_suppkey)

                                    ->  Index Scan using nation_pkey on nation nation_1  (cost=0.14..0.16 rows=1 width=8) (actual time=0.001..0.001 rows=1 loops=5244)

                                          Index Cond: (n_nationkey = supplier_1.s_nationkey)

 (50 rows)

I am wondering why the HASH JOIN in the fifth line is not in parallel?

According to this document https://www.postgresql.org/docs/current/parallel-safety.html, I understand it should be in parallel, and we can get a more efficient query plan as follows:

                                                                                         QUERY PLAN                                                                                        

 ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

  Limit  (cost=42019.25..42019.25 rows=1 width=195) (actual time=134.698..142.258 rows=100 loops=1)

    ->  Sort  (cost=42019.25..42019.25 rows=1 width=195) (actual time=134.697..142.252 rows=100 loops=1)

          Sort Key: supplier.s_acctbal DESC, nation.n_name, supplier.s_name, part.p_partkey

          Sort Method: top-N heapsort  Memory: 69kB

          ->  Gather  (cost=26845.97..42019.24 rows=1 width=195) (actual time=116.843..142.014 rows=485 loops=1)

                Workers Planned: 2

                Workers Launched: 2

                ->  Parallel Hash Join  (cost=25845.97..41019.14 rows=1 width=195) (actual time=120.427..130.569 rows=162 loops=3)

                      Hash Cond: ((part.p_partkey = partsupp.ps_partkey) AND ((SubPlan 1) = partsupp.ps_supplycost))

                      ->  Parallel Seq Scan on part  (cost=0.00..5378.00 rows=335 width=30) (actual time=0.102..9.739 rows=275 loops=3)

                            Filter: (((p_type)::text ~~ '%STEEL'::text) AND (p_size = 30))

                            Rows Removed by Filter: 66391

                      ->  Parallel Hash  (cost=23217.97..23217.97 rows=66667 width=175) (actual time=97.833..97.837 rows=53413 loops=3)

                            Buckets: 65536  Batches: 8  Memory Usage: 4800kB

                            ->  Hash Join  (cost=407.96..23217.97 rows=66667 width=175) (actual time=4.596..69.280 rows=53413 loops=3)

                                  Hash Cond: (partsupp.ps_suppkey = supplier.s_suppkey)

                                  ->  Parallel Seq Scan on partsupp  (cost=0.00..20893.33 rows=333333 width=14) (actual time=0.039..26.690 rows=266667 loops=3)

                                  ->  Hash  (cost=382.96..382.96 rows=2000 width=169) (actual time=4.543..4.545 rows=2003 loops=3)

                                        Buckets: 2048  Batches: 1  Memory Usage: 413kB

                                        ->  Hash Join  (cost=2.46..382.96 rows=2000 width=169) (actual time=0.106..3.658 rows=2003 loops=3)

                                              Hash Cond: (supplier.s_nationkey = nation.n_nationkey)

                                              ->  Seq Scan on supplier  (cost=0.00..323.00 rows=10000 width=144) (actual time=0.015..1.071 rows=10000 loops=3)

                                              ->  Hash  (cost=2.40..2.40 rows=5 width=33) (actual time=0.070..0.072 rows=5 loops=3)

                                                    Buckets: 1024  Batches: 1  Memory Usage: 9kB

                                                    ->  Hash Join  (cost=1.07..2.40 rows=5 width=33) (actual time=0.061..0.067 rows=5 loops=3)

                                                          Hash Cond: (nation.n_regionkey = region.r_regionkey)

                                                          ->  Seq Scan on nation  (cost=0.00..1.25 rows=25 width=37) (actual time=0.009..0.011 rows=25 loops=3)

                                                          ->  Hash  (cost=1.06..1.06 rows=1 width=4) (actual time=0.031..0.031 rows=1 loops=3)

                                                                Buckets: 1024  Batches: 1  Memory Usage: 9kB

                                                                ->  Seq Scan on region  (cost=0.00..1.06 rows=1 width=4) (actual time=0.014..0.016 rows=1 loops=3)

                                                                      Filter: (r_name = 'ASIA'::bpchar)

                                                                      Rows Removed by Filter: 4

                      SubPlan 1

                        ->  Aggregate  (cost=48.70..48.71 rows=1 width=32) (actual time=0.017..0.017 rows=1 loops=3810)

                              ->  Nested Loop  (cost=0.85..48.70 rows=1 width=6) (actual time=0.012..0.016 rows=1 loops=3810)

                                    Join Filter: (region_1.r_regionkey = nation_1.n_regionkey)

                                    Rows Removed by Join Filter: 3

                                    ->  Seq Scan on region region_1  (cost=0.00..1.06 rows=1 width=4) (actual time=0.001..0.001 rows=1 loops=3810)

                                          Filter: (r_name = 'ASIA'::bpchar)

                                          Rows Removed by Filter: 4

                                    ->  Nested Loop  (cost=0.85..47.58 rows=4 width=10) (actual time=0.006..0.015 rows=4 loops=3810)

                                          ->  Nested Loop  (cost=0.71..46.96 rows=4 width=10) (actual time=0.005..0.011 rows=4 loops=3810)

                                                ->  Index Scan using partsupp_pkey on partsupp partsupp_1  (cost=0.42..13.75 rows=4 width=10) (actual time=0.003..0.004 rows=4 loops=3810)

                                                      Index Cond: (ps_partkey = part.p_partkey)

                                                ->  Index Scan using supplier_pkey on supplier supplier_1  (cost=0.29..8.30 rows=1 width=8) (actual time=0.001..0.001 rows=1 loops=15240)

                                                      Index Cond: (s_suppkey = partsupp_1.ps_suppkey)

                                          ->  Index Scan using nation_pkey on nation nation_1  (cost=0.14..0.16 rows=1 width=8) (actual time=0.001..0.001 rows=1 loops=15240)

                                                Index Cond: (n_nationkey = supplier_1.s_nationkey)

 (50 rows)

Best regards,

Jinsheng Ba

Notice: This email is generated from the account of an NUS alumnus. Contents, views, and opinions therein are solely those of the sender.

Re: Question of Parallel Hash Join on TPC-H Benchmark

From

Ba Jinsheng

Date:

10 October 2024, 17:16:52

Forgot to mention:

TPC-H data scale: 1 GB

PostgreSQL version: https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=a94d5b3728dd3e5fd4adb25350712eb785bb9ab6

From: Ba Jinsheng <bajinsheng@u.nus.edu>
Sent: Thursday, October 10, 2024 4:11 PM
To: pgsql-bugs@lists.postgresql.org <pgsql-bugs@lists.postgresql.org>
Subject: Question of Parallel Hash Join on TPC-H Benchmark

- External Email -

Hi everyone,

Considering the query 2 in TPC-H benchmark:

I run it on PostgreSQL and got this query plan:

                                                                                      QUERY PLAN                                                                                     

 ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

  Limit  (cost=66839.16..66839.17 rows=1 width=195) (actual time=267.054..267.114 rows=100 loops=1)

    ->  Sort  (cost=66839.16..66839.17 rows=1 width=195) (actual time=267.053..267.108 rows=100 loops=1)

          Sort Key: supplier.s_acctbal DESC, nation.n_name, supplier.s_name, part.p_partkey

          Sort Method: top-N heapsort  Memory: 71kB

          ->  Hash Join  (cost=37874.96..66839.15 rows=1 width=195) (actual time=227.273..266.816 rows=485 loops=1)

                Hash Cond: ((part.p_partkey = partsupp.ps_partkey) AND ((SubPlan 1) = partsupp.ps_supplycost))

                ->  Gather  (cost=1000.00..6458.40 rows=804 width=30) (actual time=0.567..0.733 rows=826 loops=1)

                      Workers Planned: 2

                      Workers Launched: 2

                      ->  Parallel Seq Scan on part  (cost=0.00..5378.00 rows=335 width=30) (actual time=0.156..15.960 rows=275 loops=3)

                            Filter: (((p_type)::text ~~ '%STEEL'::text) AND (p_size = 30))

                            Rows Removed by Filter: 66391

                ->  Hash  (cost=30567.96..30567.96 rows=160000 width=175) (actual time=225.816..225.820 rows=160240 loops=1)

                      Buckets: 65536  Batches: 8  Memory Usage: 4648kB

                      ->  Hash Join  (cost=407.96..30567.96 rows=160000 width=175) (actual time=5.200..162.555 rows=160240 loops=1)

                            Hash Cond: (partsupp.ps_suppkey = supplier.s_suppkey)

                            ->  Seq Scan on partsupp  (cost=0.00..25560.00 rows=800000 width=14) (actual time=0.016..59.835 rows=800000 loops=1)

                            ->  Hash  (cost=382.96..382.96 rows=2000 width=169) (actual time=5.169..5.172 rows=2003 loops=1)

                                  Buckets: 2048  Batches: 1  Memory Usage: 413kB

                                  ->  Hash Join  (cost=2.46..382.96 rows=2000 width=169) (actual time=0.108..4.336 rows=2003 loops=1)

                                        Hash Cond: (supplier.s_nationkey = nation.n_nationkey)

                                        ->  Seq Scan on supplier  (cost=0.00..323.00 rows=10000 width=144) (actual time=0.013..2.478 rows=10000 loops=1)

                                        ->  Hash  (cost=2.40..2.40 rows=5 width=33) (actual time=0.069..0.071 rows=5 loops=1)

                                              Buckets: 1024  Batches: 1  Memory Usage: 9kB

                                              ->  Hash Join  (cost=1.07..2.40 rows=5 width=33) (actual time=0.060..0.066 rows=5 loops=1)

                                                    Hash Cond: (nation.n_regionkey = region.r_regionkey)

                                                    ->  Seq Scan on nation  (cost=0.00..1.25 rows=25 width=37) (actual time=0.011..0.013 rows=25 loops=1)

                                                    ->  Hash  (cost=1.06..1.06 rows=1 width=4) (actual time=0.029..0.029 rows=1 loops=1)

                                                          Buckets: 1024  Batches: 1  Memory Usage: 9kB

                                                          ->  Seq Scan on region  (cost=0.00..1.06 rows=1 width=4) (actual time=0.014..0.015 rows=1 loops=1)

                                                                Filter: (r_name = 'ASIA'::bpchar)

                                                                Rows Removed by Filter: 4

                SubPlan 1

                  ->  Aggregate  (cost=48.70..48.71 rows=1 width=32) (actual time=0.019..0.019 rows=1 loops=1311)

                        ->  Nested Loop  (cost=0.85..48.70 rows=1 width=6) (actual time=0.014..0.018 rows=1 loops=1311)

                              Join Filter: (region_1.r_regionkey = nation_1.n_regionkey)

                              Rows Removed by Join Filter: 3

                              ->  Seq Scan on region region_1  (cost=0.00..1.06 rows=1 width=4) (actual time=0.001..0.001 rows=1 loops=1311)

                                    Filter: (r_name = 'ASIA'::bpchar)

                                    Rows Removed by Filter: 4

                              ->  Nested Loop  (cost=0.85..47.58 rows=4 width=10) (actual time=0.010..0.017 rows=4 loops=1311)

                                    ->  Nested Loop  (cost=0.71..46.96 rows=4 width=10) (actual time=0.009..0.013 rows=4 loops=1311)

                                          ->  Index Scan using partsupp_pkey on partsupp partsupp_1  (cost=0.42..13.75 rows=4 width=10) (actual time=0.007..0.008 rows=4 loops=1311)

                                                Index Cond: (ps_partkey = part.p_partkey)

                                          ->  Index Scan using supplier_pkey on supplier supplier_1  (cost=0.29..8.30 rows=1 width=8) (actual time=0.001..0.001 rows=1 loops=5244)

                                                Index Cond: (s_suppkey = partsupp_1.ps_suppkey)

                                    ->  Index Scan using nation_pkey on nation nation_1  (cost=0.14..0.16 rows=1 width=8) (actual time=0.001..0.001 rows=1 loops=5244)

                                          Index Cond: (n_nationkey = supplier_1.s_nationkey)

 (50 rows)

I am wondering why the HASH JOIN in the fifth line is not in parallel?

According to this document https://www.postgresql.org/docs/current/parallel-safety.html, I understand it should be in parallel, and we can get a more efficient query plan as follows:

                                                                                         QUERY PLAN                                                                                        

 ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

  Limit  (cost=42019.25..42019.25 rows=1 width=195) (actual time=134.698..142.258 rows=100 loops=1)

    ->  Sort  (cost=42019.25..42019.25 rows=1 width=195) (actual time=134.697..142.252 rows=100 loops=1)

          Sort Key: supplier.s_acctbal DESC, nation.n_name, supplier.s_name, part.p_partkey

          Sort Method: top-N heapsort  Memory: 69kB

          ->  Gather  (cost=26845.97..42019.24 rows=1 width=195) (actual time=116.843..142.014 rows=485 loops=1)

                Workers Planned: 2

                Workers Launched: 2

                ->  Parallel Hash Join  (cost=25845.97..41019.14 rows=1 width=195) (actual time=120.427..130.569 rows=162 loops=3)

                      Hash Cond: ((part.p_partkey = partsupp.ps_partkey) AND ((SubPlan 1) = partsupp.ps_supplycost))

                      ->  Parallel Seq Scan on part  (cost=0.00..5378.00 rows=335 width=30) (actual time=0.102..9.739 rows=275 loops=3)

                            Filter: (((p_type)::text ~~ '%STEEL'::text) AND (p_size = 30))

                            Rows Removed by Filter: 66391

                      ->  Parallel Hash  (cost=23217.97..23217.97 rows=66667 width=175) (actual time=97.833..97.837 rows=53413 loops=3)

                            Buckets: 65536  Batches: 8  Memory Usage: 4800kB

                            ->  Hash Join  (cost=407.96..23217.97 rows=66667 width=175) (actual time=4.596..69.280 rows=53413 loops=3)

                                  Hash Cond: (partsupp.ps_suppkey = supplier.s_suppkey)

                                  ->  Parallel Seq Scan on partsupp  (cost=0.00..20893.33 rows=333333 width=14) (actual time=0.039..26.690 rows=266667 loops=3)

                                  ->  Hash  (cost=382.96..382.96 rows=2000 width=169) (actual time=4.543..4.545 rows=2003 loops=3)

                                        Buckets: 2048  Batches: 1  Memory Usage: 413kB

                                        ->  Hash Join  (cost=2.46..382.96 rows=2000 width=169) (actual time=0.106..3.658 rows=2003 loops=3)

                                              Hash Cond: (supplier.s_nationkey = nation.n_nationkey)

                                              ->  Seq Scan on supplier  (cost=0.00..323.00 rows=10000 width=144) (actual time=0.015..1.071 rows=10000 loops=3)

                                              ->  Hash  (cost=2.40..2.40 rows=5 width=33) (actual time=0.070..0.072 rows=5 loops=3)

                                                    Buckets: 1024  Batches: 1  Memory Usage: 9kB

                                                    ->  Hash Join  (cost=1.07..2.40 rows=5 width=33) (actual time=0.061..0.067 rows=5 loops=3)

                                                          Hash Cond: (nation.n_regionkey = region.r_regionkey)

                                                          ->  Seq Scan on nation  (cost=0.00..1.25 rows=25 width=37) (actual time=0.009..0.011 rows=25 loops=3)

                                                          ->  Hash  (cost=1.06..1.06 rows=1 width=4) (actual time=0.031..0.031 rows=1 loops=3)

                                                                Buckets: 1024  Batches: 1  Memory Usage: 9kB

                                                                ->  Seq Scan on region  (cost=0.00..1.06 rows=1 width=4) (actual time=0.014..0.016 rows=1 loops=3)

                                                                      Filter: (r_name = 'ASIA'::bpchar)

                                                                      Rows Removed by Filter: 4

                      SubPlan 1

                        ->  Aggregate  (cost=48.70..48.71 rows=1 width=32) (actual time=0.017..0.017 rows=1 loops=3810)

                              ->  Nested Loop  (cost=0.85..48.70 rows=1 width=6) (actual time=0.012..0.016 rows=1 loops=3810)

                                    Join Filter: (region_1.r_regionkey = nation_1.n_regionkey)

                                    Rows Removed by Join Filter: 3

                                    ->  Seq Scan on region region_1  (cost=0.00..1.06 rows=1 width=4) (actual time=0.001..0.001 rows=1 loops=3810)

                                          Filter: (r_name = 'ASIA'::bpchar)

                                          Rows Removed by Filter: 4

                                    ->  Nested Loop  (cost=0.85..47.58 rows=4 width=10) (actual time=0.006..0.015 rows=4 loops=3810)

                                          ->  Nested Loop  (cost=0.71..46.96 rows=4 width=10) (actual time=0.005..0.011 rows=4 loops=3810)

                                                ->  Index Scan using partsupp_pkey on partsupp partsupp_1  (cost=0.42..13.75 rows=4 width=10) (actual time=0.003..0.004 rows=4 loops=3810)

                                                      Index Cond: (ps_partkey = part.p_partkey)

                                                ->  Index Scan using supplier_pkey on supplier supplier_1  (cost=0.29..8.30 rows=1 width=8) (actual time=0.001..0.001 rows=1 loops=15240)

                                                      Index Cond: (s_suppkey = partsupp_1.ps_suppkey)

                                          ->  Index Scan using nation_pkey on nation nation_1  (cost=0.14..0.16 rows=1 width=8) (actual time=0.001..0.001 rows=1 loops=15240)

                                                Index Cond: (n_nationkey = supplier_1.s_nationkey)

 (50 rows)

Best regards,

Jinsheng Ba

Notice: This email is generated from the account of an NUS alumnus. Contents, views, and opinions therein are solely those of the sender. Notice: This email is generated from the account of an NUS alumnus. Contents, views, and opinions therein are solely those of the sender.

Notice: This email is generated from the account of an NUS alumnus. Contents, views, and opinions therein are solely those of the sender.

Re: Question of Parallel Hash Join on TPC-H Benchmark

From

Tomas Vondra

Date:

10 October 2024, 17:40:43

The planner likely does not believe it'd be a win. It's not just about
parallel safety - starting parallel workers and copying the data between
processes is not free, and the planner tries to not do parallel stuff
unnecessarily. I'd bet 1GB is fairly small for parallelism.

You can try setting parallel_setup_cost and parallel_tuple_cost to 0,
and reducing min_parallel_{table,index}_scan_size. That should force a
parallel plan, and you'll see if parallelism is helpful.

regards

On 10/10/24 16:16, Ba Jinsheng wrote:
> Forgot to mention:
> 
> TPC-H data scale: 1 GB
> PostgreSQL version: https://git.postgresql.org/gitweb/?
> p=postgresql.git;a=commit;h=a94d5b3728dd3e5fd4adb25350712eb785bb9ab6
> <https://git.postgresql.org/gitweb/?
> p=postgresql.git;a=commit;h=a94d5b3728dd3e5fd4adb25350712eb785bb9ab6>
> 
> 
> ------------------------------------------------------------------------
> *From:* Ba Jinsheng <bajinsheng@u.nus.edu>
> *Sent:* Thursday, October 10, 2024 4:11 PM
> *To:* pgsql-bugs@lists.postgresql.org <pgsql-bugs@lists.postgresql.org>
> *Subject:* Question of Parallel Hash Join on TPC-H Benchmark
>  
>  
>     
> - External Email -
> 
>  
> 
> Hi everyone,
> 
> Considering the query 2 in TPC-H benchmark:
> 
> explain analyze
> select s_acctbal, s_name, n_name, p_partkey, p_mfgr, s_address, s_phone,
> s_comment from PART, SUPPLIER, PARTSUPP, NATION, REGION where p_partkey
> = ps_partkey and s_suppkey = ps_suppkey and p_size = 30 and p_type like
> '%STEEL' and s_nationkey = n_nationkey and n_regionkey = r_regionkey and
> r_name = 'ASIA' and ps_supplycost = ( select min(ps_supplycost) from
> PARTSUPP, SUPPLIER, NATION, REGION where p_partkey = ps_partkey and
> s_suppkey = ps_suppkey and s_nationkey = n_nationkey and n_regionkey =
> r_regionkey and r_name = 'ASIA' ) order by s_acctbal desc, n_name,
> s_name, p_partkey limit 100;
> 
> I run it  on PostgreSQL and got this query plan:
> 
>                                                                        
>              QUERY PLAN                                                
>                                    
>
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>  Limit  (cost=66839.16..66839.17 rows=1 width=195) (actual
> time=267.054..267.114 rows=100 loops=1)
>    ->  Sort  (cost=66839.16..66839.17 rows=1 width=195) (actual
> time=267.053..267.108 rows=100 loops=1)
>          Sort Key: supplier.s_acctbal DESC, nation.n_name,
> supplier.s_name, part.p_partkey
>          Sort Method: top-N heapsort  Memory: 71kB
>          *->  Hash Join  (cost=37874.96..66839.15 rows=1 width=195)
> (actual time=227.273..266.816 rows=485 loops=1)*
>                Hash Cond: ((part.p_partkey = partsupp.ps_partkey) AND
> ((SubPlan 1) = partsupp.ps_supplycost))
>                ->  Gather  (cost=1000.00..6458.40 rows=804 width=30)
> (actual time=0.567..0.733 rows=826 loops=1)
>                      Workers Planned: 2
>                      Workers Launched: 2
>                      ->  Parallel Seq Scan on part  (cost=0.00..5378.00
> rows=335 width=30) (actual time=0.156..15.960 rows=275 loops=3)
>                            Filter: (((p_type)::text ~~ '%STEEL'::text)
> AND (p_size = 30))
>                            Rows Removed by Filter: 66391
>                ->  Hash  (cost=30567.96..30567.96 rows=160000 width=175)
> (actual time=225.816..225.820 rows=160240 loops=1)
>                      Buckets: 65536  Batches: 8  Memory Usage: 4648kB
>                      ->  Hash Join  (cost=407.96..30567.96 rows=160000
> width=175) (actual time=5.200..162.555 rows=160240 loops=1)
>                            Hash Cond: (partsupp.ps_suppkey =
> supplier.s_suppkey)
>                            ->  Seq Scan on partsupp
>  (cost=0.00..25560.00 rows=800000 width=14) (actual time=0.016..59.835
> rows=800000 loops=1)
>                            ->  Hash  (cost=382.96..382.96 rows=2000
> width=169) (actual time=5.169..5.172 rows=2003 loops=1)
>                                  Buckets: 2048  Batches: 1  Memory
> Usage: 413kB
>                                  ->  Hash Join  (cost=2.46..382.96
> rows=2000 width=169) (actual time=0.108..4.336 rows=2003 loops=1)
>                                        Hash Cond: (supplier.s_nationkey
> = nation.n_nationkey)
>                                        ->  Seq Scan on supplier
>  (cost=0.00..323.00 rows=10000 width=144) (actual time=0.013..2.478
> rows=10000 loops=1)
>                                        ->  Hash  (cost=2.40..2.40 rows=5
> width=33) (actual time=0.069..0.071 rows=5 loops=1)
>                                              Buckets: 1024  Batches: 1
>  Memory Usage: 9kB
>                                              ->  Hash Join
>  (cost=1.07..2.40 rows=5 width=33) (actual time=0.060..0.066 rows=5 loops=1)
>                                                    Hash Cond:
> (nation.n_regionkey = region.r_regionkey)
>                                                    ->  Seq Scan on
> nation  (cost=0.00..1.25 rows=25 width=37) (actual time=0.011..0.013
> rows=25 loops=1)
>                                                    ->  Hash
>  (cost=1.06..1.06 rows=1 width=4) (actual time=0.029..0.029 rows=1 loops=1)
>                                                          Buckets: 1024
>  Batches: 1  Memory Usage: 9kB
>                                                          ->  Seq Scan on
> region  (cost=0.00..1.06 rows=1 width=4) (actual time=0.014..0.015
> rows=1 loops=1)
>                                                                Filter:
> (r_name = 'ASIA'::bpchar)
>                                                                Rows
> Removed by Filter: 4
>                SubPlan 1
>                  ->  Aggregate  (cost=48.70..48.71 rows=1 width=32)
> (actual time=0.019..0.019 rows=1 loops=1311)
>                        ->  Nested Loop  (cost=0.85..48.70 rows=1
> width=6) (actual time=0.014..0.018 rows=1 loops=1311)
>                              Join Filter: (region_1.r_regionkey =
> nation_1.n_regionkey)
>                              Rows Removed by Join Filter: 3
>                              ->  Seq Scan on region region_1
>  (cost=0.00..1.06 rows=1 width=4) (actual time=0.001..0.001 rows=1
> loops=1311)
>                                    Filter: (r_name = 'ASIA'::bpchar)
>                                    Rows Removed by Filter: 4
>                              ->  Nested Loop  (cost=0.85..47.58 rows=4
> width=10) (actual time=0.010..0.017 rows=4 loops=1311)
>                                    ->  Nested Loop  (cost=0.71..46.96
> rows=4 width=10) (actual time=0.009..0.013 rows=4 loops=1311)
>                                          ->  Index Scan using
> partsupp_pkey on partsupp partsupp_1  (cost=0.42..13.75 rows=4 width=10)
> (actual time=0.007..0.008 rows=4 loops=1311)
>                                                Index Cond: (ps_partkey =
> part.p_partkey)
>                                          ->  Index Scan using
> supplier_pkey on supplier supplier_1  (cost=0.29..8.30 rows=1 width=8)
> (actual time=0.001..0.001 rows=1 loops=5244)
>                                                Index Cond: (s_suppkey =
> partsupp_1.ps_suppkey)
>                                    ->  Index Scan using nation_pkey on
> nation nation_1  (cost=0.14..0.16 rows=1 width=8) (actual
> time=0.001..0.001 rows=1 loops=5244)
>                                          Index Cond: (n_nationkey =
> supplier_1.s_nationkey)
> (50 rows)
> 
> 
> I am wondering why the HASH JOIN in the fifth line is not in parallel?
> According to this document https://www.postgresql.org/docs/current/
> parallel-safety.html <https://www.postgresql.org/docs/current/parallel-
> safety.html>, I understand it should be in parallel, and we can get a
> more efficient query plan as follows:
> 
>                                                                        
>                 QUERY PLAN                                              
>                                          
>
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>  Limit  (cost=42019.25..42019.25 rows=1 width=195) (actual
> time=134.698..142.258 rows=100 loops=1)
>    ->  Sort  (cost=42019.25..42019.25 rows=1 width=195) (actual
> time=134.697..142.252 rows=100 loops=1)
>          Sort Key: supplier.s_acctbal DESC, nation.n_name,
> supplier.s_name, part.p_partkey
>          Sort Method: top-N heapsort  Memory: 69kB
>          ->  Gather  (cost=26845.97..42019.24 rows=1 width=195) (actual
> time=116.843..142.014 rows=485 loops=1)
>                Workers Planned: 2
>                Workers Launched: 2
>           *     ->  Parallel Hash Join  (cost=25845.97..41019.14 rows=1
> width=195) (actual time=120.427..130.569 rows=162 loops=3)*
>                      Hash Cond: ((part.p_partkey = partsupp.ps_partkey)
> AND ((SubPlan 1) = partsupp.ps_supplycost))
>                      ->  Parallel Seq Scan on part  (cost=0.00..5378.00
> rows=335 width=30) (actual time=0.102..9.739 rows=275 loops=3)
>                            Filter: (((p_type)::text ~~ '%STEEL'::text)
> AND (p_size = 30))
>                            Rows Removed by Filter: 66391
>                      ->  Parallel Hash  (cost=23217.97..23217.97
> rows=66667 width=175) (actual time=97.833..97.837 rows=53413 loops=3)
>                            Buckets: 65536  Batches: 8  Memory Usage: 4800kB
>                            ->  Hash Join  (cost=407.96..23217.97
> rows=66667 width=175) (actual time=4.596..69.280 rows=53413 loops=3)
>                                  Hash Cond: (partsupp.ps_suppkey =
> supplier.s_suppkey)
>                                  ->  Parallel Seq Scan on partsupp
>  (cost=0.00..20893.33 rows=333333 width=14) (actual time=0.039..26.690
> rows=266667 loops=3)
>                                  ->  Hash  (cost=382.96..382.96
> rows=2000 width=169) (actual time=4.543..4.545 rows=2003 loops=3)
>                                        Buckets: 2048  Batches: 1  Memory
> Usage: 413kB
>                                        ->  Hash Join  (cost=2.46..382.96
> rows=2000 width=169) (actual time=0.106..3.658 rows=2003 loops=3)
>                                              Hash Cond:
> (supplier.s_nationkey = nation.n_nationkey)
>                                              ->  Seq Scan on supplier
>  (cost=0.00..323.00 rows=10000 width=144) (actual time=0.015..1.071
> rows=10000 loops=3)
>                                              ->  Hash  (cost=2.40..2.40
> rows=5 width=33) (actual time=0.070..0.072 rows=5 loops=3)
>                                                    Buckets: 1024
>  Batches: 1  Memory Usage: 9kB
>                                                    ->  Hash Join
>  (cost=1.07..2.40 rows=5 width=33) (actual time=0.061..0.067 rows=5 loops=3)
>                                                          Hash Cond:
> (nation.n_regionkey = region.r_regionkey)
>                                                          ->  Seq Scan on
> nation  (cost=0.00..1.25 rows=25 width=37) (actual time=0.009..0.011
> rows=25 loops=3)
>                                                          ->  Hash
>  (cost=1.06..1.06 rows=1 width=4) (actual time=0.031..0.031 rows=1 loops=3)
>                                                                Buckets:
> 1024  Batches: 1  Memory Usage: 9kB
>                                                                ->  Seq
> Scan on region  (cost=0.00..1.06 rows=1 width=4) (actual
> time=0.014..0.016 rows=1 loops=3)
>                                                                    
>  Filter: (r_name = 'ASIA'::bpchar)
>                                                                    
>  Rows Removed by Filter: 4
>                      SubPlan 1
>                        ->  Aggregate  (cost=48.70..48.71 rows=1
> width=32) (actual time=0.017..0.017 rows=1 loops=3810)
>                              ->  Nested Loop  (cost=0.85..48.70 rows=1
> width=6) (actual time=0.012..0.016 rows=1 loops=3810)
>                                    Join Filter: (region_1.r_regionkey =
> nation_1.n_regionkey)
>                                    Rows Removed by Join Filter: 3
>                                    ->  Seq Scan on region region_1
>  (cost=0.00..1.06 rows=1 width=4) (actual time=0.001..0.001 rows=1
> loops=3810)
>                                          Filter: (r_name = 'ASIA'::bpchar)
>                                          Rows Removed by Filter: 4
>                                    ->  Nested Loop  (cost=0.85..47.58
> rows=4 width=10) (actual time=0.006..0.015 rows=4 loops=3810)
>                                          ->  Nested Loop
>  (cost=0.71..46.96 rows=4 width=10) (actual time=0.005..0.011 rows=4
> loops=3810)
>                                                ->  Index Scan using
> partsupp_pkey on partsupp partsupp_1  (cost=0.42..13.75 rows=4 width=10)
> (actual time=0.003..0.004 rows=4 loops=3810)
>                                                      Index Cond:
> (ps_partkey = part.p_partkey)
>                                                ->  Index Scan using
> supplier_pkey on supplier supplier_1  (cost=0.29..8.30 rows=1 width=8)
> (actual time=0.001..0.001 rows=1 loops=15240)
>                                                      Index Cond:
> (s_suppkey = partsupp_1.ps_suppkey)
>                                          ->  Index Scan using
> nation_pkey on nation nation_1  (cost=0.14..0.16 rows=1 width=8) (actual
> time=0.001..0.001 rows=1 loops=15240)
>                                                Index Cond: (n_nationkey
> = supplier_1.s_nationkey)
> (50 rows)
> 
> 
> 
> Best regards,
> 
> Jinsheng Ba
> 
>  
> 
> Notice: This email is generated from the account of an NUS alumnus.
> Contents, views, and opinions therein are solely those of the sender.
> Notice: This email is generated from the account of an NUS alumnus.
> Contents, views, and opinions therein are solely those of the sender.
> Notice: This email is generated from the account of an NUS alumnus.
> Contents, views, and opinions therein are solely those of the sender.

-- 
Tomas Vondra

Re: Question of Parallel Hash Join on TPC-H Benchmark

From

Zhang Mingli

Date:

10 October 2024, 17:47:00

Hi,

Zhang Mingli

www.hashdata.xyz

On Oct 10, 2024 at 22:41 +0800, Tomas Vondra <tomas@vondra.me>, wrote:

You can try setting parallel_setup_cost and parallel_tuple_cost to 0,
and reducing min_parallel_{table,index}_scan_size. That should force a
parallel plan, and you'll see if parallelism is helpful.

In addition to this, it seems you’r eager a parallel-aware HashJoin.
Check if enable_parallel_hash = on, an explain(verbose) will show related GUCs.

Re: Question of Parallel Hash Join on TPC-H Benchmark

From

Ba Jinsheng

Date:

10 October 2024, 19:52:27

Thanks for your replies!

I configured these variables and did get a worse query plan:

QUERY PLAN

------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

Limit (cost=57899.45..57899.45 rows=1 width=195) (actual time=4261.283..4261.806 rows=100 loops=1)

-> Sort (cost=57899.45..57899.45 rows=1 width=195) (actual time=4261.281..4261.795 rows=100 loops=1)

Sort Key: supplier.s_acctbal DESC, nation.n_name, supplier.s_name, part.p_partkey

Sort Method: top-N heapsort Memory: 71kB

-> Hash Join (cost=29025.99..57899.44 rows=1 width=195) (actual time=281.174..4261.118 rows=485 loops=1)

Hash Cond: ((part.p_partkey = partsupp.ps_partkey) AND ((SubPlan 1) = partsupp.ps_supplycost))

-> Gather (cost=0.00..5378.00 rows=804 width=30) (actual time=0.607..0.807 rows=826 loops=1)

Workers Planned: 2

Workers Launched: 2

-> Parallel Seq Scan on part (cost=0.00..5378.00 rows=335 width=30) (actual time=0.156..17.662 rows=275 loops=3)

Filter: (((p_type)::text ~~ '%STEEL'::text) AND (p_size = 30))

Rows Removed by Filter: 66391

-> Hash (cost=22718.99..22718.99 rows=160000 width=175) (actual time=108.521..108.833 rows=160240 loops=1)

Buckets: 65536 Batches: 8 Memory Usage: 4648kB

-> Gather (cost=297.99..22718.99 rows=160000 width=175) (actual time=5.148..53.343 rows=160240 loops=1)

Workers Planned: 2

Workers Launched: 2

-> Parallel Hash Join (cost=297.99..22718.99 rows=66667 width=175) (actual time=2.076..71.222 rows=53413 loops=3)

Hash Cond: (partsupp.ps_suppkey = supplier.s_suppkey)

-> Parallel Seq Scan on partsupp (cost=0.00..20893.33 rows=333333 width=14) (actual time=0.036..28.325 rows=266667 loops=3)

-> Parallel Hash (cost=287.58..287.58 rows=833 width=169) (actual time=1.600..1.602 rows=668 loops=3)

Buckets: 2048 Batches: 1 Memory Usage: 464kB

-> Parallel Hash Join (cost=2.28..287.58 rows=833 width=169) (actual time=0.047..1.870 rows=1002 loops=2)

Hash Cond: (supplier.s_nationkey = nation.n_nationkey)

-> Parallel Seq Scan on supplier (cost=0.00..264.67 rows=4167 width=144) (actual time=0.004..0.518 rows=5000 loops=2)

-> Parallel Hash (cost=2.25..2.25 rows=3 width=33) (actual time=0.020..0.023 rows=2 loops=2)

Buckets: 1024 Batches: 1 Memory Usage: 40kB

-> Parallel Hash Join (cost=1.05..2.25 rows=3 width=33) (actual time=0.028..0.040 rows=5 loops=1)

Hash Cond: (nation.n_regionkey = region.r_regionkey)

-> Parallel Seq Scan on nation (cost=0.00..1.15 rows=15 width=37) (actual time=0.002..0.004 rows=25 loops=1)

-> Parallel Hash (cost=1.04..1.04 rows=1 width=4) (actual time=0.010..0.010 rows=1 loops=1)

Buckets: 1024 Batches: 1 Memory Usage: 40kB

-> Parallel Seq Scan on region (cost=0.00..1.04 rows=1 width=4) (actual time=0.005..0.006 rows=1 loops=1)

Filter: (r_name = 'ASIA'::bpchar)

Rows Removed by Filter: 4

SubPlan 1

-> Aggregate (cost=48.67..48.68 rows=1 width=32) (actual time=3.144..3.144 rows=1 loops=1311)

-> Nested Loop (cost=0.85..48.67 rows=1 width=6) (actual time=1.306..3.143 rows=1 loops=1311)

Join Filter: (region_1.r_regionkey = nation_1.n_regionkey)

Rows Removed by Join Filter: 3

-> Gather (cost=0.00..1.04 rows=1 width=4) (actual time=0.647..3.118 rows=1 loops=1311)

Workers Planned: 1

Workers Launched: 1

-> Parallel Seq Scan on region region_1 (cost=0.00..1.04 rows=1 width=4) (actual time=0.001..0.002 rows=0 loops=2622)

Filter: (r_name = 'ASIA'::bpchar)

Rows Removed by Filter: 2

-> Nested Loop (cost=0.85..47.58 rows=4 width=10) (actual time=0.010..0.024 rows=4 loops=1311)

-> Nested Loop (cost=0.71..46.96 rows=4 width=10) (actual time=0.008..0.017 rows=4 loops=1311)

-> Index Scan using partsupp_pkey on partsupp partsupp_1 (cost=0.42..13.75 rows=4 width=10) (actual time=0.005..0.006 rows=4 loops=1311)

Index Cond: (ps_partkey = part.p_partkey)

-> Index Scan using supplier_pkey on supplier supplier_1 (cost=0.29..8.30 rows=1 width=8) (actual time=0.002..0.002 rows=1 loops=5244)

Index Cond: (s_suppkey = partsupp_1.ps_suppkey)

-> Index Scan using nation_pkey on nation nation_1 (cost=0.14..0.16 rows=1 width=8) (actual time=0.001..0.001 rows=1 loops=5244)

Index Cond: (n_nationkey = supplier_1.s_nationkey)

Planning Time: 10.079 ms

Execution Time: 4262.062 ms
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

However, I found that the Hash Join in the fifth line is still not in parallel, and other Hash Join are in parallel. This is not what I intended.

I tried another thing. I reset every configuration to default and apply the following patch:

 diff --git a/src/backend/optimizer/util/clauses.c b/src/backend/optimizer/util/clauses.c

 index b4e085e9d4..a0dc032d79 100644

 --- a/src/backend/optimizer/util/clauses.c

 +++ b/src/backend/optimizer/util/clauses.c

 @@ -803,8 +803,6 @@ max_parallel_hazard_test(char proparallel, max_parallel_hazard_context *context)

                         Assert(context->max_hazard != PROPARALLEL_UNSAFE);

                         context->max_hazard = proparallel;

                         /* done if we are not expecting any unsafe functions */

 -                       if (context->max_interesting == proparallel)

 -                               return true;

                         break;

                 case PROPARALLEL_UNSAFE:

                         context->max_hazard = proparallel;

Then I can get a more efficient query plan. Compared to the original execution under default configurations, the estimated cost is reduced around 37% and the execution time is reduced around 50%.

                                                                                         QUERY PLAN                                                                                        

 ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

  Limit  (cost=42019.25..42019.25 rows=1 width=195) (actual time=134.698..142.258 rows=100 loops=1)

    ->  Sort  (cost=42019.25..42019.25 rows=1 width=195) (actual time=134.697..142.252 rows=100 loops=1)

          Sort Key: supplier.s_acctbal DESC, nation.n_name, supplier.s_name, part.p_partkey

          Sort Method: top-N heapsort  Memory: 69kB

          ->  Gather  (cost=26845.97..42019.24 rows=1 width=195) (actual time=116.843..142.014 rows=485 loops=1)

                Workers Planned: 2

                Workers Launched: 2

                ->  Parallel Hash Join  (cost=25845.97..41019.14 rows=1 width=195) (actual time=120.427..130.569 rows=162 loops=3)

                      Hash Cond: ((part.p_partkey = partsupp.ps_partkey) AND ((SubPlan 1) = partsupp.ps_supplycost))

                      ->  Parallel Seq Scan on part  (cost=0.00..5378.00 rows=335 width=30) (actual time=0.102..9.739 rows=275 loops=3)

                            Filter: (((p_type)::text ~~ '%STEEL'::text) AND (p_size = 30))

                            Rows Removed by Filter: 66391

                      ->  Parallel Hash  (cost=23217.97..23217.97 rows=66667 width=175) (actual time=97.833..97.837 rows=53413 loops=3)

                            Buckets: 65536  Batches: 8  Memory Usage: 4800kB

                            ->  Hash Join  (cost=407.96..23217.97 rows=66667 width=175) (actual time=4.596..69.280 rows=53413 loops=3)

                                  Hash Cond: (partsupp.ps_suppkey = supplier.s_suppkey)

                                  ->  Parallel Seq Scan on partsupp  (cost=0.00..20893.33 rows=333333 width=14) (actual time=0.039..26.690 rows=266667 loops=3)

                                  ->  Hash  (cost=382.96..382.96 rows=2000 width=169) (actual time=4.543..4.545 rows=2003 loops=3)

                                        Buckets: 2048  Batches: 1  Memory Usage: 413kB

                                        ->  Hash Join  (cost=2.46..382.96 rows=2000 width=169) (actual time=0.106..3.658 rows=2003 loops=3)

                                              Hash Cond: (supplier.s_nationkey = nation.n_nationkey)

                                              ->  Seq Scan on supplier  (cost=0.00..323.00 rows=10000 width=144) (actual time=0.015..1.071 rows=10000 loops=3)

                                              ->  Hash  (cost=2.40..2.40 rows=5 width=33) (actual time=0.070..0.072 rows=5 loops=3)

                                                    Buckets: 1024  Batches: 1  Memory Usage: 9kB

                                                    ->  Hash Join  (cost=1.07..2.40 rows=5 width=33) (actual time=0.061..0.067 rows=5 loops=3)

                                                          Hash Cond: (nation.n_regionkey = region.r_regionkey)

                                                          ->  Seq Scan on nation  (cost=0.00..1.25 rows=25 width=37) (actual time=0.009..0.011 rows=25 loops=3)

                                                          ->  Hash  (cost=1.06..1.06 rows=1 width=4) (actual time=0.031..0.031 rows=1 loops=3)

                                                                Buckets: 1024  Batches: 1  Memory Usage: 9kB

                                                                ->  Seq Scan on region  (cost=0.00..1.06 rows=1 width=4) (actual time=0.014..0.016 rows=1 loops=3)

                                                                      Filter: (r_name = 'ASIA'::bpchar)

                                                                      Rows Removed by Filter: 4

                      SubPlan 1

                        ->  Aggregate  (cost=48.70..48.71 rows=1 width=32) (actual time=0.017..0.017 rows=1 loops=3810)

                              ->  Nested Loop  (cost=0.85..48.70 rows=1 width=6) (actual time=0.012..0.016 rows=1 loops=3810)

                                    Join Filter: (region_1.r_regionkey = nation_1.n_regionkey)

                                    Rows Removed by Join Filter: 3

                                    ->  Seq Scan on region region_1  (cost=0.00..1.06 rows=1 width=4) (actual time=0.001..0.001 rows=1 loops=3810)

                                          Filter: (r_name = 'ASIA'::bpchar)

                                          Rows Removed by Filter: 4

                                    ->  Nested Loop  (cost=0.85..47.58 rows=4 width=10) (actual time=0.006..0.015 rows=4 loops=3810)

                                          ->  Nested Loop  (cost=0.71..46.96 rows=4 width=10) (actual time=0.005..0.011 rows=4 loops=3810)

                                                ->  Index Scan using partsupp_pkey on partsupp partsupp_1  (cost=0.42..13.75 rows=4 width=10) (actual time=0.003..0.004 rows=4 loops=3810)

                                                      Index Cond: (ps_partkey = part.p_partkey)

                                                ->  Index Scan using supplier_pkey on supplier supplier_1  (cost=0.29..8.30 rows=1 width=8) (actual time=0.001..0.001 rows=1 loops=15240)

                                                      Index Cond: (s_suppkey = partsupp_1.ps_suppkey)

                                          ->  Index Scan using nation_pkey on nation nation_1  (cost=0.14..0.16 rows=1 width=8) (actual time=0.001..0.001 rows=1 loops=15240)

                                                Index Cond: (n_nationkey = supplier_1.s_nationkey)

  Planning Time: 2.046 ms

  Execution Time: 142.437 ms

 (50 rows)
 ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

Best regards,

Jinsheng Ba

From: Zhang Mingli <zmlpostgres@gmail.com>
Sent: Thursday, October 10, 2024 4:47 PM
To: Ba Jinsheng <bajinsheng@u.nus.edu>; pgsql-bugs@lists.postgresql.org <pgsql-bugs@lists.postgresql.org>; Tomas Vondra <tomas@vondra.me>
Subject: Re: Question of Parallel Hash Join on TPC-H Benchmark

- External Email -

Hi,

Zhang Mingli

www.hashdata.xyz

On Oct 10, 2024 at 22:41 +0800, Tomas Vondra <tomas@vondra.me>, wrote:

You can try setting parallel_setup_cost and parallel_tuple_cost to 0,
and reducing min_parallel_{table,index}_scan_size. That should force a
parallel plan, and you'll see if parallelism is helpful.

In addition to this, it seems you’r eager a parallel-aware HashJoin.
Check if enable_parallel_hash = on, an explain(verbose) will show related GUCs.

Notice: This email is generated from the account of an NUS alumnus. Contents, views, and opinions therein are solely those of the sender.

Re: Question of Parallel Hash Join on TPC-H Benchmark

From

Andrei Lepikhov

Date:

12 October 2024, 04:53:19

On 10/10/2024 23:52, Ba Jinsheng wrote:
> Thanks for your replies!
> 
> 
> I configured these variables and did get a worse query plan:
Could you provide SQL dump and settings to play with this case locally?
I would discover what we need to transform Subplan 1 into a SEMI JOIN.

Also, I usually force parallel workers with settings like below:

max_parallel_workers_per_gather = 32
min_parallel_table_scan_size = 0
min_parallel_index_scan_size = 0
max_worker_processes = 64
parallel_setup_cost = 0.001
parallel_tuple_cost = 0.0001

just to be sure, it's not the case when something down the plan tree 
forbids generating a parallel query plan.

P.S.
To discover other options, the AQO extension may be helpful. It stores 
cardinalities from all plan nodes after execution and uses them to 
calculate selectivities on the subsequent execution.

-- 
regards, Andrei Lepikhov

Re: Question of Parallel Hash Join on TPC-H Benchmark

From

Ba Jinsheng

Date:

12 October 2024, 21:11:04

> Could you provide SQL dump and settings to play with this case locally?

The dump file is too big, so I put it on Google Drive: https://drive.google.com/file/d/1e0s6ZLKLEPbZzS6BzftwpmVspWi7Okd1/view?usp=sharing

I also share my data directory here: https://drive.google.com/file/d/1ZBLHanIRwxbaMQIhRUSPv4I7y8g_0AWi/view?usp=sharing

>Also, I usually force parallel workers with settings like below:

>max_parallel_workers_per_gather = 32
>min_parallel_table_scan_size = 0
>min_parallel_index_scan_size = 0
>max_worker_processes = 64
>parallel_setup_cost = 0.001
>parallel_tuple_cost = 0.0001

I tried these configuration parameters and got the same worse query plan--- the HashJoin in fifth line is still not in parallel and the following HashJoin are in parallel.

However, this is an inefficient query plan.

I changed the code to generate an efficient query plan (only the HashJoin in fifth line is in parallel), so I am wondering whether it is possible to optimize the code to enable this efficient query plan in default? I believe at least, it would improve the performance of PostgreSQL on the standard benchmark TPC-H.

If you need, I can provide my environment in docker for your analysis.

Best regards,

Jinsheng Ba

Notice: This email is generated from the account of an NUS alumnus. Contents, views, and opinions therein are solely those of the sender.

Re: Question of Parallel Hash Join on TPC-H Benchmark

From

Andrei Lepikhov

Date:

13 October 2024, 03:01:29

On 13/10/2024 01:11, Ba Jinsheng wrote:
> I changed the code to generate an efficient query plan (only the 
> HashJoin in fifth line is in parallel), so I am wondering whether it is 
> possible to optimize the code to enable this efficient query plan in 
> default? I believe at least, it would improve the performance of 
> PostgreSQL on the standard benchmark TPC-H.
> If you need, I can provide my environment in docker for your analysis.
Thanks for sharing the case! I will investigate it in a while.

-- 
regards, Andrei Lepikhov

Re: Question of Parallel Hash Join on TPC-H Benchmark

From

Ba Jinsheng

Date:

13 October 2024, 11:57:45

Thanks for looking into it!

We are working a research project to automatically look for such cases.

If it matters for PostgreSQL, I will report more such cases.

Best regards,

Jinsheng Ba

Notice: This email is generated from the account of an NUS alumnus. Contents, views, and opinions therein are solely those of the sender.

Re: Question of Parallel Hash Join on TPC-H Benchmark

From

Andrei Lepikhov

Date:

13 October 2024, 12:57:59

On 13/10/2024 15:57, Ba Jinsheng wrote:
> Thanks for looking into it!
> We are working a research project to automatically look for such cases.
> If it matters for PostgreSQL, I will report more such cases.
I made an attempt last year [1]. However, at least in production, it 
provided too many false-positive cases to trigger automatic 
re-optimisation [2] on each poorly estimated query plan.
I wonder if you have any progress in that area. Living nearby Singapore 
(Chon Buri, Thailand) I would like to visit meetup related to this 
problem, if you organised it.
Of course, reporting performance issues is always beneficial.

[1] https://github.com/danolivo/pg_track_optimizer
[2] https://postgrespro.com/docs/enterprise/16/realtime-query-replanning

-- 
regards, Andrei Lepikhov

Re: Question of Parallel Hash Join on TPC-H Benchmark

From

Ba Jinsheng

Date:

14 October 2024, 09:52:30

>I wonder if you have any progress in that area. Living nearby Singapore
(Chon Buri, Thailand) I would like to visit meetup related to this
problem, if you organised it.

That is great! Unfortunately, I have moved to ETH Zurich, Switzerland as a postdoc from last month.

Our research is still on early exploration stage, and we would happy to talk more about it when we have made a significant progress.

>Of course, reporting performance issues is always beneficial.

Sounds good! I will report more cases when I believe are real performance issues.

Best regards,

Jinsheng Ba

From: Andrei Lepikhov <lepihov@gmail.com>
Sent: Sunday, October 13, 2024 11:57 AM
To: Ba Jinsheng <bajinsheng@u.nus.edu>
Cc: Tomas Vondra <tomas@vondra.me>; Zhang Mingli <zmlpostgres@gmail.com>; pgsql-bugs@lists.postgresql.org <pgsql-bugs@lists.postgresql.org>
Subject: Re: Question of Parallel Hash Join on TPC-H Benchmark

- External Email -

On 13/10/2024 15:57, Ba Jinsheng wrote:
> Thanks for looking into it!
> We are working a research project to automatically look for such cases.
> If it matters for PostgreSQL, I will report more such cases.
I made an attempt last year [1]. However, at least in production, it
provided too many false-positive cases to trigger automatic
re-optimisation [2] on each poorly estimated query plan.
I wonder if you have any progress in that area. Living nearby Singapore
(Chon Buri, Thailand) I would like to visit meetup related to this
problem, if you organised it.
Of course, reporting performance issues is always beneficial.

[1] https://github.com/danolivo/pg_track_optimizer
[2] https://postgrespro.com/docs/enterprise/16/realtime-query-replanning

--
regards, Andrei Lepikhov

Notice: This email is generated from the account of an NUS alumnus. Contents, views, and opinions therein are solely those of the sender.