Re: [HACKERS] [POC] Faster processing at Gather node - Mailing list pgsql-hackers

From Amit Kapila
Subject Re: [HACKERS] [POC] Faster processing at Gather node
Date
Msg-id CAA4eK1KhNpH9PY+qYMaQcWdgHa=P+fo8cPCmdNcOLwLpoCjayQ@mail.gmail.com
Whole thread Raw
In response to Re: [HACKERS] [POC] Faster processing at Gather node  (Andres Freund <andres@anarazel.de>)
Responses Re: [HACKERS] [POC] Faster processing at Gather node  (Andres Freund <andres@anarazel.de>)
List pgsql-hackers
On Sun, Nov 5, 2017 at 6:54 AM, Andres Freund <andres@anarazel.de> wrote
> On 2017-11-05 01:05:59 +0100, Robert Haas wrote:
>> skip-gather-project-v1.patch does what it says on the tin.  I still
>> don't have a test case for this, and I didn't find that it helped very
>> much,

I am also wondering in which case it can help and I can't think of the
case.  Basically, as part of projection in the gather, I think we are
just deforming the tuple which we anyway need to perform before
sending the tuple to the client (printtup) or probably at the upper
level of the node.

>> and you said this looked like a big bottleneck in your
>> testing, so here you go.
>

Is it possible that it shows the bottleneck only for 'explain analyze'
statement as we don't deform the tuple for that at a later stage?

> The query where that showed a big benefit was
>
> SELECT * FROM lineitem WHERE l_suppkey > '5012' OFFSET 1000000000 LIMIT 1;
>
> (i.e a not very selective filter, and then just throwing the results away)
>
> still shows quite massive benefits:
>
> before:
> set parallel_setup_cost=0;set parallel_tuple_cost=0;set min_parallel_table_scan_size=0;set
max_parallel_workers_per_gather=8;
> tpch_5[17938][1]=# explain analyze SELECT * FROM lineitem WHERE l_suppkey > '5012' OFFSET 1000000000 LIMIT 1;
>
┌────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
> │                                                                 QUERY PLAN
>
├────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
> │ Limit  (cost=635802.67..635802.69 rows=1 width=127) (actual time=8675.097..8675.097 rows=0 loops=1)
> │   ->  Gather  (cost=0.00..635802.67 rows=27003243 width=127) (actual time=0.289..7904.849 rows=26989780 loops=1)
> │         Workers Planned: 8
> │         Workers Launched: 7
> │         ->  Parallel Seq Scan on lineitem  (cost=0.00..635802.67 rows=3375405 width=127) (actual
time=0.124..528.667rows=3373722 loops=8)
 
> │               Filter: (l_suppkey > 5012)
> │               Rows Removed by Filter: 376252
> │ Planning time: 0.098 ms
> │ Execution time: 8676.125 ms
>
└────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
> (9 rows)
> after:
> tpch_5[19754][1]=# EXPLAIN ANALYZE SELECT * FROM lineitem WHERE l_suppkey > '5012' OFFSET 1000000000 LIMIT 1;
>
┌────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
> │                                                                 QUERY PLAN
>
├────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
> │ Limit  (cost=635802.67..635802.69 rows=1 width=127) (actual time=5984.916..5984.916 rows=0 loops=1)
> │   ->  Gather  (cost=0.00..635802.67 rows=27003243 width=127) (actual time=0.214..5123.238 rows=26989780 loops=1)
> │         Workers Planned: 8
> │         Workers Launched: 7
> │         ->  Parallel Seq Scan on lineitem  (cost=0.00..635802.67 rows=3375405 width=127) (actual
time=0.025..649.887rows=3373722 loops=8)
 
> │               Filter: (l_suppkey > 5012)
> │               Rows Removed by Filter: 376252
> │ Planning time: 0.076 ms
> │ Execution time: 5986.171 ms
>
└────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
> (9 rows)
>
> so there clearly is still benefit (this is scale 100, but that shouldn't
> make much of a difference).
>

Do you see the benefit if the query is executed without using Explain Analyze?


-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

pgsql-hackers by date:

Previous
From: Thomas Munro
Date:
Subject: Re: [HACKERS] Flexible configuration for full-text search
Next
From: David Rowley
Date:
Subject: Re: [Sender Address Forgery]Re: [HACKERS] path toward fasterpartition pruning