Re: BUG #15324: Non-deterministic behaviour from parallelised sub-query - Mailing list pgsql-bugs

From Amit Kapila
Subject Re: BUG #15324: Non-deterministic behaviour from parallelised sub-query
Date
Msg-id CAA4eK1JqimRvJt5=nuukG+hXGA0P2tD=D0ewYrO4u0ig_TTacg@mail.gmail.com
Whole thread Raw
In response to Re: BUG #15324: Non-deterministic behaviour from parallelisedsub-query  (Stephen Frost <sfrost@snowman.net>)
Responses Re: BUG #15324: Non-deterministic behaviour from parallelised sub-query  (Andrew Fletcher <andy@prestigedigital.com>)
Re: BUG #15324: Non-deterministic behaviour from parallelised sub-query  (Amit Kapila <amit.kapila16@gmail.com>)
List pgsql-bugs
On Wed, Aug 15, 2018 at 4:40 PM, Stephen Frost <sfrost@snowman.net> wrote:
> Greetings,
>
> * Amit Kapila (amit.kapila16@gmail.com) wrote:
>> On Tue, Aug 14, 2018 at 9:14 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> > Marko Tiikkaja <marko@joh.to> writes:
>> >> Marking the function parallel safe doesn't seem wrong to me.  The
>> >> non-parallel-safe part is that the input gets fed to it in different order
>> >> in different workers.  And I don't really think that to be the function's
>> >> fault.
>> >
>> > So that basically opens the question of whether *any* window function
>> > calculation can safely be pushed down to parallel workers.
>>
>> I think we can consider it as a parallel-restricted operation.  For
>> the purpose of testing, I have marked row_number as
>> parallel-restricted in pg_proc and I get the below plan:
>>
>> postgres=# Explain select count(*) from qwr where (a, b) in (select a,
>> row_number() over() from qwr);
>>                                                QUERY PLAN
>> --------------------------------------------------------------------------------------------------------
>>  Aggregate  (cost=46522.12..46522.13 rows=1 width=8)
>>    ->  Hash Semi Join  (cost=24352.08..46362.12 rows=64001 width=0)
>>          Hash Cond: ((qwr.a = qwr_1.a) AND (qwr.b = (row_number() OVER (?))))
>>          ->  Gather  (cost=0.00..18926.01 rows=128002 width=8)
>>                Workers Planned: 2
>>                ->  Parallel Seq Scan on qwr  (cost=0.00..18926.01
>> rows=64001 width=8)
>>          ->  Hash  (cost=21806.06..21806.06 rows=128002 width=12)
>>                ->  WindowAgg  (cost=0.00..20526.04 rows=128002 width=12)
>>                      ->  Gather  (cost=0.00..18926.01 rows=128002 width=4)
>>                            Workers Planned: 2
>>                            ->  Parallel Seq Scan on qwr qwr_1
>> (cost=0.00..18926.01 rows=64001 width=4)
>> (11 rows)
>>
>> This seems okay, though the results of the above parallel-execution
>> are not same as serial-execution.  I think the reason for it is that
>> we don't get rows in predictable order from workers.
>
> You wouldn't get them in a predictable order even without
> parallelization due to the lack of an ordering, so this hardly seems
> like an issue.
>

Right.

>> > Somewhat like the LIMIT/OFFSET case, it seems to me that we could only
>> > expect to do this safely if the row ordering induced by the WINDOW clause
>> > can be proven to be fully deterministic.  The planner has no such smarts
>> > at the moment AFAIR.  In principle you could do it if there were
>> > partitioning/ordering by a primary key, but I'm not excited about the
>> > prospects of that being true often enough in practice to justify making
>> > the check.
>>
>> Yeah, I am also not sure if it is worth adding the additional checks.
>> So, for now, we can treat any window function calculation as
>> parallel-restricted and if later anybody has a reason strong enough to
>> relax the restriction for some particular case, we will consider it.
>
> Seems likely that we'll want this at some point, but certainly seems
> like new work and not a small bit of it.
>

Yeah, let me summarize the problems which require patches:
(a) Consider the presence of a LIMIT/OFFSET in a sub-select as making
it parallel-unsafe.
(b) Consider the presence of any window function calculation as
parallel-restricted operation.

Initially, I will prepare two separate patches for them and then we
will see if we want to combine them into one before committing.  It
might take me few days to come up with patches, so if anyone else
wants to take a lead, feel free to do so.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


pgsql-bugs by date:

Previous
From: Christophe Pettus
Date:
Subject: Re: BUG #15330: Can't restore backup file
Next
From: Andrew Fletcher
Date:
Subject: Re: BUG #15324: Non-deterministic behaviour from parallelised sub-query