Re: [HACKERS] parallelize queries containing initplans - Mailing list pgsql-hackers

From Amit Kapila
Subject Re: [HACKERS] parallelize queries containing initplans
Date
Msg-id CAA4eK1KcAAjhfsDn2dnnV2L-mdDm1eraAR-5CcECK2kSMYyj_w@mail.gmail.com
Whole thread Raw
In response to Re: [HACKERS] parallelize queries containing initplans  (Robert Haas <robertmhaas@gmail.com>)
List pgsql-hackers
On Thu, Nov 16, 2017 at 10:44 PM, Robert Haas <robertmhaas@gmail.com> wrote:
> On Thu, Nov 16, 2017 at 5:23 AM, Kuntal Ghosh
> <kuntalghosh.2007@gmail.com> wrote:
>> I've tested the above-mentioned scenario with this patch and it is
>> working fine. Also, I've created a text column named 'vartext',
>> inserted some random length texts(max length 100) and tweaked the
>> above query as follows:
>> select ten,count(*) from tenk1 a where a.ten in (select
>> b.ten from tenk1 b where (select a.vartext from tenk1 c where c.ten =
>> a.ten limit 1) = b.vartext limit 1) group by a.ten;
>> This query is equivalent to select ten,count(*) from tenk1 group by
>> a.ten. It also produced the expected result without throwing any
>> error.
>
> Great!  I have committed the patch; thanks for testing.
>

Thanks.

> As I said in the commit message, there's a lot more work that could be
> done here.  I think we should consider trying to revise this whole
> system so that instead of serializing the values and passing them to
> the workers, we allocate an array of slots where each slot has a Datum
> flag, an isnull flag, and a dsa_pointer (which maybe could be union'd
> to the Datum?).  If we're passing a parameter by value, we could just
> store it in the Datum field; if it's null, we just set isnull.  If
> it's being passed by reference, we dsa_allocate() space for it, copy
> it into that space, and then store the dsa_pointer.
>
> The advantage of this is that it would be possible to imagine the
> contents of a slot changing while parallelism is running, which
> doesn't really work with the current serialized-blob representation.
> That would in turn allow us to imagine letting parallel-safe InitPlans
> being evaluated by the first participant that needs the value rather
> than before launching workers, which would be good, not only because
> of the possibility of deferring work for InitPlans attached at or
> above the Gather but also because it could be used for InitPlans below
> the Gather (as long as they don't depend on any parameters computed
> below the Gather).
>

That would be cool, but I think here finding whether it is dependent
on any parameter computed below gather could be tricky.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


pgsql-hackers by date:

Previous
From: Andreas Joseph Krogh
Date:
Subject: Sv: Re: pspg - psql pager
Next
From: Masahiko Sawada
Date:
Subject: Missing wal_receiver_status_interval in Subscribers section