Re: a raft of parallelism-related bug fixes - Mailing list pgsql-hackers

From Robert Haas
Subject Re: a raft of parallelism-related bug fixes
Date
Msg-id CA+Tgmoa77b167BMyUEERb6q4n-+OY-L+XDmLd2cBZ6Mt3FSGZw@mail.gmail.com
Whole thread Raw
In response to Re: a raft of parallelism-related bug fixes  (Robert Haas <robertmhaas@gmail.com>)
List pgsql-hackers
On Mon, Nov 2, 2015 at 9:29 PM, Robert Haas <robertmhaas@gmail.com> wrote:
> On Wed, Oct 28, 2015 at 10:23 AM, Robert Haas <robertmhaas@gmail.com> wrote:
>> On Sun, Oct 18, 2015 at 12:17 AM, Robert Haas <robertmhaas@gmail.com> wrote:
>>>> So reviewing patch 13 isn't possible without prior knowledge.
>>>
>>> The basic question for patch 13 is whether ephemeral record types can
>>> occur in executor tuples in any contexts that I haven't identified.  I
>>> know that a tuple table slot can contain have a column that is of type
>>> record or record[], and those records can themselves contain
>>> attributes of type record or record[], and so on as far down as you
>>> like.  I *think* that's the only case.  For example, I don't believe
>>> that a TupleTableSlot can contain a *named* record type that has an
>>> anonymous record buried down inside of it somehow.  But I'm not
>>> positive I'm right about that.
>>
>> I have done some more testing and investigation and determined that
>> this optimism was unwarranted.  It turns out that the type information
>> for composite and record types gets stored in two different places.
>> First, the TupleTableSlot has a type OID, indicating the sort of the
>> value it expects to be stored for that slot attribute.  Second, the
>> value itself contains a type OID and typmod.  And these don't have to
>> match.  For example, consider this query:
>>
>> select row_to_json(i) from int8_tbl i(x,y);
>>
>> Without i(x,y), the HeapTuple passed to row_to_json is labelled with
>> the pg_type OID of int8_tbl.  But with the query as written, it's
>> labeled as an anonymous record type.  If I jigger things by hacking
>> the code so that this is planned as Gather (single-copy) -> SeqScan,
>> with row_to_json evaluated at the Gather node, then the sequential
>> scan kicks out a tuple with a transient record type and stores it into
>> a slot whose type OID is still that of int8_tbl.  My previous patch
>> failed to deal with that; the attached one does.
>>
>> The previous patch was also defective in a few other respects.  The
>> most significant of those, maybe, is that it somehow thought it was OK
>> to assume that transient typmods from all workers could be treated
>> interchangeably rather than individually.  To fix this, I've changed
>> the TupleQueueFunnel implemented by tqueue.c to be merely a
>> TupleQueueReader which handles reading from a single worker only.
>> nodeGather.c therefore creates one TupleQueueReader per worker instead
>> of a single TupleQueueFunnel for all workers; accordingly, the logic
>> for multiplexing multiple queues now lives in nodeGather.c.  This is
>> probably how I should have done it originally - someone, I think Jeff
>> Davis - complained previously that tqueue.c had no business embedding
>> the round-robin policy decision, and he was right.  So this addresses
>> that complaint as well.
>
> Here is an updated version.  This is rebased over recent commits, and
> I added a missing check for attisdropped.

Committed.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



pgsql-hackers by date:

Previous
From: Peter Geoghegan
Date:
Subject: Re: Better name for PQsslAttributes()
Next
From: Tom Lane
Date:
Subject: Re: Better name for PQsslAttributes()