Re: a raft of parallelism-related bug fixes - Mailing list pgsql-hackers

From Robert Haas
Subject Re: a raft of parallelism-related bug fixes
Date
Msg-id CA+TgmoZMH6mJyXX=YLSOvJ8jULFqGgXWZCr_rbkc1nJ+177VSQ@mail.gmail.com
Whole thread Raw
In response to Re: a raft of parallelism-related bug fixes  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: a raft of parallelism-related bug fixes  (Robert Haas <robertmhaas@gmail.com>)
List pgsql-hackers
On Sun, Oct 18, 2015 at 12:17 AM, Robert Haas <robertmhaas@gmail.com> wrote:
>> So reviewing patch 13 isn't possible without prior knowledge.
>
> The basic question for patch 13 is whether ephemeral record types can
> occur in executor tuples in any contexts that I haven't identified.  I
> know that a tuple table slot can contain have a column that is of type
> record or record[], and those records can themselves contain
> attributes of type record or record[], and so on as far down as you
> like.  I *think* that's the only case.  For example, I don't believe
> that a TupleTableSlot can contain a *named* record type that has an
> anonymous record buried down inside of it somehow.  But I'm not
> positive I'm right about that.

I have done some more testing and investigation and determined that
this optimism was unwarranted.  It turns out that the type information
for composite and record types gets stored in two different places.
First, the TupleTableSlot has a type OID, indicating the sort of the
value it expects to be stored for that slot attribute.  Second, the
value itself contains a type OID and typmod.  And these don't have to
match.  For example, consider this query:

select row_to_json(i) from int8_tbl i(x,y);

Without i(x,y), the HeapTuple passed to row_to_json is labelled with
the pg_type OID of int8_tbl.  But with the query as written, it's
labeled as an anonymous record type.  If I jigger things by hacking
the code so that this is planned as Gather (single-copy) -> SeqScan,
with row_to_json evaluated at the Gather node, then the sequential
scan kicks out a tuple with a transient record type and stores it into
a slot whose type OID is still that of int8_tbl.  My previous patch
failed to deal with that; the attached one does.

The previous patch was also defective in a few other respects.  The
most significant of those, maybe, is that it somehow thought it was OK
to assume that transient typmods from all workers could be treated
interchangeably rather than individually.  To fix this, I've changed
the TupleQueueFunnel implemented by tqueue.c to be merely a
TupleQueueReader which handles reading from a single worker only.
nodeGather.c therefore creates one TupleQueueReader per worker instead
of a single TupleQueueFunnel for all workers; accordingly, the logic
for multiplexing multiple queues now lives in nodeGather.c.  This is
probably how I should have done it originally - someone, I think Jeff
Davis - complained previously that tqueue.c had no business embedding
the round-robin policy decision, and he was right.  So this addresses
that complaint as well.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Attachment

pgsql-hackers by date:

Previous
From: Robert Haas
Date:
Subject: Re: quieting DEBUG3
Next
From: Andres Freund
Date:
Subject: Re: Add EXTRA_CFLAGS to configure