Thread: pgsql: Modify tqueue infrastructure to support transient record types.

pgsql: Modify tqueue infrastructure to support transient record types.

From
Robert Haas
Date:
Modify tqueue infrastructure to support transient record types.

Commit 4a4e6893aa080b9094dadbe0e65f8a75fee41ac6, which introduced this
mechanism, failed to account for the fact that the RECORD pseudo-type
uses transient typmods that are only meaningful within a single
backend.  Transferring such tuples without modification between two
cooperating backends does not work.  This commit installs a system
for passing the tuple descriptors over the same shm_mq being used to
send the tuples themselves.  The two sides might not assign the same
transient typmod to any given tuple descriptor, so we must also
substitute the appropriate receiver-side typmod for the one used by
the sender.  That adds some CPU overhead, but still seems better than
being unable to pass records between cooperating parallel processes.

Along the way, move the logic for handling multiple tuple queues from
tqueue.c to nodeGather.c; tqueue.c now provides a TupleQueueReader,
which reads from a single queue, rather than a TupleQueueFunnel, which
potentially reads from multiple queues.  This change was suggested
previously as a way to make sure that nodeGather.c rather than tqueue.c
had policy control over the order in which to read from queues, but
it wasn't clear to me until now how good an idea it was.  typmod
mapping needs to be performed separately for each queue, and it is
much simpler if the tqueue.c code handles that and leaves multiplexing
multiple queues to higher layers of the stack.

Branch
------
master

Details
-------
http://git.postgresql.org/pg/commitdiff/6e71dd7ce9766582da453f493bc371d64977282f

Modified Files
--------------
src/backend/executor/nodeGather.c |  138 ++++--
src/backend/executor/tqueue.c     |  983 ++++++++++++++++++++++++++++++++-----
src/include/executor/tqueue.h     |   12 +-
src/include/nodes/execnodes.h     |    4 +-
src/tools/pgindent/typedefs.list  |    2 +-
5 files changed, 986 insertions(+), 153 deletions(-)


Re: pgsql: Modify tqueue infrastructure to support transient record types.

From
Kevin Grittner
Date:
On Friday, November 6, 2015 3:59 PM, Robert Haas <rhaas@postgresql.org> wrote:

> Modified Files
> --------------

> src/backend/executor/tqueue.c    |  983 ++++++++++++++++++++++++++++++++-----

This patch created some referenced local variables:

tqueue.c: In function ‘tqueueReceiveSlot’:
tqueue.c:124:18: error: variable ‘tup’ set but not used [-Werror=unused-but-set-variable]
  HeapTupleHeader tup;
                                   ^
tqueue.c: In function ‘TupleQueueRemapTuple’:
tqueue.c:612:8: error: variable ‘dirty’ set but not used [-Werror=unused-but-set-variable]
  bool  dirty = false;
       ^
cc1: all warnings being treated as errors

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: pgsql: Modify tqueue infrastructure to support transient record types.

From
Amit Kapila
Date:
On Sat, Nov 7, 2015 at 3:29 AM, Robert Haas <rhaas@postgresql.org> wrote:
Modify tqueue infrastructure to support transient record types.


I am getting below compiler warning on Windows:
tqueue.c(662): warning C4715: 'TupleQueueRemap' : not all control paths return a value

Attached patch fixes the problem.


With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com
Attachment

Re: pgsql: Modify tqueue infrastructure to support transient record types.

From
Amit Kapila
Date:
On Sat, Nov 7, 2015 at 3:29 AM, Robert Haas <rhaas@postgresql.org> wrote:
>
> Modify tqueue infrastructure to support transient record types.
>

+static HeapTuple
+gather_readnext(GatherState *gatherstate)
+{
..
+       if (readerdone)
+       {
+           DestroyTupleQueueReader(reader);
+           --gatherstate->nreaders;
+           if (gatherstate->nreaders == 0)
+           {
+               ExecShutdownGather(gatherstate);
+               return NULL;
+           }
..
}

I think after readers are done, it's not good to call ShutdownGather,
because it will destroy the parallel context as well and the same is
needed for the cases when after the readers are done we still need
to access dsm, like for rescan and for scanning the data from local
node.

Here, we should just shutdown the workers and that is what we were
doing previous to this commit.  Attached patch fixes this problem.


With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com
Attachment

Re: pgsql: Modify tqueue infrastructure to support transient record types.

From
Robert Haas
Date:
On Mon, Nov 9, 2015 at 8:18 AM, Amit Kapila <amit.kapila16@gmail.com> wrote:
> On Sat, Nov 7, 2015 at 3:29 AM, Robert Haas <rhaas@postgresql.org> wrote:
>>
>> Modify tqueue infrastructure to support transient record types.
>>
>
> +static HeapTuple
> +gather_readnext(GatherState *gatherstate)
> +{
> ..
> +       if (readerdone)
> +       {
> +           DestroyTupleQueueReader(reader);
> +           --gatherstate->nreaders;
> +           if (gatherstate->nreaders == 0)
> +           {
> +               ExecShutdownGather(gatherstate);
> +               return NULL;
> +           }
> ..
> }
>
> I think after readers are done, it's not good to call ShutdownGather,
> because it will destroy the parallel context as well and the same is
> needed for the cases when after the readers are done we still need
> to access dsm, like for rescan and for scanning the data from local
> node.
>
> Here, we should just shutdown the workers and that is what we were
> doing previous to this commit.  Attached patch fixes this problem.

Oops.  Rebasing fail.  Committed.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company