Home > mailing lists

Re: BUG #16040: PL/PGSQL RETURN QUERY statement never uses a parallel plan - Mailing list pgsql-bugs

From	Pavel Stehule
Subject	Re: BUG #16040: PL/PGSQL RETURN QUERY statement never uses a parallel plan
Date	March 22, 2020 09:48:14
Msg-id	CAFj8pRBfi7fNtRLsAsCo0Y-mm+X6Y=9Z1ce8bocbWP5MQ65irQ@mail.gmail.com Whole thread Raw
In response to	Re: BUG #16040: PL/PGSQL RETURN QUERY statement never uses a parallel plan (Tom Lane <tgl@sss.pgh.pa.us>)
List	pgsql-bugs

Tree view

ne 22. 3. 2020 v 4:23 odesílatel Tom Lane <tgl@sss.pgh.pa.us> napsal:

PG Bug reporting form <noreply@postgresql.org> writes:
> [ $SUBJECT ]

I got around to looking at this today, and what I find is that the
problem is that exec_stmt_return_query() uses a portal (i.e. a cursor)
to read the results of the query. That seemed like a good idea, back
in the late bronze age, because it allowed plpgsql to fetch the query
results a few rows at a time and not risk blowing out memory with a huge
SPI result. However, the parallel-query infrastructure refuses to
parallelize when the query is being read via a cursor.

I think that the latter restriction is probably sane, because we don't
want to suspend execution of a parallel query while we've got worker
processes waiting. And there might be some implementation restrictions
lurking under it too --- that's not a part of the code I know in any
detail.

However, there's no fundamental reason why exec_stmt_return_query has
to use a cursor. It's going to run the query to completion immediately
anyway, and shove all the result rows into a tuplestore. What we lack
is a way to get the SPI query to pass its results directly to a
tuplestore, without the SPITupleTable intermediary. (Note that the
tuplestore can spill a large result to disk, whereas SPITupleTable
can't do that.)

So, attached is a draft patch to enable that. By getting rid of the
intermediate SPITupleTable, this should improve the performance of
RETURN QUERY somewhat even without considering the possibility of
parallelizing the source query. I've not tried to measure that though.
I've also not looked for other places that could use this new
infrastructure, but there may well be some.

One thing I'm not totally pleased about with this is adding another
SPI interface routine using the old parameter-values API (that is,
null flags as char ' '/'n'). That was the path of least resistance
given the other moving parts in pl_exec.c and spi.c, but maybe we
should try to modernize that before we set it in stone.

Another thing standing between this patch and committability is suitable
additions to the SPI documentation. But I saw no value in writing that
before the previous point is settled.

I will go add this to the next commitfest (for v14), but I wonder
if we should try to squeeze it into v13? This isn't the only
complaint we've gotten about non-parallelizability of RETURN QUERY.

Pavel

regards, tom lane

pgsql-bugs by date:

From: Tom Lane
Date: 22 March 2020, 06:23:03
Subject: Re: BUG #16040: PL/PGSQL RETURN QUERY statement never uses a parallel plan

From: Andres Freund
Date: 22 March 2020, 09:52:17
Subject: Re: BUG #16293: postgres segfaults and returns SQLSTATE 08006

Re: BUG #16040: PL/PGSQL RETURN QUERY statement never uses a parallel plan - Mailing list pgsql-bugs

Previous

Next