Re: libpq pipelining - Mailing list pgsql-hackers

From Matt Newell
Subject Re: libpq pipelining
Date
Msg-id 4208748.4mnkvVCjOB@obsidian
Whole thread Raw
In response to Re: libpq pipelining  (Craig Ringer <craig@2ndquadrant.com>)
Responses Re: libpq pipelining
Re: libpq pipelining
List pgsql-hackers
On Thursday, December 04, 2014 10:30:46 PM Craig Ringer wrote:
> On 12/04/2014 05:08 PM, Heikki Linnakangas wrote:
> > A good API is crucial for this. It should make it easy to write an
> > application that does pipelining, and to handle all the error conditions
> > in a predictable way. I'd suggest that you write the documentation
> > first, before writing any code, so that we can discuss the API. It
> > doesn't have to be in SGML format yet, a plain-text description of the
> > API will do.
>
> I strongly agree.
>
First pass at the documentation changes attached, along with a new example
that demonstrates pipelining 3 queries, with the middle one resulting in a
PGRES_FATAL_ERROR response.

With the API i am proposing, only 2 new functions (PQgetFirstQuery,
PQgetLastQuery) are required to be able to match each result to the query that
caused it.  Another function, PQgetNextQuery allows iterating through the
pending queries, and PQgetQueryCommand permits getting the original query
text.

Adding the ability to set a user supplied pointer on the PGquery struct might
make it much easier for some frameworks, and other users might want a
callback, but I don't think either are required.

> Applications need to be able to reliably predict what will happen if
> there's an error in the middle of a pipeline.
>
Yes, the API i am proposing makes it easy to get results for each submitted
query independently of the success or failure of previous queries in the
pipeline.

> Consideration of implicit transactions (autocommit), the whole pipeline
> being one transaction, or multiple transactions is needed.
The more I think about this the more confident I am that no extra work is
needed.

Unless we start doing some preliminary processing of the query inside of
libpq, our hands are tied wrt sending a sync at the end of each query.  The
reason for this is that we rely on the ReadyForQuery message to indicate the
end of a query, so without the sync there is no way to tell if the next result
is from another statement in the current query, or the first statement in the
next query.

I also don't see a reason to need multiple queries without a sync statement.
If the user wants all queries to succeed or fail together it should be no
problem to start the pipeline with begin and complete it commit.  But I may be
missing some detail...


>
> Apps need to be able to wait for the result of a query partway through a
> pipeline, e.g. scheduling four queries, then waiting for the result of
> the 2nd.
>
Right.  With the api i am proposing the user does have to process each result
until it gets to the one it wants, but it's no problem doing that.  It would
also be trivial to add a function

PGresult * PQgetNextQueryResult(PQquery *query);

that discards all results from previous queries.  Very similar to how a PQexec
disregards all results from previous async queries.

It would also be possible to queue the results and be able to retrieve them
out of order, but I think that add unnecessary complexity and might also make
it easy for users to never retrieve and free some results.

> There are probably plenty of other wrinkly bits to think about.

Yup, I'm sure i'm still missing some significant things at this point...

Matt Newell

Attachment

pgsql-hackers by date:

Previous
From: Alvaro Herrera
Date:
Subject: Re: SSL regression test suite
Next
From: Claudio Freire
Date:
Subject: Re: libpq pipelining