Re: RFC: Async query processing - Mailing list pgsql-hackers

From Claudio Freire
Subject Re: RFC: Async query processing
Date
Msg-id CAGTBQpYc11rBtoNdMh8daoafnYcAJyU_0EdaeB03Fk2TjBSCNA@mail.gmail.com
Whole thread Raw
In response to RFC: Async query processing  (Florian Weimer <fweimer@redhat.com>)
Responses Re: RFC: Async query processing  (Florian Weimer <fweimer@redhat.com>)
List pgsql-hackers
On Sun, Nov 3, 2013 at 3:58 PM, Florian Weimer <fweimer@redhat.com> wrote:
> I would like to add truly asynchronous query processing to libpq, enabling
> command pipelining.  The idea is to to allow applications to auto-tune to
> the bandwidth-delay product and reduce the number of context switches when
> running against a local server.
...
> If the application is not interested in intermediate query results, it would
> use something like this:
...
> If there is no need to exit from the loop early (say, because errors are
> expected to be extremely rare), the PQgetResultNoWait call can be left out.

It doesn't seem wise to me making such a distinction. It sounds like
you're oversimplifying, and that's why you need "modes", to overcome
the evidently restrictive limits of the simplified interface, and that
it would only be a matter of (a short) time when some other limitation
requires some other mode.


>   PGAsyncMode oldMode = PQsetsendAsyncMode(conn, PQASYNC_RESULT);
>   bool more_data;
>   do {
>      more_data = ...;
>      if (more_data) {
>        int ret = PQsendQueryParams(conn,
>          "INSERT ... RETURNING ...", ...);
>        if (ret == 0) {
>          // handle low-level error
>        }
>      }
>      // Consume all pending results.
>      while (1) {
>        PGresult *res;
>        if (more_data) {
>          res = PQgetResultNoWait(conn);
>        } else {
>          res = PQgetResult(conn);
>        }

Somehow, that code looks backwards. I mean, really backwards. Wouldn't
that be !more_data?

In any case, pipelining like that, without a clear distinction, in the
wire protocol, of which results pertain to which query, could be a
recipe for trouble when subtle bugs, either in lib usage or
implementation, mistakenly treat one query's result as another's.

Notice that it's not an uncommon mistake, and this is much more likely
with such an unclear interface.


> Instead of buffering the results, we could buffer the encoded command
> messages in PQASYNC_RESULT mode.  This means that PQsendQueryParams would
> not block when it cannot send the (complete) command message, but store in
> the connection object so that the subsequent PQgetResultNoWait and
> PQgetResult would send it.  This might work better with single-tuple result
> mode.  We cannot avoid buffering either multiple queries or multiple
> responses if we want to utilize the link bandwidth, or we'd risk deadlocks.

This is a non-solution. Such an implementation, at least as described,
would not remove neither network latency nor context switches, it
would be a purely API change with no externally visible behavior
change.

An effective solution must include multi-command packets. Without
knowing the wire protocol in detail, something like:

PARSE: INSERT blah
BIND: args
EXECUTE with DISCARD
PARSE: INSERT blah
BIND: args
EXECUTE with DISCARD
PARSE: SELECT  blah
BIND: args
EXECUTE with FETCH ALL

All in one packet, would be efficient and error-free (IMO). This
precludes multiple result-containing comands, but it could hold
multiple result-less ones.

This could better be specified with expectations

PARSE: INSERT blah
EXPECT: ParseComplete
BIND: args
EXECUTE
EXPECT: CommandComplete
PARSE: INSERT blah
EXPECT: ParseComplete
BIND: args
EXECUTE
EXPECT: CommandComplete
PARSE: SELECT blah
EXPECT: ParseComplete
BIND: args
EXECUTE

A failed expectation would discard the command buffer and send an
ExpectationFailed message.

API ways of performing that could vary. Perhaps a
PQqueueQueryExpectNoResults could build up a message buffer, flushed
with either an explicit call to a flush command or to a PQsendQuery /
PQgetresult. Perhaps all of libpq could work with a message queue and
only flush it at specific points (PQgetresult, for instance, multiple
PQsendQuery calls could be turned into expect command complete).



pgsql-hackers by date:

Previous
From: Florian Weimer
Date:
Subject: RFC: Async query processing
Next
From: Stephen Frost
Date:
Subject: postgres_fdw & async queries