Re: PATCH: Batch/pipelining support for libpq - Mailing list pgsql-hackers

From Matthieu Garrigues
Subject Re: PATCH: Batch/pipelining support for libpq
Date
Msg-id CAJkzx4T5E-2cQe3dtv2R78dYFvz+in8PY7A8MArvLhs_pg75gg@mail.gmail.com
Whole thread Raw
In response to PATCH: Batch/pipelining support for libpq  (Craig Ringer <craig@2ndquadrant.com>)
Responses Re: PATCH: Batch/pipelining support for libpq  (Alvaro Herrera <alvherre@2ndquadrant.com>)
Re: PATCH: Batch/pipelining support for libpq  (Dave Cramer <davecramer@postgres.rocks>)
List pgsql-hackers
Hi,

It seems like this patch is nearly finished. I fixed all the remaining
issues. I'm also asking
a confirmation of the test scenarios you want to see in the next
version of the patch.

> Hi,
>
> On 2020-07-10 19:01:49 -0400, Alvaro Herrera wrote:
> > Totally unasked for, here's a rebase of this patch series.  I didn't do
> > anything other than rebasing to current master, solving a couple of very
> > trivial conflicts, fixing some whitespace complaints by git apply, and
> > running tests to verify everthing works.
> >
> > I don't foresee working on this at all, so if anyone is interested in
> > seeing this feature in, I encourage them to read and address
> > Horiguchi-san's feedback.
>
> Nor am I planning to do so, but I do think its a pretty important
> improvement.
>
>
Fixed

>
>
> > +/*
> > + * PQrecyclePipelinedCommand
> > + * Push a command queue entry onto the freelist. It must be a dangling entry
> > + * with null next pointer and not referenced by any other entry's next pointer.
> > + */
>
> Dangling sounds a bit like it's already freed.
>
>
Fixed

>
>
> > +/*
> > + * PQbatchSendQueue
> > + * End a batch submission by sending a protocol sync. The connection will
> > + * remain in batch mode and unavailable for new synchronous command execution
> > + * functions until all results from the batch are processed by the client.
>
> I feel like the reference to the protocol sync is a bit too low level
> for an external API. It should first document what the function does
> from a user's POV.
>
> I think it'd also be good to document whether / whether not queries can
> already have been sent before PQbatchSendQueue is called or not.
>
Fixed

>
>
>
> > + if (conn->batch_status == PQBATCH_MODE_ABORTED && conn->queryclass != PGQUERY_SYNC)
> > + {
> > + /*
> > + * In an aborted batch we don't get anything from the server for each
> > + * result; we're just discarding input until we get to the next sync
> > + * from the server. The client needs to know its queries got aborted
> > + * so we create a fake PGresult to return immediately from
> > + * PQgetResult.
> > + */
> > + conn->result = PQmakeEmptyPGresult(conn,
> > +   PGRES_BATCH_ABORTED);
> > + if (!conn->result)
> > + {
> > + printfPQExpBuffer(&conn->errorMessage,
> > +  libpq_gettext("out of memory"));
> > + pqSaveErrorResult(conn);
> > + return 0;
>
> Is there any way an application can recover at this point? ISTM we'd be
> stuck in the previous asyncStatus, no?
>

conn->result is null when malloc(sizeof(PGresult)) returns null. It's
very unlikely unless
the server machine is out of memory, so the server will probably be
unresponsive anyway.

I'm leaving this as it is but if anyone has a solution simple to
implement I'll fix it.

>
>
> > +/* pqBatchFlush
> > + * In batch mode, data will be flushed only when the out buffer reaches the threshold value.
> > + * In non-batch mode, data will be flushed all the time.
> > + */
> > +static int
> > +pqBatchFlush(PGconn *conn)
> > +{
> > + if ((conn->batch_status == PQBATCH_MODE_OFF)||(conn->outCount >= OUTBUFFER_THRESHOLD))
> > + return(pqFlush(conn));
> > + return 0; /* Just to keep compiler quiet */
> > +}
>
> unnecessarily long line.
>
Fixed

>
> > +/*
> > + * Connection's outbuffer threshold is set to 64k as it is safe
> > + * in Windows as per comments in pqSendSome() API.
> > + */
> > +#define OUTBUFFER_THRESHOLD 65536
>
> I don't think the comment explains much. It's fine to send more than 64k
> with pqSendSome(), they'll just be send with separate pgsecure_write()
> invocations. And only on windows.
>
> It clearly makes sense to start sending out data at a certain
> granularity to avoid needing unnecessary amounts of memory, and to make
> more efficient use of latency / serer side compute.
>
> It's not implausible that 64k is the right amount for that, I just don't
> think the explanation above is good.
>

Fixed

> > diff --git a/src/test/modules/test_libpq/testlibpqbatch.c b/src/test/modules/test_libpq/testlibpqbatch.c
> > new file mode 100644
> > index 0000000000..4d6ba266e5
> > --- /dev/null
> > +++ b/src/test/modules/test_libpq/testlibpqbatch.c
> > @@ -0,0 +1,1456 @@
> > +/*
> > + * src/test/modules/test_libpq/testlibpqbatch.c
> > + *
> > + *
> > + * testlibpqbatch.c
> > + * Test of batch execution functionality
> > + */
> > +
> > +#ifdef WIN32
> > +#include <windows.h>
> > +#endif
>
> ISTM that this shouldn't be needed in a test program like this?
> Shouldn't libpq abstract all of this away?
>

Fixed.

>
> > +static void
> > +simple_batch(PGconn *conn)
> > +{
>
> ISTM that all or at least several of these should include tests of
> transactional behaviour with pipelining (e.g. using both implicit and
> explicit transactions inside a single batch, using transactions across
> batches, multiple explicit transactions inside a batch).
>

@Andres, just to make sure I understood, here is the test scenarios I'll add:

Implicit and explicit multiple transactions:
   start batch:
     create_table
     insert X
     begin transaction
       insert X
     commit transaction
     begin transaction
       insert Y
     commit transaction
  end batch

Transaction across batches:
   start batch:
     create_table
     begin transaction
       insert X
   end batch
   start batch:
       insert Y
    commit transaction
  end batch

>
>
> > + PGresult   *res = NULL;
> > + const char *dummy_params[1] = {"1"};
> > + Oid dummy_param_oids[1] = {INT4OID};
> > +
> > + fprintf(stderr, "simple batch... ");
> > + fflush(stderr);
>
> Why do we need fflush()? IMO that shouldn't be needed in a use like
> this? Especially not on stderr, which ought to be unbuffered?
>
Removed.

>
> > + /*
> > + * Enter batch mode and dispatch a set of operations, which we'll then
> > + * process the results of as they come in.
> > + *
> > + * For a simple case we should be able to do this without interim
> > + * processing of results since our out buffer will give us enough slush to
> > + * work with and we won't block on sending. So blocking mode is fine.
> > + */
> > + if (PQisnonblocking(conn))
> > + {
> > + fprintf(stderr, "Expected blocking connection mode\n");
> > + goto fail;
> > + }
>
> Perhaps worth adding a helper for this?
>
> #define EXPECT(condition, explanation) \
>

Fixed (and saved 600 lines :).



Once I get your confirmation of the test scenarios, I'll implement
them and share another patch.

Attachment

pgsql-hackers by date:

Previous
From: Ranier Vilela
Date:
Subject: Re: Clang UndefinedBehaviorSanitize (Postgres14) Detected undefined-behavior
Next
From: Alvaro Herrera
Date:
Subject: Re: list of extended statistics on psql