Re: PATCH: Batch/pipelining support for libpq - Mailing list pgsql-hackers

From Alvaro Herrera
Subject Re: PATCH: Batch/pipelining support for libpq
Date
Msg-id 20201114003753.GA30616@alvherre.pgsql
Whole thread Raw
In response to Re: PATCH: Batch/pipelining support for libpq  (Alvaro Herrera <alvherre@alvh.no-ip.org>)
Responses Re: PATCH: Batch/pipelining support for libpq  ("Daniel Verite" <daniel@manitou-mail.org>)
Re: PATCH: Batch/pipelining support for libpq  ("David G. Johnston" <david.g.johnston@gmail.com>)
List pgsql-hackers
Here's a v25.

I made a few more changes to the docs per David's suggestions; I also
reordered the sections quite a bit.  It's now like this:
 * Batch Mode
   * Using Batch Mode
     * Issuing Queries
     * Processing Results
     * Error Handling
     * Interleaving Result Processing and Query Dispatch
     * Ending Batch Mode
   * Functions Associated with Batch Mode
   * When to Use Batching

To me as a reader, this makes more sense, but if you disagree, I think
we should discuss further changes.  (For example, maybe we should move
the "Functions" section at the end?)  The original had "When to Use
Batching" at the start, but it seemed to me that the points it's making
are not as critical as understanding what it is.

I reworked the test program to better fit the TAP model; I found that if
one test mecha failed in whatever way, the connection would be in a
weird state and cause the next test to also fail.  I changed so that it
runs one test and exits; then the t/001_libpq_async.pl file (hmm, need
to rename to 001_batch.pl I guess) calls it once for each test.
I adapted the test code to our code style.  I also removed the "timings"
stuff; I think that's something better left to pgbench.

(I haven't looked at Daniel's pgbench stuff yet, but I will do that
next.)

While looking at how the tests worked, I gave a hard stare at the new
libpq code and cleaned it up also.  There's a lot of minor changes, but
nothing truly substantial.  I moved the code around a lot, to keep
things where grouped together they belong.

I'm not 100% clear on things like PGconn->batch_status and how
PGconn->asyncStatus works.  Currently everything seems to work
correctly, but I'm worried that because we add new status values to
asyncStatus, some existing code might not be doing everything correctly
(for example when changing to/from ASYNC_BUSY in some cases, are we 100%
we shouldn't be changing to ASYNC_QUEUED?)


While looking this over I noticed a thread from 2014[1] where Matt
Newell tried to implement this stuff and apparently the main review
comment he got before abandoning the patch was that the user would like
a way to access the query that corresponded to each result.  The current
patch set does not address that need; the approach to this problem is
that it's on the application's head to keep track of this.  Honestly I
don't understand why it would be otherwise ... I'm not sure that it
makes sense to expect that the application is stupid enough that it
doesn't keep track in which order it sent things, but bright enough to
keep pointers to the queries it sent (??).  So this seems okay to me.
But added Heikki and Claudio to CC because of that old thread.


[1] https://postgr.es/m/2059418.jZQvL3lH90@obsidian


Attachment

pgsql-hackers by date:

Previous
From: Tomas Vondra
Date:
Subject: Re: Zedstore - compressed in-core columnar storage
Next
From: Michael Paquier
Date:
Subject: Re: Misc typos