Re: PATCH: Batch/pipelining support for libpq - Mailing list pgsql-hackers

From Craig Ringer
Subject Re: PATCH: Batch/pipelining support for libpq
Date
Msg-id CAMsr+YEMCXXcZHE4FjH9FZDfqPZNfVZV2drHAs0wn9hFzda=Kg@mail.gmail.com
Whole thread Raw
In response to Re: PATCH: Batch/pipelining support for libpq  (Michael Paquier <michael.paquier@gmail.com>)
List pgsql-hackers
On 24 May 2016 at 00:00, Michael Paquier <michael.paquier@gmail.com> wrote:
On Mon, May 23, 2016 at 8:50 AM, Andres Freund <andres@anarazel.de> wrote: 
>> This should be very useful for optimising FDWs, Postgres-XC, etc.
>
> And optimizing normal clients.
>
> Not easy, but I'd be very curious how much psql's performance improves
> when replaying a .sql style dump, and replaying from a !tty fd, after
> hacking it up to use the batch API.

I didn't, but agree it'd be interesting. So would pg_restore, for that matter, though its use of COPY for the bulk of its work means it wouldn't make tons of difference.

I think it'd be safe to enable it automatically in psql's --single-transaction mode. It's also safe to send anything after an explicit BEGIN and until the next COMMIT as a batch from libpq, and since it parses the SQL enough to detect statement boundaries already that shouldn't be too hard to handle.

However: psql is synchronous, using the PQexec family of blocking calls. It's all fairly well contained in SendQuery and PSQLexec, but making it use batching still require restructuring those to use the asynchronous nonblocking API and append the query to a pending-list, plus the addition of a select() loop to handle results and dispatch more work. MainLoop() isn't structured around a select or poll, it loops over gets. So while it'd be interesting to see what difference batching made the changes to make psql use it would be a bit more invasive. Far from a rewrite, but to avoid lots of code duplication it'd have to change everything to use nonblocking mode and a select loop, which is a big change for such a core tool.

This is a bit of a side-project and I've got to get back to "real work" so I don't expect to do a proper patch for psql any time soon. I'd rather not try to build too much on this until it's seen some review and I know the API won't need a drastic rewrite anyway. I'll see if I can do a hacked-up version one evening to see what it does for performance though.

Did you consider the use of simple_list.c instead of introducing a new
mimic as PGcommandQueueEntry? It would be cool avoiding adding new
list emulations on frontends.

Nope. I didn't realise it was there; I've done very little on the C client and library side so far. So thanks, I'll update it accordingly.

--
 Craig Ringer                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

pgsql-hackers by date:

Previous
From: Michael Paquier
Date:
Subject: Re: Typo in 001_initdb.pl
Next
From: Craig Ringer
Date:
Subject: Re: LSN as a recovery target