Re: [HACKERS] Pipelining executions to postgresql server - Mailing list pgsql-jdbc

From Mikko Tiihonen
Subject Re: [HACKERS] Pipelining executions to postgresql server
Date
Msg-id 1415058981678.77807@nitorcreations.com
Whole thread Raw
In response to Re: [HACKERS] Pipelining executions to postgresql server  (Craig Ringer <craig@2ndquadrant.com>)
Responses Re: [HACKERS] Pipelining executions to postgresql server  (Tom Lane <tgl@sss.pgh.pa.us>)
Re: [HACKERS] Pipelining executions to postgresql server  (Craig Ringer <craig@2ndquadrant.com>)
Re: [HACKERS] Pipelining executions to postgresql server  (Craig Ringer <craig@2ndquadrant.com>)
List pgsql-jdbc
> Craig Ringer wrote:
> On 11/02/2014 09:27 PM, Mikko Tiihonen wrote:
> > Is the following summary correct:
> > - the network protocol supports pipelinings
> Yes.
>
> All you have to do is *not* send a Sync message and be aware that the
> server will discard all input until the next Sync, so pipelining +
> autocommit doesn't make a ton of sense for error handling reasons.

I do not quite grasp why not sending Sync is so important. My proof of concept setup was for queries with autocommit
enabled.
When looking with wireshark I could observe that the client sent 3-10 P/B//D/E/S messages to server, before the server
startedsending the corresponding 1/2/T/D/C/Z replies for each request. Every 10 requests the test application waited
forthe all the replies to come to not overflow the network buffers (which is known to cause deadlocks with current pg
jdbcdriver). 

If I want separate error handling for each execution then shouldn't I be sending separate sync for each pipelined
operation?

> > - the server handles operations in order, starting the processing of next operation only after fully processing the
previousone  
> >    - thus pipelining is invisible to the server
>
> As far as I know, yes. The server just doesn't care.
>
> > - libpq driver does not support pipelining, but that is due to internal limitations
>
> Yep.
>
> > - if proper error handling is done by the client then there is no reason why pipelining could be supported by any
pgclient 
>
> Indeed, and most should support it. Sending batches of related queries
> would make things a LOT faster.
>
> PgJDBC's batch support is currently write-oriented. There is no
> fundamental reason it can't be expanded for reads. I've already written
> a patch to do just that for the case of returning generated keys.
>
> https://github.com/ringerc/pgjdbc/tree/batch-returning-support
>
> and just need to rebase it so I can send a pull for upstream PgJDBC.
> It's already linked in the issues documenting the limitatations in batch
>support.

Your code looked like good. Returning inserts are an important thing to optimize.

> If you want to have more general support for batches that return rowsets
> there's no fundamental technical reason why it can't be added. It just
> requires some tedious refactoring of the driver to either:
>
> - Sync and wait before it fills its *send* buffer, rather than trying
>   to manage its receive buffer (the server send buffer), so it can
>   reliably avoid deadlocks; or
>
> - Do async I/O in a select()-like loop over a protocol state machine,
>   so it can simultaneously read and write on the wire.

I also think the async I/O is the way to go. Luckily that has already been done
in the pgjdbc-ng  (https://github.com/impossibl/pgjdbc-ng), built on top
of netty java NIO library. It has quite good feature parity with the original
pgjdbc driver. I'll try next if I can enable the pipelining with it now that
I have tried the proof of concept with the originial pgjdbc driver.

> I might need to do some of that myself soon, but it's a big (and
> therefore error-prone) job I've so far avoided by making smaller, more
> targeted changes.
>
> For JDBC the JDBC batch interface is the right place to do this, and you
> should not IMO attempt to add pipelining outside that interface.
> (Multiple open resultsets from portals, yes, but not pipelining of queries).

I do not think the JDBC batch interface even allow doing updates to multiple
tables when using prepared statements?

--
 Craig Ringer                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


pgsql-jdbc by date:

Previous
From: Craig Ringer
Date:
Subject: Re: [HACKERS] Pipelining executions to postgresql server
Next
From: Tom Lane
Date:
Subject: Re: [HACKERS] Pipelining executions to postgresql server