Thread: Add PQsendSyncMessage() to libpq

Add PQsendSyncMessage() to libpq

From
Anton Kirilov
Date:
Hello,

Recently I have been trying to use libpq's pipeline mode in a project,
and in the process I have noticed that the PQpipelineSync() function
has a deficiency (which, to be fair, could be an advantage in other
situations): It combines the establishment of a synchronization point
in a pipeline with a send buffer flush, i.e. a system call. In my use
case I build up a pipeline of several completely independent queries,
so a synchronization point is required between each of them, but
performing a system call for each is just unnecessary overhead,
especially if the system is severely affected by any mitigations for
Spectre or other security vulnerabilities. That's why I propose to add
an interface to libpq to establish a synchronization point in a
pipeline without performing any further actions.

I have attached a patch that introduces PQsendSyncMessage(), a
function that is equivalent to PQpipelineSync(), except that it does
not flush anything to the server; the user must subsequently call
PQflush() instead. Alternatively, the new function is equivalent to
PQsendFlushRequest(), except that it sends a sync message instead of a
flush request. In addition to reducing the system call overhead of
libpq's pipeline mode, it also makes it easier for the operating
system to send as much of the pipeline as possible in a single TCP (or
lower level protocol) packet when the database is running remotely.

I would appeciate your thoughts on my proposal.

Best wishes,
Anton Kirilov

Attachment

Re: Add PQsendSyncMessage() to libpq

From
Denis Laxalde
Date:
Anton Kirilov wrote:
> I would appeciate your thoughts on my proposal.

This sounds like a useful addition to me. I've played a bit with it in 
Psycopg and it works fine.


diff --git a/src/interfaces/libpq/fe-exec.c b/src/interfaces/libpq/fe-exec.c
index a16bbf32ef..e2b32c1379 100644
--- a/src/interfaces/libpq/fe-exec.c
+++ b/src/interfaces/libpq/fe-exec.c
@@ -82,6 +82,7 @@ static int    PQsendDescribe(PGconn *conn, char desc_type,
  static int     check_field_number(const PGresult *res, int field_num);
  static void pqPipelineProcessQueue(PGconn *conn);
  static int     pqPipelineFlush(PGconn *conn);
+static int     send_sync_message(PGconn *conn, int flush);

Could (should?) be:
static int     send_sync_message(PGconn *conn, bool flush);


diff --git a/src/test/modules/libpq_pipeline/libpq_pipeline.c 
b/src/test/modules/libpq_pipeline/libpq_pipeline.c
index f48da7d963..829907957a 100644
--- a/src/test/modules/libpq_pipeline/libpq_pipeline.c
+++ b/src/test/modules/libpq_pipeline/libpq_pipeline.c
@@ -244,6 +244,104 @@ test_multi_pipelines(PGconn *conn)
         fprintf(stderr, "ok\n");
  }

+static void
+test_multi_pipelines_noflush(PGconn *conn)
+{

Maybe test_multi_pipelines() could be extended with an additional 
PQsendQueryParams()+PQsendSyncMessage() step instead of adding this 
extra test case?



Re: Add PQsendSyncMessage() to libpq

From
Anton Kirilov
Date:
Hello,

On 25/04/2023 15:23, Denis Laxalde wrote:
> This sounds like a useful addition to me. I've played a bit with it in
> Psycopg and it works fine.

Thank you very much for reviewing my patch! I have attached a new
version of it that addresses your comments and that has been rebased on
top of the current tip of the master branch (by fixing a merge
conflict), i.e. commit 7b7fa85130330128b404eddebd4f33c6739454b0.

For the sake of others who might read this e-mail thread, I would like
to mention that my patch is complete (including documentation and tests,
but modulo review comments, of course), and that it passes the tests, i.e.:

make check
make -C src/test/modules/libpq_pipeline check

Best wishes,
Anton Kirilov

Attachment

Re: Add PQsendSyncMessage() to libpq

From
Denis Laxalde
Date:
Hello,

Anton Kirilov a écrit :
> On 25/04/2023 15:23, Denis Laxalde wrote:
>> This sounds like a useful addition to me. I've played a bit with it in
>> Psycopg and it works fine.
> 
> Thank you very much for reviewing my patch! I have attached a new
> version of it that addresses your comments and that has been rebased on
> top of the current tip of the master branch (by fixing a merge
> conflict), i.e. commit 7b7fa85130330128b404eddebd4f33c6739454b0.
> 
> For the sake of others who might read this e-mail thread, I would like
> to mention that my patch is complete (including documentation and tests,
> but modulo review comments, of course), and that it passes the tests, i.e.:
> 
> make check
> make -C src/test/modules/libpq_pipeline check

Thank you; this V2 looks good to me.
Marking as ready for committer.




Re: Add PQsendSyncMessage() to libpq

From
Michael Paquier
Date:
On Thu, Apr 27, 2023 at 01:06:27PM +0200, Denis Laxalde wrote:
> Thank you; this V2 looks good to me.
> Marking as ready for committer.

Please note that we are in a stabilization period for v16 and that the
first commit fest of v17 should start in July, so it will perhaps take
some time before this is looked at by a committer.

Speaking of which, what was the performance impact of your application
once PQflush() was moved out of the pipeline sync?  Just asking for
curiosity..
--
Michael

Attachment

Re: Add PQsendSyncMessage() to libpq

From
Denis Laxalde
Date:
Michael Paquier a écrit :
> On Thu, Apr 27, 2023 at 01:06:27PM +0200, Denis Laxalde wrote:
>> Thank you; this V2 looks good to me.
>> Marking as ready for committer.
> 
> Please note that we are in a stabilization period for v16 and that the
> first commit fest of v17 should start in July, so it will perhaps take
> some time before this is looked at by a committer.

Yes, I am aware; totally fine by me.

> Speaking of which, what was the performance impact of your application
> once PQflush() was moved out of the pipeline sync?  Just asking for
> curiosity..

I have no metrics for that; but maybe Anton has some?
(In Psycopg, we generally do not expect users to handle the sync 
operation themselves, it's done under the hood; and I only found one 
situation where the flush could be avoided, but that's largely because 
our design, there can be more in general I think.)




Re: Add PQsendSyncMessage() to libpq

From
Robert Haas
Date:
On Fri, Mar 24, 2023 at 6:39 PM Anton Kirilov <antonvkirilov@gmail.com> wrote:
> I have attached a patch that introduces PQsendSyncMessage(), a
> function that is equivalent to PQpipelineSync(), except that it does
> not flush anything to the server; the user must subsequently call
> PQflush() instead. Alternatively, the new function is equivalent to
> PQsendFlushRequest(), except that it sends a sync message instead of a
> flush request. In addition to reducing the system call overhead of
> libpq's pipeline mode, it also makes it easier for the operating
> system to send as much of the pipeline as possible in a single TCP (or
> lower level protocol) packet when the database is running remotely.

I wonder whether this is the naming that we want. The two names are
significantly different. Something like PQpipelineSendSync() would be
more similar.

I also wonder, really even more, whether it would be better to do
something like PQpipelinePutSync(PGconn *conn, bool flush) with
PQpipelineSync(conn) just meaning PQpipelinePutSync(conn, true). We're
basically using the function name as a Boolean parameter to select the
behavior, which is fine if you only have one parameter and it's a
Boolean, but it's obviously unworkable if you have say 3 Boolean
parameters because you don't want 8 different functions, and what if
you need an integer parameter for some reason?

So I'd favor exposing a function that is effectively an extended
version of PQpipelineSendSync() with an additional Boolean parameter,
and that way if for some reason somebody needs to extend it again,
they can just make an even more extended version with yet another
parameter. That way, all the functionality is always available by
calling the newest function, and older ones are still there for older
applications.

--
Robert Haas
EDB: http://www.enterprisedb.com



Re: Add PQsendSyncMessage() to libpq

From
Anton Kirilov
Date:
Hello,

On 28/04/2023 13:06, Robert Haas wrote:
> On Fri, Mar 24, 2023 at 6:39 PM Anton Kirilov <antonvkirilov@gmail.com> wrote:
>> I have attached a patch that introduces PQsendSyncMessage()...
> 
> I wonder whether this is the naming that we want. The two names are
> significantly different. Something like PQpipelineSendSync() would be
> more similar.

The reason is that the function is modeled after PQsendFlushRequest(), 
since it felt closer to what I was trying to achieve, i.e. appending a 
protocol message to the output buffer without doing any actual I/O 
operations.

> I also wonder, really even more, whether it would be better to do
> something like PQpipelinePutSync(PGconn *conn, bool flush) with
> PQpipelineSync(conn) just meaning PQpipelinePutSync(conn, true).

Actually I believe that there is another issue with PQpipelineSync() 
that has to do with ergonomics - according to a comment inside its body 
( 

https://git.postgresql.org/gitweb/?p=postgresql.git;a=blob;f=src/interfaces/libpq/fe-exec.c;h=a16bbf32ef5c0043eee9c92ab82bf4f11386ee47;hb=HEAD#l3189

) it could fail silently to send all the buffered data, which seems to 
be problematic when operating in non-blocking mode. In practice, this 
means that all calls to PQpipelineSync() must be followed by execution 
of PQflush() to check whether the application should poll for write 
readiness. I suppose that that was the reason why I was going for a 
solution that did not combine changing the connection state with doing 
I/O operations.

In any case I am not particularly attached to any naming or the exact 
shape of the new API, as long as it achieves the same goal (reducing the 
number of system calls), but before I make any substantial changes to my 
patch, I would like to hear your thoughts on the matter.

Best wishes,
Anton Kirilov



Re: Add PQsendSyncMessage() to libpq

From
Anton Kirilov
Date:
Hello,

On 28/04/2023 09:08, Denis Laxalde wrote:
> Michael Paquier a écrit :
>> Speaking of which, what was the performance impact of your application
>> once PQflush() was moved out of the pipeline sync?  Just asking for
>> curiosity..
> 
> I have no metrics for that; but maybe Anton has some?
I did a quick check using the TechEmpower Framework Benchmarks ( 
https://www.techempower.com/benchmarks/ ) - they define 4 Web 
application tests that are database-bound. Everything was running on a 
single machine, and 3 of the tests had an improvement of 29.16%, 32.30%, 
and 41.78% respectively in the number of requests per second (Web 
application requests, not database queries), while the last test 
regressed by 0.66% (which I would say is practically no difference, 
given that there is always some measurement noise). I will try to get 
the changes from my patch tested in the project's continuous 
benchmarking environment, which has a proper set up with 3 servers 
(client, application server, and database) connected by a 10GbE link.

Best wishes,
Anton Kirilov



Re: Add PQsendSyncMessage() to libpq

From
Michael Paquier
Date:
On Sun, Apr 30, 2023 at 01:59:17AM +0100, Anton Kirilov wrote:
> I did a quick check using the TechEmpower Framework Benchmarks (
> https://www.techempower.com/benchmarks/ ) - they define 4 Web application
> tests that are database-bound. Everything was running on a single machine,
> and 3 of the tests had an improvement of 29.16%, 32.30%, and 41.78%
> respectively in the number of requests per second (Web application requests,
> not database queries), while the last test regressed by 0.66% (which I would
> say is practically no difference, given that there is always some
> measurement noise). I will try to get the changes from my patch tested in
> the project's continuous benchmarking environment, which has a proper set up
> with 3 servers (client, application server, and database) connected by a
> 10GbE link.

Well, these are nice numbers.  At ~1% I am ready to buy the noise
argument, but what would the range of the usual noise when it comes to
multiple runs under the same conditions?

Let's make sure that the API interface is the most intuitive (Robert
has commented about that a few days ago, still need to follow up on
that).
--
Michael

Attachment

Re: Add PQsendSyncMessage() to libpq

From
Michael Paquier
Date:
On Sat, Apr 29, 2023 at 05:06:03PM +0100, Anton Kirilov wrote:
> In any case I am not particularly attached to any naming or the exact shape
> of the new API, as long as it achieves the same goal (reducing the number of
> system calls), but before I make any substantial changes to my patch, I
> would like to hear your thoughts on the matter.

Another thing that may matter in terms of extensibility?  Would a
boolean argument really be the best design?  Could it be better to
have instead one API with a bits32 and some flags controlling its
internals?
--
Michael

Attachment

Re: Add PQsendSyncMessage() to libpq

From
Robert Haas
Date:
On Mon, May 1, 2023 at 8:42 PM Michael Paquier <michael@paquier.xyz> wrote:
> Another thing that may matter in terms of extensibility?  Would a
> boolean argument really be the best design?  Could it be better to
> have instead one API with a bits32 and some flags controlling its
> internals?

I wondered that, too. If we never add any more Boolean parameters to
this function then that would end up a waste, but maybe we will and
then it will be genius. Not sure what's best.

--
Robert Haas
EDB: http://www.enterprisedb.com



Re: Add PQsendSyncMessage() to libpq

From
Alvaro Herrera
Date:
On 2023-May-02, Robert Haas wrote:

> On Mon, May 1, 2023 at 8:42 PM Michael Paquier <michael@paquier.xyz> wrote:
> > Another thing that may matter in terms of extensibility?  Would a
> > boolean argument really be the best design?  Could it be better to
> > have instead one API with a bits32 and some flags controlling its
> > internals?
> 
> I wondered that, too. If we never add any more Boolean parameters to
> this function then that would end up a waste, but maybe we will and
> then it will be genius. Not sure what's best.

I agree that adding a flag is the way to go, since it improve chances
that we won't end up with ten different functions in case we decide to
have eight other behaviors.  One more function and we're done.  And
while I can't think of any use for a future flag, we (I) already didn't
of this one either, so let's not make the same mistake.

We already have 'int' flag masks in PQcopyResult() and
PQsetTraceFlags().  We were using bits32 initially for flag stuff in the
PQtrace facilities, until [1] reminded us that we shouldn't let c.h
creep into app-land, so that was turned into plain 'int'.

[1]
https://www.postgresql.org/message-id/TYAPR01MB2990B6C6A32ACF15D97AE94AFEBD0%40TYAPR01MB2990.jpnprd01.prod.outlook.com

-- 
Álvaro Herrera        Breisgau, Deutschland  —  https://www.EnterpriseDB.com/
"No nos atrevemos a muchas cosas porque son difíciles,
pero son difíciles porque no nos atrevemos a hacerlas" (Séneca)



Re: Add PQsendSyncMessage() to libpq

From
Anton Kirilov
Date:
Hello,

> On 3 May 2023, at 11:03, Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:
>
> On 2023-May-02, Robert Haas wrote:
>
>> On Mon, May 1, 2023 at 8:42 PM Michael Paquier <michael@paquier.xyz> wrote:
>>> Another thing that may matter in terms of extensibility?  Would a
>>> boolean argument really be the best design?  Could it be better to
>>> have instead one API with a bits32 and some flags controlling its
>>> internals?
>>
>> I wondered that, too. If we never add any more Boolean parameters to
>> this function then that would end up a waste, but maybe we will and
>> then it will be genius. Not sure what's best.
>
> I agree that adding a flag is the way to go, since it improve chances
> that we won't end up with ten different functions in case we decide to
> have eight other behaviors.  One more function and we're done.  And
> while I can't think of any use for a future flag, we (I) already didn't
> of this one either, so let's not make the same mistake.

Thank you all for the feedback! Do you have any thoughts on the other issue with PQpipelineSync() I have mentioned in
myprevious message? Am I just misunderstanding what the code comment means and how the API is supposed to be used by
anychance? 

Best wishes,
Anton Kirilov



Re: Add PQsendSyncMessage() to libpq

From
Alvaro Herrera
Date:
On 2023-May-04, Anton Kirilov wrote:

> Thank you all for the feedback! Do you have any thoughts on the other
> issue with PQpipelineSync() I have mentioned in my previous message?

Eh, I hadn't seen that one.

> Am I just misunderstanding what the code comment means and how the API
> is supposed to be used by any chance?

I think you have it right: it is possible that the buffer has not been
fully flushed by the time PQpipelineSync returns.

If you want to make sure it's fully flushed, your only option is to have
the call block.  That would make it no longer non-blocking, so it has to
be explicitly requested behavior.
I think this means to add yet another behavior flag for the new
function: have it block, waiting for the buffer to be flushed.

So your application can put several sync points in the queue, with no
flushing (and of course no blocking), and have it flush+block only on
the "last" one.  Of course, for other users, you want the current
behavior: have it flush opportunistically but not block.  So you can't
make it a single flag.

-- 
Álvaro Herrera         PostgreSQL Developer  —  https://www.EnterpriseDB.com/



Re: Add PQsendSyncMessage() to libpq

From
Anton Kirilov
Date:
Hello,

On Thu, 4 May 2023, 11:36 Alvaro Herrera, <alvherre@alvh.no-ip.org> wrote:
On 2023-May-04, Anton Kirilov wrote:
If you want to make sure it's fully flushed, your only option is to have
the call block.

Surely PQflush() returning 0 would signify that the output buffer has been fully flushed? Which means that there is another, IMHO simpler option than introducing an extra flag - make the new function return the same values as PQflush(), i.e. 0 for no error and fully flushed output, -1 for error, and 1 for partial flush (so that the user may start polling for write readiness). Of course, the function would never return 1 (but would block instead) unless the user has called PQsetnonblocking() beforehand.

Best wishes,
Anton Kirilov

Re: Add PQsendSyncMessage() to libpq

From
Michael Paquier
Date:
On Wed, May 03, 2023 at 12:03:57PM +0200, Alvaro Herrera wrote:
> We already have 'int' flag masks in PQcopyResult() and
> PQsetTraceFlags().  We were using bits32 initially for flag stuff in the
> PQtrace facilities, until [1] reminded us that we shouldn't let c.h
> creep into app-land, so that was turned into plain 'int'.
>
> [1]
https://www.postgresql.org/message-id/TYAPR01MB2990B6C6A32ACF15D97AE94AFEBD0%40TYAPR01MB2990.jpnprd01.prod.outlook.com

Indeed.  Good point!
--
Michael

Attachment

Re: Add PQsendSyncMessage() to libpq

From
Anton Kirilov
Date:
Hello,

On 05/05/2023 16:02, Anton Kirilov wrote:
> On Thu, 4 May 2023, 11:36 Alvaro Herrera, <alvherre@alvh.no-ip.org 
> <mailto:alvherre@alvh.no-ip.org>> wrote:
> 
>     On 2023-May-04, Anton Kirilov wrote:
>     If you want to make sure it's fully flushed, your only option is to have
>     the call block.
> 
> 
> Surely PQflush() returning 0 would signify that the output buffer has 
> been fully flushed?
Since I haven't got any further comments, I assume that I am correct, so 
here is an updated version of the patch that should address all feedback 
that I have received so far and all issues that I have identified.

Thanks,
Anton Kirilov
Attachment

Re: Add PQsendSyncMessage() to libpq

From
Anton Kirilov
Date:
Hello,

On 02/05/2023 00:55, Michael Paquier wrote:
 > Well, these are nice numbers.  At ~1% I am ready to buy the noise
 > argument, but what would the range of the usual noise when it comes to
 > multiple runs under the same conditions>

I managed to get my patch tested in the TechEmpower Framework Benchmarks 
continuous benchmarking environment, and even though it takes roughly a 
week to get a new set of results, now there had been a couple of runs 
both with and without my changes. All 4 database-bound Web application 
tests (single query, multiple queries, fortunes, and data updates) saw 
improvements, by approximately 8.94%, 0.64%, 9.54%, and 2.78% 
respectively. The standard errors were 0.65% or less, so there was 
practically no change in the second test. However, I have seen another 
implementation experience a much larger improvement (~6.69%) in that 
test from essentially the same optimization, so I think that my own code 
has another bottleneck. Note that these test runs were not in the same 
benchmarking environment as the one I used previously for a quick check, 
so the values differ. Also, another set of results should become 
available in a week or so (and would be based on my optimization).

Links to the test runs:

https://www.techempower.com/benchmarks/#section=test&runid=1ecf679a-9686-4de7-a3b7-de16a1a84bb6&l=zik0zi-35r&w=zhb2tb-zik0zj-zik0zj-sf&test=db

https://www.techempower.com/benchmarks/#section=test&runid=aab00736-445c-4b7f-83b5-451c47c83395&l=zik0zi-35r&w=zhb2tb-zik0zj-zik0zj-sf&test=db

https://www.techempower.com/benchmarks/#section=test&runid=bc7f7570-a88e-48e3-9874-06d7dc0a0f74&l=zik0zi-35r&w=zhb2tb-zik0zj-zik0zj-sf&test=db

https://www.techempower.com/benchmarks/#section=test&runid=e6dd1abd-7aa2-4846-9b44-d8fd8a23d385&l=zik0zi-35r&w=zhb2tb-zik0zj-zik0zj-sf&test=db
(ordered chronologically; the first 2 did not include my optimization)

Best wishes,
Anton Kirilov



Re: Add PQsendSyncMessage() to libpq

From
Daniel Gustafsson
Date:
> On 21 May 2023, at 19:17, Anton Kirilov <antonvkirilov@gmail.com> wrote:

> .. here is an updated version of the patch

This hunk here:

-    if (PQflush(conn) < 0)
+    const int ret = flags & PG_PIPELINEPUTSYNC_FLUSH ? PQflush(conn) : 0;
+
+    if (ret < 0)

..is causing this compiler warning:

fe-exec.c: In function ‘PQpipelinePutSync’:
fe-exec.c:3203:2: error: ISO C90 forbids mixed declarations and code [-Werror=declaration-after-statement]
3203 | const int ret = flags & PG_PIPELINEPUTSYNC_FLUSH ? PQflush(conn) : 0;
     | ^~~~~
cc1: all warnings being treated as errors

Also, the patch no longer applies. Please rebase and send an updated version.

--
Daniel Gustafsson




Re: Add PQsendSyncMessage() to libpq

From
Anton Kirilov
Date:
Hello,

On 05/07/2023 21:45, Daniel Gustafsson wrote:
> Please rebase and send an updated version.
Here it is (including the warning fix).

Thanks,
Anton Kirilov
Attachment

Re: Add PQsendSyncMessage() to libpq

From
Jelte Fennema-Nio
Date:
On Fri, 28 Apr 2023 at 14:07, Robert Haas <robertmhaas@gmail.com> wrote:
> I wonder whether this is the naming that we want. The two names are
> significantly different. Something like PQpipelineSendSync() would be
> more similar.
>
> I also wonder, really even more, whether it would be better to do
> something like PQpipelinePutSync(PGconn *conn, bool flush) with
> PQpipelineSync(conn) just meaning PQpipelinePutSync(conn, true). We're
> basically using the function name as a Boolean parameter to select the
> behavior, which is fine if you only have one parameter and it's a
> Boolean, but it's obviously unworkable if you have say 3 Boolean
> parameters because you don't want 8 different functions, and what if
> you need an integer parameter for some reason?

On Wed, 3 May 2023 at 12:04, Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:
> I agree that adding a flag is the way to go, since it improve chances
> that we won't end up with ten different functions in case we decide to
> have eight other behaviors.  One more function and we're done.  And
> while I can't think of any use for a future flag, we (I) already didn't
> of this one either, so let's not make the same mistake.

On Sat, 29 Apr 2023 at 18:07, Anton Kirilov <antonvkirilov@gmail.com> wrote:
> The reason is that the function is modeled after PQsendFlushRequest(),
> since it felt closer to what I was trying to achieve, i.e. appending a
> protocol message to the output buffer without doing any actual I/O
> operations.

Sorry for being late to the party, but I think the current API naming
and the flag argument don't fit well with the current libpq API that
we have. I much prefer something similar to the original version of
the patch.

I think this function should be named something with the "PQsend"
prefix since that's the way we name all our public async message
sending functions in libpq. The "Put" word we only use in internal
libpq functions, so I feel it has no place in the external API
surface. My proposal would be to call the function PQsendPipelineSync
(i.e. having the PQsend prefix while still looking similar to the
existing PQpipelineSync).

Also I think the flag argument is completely unnecessary. I understand
the argument that we didn't foresee the need for this non-flushing
behaviour either, and the follow up reasoning that we thus should add
a flag for future things we didn't forsee. But I think it's looking at
the situation from the wrong direction. Instead of looking at it as
adding another version of our current PQpipelineSync API, we should
look at it as an addition to our current list of PQsend functions for
a new packet type. And none of those PQsend functions ever needed a
flag. Which makes sense, because they are the lowest level building
blocks that make sense from a user perspective: They send a message
type over the socket and don't do anything else. And if the assumption
that this is the lowest level building block is wrong, then it will
almost certainly be wrong for all other PQsend functions too. And thus
we'll need a solution that fits for all of them.

Finally, I have one suggestion for a behavioural change: I think the
function should still call pqPipelineFlush, just like all of our other
PQsend functions (except PQsendFlushRequest, but that seems like an
oversight there too).



Re: Add PQsendSyncMessage() to libpq

From
Alvaro Herrera
Date:
On 2023-Nov-07, Jelte Fennema-Nio wrote:

> I think this function should be named something with the "PQsend"
> prefix since that's the way we name all our public async message
> sending functions in libpq. The "Put" word we only use in internal
> libpq functions, so I feel it has no place in the external API
> surface. My proposal would be to call the function PQsendPipelineSync
> (i.e. having the PQsend prefix while still looking similar to the
> existing PQpipelineSync).

Argued that way, it makes sense to me.

> Also I think the flag argument is completely unnecessary. [...]
> Instead of looking at it as adding another version of our current
> PQpipelineSync API, we should look at it as an addition to our current
> list of PQsend functions for a new packet type. And none of those
> PQsend functions ever needed a flag.

True.

> Finally, I have one suggestion for a behavioural change: I think the
> function should still call pqPipelineFlush, just like all of our other
> PQsend functions (except PQsendFlushRequest, but that seems like an
> oversight there too).

I agree.

So, yeah, it looks like this will be pretty similar to Anton's original
patch, with PQpipelineSync() being just PQsendPipelineSync() + PQflush().

-- 
Álvaro Herrera        Breisgau, Deutschland  —  https://www.EnterpriseDB.com/
"In fact, the basic problem with Perl 5's subroutines is that they're not
crufty enough, so the cruft leaks out into user-defined code instead, by
the Conservation of Cruft Principle."  (Larry Wall, Apocalypse 6)



Re: Add PQsendSyncMessage() to libpq

From
Anton Kirilov
Date:
Hello,

Thanks for the feedback!

On 07/11/2023 09:23, Jelte Fennema-Nio wrote:
 > But I think it's looking at the situation from the wrong direction. 
[...] we should look at it as an addition to our current list of PQsend 
functions for a new packet type. And none of those PQsend functions ever 
needed a flag. Which makes sense, because they are the lowest level 
building blocks that make sense from a user perspective: They send a 
message type over the socket and don't do anything else.

Yes, I think that this is quite close to my thinking when I created the 
original version of the patch. Also, the protocol specification states 
that the Sync message lacks parameters.

Since there haven't been any comments from the other people who have 
chimed in on this e-mail thread, I will assume that there is consensus 
(we are doing a U-turn with the implementation approach after all), so 
here is the updated version of the patch.

Best wishes,
Anton Kirilov
Attachment

Re: Add PQsendSyncMessage() to libpq

From
Anthonin Bonnefoy
Date:
Hi,

I've played a bit with the patch on my side. One thing that would be
great would be to make this
available in pgbench through a \syncpipeline meta command. That would
make it easier for users
to test whether there's a positive impact with their queries or not.

I've wrote a patch to add it to pgbench (don't want to mess with the
thread's attachment so here's a GH link
https://github.com/bonnefoa/postgres/commit/047b5b05169e36361fe29fef9f430da045ef012d).
Here's some quick results:

echo "\set aid1 random(1, 100000 * :scale)
\set aid2 random(1, 100000 * :scale)
\startpipeline
select 1;
select * from pgbench_accounts where aid=:aid1;
select 2;
\syncpipeline
select 1;
select * from pgbench_accounts where aid=:aid2;
select 2;
\endpipeline" > /tmp/pipeline_without_flush.sql
pgbench -T30 -Mextended -f /tmp/pipeline_without_flush.sql -h127.0.0.1
latency average = 0.383 ms
initial connection time = 2.810 ms
tps = 2607.587877 (without initial connection time)

echo "\set aid1 random(1, 100000 * :scale)
\set aid2 random(1, 100000 * :scale)
\startpipeline
select 1;
select * from pgbench_accounts where aid=:aid1;
select 2;
\endpipeline
\startpipeline
select 1;
select * from pgbench_accounts where aid=:aid2;
select 2;
\endpipeline" > /tmp/pipeline_with_flush.sql
pgbench -T30 -Mextended -f /tmp/pipeline_with_flush.sql -h127.0.0.1
latency average = 0.437 ms
initial connection time = 2.602 ms
tps = 2290.527462 (without initial connection time)

I took some perfs and the main change is from the server spending less time in
ReadCommand which makes sense since the commands are sent in a single tcp
frame with the \syncpipeline version.

Regards,
Anthonin

On Sun, Nov 12, 2023 at 2:37 PM Anton Kirilov <antonvkirilov@gmail.com> wrote:
>
> Hello,
>
> Thanks for the feedback!
>
> On 07/11/2023 09:23, Jelte Fennema-Nio wrote:
>  > But I think it's looking at the situation from the wrong direction.
> [...] we should look at it as an addition to our current list of PQsend
> functions for a new packet type. And none of those PQsend functions ever
> needed a flag. Which makes sense, because they are the lowest level
> building blocks that make sense from a user perspective: They send a
> message type over the socket and don't do anything else.
>
> Yes, I think that this is quite close to my thinking when I created the
> original version of the patch. Also, the protocol specification states
> that the Sync message lacks parameters.
>
> Since there haven't been any comments from the other people who have
> chimed in on this e-mail thread, I will assume that there is consensus
> (we are doing a U-turn with the implementation approach after all), so
> here is the updated version of the patch.
>
> Best wishes,
> Anton Kirilov



Re: Add PQsendSyncMessage() to libpq

From
Jelte Fennema-Nio
Date:
On Sun, 12 Nov 2023 at 14:37, Anton Kirilov <antonvkirilov@gmail.com> wrote:
> Since there haven't been any comments from the other people who have
> chimed in on this e-mail thread, I will assume that there is consensus
> (we are doing a U-turn with the implementation approach after all), so
> here is the updated version of the patch.

The new patch looks great to me. And indeed consensus seems to have
been reached on the approach and that this patch is useful. So I'm
taking the liberty of marking this patch as Ready for Committer.



Re: Add PQsendSyncMessage() to libpq

From
Jelte Fennema-Nio
Date:
On Mon, 13 Nov 2023 at 09:20, Anthonin Bonnefoy
<anthonin.bonnefoy@datadoghq.com> wrote:
> \syncpipeline
> tps = 2607.587877 (without initial connection time)
> ...
> \endpipeline
> \startpipeline
> tps = 2290.527462 (without initial connection time)

Those are some nice improvements. And I think once this patch is in,
it would make sense to add the pgbench feature you're suggesting.
Because indeed that makes it see what perf improvements can be gained
for your workload.



Re: Add PQsendSyncMessage() to libpq

From
Michael Paquier
Date:
On Fri, Dec 29, 2023 at 12:52:30PM +0100, Jelte Fennema-Nio wrote:
> On Mon, 13 Nov 2023 at 09:20, Anthonin Bonnefoy
> <anthonin.bonnefoy@datadoghq.com> wrote:
> > \syncpipeline
> > tps = 2607.587877 (without initial connection time)
> > ...
> > \endpipeline
> > \startpipeline
> > tps = 2290.527462 (without initial connection time)
>
> Those are some nice improvements. And I think once this patch is in,
> it would make sense to add the pgbench feature you're suggesting.
> Because indeed that makes it see what perf improvements can be gained
> for your workload.

Yeah, that sounds like a good idea seen from here.  (Still need to
look at the core patch.)
--
Michael

Attachment

Re: Add PQsendSyncMessage() to libpq

From
Michael Paquier
Date:
On Sun, Dec 31, 2023 at 09:37:31AM +0900, Michael Paquier wrote:
> On Fri, Dec 29, 2023 at 12:52:30PM +0100, Jelte Fennema-Nio wrote:
>> Those are some nice improvements. And I think once this patch is in,
>> it would make sense to add the pgbench feature you're suggesting.
>> Because indeed that makes it see what perf improvements can be gained
>> for your workload.
>
> Yeah, that sounds like a good idea seen from here.  (Still need to
> look at the core patch.)

 PQpipelineSync(PGconn *conn)
+{
+    return PQsendPipelineSync(conn) && pqFlush(conn) >= 0;
+}
[...]
+     * Give the data a push if we're past the size threshold. In nonblock
+     * mode, don't complain if we're unable to send it all; the caller is
+     * expected to execute PQflush() at some point anyway.
      */
-    if (PQflush(conn) < 0)
+    if (pqPipelineFlush(conn) < 0)
         goto sendFailed;

I was looking at this patch, and calling PQpipelineSync() would now
cause two calls of PQflush() to be issued when the output buffer
threshold has been reached.  Could that lead to regressions?

A second thing I find disturbing is that pqAppendCmdQueueEntry() would
be called before the final pqFlush(), which could cause the commands
to be listed in a queue even if the flush fails when calling
PQpipelineSync().

Hence, as a whole, wouldn't it be more consistent if the new
PQsendPipelineSync() and the existing PQpipelineSync() call an
internal static routine (PQPipelineSyncInternal?) that can switch
between both modes?  Let's just make the extra argument a boolean.
--
Michael

Attachment

Re: Add PQsendSyncMessage() to libpq

From
Michael Paquier
Date:
On Wed, Jan 10, 2024 at 03:40:36PM +0900, Michael Paquier wrote:
> Hence, as a whole, wouldn't it be more consistent if the new
> PQsendPipelineSync() and the existing PQpipelineSync() call an
> internal static routine (PQPipelineSyncInternal?) that can switch
> between both modes?  Let's just make the extra argument a boolean.

Yeah, I'll go with that after a second look.  Attached is what I am
finishing with, and I have reproduced some numbers with the pgbench
metacommand mentioned upthread, which is reeeaaally nice.

I have also made a few edits to the tests.
--
Michael

Attachment

Re: Add PQsendSyncMessage() to libpq

From
Jelte Fennema-Nio
Date:
On Mon, 15 Jan 2024 at 08:50, Michael Paquier <michael@paquier.xyz> wrote:
> Yeah, I'll go with that after a second look.  Attached is what I am
> finishing with, and I have reproduced some numbers with the pgbench
> metacommand mentioned upthread, which is reeeaaally nice.

Code looks good to me. But one small notes on the test.

+     /* second pipeline */
+     if (PQsendQueryParams(conn, "SELECT $1", 1, dummy_param_oids,
+                                     dummy_params, NULL, NULL, 0) != 1)
+           pg_fatal("dispatching first SELECT failed: %s",
PQerrorMessage(conn));

Error message should be "second SELECT" not "first SELECT". Same note
for the error message in the third pipeline, where it still says
"second SELECT".


+     res = PQgetResult(conn);
+     if (res == NULL)
+           pg_fatal("PQgetResult returned null when there's a
pipeline item: %s",
+                        PQerrorMessage(conn));
+
+     if (PQresultStatus(res) != PGRES_TUPLES_OK)
+           pg_fatal("Unexpected result code %s from first pipeline item",
+                        PQresStatus(PQresultStatus(res)));
+     PQclear(res);
+     res = NULL;
+
+     if (PQgetResult(conn) != NULL)
+           pg_fatal("PQgetResult returned something extra after first result");

same issue: s/first/second/g (and s/second/third/g for the existing
part of the test).



Re: Add PQsendSyncMessage() to libpq

From
Alvaro Herrera
Date:
On 2024-Jan-15, Michael Paquier wrote:

Looks good!  Just some small notes,

> +/*
> + * Wrapper for PQpipelineSync and PQsendPipelineSync.
>   *
>   * It's legal to start submitting more commands in the pipeline immediately,
>   * without waiting for the results of the current pipeline. There's no need to

the new function pqPipelineSyncInternal is not a wrapper for these other
two functions -- the opposite is true actually.  We tend to use the term
"workhorse" or "internal workhorse" for this kind of thing.

In the docs, after this patch we have

- PQpipelineSync
- PQsendFlushRequest
- PQsendPipelineSync

Wouldn't it make more sense to add the new function in the middle of the
two existing ones instead?


Looking again at the largish comment that's now atop
pqPipelineSyncInternal(), I think most of it should be removed -- these
things should be explained in the SGML docs, and I think they are, in
the "Using Pipeline Mode" section.  We can just have the lines this
patch is adding.

-- 
Álvaro Herrera               48°01'N 7°57'E  —  https://www.EnterpriseDB.com/
"I can't go to a restaurant and order food because I keep looking at the
fonts on the menu.  Five minutes later I realize that it's also talking
about food" (Donald Knuth)



Re: Add PQsendSyncMessage() to libpq

From
Michael Paquier
Date:
On Mon, Jan 15, 2024 at 10:01:59AM +0100, Jelte Fennema-Nio wrote:
> Error message should be "second SELECT" not "first SELECT". Same note
> for the error message in the third pipeline, where it still says
> "second SELECT".
>
> same issue: s/first/second/g (and s/second/third/g for the existing
> part of the test).

Ugh, yes.  The note in the test was wrong.  Thanks for
double-checking.
--
Michael

Attachment

Re: Add PQsendSyncMessage() to libpq

From
Michael Paquier
Date:
On Mon, Jan 15, 2024 at 10:49:56AM +0100, Alvaro Herrera wrote:
> the new function pqPipelineSyncInternal is not a wrapper for these other
> two functions -- the opposite is true actually.  We tend to use the term
> "workhorse" or "internal workhorse" for this kind of thing.

Indeed, makes sense.

> In the docs, after this patch we have
>
> - PQpipelineSync
> - PQsendFlushRequest
> - PQsendPipelineSync
>
> Wouldn't it make more sense to add the new function in the middle of the
> two existing ones instead?

Ordering PQsendPipelineSync just after PQpipelineSync is OK by me.
I've applied the patch with all these modifications to move on with
the subject.

> Looking again at the largish comment that's now atop
> pqPipelineSyncInternal(), I think most of it should be removed -- these
> things should be explained in the SGML docs, and I think they are, in
> the "Using Pipeline Mode" section.  We can just have the lines this
> patch is adding.

Hmm.  The first two sentences about being able to submit more commands
to the pipeline are documented in the subsection "Issuing Queries".
The third sentence is implied in the second paragraph of this
subsection.  The 4th paragraph of the comment where sync commands
cannot be issued until all the results from the pipeline have been
consumed is mentioned in the first paragraph in "Using Pipeline Mode".
So you are right that this could be entirely removed.

How about the attached to remove all that, then?
--
Michael

Attachment

Re: Add PQsendSyncMessage() to libpq

From
Alvaro Herrera
Date:
On 2024-Jan-16, Michael Paquier wrote:

> I've applied the patch with all these modifications to move on with
> the subject.

Thanks!

> On Mon, Jan 15, 2024 at 10:49:56AM +0100, Alvaro Herrera wrote:

> > Looking again at the largish comment that's now atop
> > pqPipelineSyncInternal(), I think most of it should be removed -- these
> > things should be explained in the SGML docs, and I think they are, in
> > the "Using Pipeline Mode" section.  We can just have the lines this
> > patch is adding.
> 
> Hmm.  The first two sentences about being able to submit more commands
> to the pipeline are documented in the subsection "Issuing Queries".
> The third sentence is implied in the second paragraph of this
> subsection.  The 4th paragraph of the comment where sync commands
> cannot be issued until all the results from the pipeline have been
> consumed is mentioned in the first paragraph in "Using Pipeline Mode".
> So you are right that this could be entirely removed.

(I'm pretty sure that the history of this comment is that Craig Ringer
wrote it for his prototype patch, and then I took the various parts and
struggled to add them as SGML docs as it made logical sense.  So if
there's anything in the comment that's important and not covered by the
docs, that would be a docs bug.)  I agree with your findings.

> How about the attached to remove all that, then?

Looks good, thank you.

-- 
Álvaro Herrera         PostgreSQL Developer  —  https://www.EnterpriseDB.com/
"Tiene valor aquel que admite que es un cobarde" (Fernandel)



Re: Add PQsendSyncMessage() to libpq

From
Michael Paquier
Date:
On Tue, Jan 16, 2024 at 02:55:12PM +0100, Alvaro Herrera wrote:
> On 2024-Jan-16, Michael Paquier wrote:
>> How about the attached to remove all that, then?
>
> Looks good, thank you.

Thanks for double-checking.  Done.
--
Michael

Attachment

Re: Add PQsendSyncMessage() to libpq

From
Anton Kirilov
Date:
Hello,

On 17/01/2024 07:30, Michael Paquier wrote:
> Thanks for double-checking.  Done.
Thank you very much for taking care of my patch!

One thing that I noticed is that the TODO list on the PostgreSQL Wiki 
still contained an entry ( https://wiki.postgresql.org/wiki/Todo#libpq ) 
about adding pipelining support to libpq - perhaps it ought to be updated?

Best wishes,
Anton Kirilov