Thread: PATCH: Batch/pipelining support for libpq
Attachment
Hi, On 2016-05-23 17:19:09 +0800, Craig Ringer wrote: > Hi all > Following on from the foreign table batch inserts thread[1], here's a patch > to add support for pipelining queries into asynchronous batches in libpq. Yay! > I'm measuring 300x (not %) performance improvements doing batches on > servers over the Internet, so this seems pretty worthwhile. It turned out > to be way less invasive than I expected too. yay^2. > (I intentionally didn't add any way for clients to annotate each work-item > in a batch with their own private data. I think that'd be really useful and > would make implementing clients easier, but should be a separate patch). > > This should be very useful for optimising FDWs, Postgres-XC, etc. And optimizing normal clients. Not easy, but I'd be very curious how much psql's performance improves when replaying a .sql style dump, and replaying from a !tty fd, after hacking it up to use the batch API. - Andres
On Mon, May 23, 2016 at 8:50 AM, Andres Freund <andres@anarazel.de> wrote: > On 2016-05-23 17:19:09 +0800, Craig Ringer wrote: >> Following on from the foreign table batch inserts thread[1], here's a patch >> to add support for pipelining queries into asynchronous batches in libpq. > > Yay! >> I'm measuring 300x (not %) performance improvements doing batches on >> servers over the Internet, so this seems pretty worthwhile. It turned out >> to be way less invasive than I expected too. > > yay^2. I'll follow this mood. Yeha. >> (I intentionally didn't add any way for clients to annotate each work-item >> in a batch with their own private data. I think that'd be really useful and >> would make implementing clients easier, but should be a separate patch). >> >> This should be very useful for optimising FDWs, Postgres-XC, etc. > > And optimizing normal clients. > > Not easy, but I'd be very curious how much psql's performance improves > when replaying a .sql style dump, and replaying from a !tty fd, after > hacking it up to use the batch API. Did you consider the use of simple_list.c instead of introducing a new mimic as PGcommandQueueEntry? It would be cool avoiding adding new list emulations on frontends. -- Michael
On Mon, May 23, 2016 at 8:50 AM, Andres Freund <andres@anarazel.de> wrote:
>> This should be very useful for optimising FDWs, Postgres-XC, etc.
>
> And optimizing normal clients.
>
> Not easy, but I'd be very curious how much psql's performance improves
> when replaying a .sql style dump, and replaying from a !tty fd, after
> hacking it up to use the batch API.
Did you consider the use of simple_list.c instead of introducing a new
mimic as PGcommandQueueEntry? It would be cool avoiding adding new
list emulations on frontends.
On Mon, May 23, 2016 at 8:50 AM, Andres Freund <andres@anarazel.de> wrote:
> yay^2.
I'll follow this mood. Yeha.
Did you consider the use of simple_list.c instead of introducing a new
mimic as PGcommandQueueEntry? It would be cool avoiding adding new
list emulations on frontends.
Craig Ringer <craig@2ndquadrant.com> writes: > On 24 May 2016 at 00:00, Michael Paquier <michael.paquier@gmail.com> wrote: >> Did you consider the use of simple_list.c instead of introducing a new >> mimic as PGcommandQueueEntry? It would be cool avoiding adding new >> list emulations on frontends. > I'd have to extend simple_list to add a generic object version, like > struct my_list_elem > { > PG_SIMPLE_LIST_ATTRS; > mytype mycol; > myothertype myothercol; > } > Objections? That doesn't look exactly "generic". > I could add a void* version that's a simple clone of the string version, > but having to malloc both a list cell and its contents separately is > annoying. I'd be okay with a void* version, but I'm not sure about this. regards, tom lane
On 5/23/16 4:19 AM, Craig Ringer wrote: > + Batching less useful when information from one operation is required by the SB "Batching is less useful". -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com 855-TREBLE2 (855-873-2532) mobile: 512-569-9461
From: pgsql-hackers-owner@postgresql.org [mailto:pgsql-hackers-owner@postgresql.org] On Behalf Of Craig Ringer
I'll follow this mood. Yeha.
BTW, I've publushed the HTML-ified SGML docs to http://2ndquadrant.github.io/postgres/libpq-batch-mode.html as a preview.
Sorry for my late reply. Fantastic performance improvement, nice documentation, and amazing rapid development! I think I’ll join the review & testing in 2016/9 CF.
Regards
Takayuki Tsunakawa
2016-05-24 5:01 GMT+03:00 Craig Ringer <craig@2ndquadrant.com>: > > On 24 May 2016 at 00:00, Michael Paquier <michael.paquier@gmail.com> wrote: >> >> On Mon, May 23, 2016 at 8:50 AM, Andres Freund <andres@anarazel.de> wrote: > > >> >> > yay^2. >> >> I'll follow this mood. Yeha. > > > > BTW, I've publushed the HTML-ified SGML docs to > http://2ndquadrant.github.io/postgres/libpq-batch-mode.html as a preview. Typo detected: "Returns 1 if the batch curently being received" -- "curently". -- // Dmitry.
On Fri, Jun 3, 2016 at 8:51 PM, Dmitry Igrishin <dmitigr@gmail.com> wrote: >> BTW, I've publushed the HTML-ified SGML docs to >> http://2ndquadrant.github.io/postgres/libpq-batch-mode.html as a preview. > Typo detected: "Returns 1 if the batch curently being received" -- "curently". I am looking a bit more seriously at this patch and assigned myself as a reviewer. testlibpqbatch.c:1239:73: warning: format specifies type 'long' but the argument has type '__darwin_suseconds_t' (aka 'int') [-Wformat] printf("batch insert elapsed: %ld.%06lds\n", elapsed_time.tv_sec, elapsed_time.tv_usec); macos complains here. You may want to replace %06lds by just %06d. (lldb) bt * thread #1: tid = 0x0000, 0x0000000108bcdd56 libpq.5.dylib`closePGconn(conn=0x00007fd642d002d0) + 438 at fe-connect.c:3010, stop reason = signal SIGSTOP * frame #0: 0x0000000108bcdd56 libpq.5.dylib`closePGconn(conn=0x00007fd642d002d0) + 438 at fe-connect.c:3010 frame #1: 0x0000000108bc9db0 libpq.5.dylib`PQfinish(conn=0x00007fd642d002d0) + 32 at fe-connect.c:3072 frame #2: 0x0000000108bc9ede libpq.5.dylib`PQping(conninfo="dbname=postgres port=5432 host='/tmp' connect_timeout=5") + 46 at fe-connect.c:539 frame #3: 0x0000000108bb5210 pg_ctl`test_postmaster_connection(pm_pid=78218, do_checkpoint='\0') + 976 at pg_ctl.c:681 frame #4: 0x0000000108bb388e pg_ctl`do_start + 302 at pg_ctl.c:915 frame #5: 0x0000000108bb29b4 pg_ctl`main(argc=5, argv=0x00007fff5704e5c0) + 2836 at pg_ctl.c:2416 frame #6: 0x00007fff8b8b65ad libdyld.dylib`start + 1 (lldb) down 1 frame #0: 0x0000000108bcdd56 libpq.5.dylib`closePGconn(conn=0x00007fd642d002d0) + 438 at fe-connect.c:3010 3007 queue = conn->cmd_queue_recycle; 3008 { 3009 PGcommandQueueEntry *prev= queue; -> 3010 queue = queue->next; 3011 free(prev); 3012 } 3013 conn->cmd_queue_recycle= NULL; This patch generates a core dump, use for example pg_ctl start -w and you'll bump into the trace above. There is something wrong with the queue handling. Do you have plans for a more generic structure for the command queue list? + <application>libpq</application> supports queueing up mulitiple queries into s/mulitiple/multiple/. + <para> + An example of batch use may be found in the source distribution in + <filename>src/test/examples/libpqbatch.c</filename>. + </para> You mean testlibpqbatch.c here. + <para> + Batching less useful when information from one operation is required by the + client before it knows enough to send the next operation. The client must "Batching *is* less useful". src/test/examples/.gitignore needs a new entry for the new test binary. + fprintf(stderr, "internal error, COPY in batch mode"); + abort(); I don't think that's a good idea. defaultNoticeProcessor can be overridden to allow applications to have error messages sent elsewhere. Error messages should also use libpq_gettext, and perhaps be stored in conn->errorMessage as we do so for OOMs happening on client-side and reporting them back even if they are not expected (those are blocked PQsendQueryStart in your patch). src/test/examples is a good idea to show people what this new API can do, but this is never getting compiled. It could as well be possible to include tests in src/test/modules/, in the same shape as what postgres_fdw is doing by connecting to itself and link it to libpq. As this patch complicates quote a lot fe-exec.c, I think that this would be worth it. Thoughts? -- Michael
On Fri, Jun 3, 2016 at 8:51 PM, Dmitry Igrishin <dmitigr@gmail.com> wrote:
>> BTW, I've publushed the HTML-ified SGML docs to
>> http://2ndquadrant.github.io/postgres/libpq-batch-mode.html as a preview.
> Typo detected: "Returns 1 if the batch curently being received" -- "curently".
I am looking a bit more seriously at this patch and assigned myself as
a reviewer.
testlibpqbatch.c:1239:73: warning: format specifies type 'long' but
the argument has type '__darwin_suseconds_t' (aka 'int') [-Wformat]
printf("batch insert elapsed: %ld.%06lds\n",
elapsed_time.tv_sec, elapsed_time.tv_usec);
macos complains here. You may want to replace %06lds by just %06d.
This patch generates a core dump, use for example pg_ctl start -w and
you'll bump into the trace above. There is something wrong with the
queue handling.
Do you have plans for a more generic structure for the command queue list?
+ fprintf(stderr, "internal error, COPY in batch mode");
+ abort();
I don't think that's a good idea. defaultNoticeProcessor can be
overridden to allow applications to have error messages sent
elsewhere. Error messages should also use libpq_gettext, and perhaps
be stored in conn->errorMessage as we do so for OOMs happening on
client-side and reporting them back even if they are not expected
(those are blocked PQsendQueryStart in your patch).
src/test/examples is a good idea to show people what this new API can
do, but this is never getting compiled. It could as well be possible
to include tests in src/test/modules/, in the same shape as what
postgres_fdw is doing by connecting to itself and link it to libpq. As
this patch complicates quote a lot fe-exec.c, I think that this would
be worth it. Thoughts?
On 10 August 2016 at 14:44, Michael Paquier <michael.paquier@gmail.com> wrote:
I am looking a bit more seriously at this patch and assigned myself as
a reviewer.Much appreciated.
macos complains here. You may want to replace %06lds by just %06d.Yeah, or cast to a type known to be big enough. Will amend.
This patch generates a core dump, use for example pg_ctl start -w and
you'll bump into the trace above. There is something wrong with the
queue handling.Huh. I didn't see that here (Fedora 23). I'll look more closely.Do you have plans for a more generic structure for the command queue list?No plans, no. This was a weekend experiment that turned into a useful patch and I'm having to scrape up time for it amongst much more important things like logical failover / sequence decoding and various other replication work.Thanks for the docs review too, will amend.+ fprintf(stderr, "internal error, COPY in batch mode");
+ abort();
I don't think that's a good idea.
Error messages should also use libpq_gettext, and perhaps
be stored in conn->errorMessage as we do so for OOMs happening on
client-side and reporting them back even if they are not expected
(those are blocked PQsendQueryStart in your patch).
src/test/examples is a good idea to show people what this new API can
do, but this is never getting compiled. It could as well be possible
to include tests in src/test/modules/, in the same shape as what
postgres_fdw is doing by connecting to itself and link it to libpq. As
this patch complicates quote a lot fe-exec.c, I think that this would
be worth it. Thoughts?
Attachment
Craig Ringer wrote: > Updated patch attached. Please find attached a couple fixes for typos I've came across in the doc part. Also it appears that PQqueriesInBatch() doesn't work as documented. It says: "Returns the number of queries still in the queue for this batch" but in fact it's implemented as a boolean: +/* PQqueriesInBatch + * Return true if there are queries currently pending in batch mode + */+int +PQqueriesInBatch(PGconn *conn) +{ + if (!PQisInBatchMode(conn)) + return false; + + return conn->cmd_queue_head != NULL; +} However, is this function really needed? It doesn't seem essential to the API. You don't call it in the test program either. Best regards, -- Daniel Vérité PostgreSQL-powered mailer: http://www.manitou-mail.org Twitter: @DanielVerite
Attachment
On 6 September 2016 at 16:10, Daniel Verite <daniel@manitou-mail.org> wrote: > Craig Ringer wrote: > >> Updated patch attached. > > Please find attached a couple fixes for typos I've came across in > the doc part. Thanks, will apply and post a rebased patch soon, or if someone picks this up in the mean time they can apply your diff on top of the patch. > Also it appears that PQqueriesInBatch() doesn't work as documented. > It says: > "Returns the number of queries still in the queue for this batch" > but in fact it's implemented as a boolean: Whoops. Will fix. I think the function is useful and necessary. There's no reason not to expose that, but also it's good for when your query dispatch isn't as tightly coupled to your query handling as in the example, so your app might need to keep processing until it sees the end of queued results. -- Craig Ringer http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
On Tue, Sep 6, 2016 at 8:01 PM, Craig Ringer <craig@2ndquadrant.com> wrote: > On 6 September 2016 at 16:10, Daniel Verite <daniel@manitou-mail.org> wrote: >> Craig Ringer wrote: >> >>> Updated patch attached. >> >> Please find attached a couple fixes for typos I've came across in >> the doc part. > > Thanks, will apply and post a rebased patch soon, or if someone picks > this up in the mean time they can apply your diff on top of the patch. Could you send an updated patch then? At the same time I am noticing that git --check is complaining... This patch has tests and a well-documented feature, so I'll take a look at it soon at the top of my list. Moved to next CF for now. -- Michael
On 3 October 2016 at 10:10, Michael Paquier <michael.paquier@gmail.com> wrote: > On Tue, Sep 6, 2016 at 8:01 PM, Craig Ringer <craig@2ndquadrant.com> wrote: >> On 6 September 2016 at 16:10, Daniel Verite <daniel@manitou-mail.org> wrote: >>> Craig Ringer wrote: >>> >>>> Updated patch attached. >>> >>> Please find attached a couple fixes for typos I've came across in >>> the doc part. >> >> Thanks, will apply and post a rebased patch soon, or if someone picks >> this up in the mean time they can apply your diff on top of the patch. > > Could you send an updated patch then? At the same time I am noticing > that git --check is complaining... This patch has tests and a > well-documented feature, so I'll take a look at it soon at the top of > my list. Moved to next CF for now. Thanks. I'd really like to teach psql in non-interactive mode to use it, but (a) I'm concerned about possible subtle behaviour differences arising if we do that and (b) I won't have the time. I think it's mostly of interest to app authors and driver developers and that's what it's aimed at. pg_restore could benefit a lot too. -- Craig Ringer http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
Craig Ringer wrote: > I think it's mostly of interest to app authors and driver developers > and that's what it's aimed at. pg_restore could benefit a lot too. Wouldn't pgbench benefit from it? It was mentioned some time ago [1], in relationship to the \into construct, how client-server latency was important enough to justify the use of a "\;" separator between statements, to send them as a group. But with the libpq batch API, maybe this could be modernized with meta-commands like this: \startbatch ... \endbatch which would essentially call PQbeginBatchMode() and PQsendEndBatch(). Inside the batch section, collecting results would be interleaved with sending queries. Interdepencies between results and subsequent queries could be handled or ignored, depending on how sophisticated we'd want this. This might also draw more users to the batch API, because it would make it easier to check how exactly it affects the performance of specific sequences of SQL statements to be grouped in a batch. For instance it would make sense for programmers to benchmark mock-ups of their code with pgbench with/without batching, before embarking on adapting it from blocking mode to asynchronous/non-blocking mode. [1] https://www.postgresql.org/message-id/alpine.DEB.2.20.1607140925400.1962%40sto Best regards, -- Daniel Vérité PostgreSQL-powered mailer: http://www.manitou-mail.org Twitter: @DanielVerite
On Mon, Oct 3, 2016 at 11:52 PM, Daniel Verite <daniel@manitou-mail.org> wrote: > Wouldn't pgbench benefit from it? > It was mentioned some time ago [1], in relationship to the > \into construct, how client-server latency was important enough to > justify the use of a "\;" separator between statements, to send them > as a group. > > But with the libpq batch API, maybe this could be modernized > with meta-commands like this: > \startbatch > ... > \endbatch Or just \batch [on|off], which sounds like a damn good idea to me for some users willing to test some workloads before integrating it in an application. -- Michael
<p dir="ltr"><p dir="ltr">On 4 Oct. 2016 15:15, "Michael Paquier" <<a href="mailto:michael.paquier@gmail.com">michael.paquier@gmail.com</a>>wrote:<br /> ><br /> > On Mon, Oct 3, 2016at 11:52 PM, Daniel Verite <<a href="mailto:daniel@manitou-mail.org">daniel@manitou-mail.org</a>> wrote:<br />> > Wouldn't pgbench benefit from it?<br /> > > It was mentioned some time ago [1], in relationship to the<br/> > > \into construct, how client-server latency was important enough to<br /> > > justify the use ofa "\;" separator between statements, to send them<br /> > > as a group.<br /> > ><br /> > > But withthe libpq batch API, maybe this could be modernized<br /> > > with meta-commands like this:<br /> > > \startbatch<br/> > > ...<br /> > > \endbatch<br /> ><br /> > Or just \batch [on|off], which soundslike a damn good idea to me for<br /> > some users willing to test some workloads before integrating it in an<br/> > application.<p dir="ltr">A batch jsnt necessarily terminated by a commit, so I'm more keen on start/end batch.It's more in line with begin/commit. Batch is not only a mode, you also have to delineate batches.<br />
On 04/10/16 20:15, Michael Paquier wrote: > On Mon, Oct 3, 2016 at 11:52 PM, Daniel Verite <daniel@manitou-mail.org> wrote: >> Wouldn't pgbench benefit from it? >> It was mentioned some time ago [1], in relationship to the >> \into construct, how client-server latency was important enough to >> justify the use of a "\;" separator between statements, to send them >> as a group. >> >> But with the libpq batch API, maybe this could be modernized >> with meta-commands like this: >> \startbatch >> ... >> \endbatch > Or just \batch [on|off], which sounds like a damn good idea to me for > some users willing to test some workloads before integrating it in an > application. +1 '\batch' is a bit easier, to find, & to remember than '\startbatch'
On Mon, Oct 3, 2016 at 12:48 PM, Craig Ringer <craig@2ndquadrant.com> wrote: > On 3 October 2016 at 10:10, Michael Paquier <michael.paquier@gmail.com> wrote: >> On Tue, Sep 6, 2016 at 8:01 PM, Craig Ringer <craig@2ndquadrant.com> wrote: >>> On 6 September 2016 at 16:10, Daniel Verite <daniel@manitou-mail.org> wrote: >>>> Craig Ringer wrote: >>>> >>>>> Updated patch attached. >>>> >>>> Please find attached a couple fixes for typos I've came across in >>>> the doc part. >>> >>> Thanks, will apply and post a rebased patch soon, or if someone picks >>> this up in the mean time they can apply your diff on top of the patch. >> >> Could you send an updated patch then? At the same time I am noticing >> that git --check is complaining... This patch has tests and a >> well-documented feature, so I'll take a look at it soon at the top of >> my list. Moved to next CF for now. > > Thanks. > > I'd really like to teach psql in non-interactive mode to use it, but > (a) I'm concerned about possible subtle behaviour differences arising > if we do that and (b) I won't have the time. I think it's mostly of > interest to app authors and driver developers and that's what it's > aimed at. pg_restore could benefit a lot too. Looking at it now... The most interesting comments are first. I wanted to congratulate you. I barely see a patch with this level of details done within the first versions. Anybody can review the patch just by looking at the code and especially the docs without looking at the thread. There are even tests to show what this does, for the client. +PQisInBatchMode 172 +PQqueriesInBatch 173 +PQbeginBatchMode 174 +PQendBatchMode 175 +PQsendEndBatch 176 +PQgetNextQuery 177 +PQbatchIsAborted 178 This set of routines is a bit inconsistent. Why not just prefixing them with PQbatch? Like that for example: PQbatchStatus(): consists of disabled/inactive/none, active, error. This covers both PQbatchIsAborted() and PQisInBatchMode(). PQbatchBegin() PQbatchEnd() PQbatchQueueSync() or PQbatchQueueFlush, same as PQsendEndBatch() to add and process a sync message into the queue. PQbatchQueueCount(): returns N>0 if there are N entries, 0 if empty, -1 on failure PQbatchQueueProcess(): returns 1 if process can begin, 0 if not, -1 on failure (OOM) + <para> + Much like asynchronous query mode, there is no performance disadvantage to + using batching and pipelining. It somewhat increased client application + complexity and extra caution is required to prevent client/server network + deadlocks, but can offer considerable performance improvements. + </para> I would reword that a bit "it increases client application complexity and extra caution is required to prevent client/server deadlocks but offers considerable performance improvements". + ("ping time") is high, and when many small operations are being performed in Nit: should use <quote> here. Still not quoting it would be fine. + After entering batch mode the application dispatches requests + using normal asynchronous <application>libpq</application> functions like + <function>PQsendQueryParams</function>, <function>PQsendPrepare</function>, + etc. The asynchronous requests are followed by a <link It may be better to list all the functions here, PQSendQuery * query work remains or an error has occurred (e.g. out of * memory). */ -PGresult *PQgetResult(PGconn *conn) Some noise in the patch. git diff --check complains: usage_exit(argv[0]); warning: 1 line adds whitespace errors. +++ b/src/interfaces/libpq/t/001_libpq_async.pl @@ -0,0 +1,15 @@ +use strict; +use warnings; + +use Config; +use PostgresNode; +use TestLib; +use Test::More; + +my $node = get_new_node('main'); +$node->init; +$node->start; + +my $port = $node->port; + +$node->stop('fast'); Er... This does nothing.. The changes in src/test/examples/ are not necessary anymore. You moved all the tests to test_libpq (for the best actually). + while (queue != NULL) + { + PGcommandQueueEntry *prev = queue; + queue = queue->next; + free(prev); + } This should free prev->query. /* blah comment * blah2 comment */ A lot of comment blocks are like that, those should be reformated. Running directly make check in src/test/modules/test_libpq/ does not work: # Postmaster PID for node "main" is 10225 # Running: testlibpqbatch dbname=postgres 10000 disallowed_in_batch Command 'testlibpqbatch' not found in [...PATH list ...] The problem is that testlibpqbatch is not getting compiled but I think it should. + * testlibpqbatch.c + * Test of batch execution funtionality [...] + fprintf(stderr, "%s is not a recognised test name\n", argv[3]); s/funtionality/functionality/ s/recognised/recognized/ + if (!PQsendQueryParams(conn, "SELECT $1", 1, dummy_param_oids, + dummy_params, NULL, NULL, 0)) + { + fprintf(stderr, "dispatching first SELECT failed: %s\n", PQerrorMessage(conn)); + goto fail; + } + + if (!PQsendEndBatch(conn)) + { + fprintf(stderr, "Ending first batch failed: %s\n", PQerrorMessage(conn)); + goto fail; + } + + if (!PQsendQueryParams(conn, "SELECT $1", 1, dummy_param_oids, + dummy_params, NULL, NULL, 0)) + { + fprintf(stderr, "dispatching second SELECT failed: %s\n", PQerrorMessage(conn)); + goto fail; + } May be better to use a loop here and set in the queue a bunch of queries.. You could just remove the VERBOSE flag in the tests, having a test more talkative is always better. + <para> + The batch API was introduced in PostgreSQL 9.6, but clients using it can + use batches on server versions 8.4 and newer. Batching works on any server + that supports the v3 extended query protocol. + </para> Postgres 10, not 9.6. It may be a good idea to check for PG_PROTOCOL_MAJOR < 3 and issue an error for the new routines. +int +PQgetNextQuery(PGconn *conn) [...] + return false; The new routines had better return explicitly 0 or 1, not a boolean for consistency with the rest. + for processing. Returns 0 and has no effect if there are no query results + pending, batch mode is not enabled, or if the query currently processed + is incomplete or still has pending results. See <link It would be useful for the user to make a difference between those different statuses. As of your patch, PQgetNextQuery() returns false/0 in all those cases so it is a bit hard to know what's going on. Maybe PQgetNextQuery() could be renamed to PQbatchGetNext, PQbatchQueueNext to outline the fact that it is beginning the next query in the queue. This helps also to understand that this can just be used with the batch mode. Actually no, I am wrong. It is possible to guess each one of those errors with respectively PQgetResult == NULL, PQisInBatchMode() and PQqueriesInBatch(). Definitely it should be mentioned in the docs that it is possible to make a difference between all those states. + entry = PQmakePipelinedCommand(conn); + entry->queryclass = PGQUERY_SYNC; + entry->query = NULL; PQmakePipelinedCommand() returns NULL, and boom. + bool in_batch; /* connection is in batch (pipelined) mode */ + bool batch_aborted; /* current batch is aborted, discarding until next Sync */ Having only one flag would be fine. batch_aborted is set of used only when in_batch is used, so both have a strong link. /* OK, it's launched! */ - conn->asyncStatus = PGASYNC_BUSY; + if (conn->in_batch) + PQappendPipelinedCommand(conn, pipeCmd); + else + conn->asyncStatus = PGASYNC_BUSY; No, it is put in the queue. + <para> + Returns the number of queries still in the queue for this batch, not + including any query that's currently having results being processsed. + This is the number of times <function>PQgetNextQuery</function> has to be + called before the query queue is empty again. + +<synopsis> +int PQqueriesInBatch(PGconn *conn); +</synopsis> This is not true. It does not return a count, just 0 or 1. It may be a good idea to add a test for COPY and trigger a failure. If I read the code correctly, it seems to me that it is possible to enable the batch mode, and then to use PQexec(), PQsendPrepare will just happily process queue the command. Shouldn't PQexec() be prevented in batch mode? Similar remark for PQexecParams(), PQexecPrepared() PQdescribePrepared and PQprepare(). In short everything calling PQexecFinish(). I haven't run the tests directly on Windows wiht MSVC, but they won't run as vcregress modulescheck lacks knowledge about modules using TAP. I cannot blame you for that.. -- Michael
<div dir="ltr"><div class="gmail_extra">Hi all. I thought I'd share some experience from Npgsql regarding batching/pipelining- hope this isn't off-topic.</div><div class="gmail_extra"><br /></div><div class="gmail_extra">Npgsqlhas supported batching for quite a while, similar to what this patch proposes - with a single Syncmessage is sent at the end.</div><div class="gmail_extra"><br /></div><div class="gmail_extra">It has recently come tomy attention that this implementation is problematic because it forces the batch to occur within a transaction; in otherwords, there's no option for a non-transactional batch. This can be a problem for several reasons: users may want tosent off a batch of inserts, not caring whether one of them fails (e.g. because of a unique constraint violation). In otherwords, in some scenarios it may be appropriate for later batched statements to be executed when an earlier batched statementraised an error. If Sync is only sent at the very end, this isn't possible. Another example of a problem (whichactually happened) is that transactions acquire row-level locks, and so may trigger deadlocks if two different batchesupdate the same rows in reverse order. Both of these issues wouldn't occur if the batch weren't implicitly batched.</div><divclass="gmail_extra"><br /></div><div class="gmail_extra">My current plan is to modify the batch implementationbased on whether we're in an (explicit) transaction or not. If we're in a transaction, then it makes perfectsense to send a single Sync at the end as is being proposed here - any failure would cause the transaction to failanyway, so skipping all subsequent statements until the batch's end makes sense. However, if we're not in an explicittransaction, I plan to insert a Sync message after each individual Execute, making non-transactional batched statementsmore or less identical in behavior to non-transactional unbatched statements. Note that this mean that a batchcan generate multiple errors, not just one.</div><div class="gmail_extra"><br /></div><div class="gmail_extra">I'm sharingthis since it may be relevant to the libpq batching implementation as well, and also to get any feedback regardinghow Npgsql should act.</div></div>
On 10/4/16 11:54 PM, Michael Paquier wrote: > + <para> > + Much like asynchronous query mode, there is no performance disadvantage to > + using batching and pipelining. It somewhat increased client application > + complexity and extra caution is required to prevent client/server network > + deadlocks, but can offer considerable performance improvements. > + </para> > I would reword that a bit "it increases client application complexity > and extra caution is required to prevent client/server deadlocks but > offers considerable performance improvements". Unrelated, but another doc bug, on line 4647: + The batch API was introduced in PostgreSQL 9.6, but clients using it can That should read 10.0 (or just 10?) -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com 855-TREBLE2 (855-873-2532) mobile: 512-569-9461
On 12 October 2016 at 19:51, Shay Rojansky <roji@roji.org> wrote: > Hi all. I thought I'd share some experience from Npgsql regarding > batching/pipelining - hope this isn't off-topic. Not at all. > Npgsql has supported batching for quite a while, similar to what this patch > proposes - with a single Sync message is sent at the end. PgJDBC does too. The benefits of it there are what prompted me to do this, not only so libpq users can use it directly but so psqlODBC, psycopg2, etc can benefit from it if they choose to expose it. > It has recently come to my attention that this implementation is problematic > because it forces the batch to occur within a transaction; in other words, > there's no option for a non-transactional batch. That's not strictly the case. If you explicitly BEGIN and COMMIT, those operations are honoured within the batch. What's missing is implicit transactions. Usually if you send a series of messages they will each get their own implicit transaction. If you batch them, the whole lot gets an implicit transaction. This is because the PostgreSQL v3 protocol conflates transaction delineation with protocol error recovery into a single Sync message type. It's very similar to the behaviour of multi-statements, where: psql -c "CREATE TABLE fred(x integer); DROP TABLE notexists;" doesn't create "fred", but psql -c "BEGIN; CREATE TABLE fred(x integer); COMMIT; BEGIN; DROP TABLE notexists; COMMMIT;" ... does. And in fact it's for almost the same reason. They're sent as a single SimpleQuery message by psql and split up client side, but the effect is the same as two separate Query messages followed by a Sync. It isn't simple to manage this client-side, because libpq doesn't know whether a given command string may contain transaction delimiting statements or not and can't reliably look for them without client-side parsing of the SQL. So it can't dispatch its own BEGIN/COMMIT around statements in a batch that it thinks might be intended to be autocommit, and anyway that'd result in sending 3 queries for every 1 client query, which would suck. If the mythical v4 protocol ever happens I'd want to split Sync into two messages, one which is a protocol synchronisation message and another that is a transaction delimiter. Or give it flags or whatever. In the mean time: > This can be a problem for > several reasons: users may want to sent off a batch of inserts, not caring > whether one of them fails (e.g. because of a unique constraint violation). > In other words, in some scenarios it may be appropriate for later batched > statements to be executed when an earlier batched statement raised an error. > If Sync is only sent at the very end, this isn't possible. Right, and that remains the case even with explicit transaction delineation, because the first ERROR causes processing of all subsequent messages to be skipped. The design I have in libpq allows for this by allowing the client to delimit batches without ending batch mode, concurrently consuming a stream of multiple batches. Each endbatch is a Sync. So a client that wants autocommit-like behavour can send a series of 1-query batches. I think I'll need to document this a bit better since it's more subtle than I properly explained. > Another example > of a problem (which actually happened) is that transactions acquire > row-level locks, and so may trigger deadlocks if two different batches > update the same rows in reverse order. Both of these issues wouldn't occur > if the batch weren't implicitly batched. Same solution as above. > My current plan is to modify the batch implementation based on whether we're > in an (explicit) transaction or not. If we're in a transaction, then it > makes perfect sense to send a single Sync at the end as is being proposed > here - any failure would cause the transaction to fail anyway, so skipping > all subsequent statements until the batch's end makes sense. However, if > we're not in an explicit transaction, I plan to insert a Sync message after > each individual Execute, making non-transactional batched statements more or > less identical in behavior to non-transactional unbatched statements. Note > that this mean that a batch can generate multiple errors, not just one. Yes, that's what I suggest, and basically what the libpq batch interface does, though it expects the client to deal with the transaction boundaries. You will need to think hard about transaction boundaries as they relate to multi-statements unless nPgSQL parses out each statement from multi-statement strings like PgJDBC does. Otherwise a user can sneak in: somestatement; BEGIN; someotherstatement; or somestatement; CoMMiT; otherstatement; -- Craig Ringer http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
> It has recently come to my attention that this implementation is problematic
> because it forces the batch to occur within a transaction; in other words,
> there's no option for a non-transactional batch.
That's not strictly the case. If you explicitly BEGIN and COMMIT,
those operations are honoured within the batch.
What's missing is implicit transactions. Usually if you send a series
of messages they will each get their own implicit transaction. If you
batch them, the whole lot gets an implicit transaction. This is
because the PostgreSQL v3 protocol conflates transaction delineation
with protocol error recovery into a single Sync message type.
If the mythical v4 protocol ever happens I'd want to split Sync into
two messages, one which is a protocol synchronisation message and
another that is a transaction delimiter. Or give it flags or whatever.
In the mean time:
> This can be a problem for
> several reasons: users may want to sent off a batch of inserts, not caring
> whether one of them fails (e.g. because of a unique constraint violation).
> In other words, in some scenarios it may be appropriate for later batched
> statements to be executed when an earlier batched statement raised an error.
> If Sync is only sent at the very end, this isn't possible.
Right, and that remains the case even with explicit transaction
delineation, because the first ERROR causes processing of all
subsequent messages to be skipped.
The design I have in libpq allows for this by allowing the client to
delimit batches without ending batch mode, concurrently consuming a
stream of multiple batches. Each endbatch is a Sync. So a client that
wants autocommit-like behavour can send a series of 1-query batches.
I think I'll need to document this a bit better since it's more subtle
than I properly explained.
Yes, that's what I suggest, and basically what the libpq batch
interface does, though it expects the client to deal with the
transaction boundaries.
You will need to think hard about transaction boundaries as they
relate to multi-statements unless nPgSQL parses out each statement
from multi-statement strings like PgJDBC does. Otherwise a user can
sneak in:
somestatement; BEGIN; someotherstatement;
or
somestatement; CoMMiT; otherstatement;
On 14 October 2016 at 18:09, Shay Rojansky <roji@roji.org> wrote: >> > It has recently come to my attention that this implementation is >> > problematic >> > because it forces the batch to occur within a transaction; in other >> > words, >> > there's no option for a non-transactional batch. >> >> That's not strictly the case. If you explicitly BEGIN and COMMIT, >> those operations are honoured within the batch. > > > I wasn't precise in my formulation (although I think we understand each > other)... The problem I'm trying to address here is the fact that in the > "usual" batching implementation (i.e. where a single Sync message is sent at > the end of the batch), there's no support for batches which have no > transactions whatsoever (i.e. where each statement is auto-committed and > errors in earlier statements don't trigger skipping of later statements). Right, you can't use implicit transactions delimited by protocol message boundaries when batching. >> The design I have in libpq allows for this by allowing the client to >> delimit batches without ending batch mode, concurrently consuming a >> stream of multiple batches. Each endbatch is a Sync. So a client that >> wants autocommit-like behavour can send a series of 1-query batches. > > Ah, I see. libpq's API is considerably more low-level than what Npgsql needs > to provide. If I understand correctly, you allow users to specify exactly > where to insert Sync messages (if at all) They don't get total control, but they can cause a Sync to be emitted after any given statement when in batch mode. > so that any number of statements > arbitrarily interleaved with Sync messages can be sent without starting to > read any results. Right. > if so, then the user indeed has everything they need to > control the exact transactional behavior they want (including full > auto-commit) without compromising on performance in any way (i.e. by > increasing roundtrips). Right. > The only minor problem I can see is that PQsendEndBatch not only adds a Sync > message to the buffer, but also flushes it. It only does a non-blocking flush. > This means that you may be > forcing users to needlessly flush the buffer just because they wanted to > insert a Sync. In other words, users can't send the following messages in a > single buffer/packet: > Prepare1/Bind1/Describe1/Execute1/Sync1/Prepare2/Bind2/Describe2/Execute2/Sync2 > - they have to be split into different packets. Oh, right. That's true, but I'm not really sure we care. I suspect that the OS will tend to coalesce them anyway, since we're not actually waiting until the socket sends the message. At least when the socket isn't able to send as fast as input comes in. It might matter for local performance in incredibly high throughput applications, but I suspect there will be a great many other things that come first. Anyway, the client can already control this with TCP_CORK. > Of course, this is a > relatively minor performance issue (especially when compared to the overall > performance benefits provided by batching), and providing an API distinction > between adding a Sync and flushing the buffer may over-complicate the API. I > just thought I'd mention it. Unnecessary IMO. If we really want to add it later we'd probably do so by setting a "no flush on endbatch" mode and adding an explicit flush call. But I expect TCP_CORK will satisfy all realistic needs. > That's a good point. I definitely don't want to depend on client-side > parsing of SQL in any way (in fact a planned feature is to allow using > Npgsql without any sort of client-side parsing/manipulation of SQL). > However, the fact that BEGIN/COMMIT can be sent in batches doesn't appear > too problematic to me. > > When it's about to send a batch, Npgsql knows whether it's in an (explicit) > transaction or not (by examining the transaction indicator on the last > ReadyForQuery message it received). If it's not in an (explicit) > transaction, it automatically inserts a Sync message after every Execute. If > some statement happens to be a BEGIN, it will be executed as a normal > statement and so on. The only issue is that if an error occurs after a > sneaked-in BEGIN, all subsequent statements will fail, and all have the Sync > messages Npgsql inserted. The end result will be a series of errors that > will be raised up to the user, but this isn't fundamentally different from > the case of a simple auto-commit batch containing multiple failures (because > of unique constraint violation or whatever) - multiple errors is something > that will have to be handled in any case. I'm a bit hesitant about how this will interact with multi-statements containing embedded BEGIN or COMMIT, where a single protocol message contains multiple ; delimited queries. But at least at this time of night I can't give a concrete problematic example. -- Craig Ringer http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
> Of course, this is a
> relatively minor performance issue (especially when compared to the overall
> performance benefits provided by batching), and providing an API distinction
> between adding a Sync and flushing the buffer may over-complicate the API. I
> just thought I'd mention it.
Unnecessary IMO. If we really want to add it later we'd probably do so
by setting a "no flush on endbatch" mode and adding an explicit flush
call. But I expect TCP_CORK will satisfy all realistic needs.
> When it's about to send a batch, Npgsql knows whether it's in an (explicit)
> transaction or not (by examining the transaction indicator on the last
> ReadyForQuery message it received). If it's not in an (explicit)
> transaction, it automatically inserts a Sync message after every Execute. If
> some statement happens to be a BEGIN, it will be executed as a normal
> statement and so on. The only issue is that if an error occurs after a
> sneaked-in BEGIN, all subsequent statements will fail, and all have the Sync
> messages Npgsql inserted. The end result will be a series of errors that
> will be raised up to the user, but this isn't fundamentally different from
> the case of a simple auto-commit batch containing multiple failures (because
> of unique constraint violation or whatever) - multiple errors is something
> that will have to be handled in any case.
I'm a bit hesitant about how this will interact with multi-statements
containing embedded BEGIN or COMMIT, where a single protocol message
contains multiple ; delimited queries. But at least at this time of
night I can't give a concrete problematic example.
On 14 October 2016 at 22:15, Shay Rojansky <roji@roji.org> wrote: > Unless I'm mistaken TCP_CORK is not necessarily going to work across all > platforms (e.g. Windows), although SO_LINGER (which is more standard) also > helps here. Yeah, true. You can also twiddle TCP_NODELAY on and off on other platforms. Anyway, my point is that it's not likely to be crucial, especially since even without socket options the host will often do packet combining if the queue isn't empty. > Unless I'm mistaken, in the extended protocol you can't combine multiple ; > delimited queries in a single Parse - that's a feature supported only by the > Query message of the simple protocol. But then, if you're in the simple > protocol Sync doesn't play any role, does it? I mean, a Query message > behaves as though it's implicitly followed by a Sync message - an error in a > previous Query message doesn't cause later messages to be skipped. Right, good point. So that concern isn't relevant. -- Craig Ringer http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
Hello, Craig, I'm sorry to be late to review your patch. I've just been able to read the HTML doc first. Can I get the latest .patchfile for reading and running the code? Here are some comments and questions. I tried to avoid the same point as other reviewers, but there may be an overlap. (1) The example UPDATE mytable SET x = x + 1; should be UPDATE mytable SET x = x + 1 WHERE id = 42; (2) "The server usually begins executing the batch before all commands in the batch are queued and the end of batch command issent." Does this mean that the app developer cannot control or predict how many TCP transmissions a batch is sent with? For example,if I want to insert 10 rows into a table in bulk, can I send those 10 rows (and the end of batch command) efficientlyin one TCP transmission, or are they split by libpq into multiple TCP transmissions? (3) "To avoid deadlocks on large batches the client should be structured around a nonblocking I/O loop using a function likeselect, poll, epoll, WaitForMultipleObjectEx, etc." Can't we use some (new) platform-independent API instead of using poll() or WaitForMultipleObject()? e.g. some thin wrapperaround pqWait(). It seems a bit burdonsome to have to use an OS-specific API to just wait for libpq. Apart fromthat, it does not seem possible to wait for the socket in 64-bit apps on Windows, because SOCKET is 64-bit while PQsocket()returns int. [winsock2.h] /** The new type to be used in all* instances which refer to sockets.*/ typedef UINT_PTR SOCKET; Regards Takayuki Tsunakawa
On 18 November 2016 at 14:04, Tsunakawa, Takayuki <tsunakawa.takay@jp.fujitsu.com> wrote: > Hello, Craig, > > I'm sorry to be late to review your patch. I've just been able to read the HTML doc first. Can I get the latest .patchfile for reading and running the code? The latest is what's attached upthread and what's in the git repo also referenced upthread. I haven't been able to update in response to more recent review due to other development commitments. At this point I doubt I'll be able to get update it again in time for v10, so anyone who wants to adopt it is welcome. > Here are some comments and questions. I tried to avoid the same point as other reviewers, but there may be an overlap. > > > (1) > The example > UPDATE mytable SET x = x + 1; > should be > UPDATE mytable SET x = x + 1 WHERE id = 42; Good catch. > (2) > "The server usually begins executing the batch before all commands in the batch are queued and the end of batch commandis sent." > > Does this mean that the app developer cannot control or predict how many TCP transmissions a batch is sent with? That's not what that sentence means since the TCP layer is much lower level, but what you say is also true. All the docs are saying there is that there's no explicit control over when we start sending the batch to the server. How that translates to individual TCP packets, etc, is not its problem. > For example, if I want to insert 10 rows into a table in bulk, can I send those 10 rows (and the end of batch command)efficiently in one TCP transmission, or are they split by libpq into multiple TCP transmissions? libpq neither knows nor cares about individual TCP packets. It sends things to the kernel and lets the kernel deal with that. That said, you can use socket options TCP_CORK and TCP_NODELAY to get some control over how and when data is sent. If you know you're about to send more, you might cork the socket to give the kernel a hint that it should expect more data to send. > (3) > "To avoid deadlocks on large batches the client should be structured around a nonblocking I/O loop using a function likeselect, poll, epoll, WaitForMultipleObjectEx, etc." > > Can't we use some (new) platform-independent API instead of using poll() or WaitForMultipleObject()? e.g. some thin wrapperaround pqWait(). It seems a bit burdonsome to have to use an OS-specific API to just wait for libpq. Apart fromthat, it does not seem possible to wait for the socket in 64-bit apps on Windows, because SOCKET is 64-bit while PQsocket()returns int. IMO this problem is out of scope for this patch. A wait abstraction might be nice, but next thing we know we'll be reinventing APR or NSPR, I think that's a totally different problem. Not being able to get a win32 SOCKET from libpq seems like a bit of an oversight, and it'd definitely be good to address that to make async mode more usable on Windows. There's some other ugliness in PQsocket already per the comments there. I think it should be a separate patch, though. -- Craig Ringer http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
The latest is what's attached upthread and what's in the git repo also
referenced upthread.
I haven't been able to update in response to more recent review due to
other development commitments. At this point I doubt I'll be able to
get update it again in time for v10, so anyone who wants to adopt it
is welcome.
On 22 November 2016 at 15:14, Haribabu Kommi <kommi.haribabu@gmail.com> wrote: > > On Fri, Nov 18, 2016 at 7:18 PM, Craig Ringer <craig@2ndquadrant.com> wrote: >> >> The latest is what's attached upthread and what's in the git repo also >> referenced upthread. >> >> I haven't been able to update in response to more recent review due to >> other development commitments. At this point I doubt I'll be able to >> get update it again in time for v10, so anyone who wants to adopt it >> is welcome. > > > Currently patch status is marked as "returned with feedback" in the 11-2016 > commitfest. Anyone who wants to work on it can submit the updated patch > by taking care of all review comments and change the status of the patch > at any time. > > Thanks for the patch. Thanks. Sorry I haven't had time to work on it. Priorities. -- Craig Ringer http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
Hi,
On 18 November 2016 at 08:18, Craig Ringer wrote:
>At this point I doubt I'll be able to
>get update it again in time for v10, so anyone who wants to adopt it
>is welcome.
I am interested in pipeline/batch support for ECPG, and found this thread.
I updated Craig's patch [1] to apply this one to HEAD. Moreover, I fixed an easy typo.
First, I'm changing PQqueriesInBatch() to work as documented.
After that, I plan to reflect contents of reviews in the patch.
On Mon, Oct 3, 2016 at 11:52 PM, Daniel Verite wrote:
> Wouldn't pgbench benefit from it?
> It was mentioned some time ago [1], in relationship to the
> \into construct, how client-server latency was important enough to
> justify the use of a "\;" separator between statements, to send them
> as a group.
>
> But with the libpq batch API, maybe this could be modernized
> with meta-commands like this:
> \startbatch
> ...
> \endbatch
I'm planning to work on meta-commands to support batch mode after committing this patch successfully.
[1]https://github.com/2ndQuadrant/postgres/tree/dev/libpq-async-batch
Regards,
Aya Iwata
FUJITSU
Attachment
On 22 November 2016 at 18:32, Craig Ringer<craig@2ndquadrant.com> wrote: > On 22 November 2016 at 15:14, Haribabu Kommi > <kommi.haribabu@gmail.com> wrote: > > > > On Fri, Nov 18, 2016 at 7:18 PM, Craig Ringer <craig@2ndquadrant.com> > wrote: > >> > >> The latest is what's attached upthread and what's in the git repo > >> also referenced upthread. > >> > >> I haven't been able to update in response to more recent review due > >> to other development commitments. At this point I doubt I'll be able > >> to get update it again in time for v10, so anyone who wants to adopt > >> it is welcome. > > > > > > Currently patch status is marked as "returned with feedback" in the > > 11-2016 commitfest. Anyone who wants to work on it can submit the > > updated patch by taking care of all review comments and change the > > status of the patch at any time. > > > > Thanks for the patch. > > Thanks. Sorry I haven't had time to work on it. Priorities. Hi, I am interested in this patch and addressed various below comments from reviewers. And, I have separated out code and testmodule into 2 patches. So, If needed, test patch can be enhanced more, meanwhile code patch can be committed. >Renaming and refactoring new APIs > +PQisInBatchMode 172 >+PQqueriesInBatch 173 >+PQbeginBatchMode 174 >+PQendBatchMode 175 >+PQsendEndBatch 176 >+PQgetNextQuery 177 >+PQbatchIsAborted 178 >This set of routines is a bit inconsistent. Why not just prefixing them with PQbatch? Like that for example: > PQbatchStatus(): consists of disabled/inactive/none, active, error. This covers both PQbatchIsAborted() and PQisInBatchMode(). >PQbatchBegin() >PQbatchEnd() >PQbatchQueueSync() or PQbatchQueueFlush, same as PQsendEndBatch() to add and process a sync message into the queue. Renamed and modified batch status APIs as below PQisInBatchMode & PQbatchIsAborted ==> PQbatchStatus PQqueriesInBatch ==> PQbatchQueueCount PQbeginBatchMode ==> PQbatchBegin PQendBatchMode ==> PQbatchEnd PQsendEndBatch ==> PQbatchQueueSync PQgetNextQuery ==> PQbatchQueueProcess >PQbatchQueueCount(): returns N>0 if there are N entries, 0 if empty,-1 on failure >PQbatchQueueProcess(): returns 1 if process can begin, 0 if not, -1 on failure (OOM) I think it is still ok to keep the current behaviour like other ones present in the same file. E.g:"PQsendPrepare" "PQsendQueryGuts" >PQqueriesInBatch() (Newname(NN):PQbatchQueueCount)doesn't work as documented. >It says: >"Returns the number of queries still in the queue for this batch" >but in fact it's implemented as a Boolean. Modified the logic to count number of entries in pending queue and return the count >The changes in src/test/examples/ are not necessary anymore. You moved all the tests to test_libpq (for the best actually). Removed these unnecessary changes from src/test/examples folder and corrected the path mentioned in comments section of testlibpqbatch.c > + while (queue != NULL) >+ { > PGcommandQueueEntry *prev = queue; >+ queue = queue->next; >+ free(prev); >+ } >This should free prev->query. Both prev->query and prev is freed. Also, this applies to "cmd_queue_recycle" too. >Running directly make check in src/test/modules/test_libpq/ does not work Modified "check" rule in makefile >You could just remove the VERBOSE flag in the tests, having a test more talkative is always better. Removed ifdef VERBOSE checks. >But with the libpq batch API, maybe this could be modernized >with meta-commands like this: >\startbatch >... >\endbatch I think it is a separate patch candidate. > It is possible to guess each one of those errors(occurred in PQbatchQueueProcess API) with respectively > PQgetResult == NULL, PQisInBatchMode() and PQqueriesInBatch(). > Definitely it should be mentioned in the docs that it is possible to make a difference between all those states. Updated documentation section of PQbatchQueueProcess() with these details. > + entry = PQmakePipelinedCommand(conn); >+ entry->queryclass = PGQUERY_SYNC; >+ entry->query = NULL; >PQmakePipelinedCommand() returns NULL, and boom. Corrected to return false if PQmakePipelinedCommand() returns NULL. > + bool in_batch; /* connection is in batch (pipelined) mode */ >+ bool batch_aborted; /* current batch is aborted, discarding until next Sync */ >Having only one flag would be fine. batch_aborted is set of used only >when in_batch is used, so both have a strong link Yes, agree that tracking the batch status via one flag is more clean So, Added new enum typedef enum { PQBATCH_MODE_OFF, PQBATCH_MODE_ON, PQBATCH_MODE_ABORTED } PQBatchStatus; and " PQBatchStatus batch_status" member of pg_conn is used to track the status of batch mode. >/* OK, it's launched! */ >- conn->asyncStatus = PGASYNC_BUSY; >+ if (conn->in_batch) >+ PQappendPipelinedCommand(conn, pipeCmd); >+ else >+ conn->asyncStatus = PGASYNC_BUSY; >No, it is put in the queue Though it is put in queue, technically the command is sent to server and server might have started executing it. So it is ok to use launched I think. > It may be a good idea to add a test for COPY and trigger a failure. Added new test - "copy_failure" to trigger failures during copy state. Please note that COPY is allowed in batch mode, but only syncing batch/queuing any command while copy is in progress is blockeddue to potential failure of entire batch. So updated the documentation for more clarity in this case. >If I read the code correctly, it seems to me that it is possible to enable the batch mode, and then to use PQexec(), PQsendPreparewill >just happily process queue the command. Shouldn't PQexec() be prevented in batch mode? Similar remark for PQexecParams(), >PQexecPrepared() PQdescribePrepared and PQprepare(). In short everything calling PQexecFinish(). Check to verify whether the connection is in batch mode or not is already present in PQexecStart(). And, all the functions calling PQexecFinish() will not be allowed in batch mode anyways. So, no new check is needed I think. > It may be a good idea to check for PG_PROTOCOL_MAJOR < 3 and issue an error for the new routines. All the APIs which supports asynchronous command processing can be executed only if PG_PROTOCOL_MAJOR >= 3. So, adding itto new routines are not really needed. > + After entering batch mode the application dispatches requests >+ using normal asynchronous <application>libpq</application> functions like >+ <function>PQsendQueryParams</function>, ><function>PQsendPrepare</function>, >+ etc. >It may be better to list all the functions here, PQSendQuery Documentation updated with list of functions - PQsendQueryParams,PQsendPrepare, PQsendQueryPrepared,PQdescribePortal,PQdescribePrepared, PQsendDescribePortal,PQsendDescribePrepared. >1. Typos in document part - diff-typos.txt file contains the typos specified here >2. git --check is complaining >3. s/funtionality/functionality/ and s/recognised/recognized/ typos in testlibpqbatch.c file >4. s/Batching less useful/Batching is less useful in libpq.sgml >5. + <para> >+ Much like asynchronous query mode, there is no performance disadvantage to >+ using batching and pipelining. It somewhat increased client application >+ complexity and extra caution is required to prevent client/server network >+ deadlocks, but can offer considerable performance improvements. >+ </para> >I would reword that a bit "it increases client application complexity >and extra caution is required to prevent client/server deadlocks but >offers considerable performance improvements". >6. + ("ping time") is high, and when many small operations are being performed in >Nit: should use <quote> here. Still not quoting it would be fine. >7. Postgres 10, not 9.6. Corrected. Thanks & Regards, Vaishnavi Fujitsu Australia Disclaimer The information in this e-mail is confidential and may contain content that is subject to copyright and/or is commercial-in-confidenceand is intended only for the use of the above named addressee. If you are not the intended recipient,you are hereby notified that dissemination, copying or use of the information is strictly prohibited. If you havereceived this e-mail in error, please telephone Fujitsu Australia Software Technology Pty Ltd on + 61 2 9452 9000 orby reply e-mail to the sender and delete the document and all copies thereof. Whereas Fujitsu Australia Software Technology Pty Ltd would not knowingly transmit a virus within an email communication,it is the receiver’s responsibility to scan all communication and any files attached for computer viruses andother defects. Fujitsu Australia Software Technology Pty Ltd does not accept liability for any loss or damage (whetherdirect, indirect, consequential or economic) however caused, and whether by negligence or otherwise, which may resultdirectly or indirectly from this communication or any files attached. If you do not wish to receive commercial and/or marketing email messages from Fujitsu Australia Software Technology Pty Ltd,please email unsubscribe@fast.au.fujitsu.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Attachment
On Fri, Feb 17, 2017 at 2:17 PM, Prabakaran, Vaishnavi <VaishnaviP@fast.au.fujitsu.com> wrote: > On 22 November 2016 at 18:32, Craig Ringer<craig@2ndquadrant.com> wrote: > I am interested in this patch and addressed various below comments from reviewers. And, I have separated out code and testmodule into 2 patches. So, If needed, test patch can be enhanced more, meanwhile code patch can be committed. Cool. Nice to see a new version of this patch. (You may want to your replies breath a bit, for example by adding an extra newline to the paragraph you are answering to.) >>Renaming and refactoring new APIs >> +PQisInBatchMode 172 >>+PQqueriesInBatch 173 >>+PQbeginBatchMode 174 >>+PQendBatchMode 175 >>+PQsendEndBatch 176 >>+PQgetNextQuery 177 >>+PQbatchIsAborted 178 >>This set of routines is a bit inconsistent. Why not just prefixing them with PQbatch? Like that for example: >> PQbatchStatus(): consists of disabled/inactive/none, active, error. This covers both PQbatchIsAborted() and PQisInBatchMode(). >>PQbatchBegin() >>PQbatchEnd() >>PQbatchQueueSync() or PQbatchQueueFlush, same as PQsendEndBatch() to add and process a sync message into the queue. > > Renamed and modified batch status APIs as below > PQisInBatchMode & PQbatchIsAborted ==> PQbatchStatus > PQqueriesInBatch ==> PQbatchQueueCount > PQbeginBatchMode ==> PQbatchBegin > PQendBatchMode ==> PQbatchEnd > PQsendEndBatch ==> PQbatchQueueSync > PQgetNextQuery ==> PQbatchQueueProcess Thanks. This seems a way cleaner interface to me. Others may have a different opinion so let's see if there are some. >>PQbatchQueueCount(): returns N>0 if there are N entries, 0 if empty,-1 on failure >>PQbatchQueueProcess(): returns 1 if process can begin, 0 if not, -1 on failure (OOM) > > I think it is still ok to keep the current behaviour like other ones present in the same file. E.g:"PQsendPrepare" "PQsendQueryGuts" > >>PQqueriesInBatch() (Newname(NN):PQbatchQueueCount)doesn't work as documented. >>It says: >>"Returns the number of queries still in the queue for this batch" >>but in fact it's implemented as a Boolean. > > Modified the logic to count number of entries in pending queue and return the count > >>The changes in src/test/examples/ are not necessary anymore. You moved all the tests to test_libpq (for the best actually). > Removed these unnecessary changes from src/test/examples folder and corrected the path mentioned in comments section oftestlibpqbatch.c >>But with the libpq batch API, maybe this could be modernized >>with meta-commands like this: >>\startbatch >>... >>\endbatch > > I think it is a separate patch candidate. Definitely agreed on that. Let's not complicate things further more. >> It may be a good idea to check for PG_PROTOCOL_MAJOR < 3 and issue an error for the new routines. > > All the APIs which supports asynchronous command processing can be executed only if PG_PROTOCOL_MAJOR >= 3. So, addingit to new routines are not really needed. OK. Let's not complicate the patch more than necessary. After an extra lookup at the patch, here are some comments. In the docs when listing any of the PQBATCH_MODE_* variables, you should have a markup <literal> around them. It would be cleaner to make a list of the potential values that can be returned by PQbatchStatus() using <variablelist>. + while (queue != NULL) + { + PGcommandQueueEntry *prev = queue; + + queue = queue->next; + if (prev->query) + free(prev->query); + free(prev); + } + conn->cmd_queue_recycle = NULL This could be useful as a separate routine. +/* PQmakePipelinedCommand + * Get a new command queue entry, allocating it if required. Doesn't add it to + * the tail of the queue yet, use PQappendPipelinedCommand once the command has + * been written for that. If a command fails once it's called this, it should + * use PQrecyclePipelinedCommand to put it on the freelist or release it. + * This comment block is not project-like. Please use an empty line as first line with only "/*". The same comment applies to a bunch of the routines introduced by this patch. Not sure there is much point in having PQstartProcessingNewQuery. It only does two things in two places, so that's not worth the loss in readability. + if (conn->batch_status != PQBATCH_MODE_OFF) + { + pipeCmd = PQmakePipelinedCommand(conn); + + if (pipeCmd == NULL) + return 0; /* error msg already set */ + + last_query = &pipeCmd->query; + queryclass = &pipeCmd->queryclass; + } + else + { + last_query = &conn->last_query; + queryclass = &conn->queryclass; + } This setup should happen further down. conn->in_batch is undefined, causing the patch to fail to compile. And actually you should not need it. - conn->asyncStatus = PGASYNC_READY; + conn->asyncStatus = PGASYNC_READY_MORE; return; This should really not be changed, or it should be changed to a status dedicated to batching only if batch mode is activated. If you are planning for integrating this patch into Postres 10, please be sure that this is registered in the last commit fest that will begin next week: https://commitfest.postgresql.org/13/ I'll try to book a couple of cycles to look at it if you are able to register it into the CF and provide a new version. -- Michael
On Fri, Feb 17, 2017 at 2:17 PM, Prabakaran, Vaishnavi
<VaishnaviP@fast.au.fujitsu.com> wrote:
> On 22 November 2016 at 18:32, Craig Ringer<craig@2ndquadrant.com> wrote:
> I am interested in this patch and addressed various below comments from reviewers. And, I have separated out code and test module into 2 patches. So, If needed, test patch can be enhanced more, meanwhile code patch can be committed.
Cool. Nice to see a new version of this patch.
(You may want to your replies breath a bit, for example by adding an
extra newline to the paragraph you are answering to.)
>>Renaming and refactoring new APIs
>> +PQisInBatchMode 172
>>+PQqueriesInBatch 173
>>+PQbeginBatchMode 174
>>+PQendBatchMode 175
>>+PQsendEndBatch 176
>>+PQgetNextQuery 177
>>+PQbatchIsAborted 178
>>This set of routines is a bit inconsistent. Why not just prefixing them with PQbatch? Like that for example:
>> PQbatchStatus(): consists of disabled/inactive/none, active, error. This covers both PQbatchIsAborted() and PQisInBatchMode().
>>PQbatchBegin()
>>PQbatchEnd()
>>PQbatchQueueSync() or PQbatchQueueFlush, same as PQsendEndBatch() to add and process a sync message into the queue.
>
> Renamed and modified batch status APIs as below
> PQisInBatchMode & PQbatchIsAborted ==> PQbatchStatus
> PQqueriesInBatch ==> PQbatchQueueCount
> PQbeginBatchMode ==> PQbatchBegin
> PQendBatchMode ==> PQbatchEnd
> PQsendEndBatch ==> PQbatchQueueSync
> PQgetNextQuery ==> PQbatchQueueProcess
Thanks. This seems a way cleaner interface to me. Others may have a
different opinion so let's see if there are some.
>>PQbatchQueueCount(): returns N>0 if there are N entries, 0 if empty,-1 on failure
>>PQbatchQueueProcess(): returns 1 if process can begin, 0 if not, -1 on failure (OOM)
>
> I think it is still ok to keep the current behaviour like other ones present in the same file. E.g:"PQsendPrepare" "PQsendQueryGuts"
>
>>PQqueriesInBatch() (Newname(NN):PQbatchQueueCount)doesn't work as documented.
>>It says:
>>"Returns the number of queries still in the queue for this batch"
>>but in fact it's implemented as a Boolean.
>
> Modified the logic to count number of entries in pending queue and return the count
>
>>The changes in src/test/examples/ are not necessary anymore. You moved all the tests to test_libpq (for the best actually).
> Removed these unnecessary changes from src/test/examples folder and corrected the path mentioned in comments section of testlibpqbatch.c
>>But with the libpq batch API, maybe this could be modernized
>>with meta-commands like this:
>>\startbatch
>>...
>>\endbatch
>
> I think it is a separate patch candidate.
Definitely agreed on that. Let's not complicate things further more.
>> It may be a good idea to check for PG_PROTOCOL_MAJOR < 3 and issue an error for the new routines.
>
> All the APIs which supports asynchronous command processing can be executed only if PG_PROTOCOL_MAJOR >= 3. So, adding it to new routines are not really needed.
OK. Let's not complicate the patch more than necessary.
After an extra lookup at the patch, here are some comments.
In the docs when listing any of the PQBATCH_MODE_* variables, you
should have a markup <literal> around them. It would be cleaner to
make a list of the potential values that can be returned by
PQbatchStatus() using <variablelist>.
+ while (queue != NULL)
+ {
+ PGcommandQueueEntry *prev = queue;
+
+ queue = queue->next;
+ if (prev->query)
+ free(prev->query);
+ free(prev);
+ }
+ conn->cmd_queue_recycle = NULL
This could be useful as a separate routine.
+/* PQmakePipelinedCommand
+ * Get a new command queue entry, allocating it if required. Doesn't add it to
+ * the tail of the queue yet, use PQappendPipelinedCommand once the command has
+ * been written for that. If a command fails once it's called this, it should
+ * use PQrecyclePipelinedCommand to put it on the freelist or release it.
+ *
This comment block is not project-like. Please use an empty line as
first line with only "/*". The same comment applies to a bunch of the
routines introduced by this patch.
Not sure there is much point in having PQstartProcessingNewQuery. It
only does two things in two places, so that's not worth the loss in
readability.
+ if (conn->batch_status != PQBATCH_MODE_OFF)
+ {
+ pipeCmd = PQmakePipelinedCommand(conn);
+
+ if (pipeCmd == NULL)
+ return 0; /* error msg already set */
+
+ last_query = &pipeCmd->query;
+ queryclass = &pipeCmd->queryclass;
+ }
+ else
+ {
+ last_query = &conn->last_query;
+ queryclass = &conn->queryclass;
+ }
This setup should happen further down.
conn->in_batch is undefined, causing the patch to fail to compile. And
actually you should not need it.
- conn->asyncStatus = PGASYNC_READY;
+ conn->asyncStatus = PGASYNC_READY_MORE;
return;
This should really not be changed, or it should be changed to a status
dedicated to batching only if batch mode is activated.
If you are planning for integrating this patch into Postres 10, please
be sure that this is registered in the last commit fest that will
begin next week:
https://commitfest.postgresql.org/13/
I'll try to book a couple of cycles to look at it if you are able to
register it into the CF and provide a new version.
Attachment
Thanks for reviewing the patch.
Vaishnavi Prabakaran wrote: > Yes, I have created a new patch entry into the commitfest 2017-03 and > attached the latest patch with this e-mail. Please find attached a companion patch implementing the batch API in pgbench, exposed as \beginbatch and \endbatch meta-commands (without documentation). The idea for now is to make it easier to exercise the API and test how batching performs. I guess I'll submit the patch separately in a future CF, depending on when/if the libpq patch goes in. While developing this, I noted a few things with 0001-v4: 1. lack of initialization for count in PQbatchQueueCount. Trivial fix: --- a/src/interfaces/libpq/fe-exec.c +++ b/src/interfaces/libpq/fe-exec.c @@ -1874,7 +1874,7 @@ PQisBusy(PGconn *conn) int PQbatchQueueCount(PGconn *conn) { - int count; + int count = 0; PGcommandQueueEntry *entry; 2. misleading error message in PQexecStart. It gets called by a few other functions than PQexec, such as PQprepare. As I understand it, the intent here is to forbid the synchronous functions in batch mode, so this error message should not single out PQexec. @@ -1932,6 +2425,13 @@ PQexecStart(PGconn *conn) if (!conn) return false; + if (conn->asyncStatus == PGASYNC_QUEUED || conn->batch_status != PQBATCH_MODE_OFF) + { + printfPQExpBuffer(&conn->errorMessage, + libpq_gettext("cannot PQexec in batch mode\n")); + return false; + } + 3. In relation to #2, PQsendQuery() is not forbidden in batch mode although I don't think it can work with it, as it's based on the old protocol. Best regards, -- Daniel Vérité PostgreSQL-powered mailer: http://www.manitou-mail.org Twitter: @DanielVerite -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Attachment
On 8 March 2017 at 00:52, Daniel Verite <daniel@manitou-mail.org> wrote: > Vaishnavi Prabakaran wrote: > >> Yes, I have created a new patch entry into the commitfest 2017-03 and >> attached the latest patch with this e-mail. > > Please find attached a companion patch implementing the batch API in > pgbench, exposed as \beginbatch and \endbatch meta-commands > (without documentation). > > The idea for now is to make it easier to exercise the API and test > how batching performs. I guess I'll submit the patch separately in > a future CF, depending on when/if the libpq patch goes in. That's great, thanks, and thanks also for the fixes. Any chance you can attach your updated patch? I looked at modifying psql to support batching when run non-interactively, but it would've required major restructuring of its control loop and I ran out of time. I didn't think of modifying pgbench. Great to see. -- Craig Ringer http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
Vaishnavi Prabakaran wrote:
> Yes, I have created a new patch entry into the commitfest 2017-03 and
> attached the latest patch with this e-mail.
Please find attached a companion patch implementing the batch API in
pgbench, exposed as \beginbatch and \endbatch meta-commands
(without documentation).
The idea for now is to make it easier to exercise the API and test
how batching performs. I guess I'll submit the patch separately in
a future CF, depending on when/if the libpq patch goes in.
While developing this, I noted a few things with 0001-v4:
1. lack of initialization for count in PQbatchQueueCount.
Trivial fix:
--- a/src/interfaces/libpq/fe-exec.c
+++ b/src/interfaces/libpq/fe-exec.c
@@ -1874,7 +1874,7 @@ PQisBusy(PGconn *conn)
int
PQbatchQueueCount(PGconn *conn)
{
- int count;
+ int count = 0;
PGcommandQueueEntry *entry;
2. misleading error message in PQexecStart. It gets called by a few other
functions than PQexec, such as PQprepare. As I understand it, the intent
here is to forbid the synchronous functions in batch mode, so this error
message should not single out PQexec.
@@ -1932,6 +2425,13 @@ PQexecStart(PGconn *conn)
if (!conn)
return false;
+ if (conn->asyncStatus == PGASYNC_QUEUED || conn->batch_status !=
PQBATCH_MODE_OFF)
+ {
+ printfPQExpBuffer(&conn->errorMessage,
+ libpq_gettext("cannot
PQexec in batch mode\n"));
+ return false;
+ }
+
3. In relation to #2, PQsendQuery() is not forbidden in batch mode
although I don't think it can work with it, as it's based on the old
protocol.
Attachment
Vaishnavi Prabakaran wrote: > if (PQbatchStatus(st->con) != PQBATCH_MODE_ON) > > But, it is better to use if (PQbatchStatus(st->con) == > PQBATCH_MODE_OFF) for this verification. Reason is there is one more state > in batch mode - PQBATCH_MODE_ABORTED. So, if the batch mode is in aborted > status, this check will still assume that the connection is not in batch > mode. > In a same way, "if(PQbatchStatus(st->con) == PQBATCH_MODE_ON)" check is > better to be modified as "PQbatchStatus(st->con) != PQBATCH_MODE_OFF". Agreed, these two tests need to be changed to account for the PQBATCH_MODE_ABORTED case. Fixed in new patch. > 2. @@ -2207,7 +2227,47 @@ doCustom(TState *thread, CState *st, StatsData > *agg) > + if (PQbatchStatus(st->con) != PQBATCH_MODE_OFF) > + { > + commandFailed(st, "already in batch mode"); > + break; > + } > > This is not required as below PQbatchBegin() internally does this > verification. > > + PQbatchBegin(st->con); The point of this test is to specifically forbid a sequence like this \beginbatch ...(no \endbatch) \beginbatch ... and according to the doc for PQbatchBegin, it looks like the return code won't tell us: "Causes a connection to enter batch mode if it is currently idle or already in batch mode. int PQbatchBegin(PGconn *conn); Returns 1 for success" My understanding is that "already in batch mode" is not an error. "Returns 0 and has no effect if the connection is not currently idle, i.e. it has a result ready, is waiting for more input from the server, etc. This function does not actually send anything to the server, it just changes the libpq connection state" > 3. It is better to check the return value of PQbatchBegin() rather than > assuming success. E.g: PQbatchBegin() will return false if the connection > is in copy in/out/both states. Given point #2 above, I think both tests are needed: if (PQbatchStatus(st->con) != PQBATCH_MODE_OFF) { commandFailed(st, "already in batch mode"); break; } if (PQbatchBegin(st->con) == 0) { commandFailed(st, "failed to start a batch"); break; } > > 3. In relation to #2, PQsendQuery() is not forbidden in batch mode > > although I don't think it can work with it, as it's based on the old > > protocol. > > > It is actually forbidden and you can see the error message "cannot > PQsendQuery in batch mode, use PQsendQueryParams" when you use > PQsendQuery(). Sorry for blaming the innocent piece of code, looking closer it's pgbench that does not display the message. With the simple query protocol, sendCommand() does essentially: r = PQsendQuery(st->con, sql); ... other stuff... if (r == 0) { if (debug) fprintf(stderr, "client %d could not send %s\n", st->id, command->argv[0]); st->ecnt++; return false; } So only in debug mode does it inform the user that it failed, and even then, it does not display PQerrorMessage(st->con). In non-debug mode it's opaque to the user. If it always fail, it appears to loop forever. The caller has this comment that is relevant to the problem: if (!sendCommand(st, command)) { /* * Failed. Stay in CSTATE_START_COMMAND state, to * retry. ??? What the point or retrying? Should * rather abort? */ return; } I think it doesn't bother anyone up to now because the immediate failure modes of PQsendQuery() are resource allocation or protocol failures that tend to never happen. Anyway currently \beginbatch avoids the problem by checking whether querymode == QUERY_SIMPLE, to fail gracefully instead of letting the endless loop happen. Best regards, -- Daniel Vérité PostgreSQL-powered mailer: http://www.manitou-mail.org Twitter: @DanielVerite -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Attachment
Hi, I notice that PQsetSingleRowMode() doesn't work when getting batch results. The function is documented as: " int PQsetSingleRowMode(PGconn *conn); This function can only be called immediately after PQsendQuery or one of its sibling functions, before any other operation on the connection such as PQconsumeInput or PQgetResult" But PQbatchQueueProcess() unconditionally clears conn->singleRowMode, so whatever it was when sending the query gets lost, and besides other queries might have been submitted in the meantime. Also if trying to set that mode when fetching like this while (QbatchQueueProcess(conn)) { r = PQsetSingleRowMode(conn); if (r!=1) { fprintf(stderr, "PQsetSingleRowMode() failed"); } .. it might work the first time, but on the next iterations, conn->asyncStatus might be PGASYNC_READY, which is a failure condition for PQsetSingleRowMode(), so that won't do. ISTM that the simplest fix would be that when in batch mode, PQsetSingleRowMode() should register that the last submitted query does request that mode, and when later QbatchQueueProcess dequeues the batch entry for the corresponding query, this flag would be popped off and set as the current mode. Please find attached the suggested fix, against the v5 of the patch. Best regards, -- Daniel Vérité PostgreSQL-powered mailer: http://www.manitou-mail.org Twitter: @DanielVerite -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Attachment
Hi,
I notice that PQsetSingleRowMode() doesn't work when getting batch results.
The function is documented as:
" int PQsetSingleRowMode(PGconn *conn);
This function can only be called immediately after PQsendQuery or one
of its sibling functions, before any other operation on the connection
such as PQconsumeInput or PQgetResult"
But PQbatchQueueProcess() unconditionally clears conn->singleRowMode,
so whatever it was when sending the query gets lost, and besides
other queries might have been submitted in the meantime.
Also if trying to set that mode when fetching like this
while (QbatchQueueProcess(conn)) {
r = PQsetSingleRowMode(conn);
if (r!=1) {
fprintf(stderr, "PQsetSingleRowMode() failed");
}
..
it might work the first time, but on the next iterations, conn->asyncStatus
might be PGASYNC_READY, which is a failure condition for
PQsetSingleRowMode(), so that won't do.
ISTM that the simplest fix would be that when in batch mode,
PQsetSingleRowMode() should register that the last submitted query
does request that mode, and when later QbatchQueueProcess dequeues
the batch entry for the corresponding query, this flag would be popped off
and set as the current mode.
Please find attached the suggested fix, against the v5 of the patch.
On 13 March 2017 at 08:54, Vaishnavi Prabakaran <vaishnaviprabakaran@gmail.com> wrote: > Before going with this fix, I would like you to consider the option of > asking batch processing users(via documentation) to set single-row mode > before calling PQgetResult(). > Either way we need to fix the documentation part, letting users know how > they can activate single-row mode while using batch processing. > Let me know your thoughts. Thanks for looking into this, it's clear that I didn't cover the combo of single-row-mode and batch mode adequately. -- Craig Ringer http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
Vaishnavi Prabakaran wrote: > > while (QbatchQueueProcess(conn)) { > > r = PQsetSingleRowMode(conn); > > if (r!=1) { > > fprintf(stderr, "PQsetSingleRowMode() failed"); > > } > > .. > Thanks for investigating the problem, and could you kindly explain what > "next iteration" you mean here? Because I don't see any problem in > following sequence of calls - PQbatchQueueProcess(),PQsetSingleRowMode() > , PQgetResult() I mean the next iteration of the above while statement. Referring to the doc, that would be the "next batch entry": " To get the result of the first batch entry the client must call PQbatchQueueProcess. It must then call PQgetResult and handle the results until PQgetResult returns null (or would return null if called). The result from the next batch entry may then be retrieved using PQbatchQueueProcess and the cycle repeated" Attached is a bare-bones testcase showing the problem. As it is, it works, retrieving results for three "SELECT 1" in the same batch. Now in order to use the single-row fetch mode, consider adding this: r = PQsetSingleRowMode(conn); if (r!=1) { fprintf(stderr, "PQsetSingleRowMode() failed for i=%d\n", i); } When inserted after the call to PQbatchQueueProcess, which is what I understand you're saying works for you, it fails for me when starting to get the results of the 2nd query and after. Best regards, -- Daniel Vérité PostgreSQL-powered mailer: http://www.manitou-mail.org Twitter: @DanielVerite -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Attachment
I mean the next iteration of the above while statement. Referring
to the doc, that would be the "next batch entry":
" To get the result of the first batch entry the client must call
PQbatchQueueProcess. It must then call PQgetResult and handle the
results until PQgetResult returns null (or would return null if
called). The result from the next batch entry may then be retrieved
using PQbatchQueueProcess and the cycle repeated"
Attached is a bare-bones testcase showing the problem.
As it is, it works, retrieving results for three "SELECT 1"
in the same batch. Now in order to use the single-row
fetch mode, consider adding this:
r = PQsetSingleRowMode(conn);
if (r!=1) {
fprintf(stderr, "PQsetSingleRowMode() failed for i=%d\n", i);
}
When inserted after the call to PQbatchQueueProcess,
which is what I understand you're saying works for you,
it fails for me when starting to get the results of the 2nd query
and after.
Attachment
On Tue, Mar 14, 2017 at 4:19 AM, Daniel Verite <daniel@manitou-mail.org> wrote:
I mean the next iteration of the above while statement. Referring
to the doc, that would be the "next batch entry":
" To get the result of the first batch entry the client must call
PQbatchQueueProcess. It must then call PQgetResult and handle the
results until PQgetResult returns null (or would return null if
called). The result from the next batch entry may then be retrieved
using PQbatchQueueProcess and the cycle repeated"
Attached is a bare-bones testcase showing the problem.
As it is, it works, retrieving results for three "SELECT 1"
in the same batch. Now in order to use the single-row
fetch mode, consider adding this:
r = PQsetSingleRowMode(conn);
if (r!=1) {
fprintf(stderr, "PQsetSingleRowMode() failed for i=%d\n", i);
}
When inserted after the call to PQbatchQueueProcess,
which is what I understand you're saying works for you,
it fails for me when starting to get the results of the 2nd query
and after.Thanks for explaining the issue in detail and yes, I do see the issue using your attached test file.And I think the problem is with the "parseInput" call at the end of PQbatchQueueProcess().I don't see the need for "parseInput" to cover the scope of PQbatchQueueProcess(), also anyways we do it in PQgetResult().This function call changes the asyncstatus of connection to READY(following command complete message from backend), so eventually PQsetSingleRowMode() is failing. So, attached the alternative fix for this issue.
Please share me your thoughts.I would also like to hear Craig's opinion on it before applying this fix to the original patch, just to make sure am not missing anything here.
Attachment
Vaishnavi Prabakaran wrote: > So, attached the alternative fix for this issue. > Please share me your thoughts. I assume you prefer the alternative fix because it's simpler. > I would also like to hear Craig's opinion on it before applying this fix > to the original patch, just to make sure am not missing anything here. +1 The main question is whether the predicates enforced by PQsetSingleRowMode() apply in batch mode in all cases when it's legit to call that function. Two predicates that may be problematic are:if (conn->asyncStatus != PGASYNC_BUSY) return 0; andif (conn->result) return 0; The general case with batch mode is that, from the doc: "The client interleaves result processing with sending batch queries" Note that I've not even tested that here, I've tested batching a bunch of queries in a first step and getting the results in a second step. I am not confident that the above predicates will be true in all cases. Also your alternative fix assumes that we add a user-visible exception to PQsetSingleRowMode in batch mode, whereby it must not be called as currently documented: "call PQsetSingleRowMode immediately after a successful call of PQsendQuery (or a sibling function)" My gut feeling is that it's not the right direction, I prefer making the single-row a per-query attribute internally and keep PQsetSingleRowMode's contract unchanged from the user's perspective. Best regards, -- Daniel Vérité PostgreSQL-powered mailer: http://www.manitou-mail.org Twitter: @DanielVerite
Vaishnavi Prabakaran wrote:
> So, attached the alternative fix for this issue.
> Please share me your thoughts.
I assume you prefer the alternative fix because it's simpler.
PQsetSingleRowMode
immediately after a successful call of PQsendQuery
(or a sibling function). This mode selection is effective only for the currently executing query. Then call PQgetResult
repeatedly..."PQsetSingleRowMode
immediately after a successful call of PQbatchQueueProcess
. This mode selection is effective only for the currently executing query. For more information on the use of PQsetSingleRowMode
, refer to Section 33.6, “Retrieving Query Results Row-By-Row”. "
> I would also like to hear Craig's opinion on it before applying this fix
> to the original patch, just to make sure am not missing anything here.
+1
The main question is whether the predicates enforced
by PQsetSingleRowMode() apply in batch mode in all cases
when it's legit to call that function. Two predicates
that may be problematic are:
if (conn->asyncStatus != PGASYNC_BUSY)
return 0;
and
if (conn->result)
return 0;
The general case with batch mode is that, from the doc:
"The client interleaves result processing with sending batch queries"
Note that I've not even tested that here,
I've tested
batching a bunch of queries in a first step and getting the results
in a second step.
I am not confident that the above predicates will be true
in all cases.
Also your alternative fix assumes that we add
a user-visible exception to PQsetSingleRowMode in batch mode,
whereby it must not be called as currently documented:
"call PQsetSingleRowMode immediately after a successful call of
PQsendQuery (or a sibling function)"
My gut feeling is that it's not the right direction, I prefer making
the single-row a per-query attribute internally and keep
PQsetSingleRowMode's contract unchanged from the
user's perspective.
Hi Vaishnavi, On 3/19/17 9:32 PM, Vaishnavi Prabakaran wrote: > On Fri, Mar 17, 2017 at 12:37 AM, Daniel Verite <daniel@manitou-mail.org > > Please let me know if you think this is not enough but wanted to update > section 33.6 also? Daniel, any input here? > > I would also like to hear Craig's opinion on it before applying this fix > > to the original patch, just to make sure am not missing anything here. Craig? > > Am going to include the test which you shared in the test patch. Please > let me know if you want to cover anymore specific cases to gain > confidence. I have marked this submission "Waiting for Author" since it appears a new patch is required based on input and adding a new test. -- -David david@pgmasters.net
On 24 March 2017 at 23:21, David Steele <david@pgmasters.net> wrote: > Hi Vaishnavi, > > On 3/19/17 9:32 PM, Vaishnavi Prabakaran wrote: >> >> On Fri, Mar 17, 2017 at 12:37 AM, Daniel Verite <daniel@manitou-mail.org >> >> Please let me know if you think this is not enough but wanted to update >> section 33.6 also? > > > Daniel, any input here? > >> > I would also like to hear Craig's opinion on it before applying this >> fix >> > to the original patch, just to make sure am not missing anything >> here. > > > Craig? I'm fairly confident that I overlooked single row mode entirely in the original patch, though it's long enough ago that it's hard for me to remember exactly. I don't really have much of an opinion on the best handling of it. I would expect to be setting single-row mode just before I requested the *result* from the next pending query, since it relates to result processing rather than query dispatch. But that's about all the opinion I have here. -- Craig Ringer http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
On Sat, Mar 25, 2017 at 9:50 PM, Craig Ringer <craig@2ndquadrant.com> wrote: > I'm fairly confident that I overlooked single row mode entirely in the > original patch, though it's long enough ago that it's hard for me to > remember exactly. > > I don't really have much of an opinion on the best handling of it. > > I would expect to be setting single-row mode just before I requested > the *result* from the next pending query, since it relates to result > processing rather than query dispatch. But that's about all the > opinion I have here. Yeah, I think that it makes sense to allow users to switch to single row mode before requesting a result in the queue. It seems to me that this should also be effective only during the fetching of one single result set. When the client moves on to the next item in the queue we should make necessary again a call to PQsetSingleRowMode(). Regarding the patch, I have spotted the following things in the last version: - src/test/modules/Makefile should include test_libpq. - Regression tests from 0002 are failing: ok 1 - testlibpqbatch disallowed_in_batch ok 2 - testlibpqbatch simple_batch ok 3 - testlibpqbatch multi_batch ok 4 - testlibpqbatch batch_abort ok 5 - testlibpqbatch timings not ok 6 - testlibpqbatch copyfailure -- Michael
On 27 March 2017 at 15:26, Michael Paquier <michael.paquier@gmail.com> wrote: > On Sat, Mar 25, 2017 at 9:50 PM, Craig Ringer <craig@2ndquadrant.com> wrote: >> I'm fairly confident that I overlooked single row mode entirely in the >> original patch, though it's long enough ago that it's hard for me to >> remember exactly. >> >> I don't really have much of an opinion on the best handling of it. >> >> I would expect to be setting single-row mode just before I requested >> the *result* from the next pending query, since it relates to result >> processing rather than query dispatch. But that's about all the >> opinion I have here. > > Yeah, I think that it makes sense to allow users to switch to single > row mode before requesting a result in the queue. It seems to me that > this should also be effective only during the fetching of one single > result set. When the client moves on to the next item in the queue we > should make necessary again a call to PQsetSingleRowMode(). > > Regarding the patch, I have spotted the following things in the last version: > - src/test/modules/Makefile should include test_libpq. > - Regression tests from 0002 are failing: > ok 1 - testlibpqbatch disallowed_in_batch > ok 2 - testlibpqbatch simple_batch > ok 3 - testlibpqbatch multi_batch > ok 4 - testlibpqbatch batch_abort > ok 5 - testlibpqbatch timings > not ok 6 - testlibpqbatch copyfailure There are only a few more days left of this commit fest. Things are sounding pretty ready here, with some final cleanups pending. It'd be cool to get this into Pg 10 :) -- Craig Ringer http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
On Mon, Mar 27, 2017 at 4:42 PM, Craig Ringer <craig@2ndquadrant.com> wrote: > On 27 March 2017 at 15:26, Michael Paquier <michael.paquier@gmail.com> wrote: >> On Sat, Mar 25, 2017 at 9:50 PM, Craig Ringer <craig@2ndquadrant.com> wrote: >>> I'm fairly confident that I overlooked single row mode entirely in the >>> original patch, though it's long enough ago that it's hard for me to >>> remember exactly. >>> >>> I don't really have much of an opinion on the best handling of it. >>> >>> I would expect to be setting single-row mode just before I requested >>> the *result* from the next pending query, since it relates to result >>> processing rather than query dispatch. But that's about all the >>> opinion I have here. >> >> Yeah, I think that it makes sense to allow users to switch to single >> row mode before requesting a result in the queue. It seems to me that >> this should also be effective only during the fetching of one single >> result set. When the client moves on to the next item in the queue we >> should make necessary again a call to PQsetSingleRowMode(). >> >> Regarding the patch, I have spotted the following things in the last version: >> - src/test/modules/Makefile should include test_libpq. >> - Regression tests from 0002 are failing: >> ok 1 - testlibpqbatch disallowed_in_batch >> ok 2 - testlibpqbatch simple_batch >> ok 3 - testlibpqbatch multi_batch >> ok 4 - testlibpqbatch batch_abort >> ok 5 - testlibpqbatch timings >> not ok 6 - testlibpqbatch copyfailure > > There are only a few more days left of this commit fest. > > Things are sounding pretty ready here, with some final cleanups > pending. It'd be cool to get this into Pg 10 :) Yes, that's one of the items I'd like to help with by the end of the CF. -- Michael
Vaishnavi Prabakaran wrote: > Please let me know if you think this is not enough but wanted to update > section 33.6 also? Yes, if the right place to call PQsetSingleRowMode() is immediately after PQbatchQueueProcess(), I think we need to update "33.6. Retrieving Query Results Row-By-Row" with that information, otherwise what it says is just wrong when in batch mode. Also, in 33.5, I suggest to not use the "currently executing query" as a synonym for the "query currently being processed" to avoid any confusion between what the backend is executing and a prior query whose results are being collected by the client at the same moment. Best regards, -- Daniel Vérité PostgreSQL-powered mailer: http://www.manitou-mail.org Twitter: @DanielVerite
Vaishnavi Prabakaran wrote: > Am going to include the test which you shared in the test patch. Please let > me know if you want to cover anymore specific cases to gain confidence. One bit of functionality that does not work in batch mode and is left as is by the patch is the PQfn() fast path interface and the large object functions that use it. Currently, calling lo_* functions inside a batch will fail with a message that depends on whether the internal lo_initialize() has been successfully called before. If it hasn't, PQerrorMessage() will be: "Synchronous command execution functions are not allowed in batch mode" which is fine, but it comes by happenstance from lo_initialize() calling PQexec() to fetch the function OIDs from pg_catalog.pg_proc. OTOH, if lo_initialize() has succeeded before, a call to a large object function will fail with a different message: "connection in wrong state" which is emitted by PQfn() based on conn->asyncStatus != PGASYNC_IDLE Having an unified error message would be more user friendly. Concerning the doc, when saying in 33.5.2:"In batch mode only asynchronous operations are permitted" the server-side functions are indeed ruled out, since PQfn() is synchronous, but maybe we should be a bit more explicit about that? Chapter 34.3 (Large Objects / Client Interfaces) could also mention the incompatibility with batch mode. Best regards, -- Daniel Vérité PostgreSQL-powered mailer: http://www.manitou-mail.org Twitter: @DanielVerite
>this should also be effective only during the fetching of one single
>result set. When the client moves on to the next item in the queue we
>should make necessary again a call to PQsetSingleRowMode().
>> section 33.6 also?
>Yes, if the right place to call PQsetSingleRowMode() is immediately
>after PQbatchQueueProcess(), I think we need to update
>"33.6. Retrieving Query Results Row-By-Row"
>with that information, otherwise what it says is just wrong
>when in batch mode.
>Also, in 33.5, I suggest to not use the "currently executing
>query" as a synonym for the "query currently being processed"
>to avoid any confusion between what the backend is executing
>and a prior query whose results are being collected by the client
>at the same moment.
One bit of functionality that does not work in batch mode and is left
as is by the patch is the PQfn() fast path interface and the large object
functions that use it.
Currently, calling lo_* functions inside a batch will fail with a message
that depends on whether the internal lo_initialize() has been successfully
called before.
If it hasn't, PQerrorMessage() will be:
"Synchronous command execution functions are not allowed in batch mode"
which is fine, but it comes by happenstance from lo_initialize()
calling PQexec() to fetch the function OIDs from pg_catalog.pg_proc.
OTOH, if lo_initialize() has succeeded before, a call to a large
object function will fail with a different message:
"connection in wrong state"
which is emitted by PQfn() based on conn->asyncStatus != PGASYNC_IDLE
Having an unified error message would be more user friendly.
Concerning the doc, when saying in 33.5.2:
"In batch mode only asynchronous operations are permitted"
the server-side functions are indeed ruled out, since PQfn() is
synchronous, but maybe we should be a bit more explicit
about that?
Chapter 34.3 (Large Objects / Client Interfaces) could also
mention the incompatibility with batch mode.
Attachment
On Tue, Mar 28, 2017 at 1:57 PM, Vaishnavi Prabakaran <vaishnaviprabakaran@gmail.com> wrote: > Thanks Craig and Michael for confirming that "PQsetSingleRowMode" should be > called right after "PQbatchQueueProcess". > > Michael Paquier wrote: > >> It seems to me that >>this should also be effective only during the fetching of one single >>result set. When the client moves on to the next item in the queue we >>should make necessary again a call to PQsetSingleRowMode(). > > Yes, the current behavior(with V6 code patch) is exactly the same as you > described above. PQsetSingleRowMode() should be called each time after > "PQbatchQueueProcess" to set result processing to single-row mode. Okay, that's fine for me then. > Attached the latest patch and here is the RT run result: > ok 1 - testlibpqbatch disallowed_in_batch > ok 2 - testlibpqbatch simple_batch > ok 3 - testlibpqbatch multi_batch > ok 4 - testlibpqbatch batch_abort > ok 5 - testlibpqbatch timings > ok 6 - testlibpqbatch copyfailure > ok 7 - testlibpqbatch test_singlerowmode > ok > All tests successful. > Files=1, Tests=7, 5 wallclock secs ( 0.01 usr 0.00 sys + 1.79 cusr 0.35 > csys = 2.15 CPU) > Result: PASS ok 1 - testlibpqbatch disallowed_in_batch ok 2 - testlibpqbatch simple_batch ok 3 - testlibpqbatch multi_batch ok 4 - testlibpqbatch batch_abort ok 5 - testlibpqbatch timings not ok 6 - testlibpqbatch copyfailure Hm... Not here. I saw something like that a couple of days ago on my macos laptop and that was related to a variable not initialized. From 001_libpq_async_main.log: 7-03-28 17:05:49.159 JST [31553] t/001_libpq_async.pl LOG: execute <unnamed>: copy batch_demo(id) to stdout; 2017-03-28 17:05:49.159 JST [31553] t/001_libpq_async.pl LOG: execute <unnamed>: copy batch_demo(itemno) FROM stdin; 2017-03-28 17:05:49.160 JST [31553] t/001_libpq_async.pl ERROR: unexpected message type 0x50 during COPY from stdin 2017-03-28 17:05:49.160 JST [31553] t/001_libpq_async.pl CONTEXT: COPY batch_demo, line 1 From regress_log_001_libpq_async : ok 5 - testlibpqbatch timings # Running: testlibpqbatch dbname=postgres 10000 copyfailure dispatching SELECT query failed: cannot queue commands during COPY COPYBUF: 5 Error status and message got from server due to COPY command failure are : PGRES_FATAL_ERROR ERROR: unexpected message type 0x50 during COPY from stdin CONTEXT: COPY batch_demo, line 1 So it seems to me that you are still missing something.. src/test/modules/Makefile | 1 +src/test/modules/test_libpq/.gitignore | 5 +src/test/modules/test_libpq/Makefile | 25 +src/test/modules/test_libpq/README | 1 +src/test/modules/test_libpq/t/001_libpq_async.pl| 26 +src/test/modules/test_libpq/testlibpqbatch.c | 1661 ++++++++++++++++++++++6files changed, 1719 insertions(+) Could you as well update src/tools/msvc/vcregress.pl, aka the routine modulescheck so as this new test is skipped. I am sure that nobody will scream if this test is not run on Windows, but the buildfarm will complain if the patch is let in its current state. -- Michael
Michael Paquier wrote: > # Running: testlibpqbatch dbname=postgres 10000 copyfailure > dispatching SELECT query failed: cannot queue commands during COPY > > COPYBUF: 5 > > Error status and message got from server due to COPY command failure > are : PGRES_FATAL_ERROR ERROR: unexpected message type 0x50 during > COPY from stdin > CONTEXT: COPY batch_demo, line 1 Same result here. BTW the doc says: "In batch mode only asynchronous operations are permitted, and COPY is not recommended as it most likelywill trigger failure in batch processing" Yet it seems that the test assumes that it should work nonetheless. I don't quite understand what we expect of this test, given what's documented. Or what am I missing? While looking at the regress log, I noticed multiple spurious test_singlerowmode tests among the others. Should be fixed with: --- a/src/test/modules/test_libpq/testlibpqbatch.c +++ b/src/test/modules/test_libpq/testlibpqbatch.c @@ -1578,6 +1578,7 @@ main(int argc, char **argv) run_batch_abort = 0; run_timings = 0; run_copyfailure= 0; + run_singlerowmode = 0; if (strcmp(argv[3], "disallowed_in_batch") == 0) run_disallowed_in_batch= 1; else if (strcmp(argv[3], "simple_batch") == 0) There's also the fact that this test doesn't care whether it fails (compare with all the "goto fail" of the other tests). And it happens that PQsetSingleRowMode() fails after the call to PQbatchQueueProcess() that instantiates a PGresult of status PGRES_BATCH_END to indicate the end of the batch, To me this shows how this way of setting the single row mode is not ideal. In order to avoid the failure, the loop should know in advance what kind of result it's going to dequeue before calling PQgetResult(), which doesn't feel right. Best regards, -- Daniel Vérité PostgreSQL-powered mailer: http://www.manitou-mail.org Twitter: @DanielVerite
>modulescheck so as this new test is skipped. I am sure that nobody
>will scream if this test is not run on Windows, but the buildfarm will
>complain if the patch is let in its current state.
Michael Paquier wrote:
> # Running: testlibpqbatch dbname=postgres 10000 copyfailure
> dispatching SELECT query failed: cannot queue commands during COPY
>
> COPYBUF: 5
>
> Error status and message got from server due to COPY command failure
> are : PGRES_FATAL_ERROR ERROR: unexpected message type 0x50 during
> COPY from stdin
> CONTEXT: COPY batch_demo, line 1
Same result here.
BTW the doc says:
"In batch mode only asynchronous operations are permitted, and COPY is
not recommended as it most likely will trigger failure in batch
processing"
Yet it seems that the test assumes that it should work nonetheless.
I don't quite understand what we expect of this test, given what's
documented. Or what am I missing?
While looking at the regress log, I noticed multiple spurious
test_singlerowmode tests among the others. Should be fixed with:
--- a/src/test/modules/test_libpq/testlibpqbatch.c
+++ b/src/test/modules/test_libpq/testlibpqbatch.c
@@ -1578,6 +1578,7 @@ main(int argc, char **argv)
run_batch_abort = 0;
run_timings = 0;
run_copyfailure = 0;
+ run_singlerowmode = 0;
if (strcmp(argv[3], "disallowed_in_batch") == 0)
run_disallowed_in_batch = 1;
else if (strcmp(argv[3], "simple_batch") == 0)
There's also the fact that this test doesn't care whether it fails
(compare with all the "goto fail" of the other tests).
And it happens that PQsetSingleRowMode() fails after the call to
PQbatchQueueProcess() that instantiates a PGresult
of status PGRES_BATCH_END to indicate the end of the batch,
To me this shows how this way of setting the single row mode is not ideal.
In order to avoid the failure, the loop should know in advance
what kind of result it's going to dequeue before calling PQgetResult(),
which doesn't feel right.
Attachment
On Wed, Mar 29, 2017 at 12:40 PM, Vaishnavi Prabakaran <vaishnaviprabakaran@gmail.com> wrote: > Michael Paquier wrote: >>Could you as well update src/tools/msvc/vcregress.pl, aka the routine >>modulescheck so as this new test is skipped. I am sure that nobody >>will scream if this test is not run on Windows, but the buildfarm will >>complain if the patch is let in its current state. > > Thanks for the finding. Hmm, modulescheck will not pick up test_libpq as > "subdircheck" will fail for this module(because it does not have > "sql"/"expected" folders in it) and hence it will be skipped. > But, Modified "Mkvcbuild.pm" to exclude "test_libpq" module, as this test > anyways will not be run in Windows and also because it uses linux specific > APIs(gettimeofday , timersub). src/port/gettimeofday.c extends things on Windows, and we could consider to just drop the timing portion for simplicity (except Windows I am not seeing any platform missing timersub). But that's just a point of detail. Or we could just drop those tests from the final patch, and re-implement them after having some psql-level subcommands. That would far better for portability. > There are 2 cases tested here: > > 1. Example for the case - Using COPY command in batch mode will trigger > failure. (Specified in documentation) > Failure can be seen only when processing the copy command's results. That > is, after dispatching COPY command, server goes into COPY state , but the > client dispatched another query in batch mode. > > Below error messages belongs to this case : > Error status and message got from server due to COPY command failure are : > PGRES_FATAL_ERROR ERROR: unexpected message type 0x50 during COPY from > stdin > CONTEXT: COPY batch_demo, line 1 > > message type 0x5a arrived from server while idle Such messages are really confusing to the user as they refer to the protocol going out of sync. I would think that something like "cannot process COPY results during a batch processing" would be cleaner. Isn't some more error handling necessary in GetCopyStart()? > Attached the latest code and test patches. For me the test is still broken: ok 1 - testlibpqbatch disallowed_in_batch ok 2 - testlibpqbatch simple_batch ok 3 - testlibpqbatch multi_batch ok 4 - testlibpqbatch batch_abort ok 5 - testlibpqbatch timings not ok 6 - testlibpqbatch copyfailure Still broken here. I can see that this patch is having enough safeguards in PQbatchBegin() and PQsendQueryStart(), but this issue is really pointing out at something... -- Michael
On Wed, Mar 29, 2017 at 12:40 PM, Vaishnavi Prabakaran
<vaishnaviprabakaran@gmail.com> wrote:
> Michael Paquier wrote:
>>Could you as well update src/tools/msvc/vcregress.pl, aka the routine
>>modulescheck so as this new test is skipped. I am sure that nobody
>>will scream if this test is not run on Windows, but the buildfarm will
>>complain if the patch is let in its current state.
>
> Thanks for the finding. Hmm, modulescheck will not pick up test_libpq as
> "subdircheck" will fail for this module(because it does not have
> "sql"/"expected" folders in it) and hence it will be skipped.
> But, Modified "Mkvcbuild.pm" to exclude "test_libpq" module, as this test
> anyways will not be run in Windows and also because it uses linux specific
> APIs(gettimeofday , timersub).
src/port/gettimeofday.c extends things on Windows, and we could
consider to just drop the timing portion for simplicity (except
Windows I am not seeing any platform missing timersub). But that's
just a point of detail. Or we could just drop those tests from the
final patch, and re-implement them after having some psql-level
subcommands. That would far better for portability.
> There are 2 cases tested here:
>
> 1. Example for the case - Using COPY command in batch mode will trigger
> failure. (Specified in documentation)
> Failure can be seen only when processing the copy command's results. That
> is, after dispatching COPY command, server goes into COPY state , but the
> client dispatched another query in batch mode.
>
> Below error messages belongs to this case :
> Error status and message got from server due to COPY command failure are :
> PGRES_FATAL_ERROR ERROR: unexpected message type 0x50 during COPY from
> stdin
> CONTEXT: COPY batch_demo, line 1
>
> message type 0x5a arrived from server while idle
Such messages are really confusing to the user as they refer to the
protocol going out of sync. I would think that something like "cannot
process COPY results during a batch processing" would be cleaner.
Isn't some more error handling necessary in GetCopyStart()?
> Attached the latest code and test patches.
For me the test is still broken:
ok 1 - testlibpqbatch disallowed_in_batch
ok 2 - testlibpqbatch simple_batch
ok 3 - testlibpqbatch multi_batch
ok 4 - testlibpqbatch batch_abort
ok 5 - testlibpqbatch timings
not ok 6 - testlibpqbatch copyfailure
Still broken here. I can see that this patch is having enough
safeguards in PQbatchBegin() and PQsendQueryStart(), but this issue is
really pointing out at something...
Vaishnavi Prabakaran wrote: > Hmm, With batch mode, after sending COPY command to server(and server > started processing the query and goes into COPY state) , client does not > immediately read the result , but it keeps sending other queries to the > server. By this time, server already encountered the error > scenario(Receiving different message during COPY state) and sent error > messages IOW, the test intentionally violates the protocol and then all goes wonky because of that. That's why I was wondering upthread what's it's supposed to test. I mean, regression tests are meant to warn against a desirable behavior being unknowingly changed by new code into an undesirable behavior. Here we have the undesirable behavior to start with. What kind of regression could we fear from that? Best regards, -- Daniel Vérité PostgreSQL-powered mailer: http://www.manitou-mail.org Twitter: @DanielVerite
Hi, On 3/30/17 2:12 PM, Daniel Verite wrote: > Vaishnavi Prabakaran wrote: > >> Hmm, With batch mode, after sending COPY command to server(and server >> started processing the query and goes into COPY state) , client does not >> immediately read the result , but it keeps sending other queries to the >> server. By this time, server already encountered the error >> scenario(Receiving different message during COPY state) and sent error >> messages > > IOW, the test intentionally violates the protocol and then all goes wonky > because of that. > That's why I was wondering upthread what's it's supposed to test. > I mean, regression tests are meant to warn against a desirable behavior > being unknowingly changed by new code into an undesirable behavior. > Here we have the undesirable behavior to start with. > What kind of regression could we fear from that? The CF has been extended until April 7 but time is still growing short. Please respond with a new patch by 2017-04-04 00:00 AoE (UTC-12) or this submission will be marked "Returned with Feedback". Thanks, -- -David david@pgmasters.net
Hi,
On 3/30/17 2:12 PM, Daniel Verite wrote:Vaishnavi Prabakaran wrote:Hmm, With batch mode, after sending COPY command to server(and server
started processing the query and goes into COPY state) , client does not
immediately read the result , but it keeps sending other queries to the
server. By this time, server already encountered the error
scenario(Receiving different message during COPY state) and sent error
messages
IOW, the test intentionally violates the protocol and then all goes wonky
because of that.
That's why I was wondering upthread what's it's supposed to test.
I mean, regression tests are meant to warn against a desirable behavior
being unknowingly changed by new code into an undesirable behavior.
Here we have the undesirable behavior to start with.
What kind of regression could we fear from that?
The CF has been extended until April 7 but time is still growing short. Please respond with a new patch by 2017-04-04 00:00 AoE (UTC-12) or this submission will be marked "Returned with Feedback".
Attachment
On 2017-04-03 14:10:47 +1000, Vaishnavi Prabakaran wrote: > > The CF has been extended until April 7 but time is still growing short. > > Please respond with a new patch by 2017-04-04 00:00 AoE (UTC-12) or this > > submission will be marked "Returned with Feedback". > > > > > Thanks for the information, attached the latest patch resolving one > compilation warning. And, please discard the test patch as it will be > re-implemented later separately. Hm. If the tests aren't ready yet, it seems we'll have to move this to the next CF. > + <sect1 id="libpq-batch-mode"> > + <title>Batch mode and query pipelining</title> > + > + <indexterm zone="libpq-batch-mode"> > + <primary>libpq</primary> > + <secondary>batch mode</secondary> > + </indexterm> > + > + <indexterm zone="libpq-batch-mode"> > + <primary>libpq</primary> > + <secondary>pipelining</secondary> > + </indexterm> > + > + <para> > + <application>libpq</application> supports queueing up multiple queries into > + a pipeline to be executed as a batch on the server. Batching queries allows > + applications to avoid a client/server round-trip after each query to get > + the results before issuing the next query. > + </para> "queueing .. into a pipeline" sounds weird to me - but I'm not a native speaker. Also batching != pipelining. > + <sect2> > + <title>When to use batching</title> > + > + <para> > + Much like asynchronous query mode, there is no performance disadvantage to > + using batching and pipelining. It increases client application complexity > + and extra caution is required to prevent client/server deadlocks but > + offers considerable performance improvements. > + </para> s/offers/can sometimes offer/ > + <sect2 id="libpq-batch-using"> > + <title>Using batch mode</title> > + > + <para> > + To issue batches the application must switch > + <application>libpq</application> into batch mode. s/libpq/a connection/? > Enter batch mode with <link > + linkend="libpq-PQbatchBegin"><function>PQbatchBegin(conn)</function></link> or test > + whether batch mode is active with <link > + linkend="libpq-PQbatchStatus"><function>PQbatchStatus(conn)</function></link>. In batch mode only <link > + linkend="libpq-async">asynchronous operations</link> are permitted, and > + <literal>COPY</literal> is not recommended as it most likely will trigger failure in batch processing. > + (The restriction on <literal>COPY</literal> is an implementation > + limit; the PostgreSQL protocol and server can support batched > <literal>COPY</literal>). Hm, I'm unconvinced that that's a useful parenthetical in the libpq docs. > + <para> > + The client uses libpq's asynchronous query functions to dispatch work, > + marking the end of each batch with <function>PQbatchQueueSync</function>. > + Concurrently, it uses <function>PQgetResult</function> and > + <function>PQbatchQueueProcess</function> to get results. "Concurrently" imo is a dangerous word, somewhat implying multi-threading. > + <note> > + <para> > + It is best to use batch mode with <application>libpq</application> in > + <link linkend="libpq-pqsetnonblocking">non-blocking mode</link>. If used in > + blocking mode it is possible for a client/server deadlock to occur. The > + client will block trying to send queries to the server, but the server will > + block trying to send results from queries it has already processed to the > + client. This only occurs when the client sends enough queries to fill its > + output buffer and the server's receive buffer before switching to > + processing input from the server, but it's hard to predict exactly when > + that'll happen so it's best to always use non-blocking mode. > + </para> > + </note> Such deadlocks are possible just as well with non-blocking mode, unless one can avoid sending queries and switching to receiving results anytime in the application code. > + <para> > + Batched operations will be executed by the server in the order the client > + sends them. The server will send the results in the order the statements > + executed. The server may begin executing the batch before all commands > + in the batch are queued and the end of batch command is sent. If any > + statement encounters an error the server aborts the current transaction and > + skips processing the rest of the batch. Query processing resumes after the > + end of the failed batch. > + </para> What if a batch contains transaction boundaries? > + <sect3 id="libpq-batch-results"> > + <title>Processing results</title> > + > + <para> > + The client <link linkend="libpq-batch-interleave">interleaves result > + processing with sending batch queries</link>, or for small batches may > + process all results after sending the whole batch. > + </para> That's a very long <link> text, is it not? > + <para> > + To get the result of the first batch entry the client must call <link > + linkend="libpq-PQbatchQueueProcess"><function>PQbatchQueueProcess</function></link>. It must then call What does 'QueueProcess' mean? Shouldn't it be 'ProcessQueue'? You're not enquing a process or processing, right? > + <function>PQgetResult</function> and handle the results until > + <function>PQgetResult</function> returns null (or would return null if > + called). What is that parenthetical referring to? IIRC we don't provide any external way to determine PQgetResult would return NULL. Have you checked how this API works with PQsetSingleRowMode()? > + <para> > + To enter single-row mode, call <function>PQsetSingleRowMode</function> immediately after a > + successful call of <function>PQbatchQueueProcess</function>. This mode selection is effective > + only for the query currently being processed. For more information on the use of <function>PQsetSingleRowMode > + </function>, refer to <xref linkend="libpq-single-row-mode">. Hah ;). > + <para> > + The client must not assume that work is committed when it > + <emphasis>sends</emphasis> a <literal>COMMIT</literal>, only when the > + corresponding result is received to confirm the commit is complete. > + Because errors arrive asynchronously the application needs to be able to > + restart from the last <emphasis>received</emphasis> committed change and > + resend work done after that point if something goes wrong. > + </para> That seems like a batch independent thing, right? If so, maybe make it a <note>? > + <listitem> > + <para> > + Causes a connection to enter batch mode if it is currently idle or > + already in batch mode. > + > +<synopsis> > +int PQbatchBegin(PGconn *conn); > +</synopsis> > + > + </para> > + <para> > + Returns 1 for success. Returns 0 and has no > + effect if the connection is not currently idle, i.e. it has a result > + ready, is waiting for more input from the server, etc. This function > + does not actually send anything to the server, it just changes the > + <application>libpq</application> connection state. > + > + </para> > + </listitem> > + </varlistentry> That function name sounds a bit too much like it'd be relevant for a single batch, not something that can send many batches. enterBatchMode? > + <varlistentry id="libpq-PQbatchEnd"> > + <term> > + <function>PQbatchEnd</function> > + <indexterm> > + <primary>PQbatchEnd</primary> > + </indexterm> > + </term> > + > + <listitem> > + <para> > + Causes a connection to exit batch mode if it is currently in batch mode > + with an empty queue and no pending results. > +<synopsis> > +int PQbatchEnd(PGconn *conn); > +</synopsis> > + </para> > + <para>Returns 1 for success. > + Returns 1 and takes no action if not in batch mode. If the connection has > + pending batch items in the queue for reading with > + <function>PQbatchQueueProcess</function>, the current statement isn't finished > + processing or there are results pending for collection with > + <function>PQgetResult</function>, returns 0 and does nothing. > + > + </para> > + </listitem> > + </varlistentry> "" > + <varlistentry id="libpq-PQbatchQueueSync"> > + <term> > + <function>PQbatchQueueSync</function> > + <function>PQbatchQueueProcess</function> As said above, I'm not a fan of these, because it sounds like you're queueing a sync/process. > /* ---------------- > @@ -1108,7 +1113,7 @@ pqRowProcessor(PGconn *conn, const char **errmsgp) > conn->next_result = conn->result; > conn->result = res; > /* And mark the result ready to return */ > - conn->asyncStatus = PGASYNC_READY; > + conn->asyncStatus = PGASYNC_READY_MORE; > } Uhm, isn't that an API/ABI breakage issue? > /* > - * Common startup code for PQsendQuery and sibling routines > + * PQmakePipelinedCommand > + * Get a new command queue entry, allocating it if required. Doesn't add it to > + * the tail of the queue yet, use PQappendPipelinedCommand once the command has > + * been written for that. If a command fails once it's called this, it should > + * use PQrecyclePipelinedCommand to put it on the freelist or release it. "command fails once it's called this"? > +/* > + * PQrecyclePipelinedCommand > + * Push a command queue entry onto the freelist. It must be a dangling entry > + * with null next pointer and not referenced by any other entry's next pointer. > + */ > +static void > +PQrecyclePipelinedCommand(PGconn *conn, PGcommandQueueEntry * entry) > +{ > + if (entry == NULL) > + return; > + if (entry->next != NULL) > + { > + fprintf(stderr, "tried to recycle non-dangling command queue entry"); > + abort(); Needs a libpq_gettext()? > +/* > + * PQbatchEnd > + * End batch mode and return to normal command mode. > + * > + * Has no effect unless the client has processed all results > + * from all outstanding batches and the connection is idle. > + * > + * Returns true if batch mode ended. > + */ > +int > +PQbatchEnd(PGconn *conn) > +{ > + if (!conn) > + return false; > + > + if (conn->batch_status == PQBATCH_MODE_OFF) > + return true; > + > + switch (conn->asyncStatus) > + { > + case PGASYNC_IDLE: > + printfPQExpBuffer(&conn->errorMessage, > + libpq_gettext_noop("internal error, IDLE in batch mode")); > + break; > + case PGASYNC_COPY_IN: > + case PGASYNC_COPY_OUT: > + case PGASYNC_COPY_BOTH: > + printfPQExpBuffer(&conn->errorMessage, > + libpq_gettext_noop("internal error, COPY in batch mode")); > + break; Why aren't you returning false here, > + case PGASYNC_READY: > + case PGASYNC_READY_MORE: > + case PGASYNC_BUSY: > + /* can't end batch while busy */ > + return false; but are here? > + case PGASYNC_QUEUED: > + break; > + } > + > +int > +PQbatchQueueSync(PGconn *conn) > +{ > + PGcommandQueueEntry *entry; > + > + if (!conn) > + return false; > + > + if (conn->batch_status == PQBATCH_MODE_OFF) > + return false; > + > + switch (conn->asyncStatus) > + { > + case PGASYNC_IDLE: > + printfPQExpBuffer(&conn->errorMessage, > + libpq_gettext_noop("internal error, IDLE in batch mode")); > + break; > + case PGASYNC_COPY_IN: > + case PGASYNC_COPY_OUT: > + case PGASYNC_COPY_BOTH: > + printfPQExpBuffer(&conn->errorMessage, > + libpq_gettext_noop("internal error, COPY in batch mode")); > + break; > + case PGASYNC_READY: > + case PGASYNC_READY_MORE: > + case PGASYNC_BUSY: > + case PGASYNC_QUEUED: > + /* can send sync to end this batch of cmds */ > + break; > + } Uhm, what is that switch actually achieving? We're not returning an error code, so ...? > + /* Should try to flush immediately if there's room */ > + PQflush(conn); "room"? Also, don't we need to process PQflush's return value? > +/* > + * PQbatchQueueProcess > + * In batch mode, start processing the next query in the queue. > + * > + * Returns true if the next query was popped from the queue and can > + * be processed by PQconsumeInput, PQgetResult, etc. > + * > + * Returns false if the current query isn't done yet, the connection > + * is not in a batch, or there are no more queries to process. > + */ > +int > +PQbatchQueueProcess(PGconn *conn) > +{ > + PGcommandQueueEntry *next_query; > + > + if (!conn) > + return false; > + > + if (conn->batch_status == PQBATCH_MODE_OFF) > + return false; > + > + switch (conn->asyncStatus) > + { > + case PGASYNC_COPY_IN: > + case PGASYNC_COPY_OUT: > + case PGASYNC_COPY_BOTH: > + printfPQExpBuffer(&conn->errorMessage, > + libpq_gettext_noop("internal error, COPY in batch mode")); > + break; > + case PGASYNC_READY: > + case PGASYNC_READY_MORE: > + case PGASYNC_BUSY: > + /* client still has to process current query or results */ > + return false; > + break; > + case PGASYNC_IDLE: > + printfPQExpBuffer(&conn->errorMessage, > + libpq_gettext_noop("internal error, IDLE in batch mode")); > + break; > + case PGASYNC_QUEUED: > + /* next query please */ > + break; > + } Once more, I'm very unconvinced by the switch. Unless you do anything with the errors, this seems pointless. > + if (conn->batch_status == PQBATCH_MODE_ABORTED && conn->queryclass != PGQUERY_SYNC) > + { > + /* > + * In an aborted batch we don't get anything from the server for each > + * result; we're just discarding input until we get to the next sync > + * from the server. The client needs to know its queries got aborted > + * so we create a fake PGresult to return immediately from > + * PQgetResult. > + */ > + conn->result = PQmakeEmptyPGresult(conn, > + PGRES_BATCH_ABORTED); > + if (!conn->result) > + { > + printfPQExpBuffer(&conn->errorMessage, > + libpq_gettext("out of memory")); > + pqSaveErrorResult(conn); > + } > + conn->asyncStatus = PGASYNC_READY; So we still return true in the OOM case? Greetings, Andres Freund
On 2017-04-03 14:10:47 +1000, Vaishnavi Prabakaran wrote:
> > The CF has been extended until April 7 but time is still growing short.
> > Please respond with a new patch by 2017-04-04 00:00 AoE (UTC-12) or this
> > submission will be marked "Returned with Feedback".
> >
> >
> Thanks for the information, attached the latest patch resolving one
> compilation warning. And, please discard the test patch as it will be
> re-implemented later separately.
Hm. If the tests aren't ready yet, it seems we'll have to move this to
the next CF.
On 2017-04-04 09:24:23 +1000, Vaishnavi Prabakaran wrote: > Hi, > On Tue, Apr 4, 2017 at 7:05 AM, Andres Freund <andres@anarazel.de> wrote: > > > On 2017-04-03 14:10:47 +1000, Vaishnavi Prabakaran wrote: > > > > The CF has been extended until April 7 but time is still growing short. > > > > Please respond with a new patch by 2017-04-04 00:00 AoE (UTC-12) or > > this > > > > submission will be marked "Returned with Feedback". > > > > > > > > > > > Thanks for the information, attached the latest patch resolving one > > > compilation warning. And, please discard the test patch as it will be > > > re-implemented later separately. > > > > Hm. If the tests aren't ready yet, it seems we'll have to move this to > > the next CF. > > > > > Thanks for your review and I will address your review comments and send the > newer version of patch shortly. Cool. > Just quickly, Is it not ok to consider only the code patch for this CF > without test patch? I'd say no, it's not acceptable. This is too much new code for it not to be tested. Andres
On Tue, Apr 4, 2017 at 8:26 AM, Andres Freund <andres@anarazel.de> wrote: > On 2017-04-04 09:24:23 +1000, Vaishnavi Prabakaran wrote: >> Just quickly, Is it not ok to consider only the code patch for this CF >> without test patch? > > I'd say no, it's not acceptable. This is too much new code for it not > to be tested. Doesn't it depend actually? I would think that the final patch may not include all the tests implemented if: - The thread on which a patch has been developed had a set of tests done and posted with it. - Including them does not make sense if we have a way to run those tests more efficiently. Sometimes a bunch of benchmarks or tests are run on a patch bu for the final result keeping them around does not make much sense. In the case of this patch, it seems to me that we would have a far better portable set of tests if we had a dedicated set of subcommands available at psql level, particularly for Windows/MSVC. If that's a requirement for this patch so let it be. I am not saying that tests are not necessary. They are of course, but in this case having a bit more infrastructure would be more be more helpful for users and the tests themselves. Note that I won't complain either if this set of C tests are included at the end. -- Michael
On 2017-04-04 08:57:33 +0900, Michael Paquier wrote: > On Tue, Apr 4, 2017 at 8:26 AM, Andres Freund <andres@anarazel.de> wrote: > > On 2017-04-04 09:24:23 +1000, Vaishnavi Prabakaran wrote: > >> Just quickly, Is it not ok to consider only the code patch for this CF > >> without test patch? > > > > I'd say no, it's not acceptable. This is too much new code for it not > > to be tested. > > Doesn't it depend actually? Well, I didn't make a general statement, I made one about this patch. And this would add a significant bunch of untested code, and it'll likely take years till it gets decent coverage outside. > In the case of this patch, it seems to me that we would have a far > better portable set of tests if we had a dedicated set of subcommands > available at psql level, particularly for Windows/MSVC. That's a really large scope creep imo. Adding a bunch of user-facing psql stuff doesn't compare in complexity to running a test across platforms. We can just do that from regess.c or such, if that ends up being a problem.. > If that's a requirement for this patch so let it be. I am not saying that tests > are not necessary. They are of course, but in this case having a bit > more infrastructure would be more be more helpful for users and the > tests themselves. I'm not following. Greetings, Andres Freund
> +
> + <para>
> + <application>libpq</application> supports queueing up multiple queries into
> + a pipeline to be executed as a batch on the server. Batching queries allows
> + applications to avoid a client/server round-trip after each query to get
> + the results before issuing the next query.
> + </para>
"queueing .. into a pipeline" sounds weird to me - but I'm not a native
speaker. Also batching != pipelining.
> + <sect2>
> + <title>When to use batching</title>
> +
> + <para>
> + Much like asynchronous query mode, there is no performance disadvantage to
> + using batching and pipelining. It increases client application complexity
> + and extra caution is required to prevent client/server deadlocks but
> + offers considerable performance improvements.
> + </para>
s/offers/can sometimes offer/
> + <sect2 id="libpq-batch-using">
> + <title>Using batch mode</title>
> +
> + <para>
> + To issue batches the application must switch
> + <application>libpq</application> into batch mode.
s/libpq/a connection/?
> Enter batch mode with <link
> + linkend="libpq-PQbatchBegin"><function>PQbatchBegin(conn)</ function></link> or test
> + whether batch mode is active with <link
> + linkend="libpq-PQbatchStatus"><function>PQbatchStatus(conn)< /function></link>. In batch mode only <link
> + linkend="libpq-async">asynchronous operations</link> are permitted, and
> + <literal>COPY</literal> is not recommended as it most likely will trigger failure in batch processing.
> + (The restriction on <literal>COPY</literal> is an implementation
> + limit; the PostgreSQL protocol and server can support batched
> <literal>COPY</literal>).
Hm, I'm unconvinced that that's a useful parenthetical in the libpq
docs.
> + <para>
> + The client uses libpq's asynchronous query functions to dispatch work,
> + marking the end of each batch with <function>PQbatchQueueSync</function>.
> + Concurrently, it uses <function>PQgetResult</function> and
> + <function>PQbatchQueueProcess</function> to get results.
"Concurrently" imo is a dangerous word, somewhat implying multi-threading.
> + <note>
> + <para>
> + It is best to use batch mode with <application>libpq</application> in
> + <link linkend="libpq-pqsetnonblocking">non-blocking mode</link>. If used in
> + blocking mode it is possible for a client/server deadlock to occur. The
> + client will block trying to send queries to the server, but the server will
> + block trying to send results from queries it has already processed to the
> + client. This only occurs when the client sends enough queries to fill its
> + output buffer and the server's receive buffer before switching to
> + processing input from the server, but it's hard to predict exactly when
> + that'll happen so it's best to always use non-blocking mode.
> + </para>
> + </note>
Such deadlocks are possible just as well with non-blocking mode, unless
one can avoid sending queries and switching to receiving results anytime
in the application code.
> + <para>
> + Batched operations will be executed by the server in the order the client
> + sends them. The server will send the results in the order the statements
> + executed. The server may begin executing the batch before all commands
> + in the batch are queued and the end of batch command is sent. If any
> + statement encounters an error the server aborts the current transaction and
> + skips processing the rest of the batch. Query processing resumes after the
> + end of the failed batch.
> + </para>
What if a batch contains transaction boundaries?
> + <sect3 id="libpq-batch-results">
> + <title>Processing results</title>
> +
> + <para>
> + The client <link linkend="libpq-batch-interleave">interleaves result
> + processing with sending batch queries</link>, or for small batches may
> + process all results after sending the whole batch.
> + </para>
That's a very long <link> text, is it not?
> + <para>
> + To get the result of the first batch entry the client must call <link
> + linkend="libpq-PQbatchQueueProcess">< function>PQbatchQueueProcess</ function></link>. It must then call
What does 'QueueProcess' mean? Shouldn't it be 'ProcessQueue'? You're
not enquing a process or processing, right?
> + <function>PQgetResult</function> and handle the results until
> + <function>PQgetResult</function> returns null (or would return null if
> + called).
What is that parenthetical referring to? IIRC we don't provide any
external way to determine PQgetResult would return NULL.
> + <para>
> + The client must not assume that work is committed when it
> + <emphasis>sends</emphasis> a <literal>COMMIT</literal>, only when the
> + corresponding result is received to confirm the commit is complete.
> + Because errors arrive asynchronously the application needs to be able to
> + restart from the last <emphasis>received</emphasis> committed change and
> + resend work done after that point if something goes wrong.
> + </para>
That seems like a batch independent thing, right? If so, maybe make it
a <note>?
> +
> +<synopsis>
> +int PQbatchBegin(PGconn *conn);
...
That function name sounds a bit too much like it'd be relevant for a
single batch, not something that can send many batches. enterBatchMode?
> +int PQbatchEnd(PGconn *conn);
...
""
> + <varlistentry id="libpq-PQbatchQueueSync">
> + <term>
> + <function>PQbatchQueueSync</function>
> + <function>PQbatchQueueProcess</function>
As said above, I'm not a fan of these, because it sounds like you're
queueing a sync/process.
> /* ----------------
> @@ -1108,7 +1113,7 @@ pqRowProcessor(PGconn *conn, const char **errmsgp)
> conn->next_result = conn->result;
> conn->result = res;
> /* And mark the result ready to return */
> - conn->asyncStatus = PGASYNC_READY;
> + conn->asyncStatus = PGASYNC_READY_MORE;
> }
Uhm, isn't that an API/ABI breakage issue?
> /*
> - * Common startup code for PQsendQuery and sibling routines
> + * PQmakePipelinedCommand
> + * Get a new command queue entry, allocating it if required. Doesn't add it to
> + * the tail of the queue yet, use PQappendPipelinedCommand once the command has
> + * been written for that. If a command fails once it's called this, it should
> + * use PQrecyclePipelinedCommand to put it on the freelist or release it.
"command fails once it's called this"?
> +/*
> + * PQrecyclePipelinedCommand
> + * Push a command queue entry onto the freelist. It must be a dangling entry
> + * with null next pointer and not referenced by any other entry's next pointer.
> + */
> +static void
> +PQrecyclePipelinedCommand(PGconn *conn, PGcommandQueueEntry * entry)
> +{
> + if (entry == NULL)
> + return;
> + if (entry->next != NULL)
> + {
> + fprintf(stderr, "tried to recycle non-dangling command queue entry");
> + abort();
Needs a libpq_gettext()?
> +/*
> + * PQbatchEnd
> + * End batch mode and return to normal command mode.
> + *
> + * Has no effect unless the client has processed all results
> + * from all outstanding batches and the connection is idle.
> + *
> + * Returns true if batch mode ended.
> + */
> +int
> +PQbatchEnd(PGconn *conn)
> +{
> + if (!conn)
> + return false;
> +
> + if (conn->batch_status == PQBATCH_MODE_OFF)
> + return true;
> +
> + switch (conn->asyncStatus)
> + {
> + case PGASYNC_IDLE:
> + printfPQExpBuffer(&conn->errorMessage,
> + libpq_gettext_noop("internal error, IDLE in batch mode"));
> + break;
> + case PGASYNC_COPY_IN:
> + case PGASYNC_COPY_OUT:
> + case PGASYNC_COPY_BOTH:
> + printfPQExpBuffer(&conn->errorMessage,
> + libpq_gettext_noop("internal error, COPY in batch mode"));
> + break;
Why aren't you returning false here,
> + case PGASYNC_READY:
> + case PGASYNC_READY_MORE:
> + case PGASYNC_BUSY:
> + /* can't end batch while busy */
> + return false;
but are here?
> +int
> +PQbatchQueueSync(PGconn *conn)
> +{
> + PGcommandQueueEntry *entry;
> +
> + if (!conn)
> + return false;
> +
> + if (conn->batch_status == PQBATCH_MODE_OFF)
> + return false;
> +
> + switch (conn->asyncStatus)
> + {
> + case PGASYNC_IDLE:
> + printfPQExpBuffer(&conn->errorMessage,
> + libpq_gettext_noop("internal error, IDLE in batch mode"));
> + break;
> + case PGASYNC_COPY_IN:
> + case PGASYNC_COPY_OUT:
> + case PGASYNC_COPY_BOTH:
> + printfPQExpBuffer(&conn->errorMessage,
> + libpq_gettext_noop("internal error, COPY in batch mode"));
> + break;
> + case PGASYNC_READY:
> + case PGASYNC_READY_MORE:
> + case PGASYNC_BUSY:
> + case PGASYNC_QUEUED:
> + /* can send sync to end this batch of cmds */
> + break;
> + }
Uhm, what is that switch actually achieving? We're not returning an
error code, so ...?
> + /* Should try to flush immediately if there's room */
> + PQflush(conn);
"room"?
Also, don't we need to process PQflush's return value?
> +/*
> + * PQbatchQueueProcess
> + * In batch mode, start processing the next query in the queue.
> + *
> + * Returns true if the next query was popped from the queue and can
> + * be processed by PQconsumeInput, PQgetResult, etc.
> + *
> + * Returns false if the current query isn't done yet, the connection
> + * is not in a batch, or there are no more queries to process.
> + */
> +int
> +PQbatchQueueProcess(PGconn *conn)
> +{
> + PGcommandQueueEntry *next_query;
> +
> + if (!conn)
> + return false;
> +
> + if (conn->batch_status == PQBATCH_MODE_OFF)
> + return false;
> +
> + switch (conn->asyncStatus)
> + {
> + case PGASYNC_COPY_IN:
> + case PGASYNC_COPY_OUT:
> + case PGASYNC_COPY_BOTH:
> + printfPQExpBuffer(&conn->errorMessage,
> + libpq_gettext_noop("internal error, COPY in batch mode"));
> + break;
> + case PGASYNC_READY:
> + case PGASYNC_READY_MORE:
> + case PGASYNC_BUSY:
> + /* client still has to process current query or results */
> + return false;
> + break;
> + case PGASYNC_IDLE:
> + printfPQExpBuffer(&conn->errorMessage,
> + libpq_gettext_noop("internal error, IDLE in batch mode"));
> + break;
> + case PGASYNC_QUEUED:
> + /* next query please */
> + break;
> + }
Once more, I'm very unconvinced by the switch. Unless you do anything
with the errors, this seems pointless.
> + if (conn->batch_status == PQBATCH_MODE_ABORTED && conn->queryclass != PGQUERY_SYNC)
> + {
> + /*
> + * In an aborted batch we don't get anything from the server for each
> + * result; we're just discarding input until we get to the next sync
> + * from the server. The client needs to know its queries got aborted
> + * so we create a fake PGresult to return immediately from
> + * PQgetResult.
> + */
> + conn->result = PQmakeEmptyPGresult(conn,
> + PGRES_BATCH_ABORTED);
> + if (!conn->result)
> + {
> + printfPQExpBuffer(&conn->errorMessage,
> + libpq_gettext("out of memory"));
> + pqSaveErrorResult(conn);
> + }
> + conn->asyncStatus = PGASYNC_READY;
So we still return true in the OOM case?
Attachment
Hi, On 2017-04-05 17:00:42 +1000, Vaishnavi Prabakaran wrote: > Regarding test patch, I have corrected the test suite after David Steele's > comments. > Also, I would like to mention that a companion patch was submitted by David > Steele up-thread. > > Attached the latest code and test patch. My impression is that this'll need a couple more rounds of review. Given that this'll establish API we'll pretty much ever going to be able to change/remove, I think it'd be a bad idea to rush this into v10. Therefore I propose moving this to the next CF. - Andres
Hi, It seems like this patch is nearly finished. I fixed all the remaining issues. I'm also asking a confirmation of the test scenarios you want to see in the next version of the patch. > Hi, > > On 2020-07-10 19:01:49 -0400, Alvaro Herrera wrote: > > Totally unasked for, here's a rebase of this patch series. I didn't do > > anything other than rebasing to current master, solving a couple of very > > trivial conflicts, fixing some whitespace complaints by git apply, and > > running tests to verify everthing works. > > > > I don't foresee working on this at all, so if anyone is interested in > > seeing this feature in, I encourage them to read and address > > Horiguchi-san's feedback. > > Nor am I planning to do so, but I do think its a pretty important > improvement. > > Fixed > > > > +/* > > + * PQrecyclePipelinedCommand > > + * Push a command queue entry onto the freelist. It must be a dangling entry > > + * with null next pointer and not referenced by any other entry's next pointer. > > + */ > > Dangling sounds a bit like it's already freed. > > Fixed > > > > +/* > > + * PQbatchSendQueue > > + * End a batch submission by sending a protocol sync. The connection will > > + * remain in batch mode and unavailable for new synchronous command execution > > + * functions until all results from the batch are processed by the client. > > I feel like the reference to the protocol sync is a bit too low level > for an external API. It should first document what the function does > from a user's POV. > > I think it'd also be good to document whether / whether not queries can > already have been sent before PQbatchSendQueue is called or not. > Fixed > > > > > + if (conn->batch_status == PQBATCH_MODE_ABORTED && conn->queryclass != PGQUERY_SYNC) > > + { > > + /* > > + * In an aborted batch we don't get anything from the server for each > > + * result; we're just discarding input until we get to the next sync > > + * from the server. The client needs to know its queries got aborted > > + * so we create a fake PGresult to return immediately from > > + * PQgetResult. > > + */ > > + conn->result = PQmakeEmptyPGresult(conn, > > + PGRES_BATCH_ABORTED); > > + if (!conn->result) > > + { > > + printfPQExpBuffer(&conn->errorMessage, > > + libpq_gettext("out of memory")); > > + pqSaveErrorResult(conn); > > + return 0; > > Is there any way an application can recover at this point? ISTM we'd be > stuck in the previous asyncStatus, no? > conn->result is null when malloc(sizeof(PGresult)) returns null. It's very unlikely unless the server machine is out of memory, so the server will probably be unresponsive anyway. I'm leaving this as it is but if anyone has a solution simple to implement I'll fix it. > > > > +/* pqBatchFlush > > + * In batch mode, data will be flushed only when the out buffer reaches the threshold value. > > + * In non-batch mode, data will be flushed all the time. > > + */ > > +static int > > +pqBatchFlush(PGconn *conn) > > +{ > > + if ((conn->batch_status == PQBATCH_MODE_OFF)||(conn->outCount >= OUTBUFFER_THRESHOLD)) > > + return(pqFlush(conn)); > > + return 0; /* Just to keep compiler quiet */ > > +} > > unnecessarily long line. > Fixed > > > +/* > > + * Connection's outbuffer threshold is set to 64k as it is safe > > + * in Windows as per comments in pqSendSome() API. > > + */ > > +#define OUTBUFFER_THRESHOLD 65536 > > I don't think the comment explains much. It's fine to send more than 64k > with pqSendSome(), they'll just be send with separate pgsecure_write() > invocations. And only on windows. > > It clearly makes sense to start sending out data at a certain > granularity to avoid needing unnecessary amounts of memory, and to make > more efficient use of latency / serer side compute. > > It's not implausible that 64k is the right amount for that, I just don't > think the explanation above is good. > Fixed > > diff --git a/src/test/modules/test_libpq/testlibpqbatch.c b/src/test/modules/test_libpq/testlibpqbatch.c > > new file mode 100644 > > index 0000000000..4d6ba266e5 > > --- /dev/null > > +++ b/src/test/modules/test_libpq/testlibpqbatch.c > > @@ -0,0 +1,1456 @@ > > +/* > > + * src/test/modules/test_libpq/testlibpqbatch.c > > + * > > + * > > + * testlibpqbatch.c > > + * Test of batch execution functionality > > + */ > > + > > +#ifdef WIN32 > > +#include <windows.h> > > +#endif > > ISTM that this shouldn't be needed in a test program like this? > Shouldn't libpq abstract all of this away? > Fixed. > > > +static void > > +simple_batch(PGconn *conn) > > +{ > > ISTM that all or at least several of these should include tests of > transactional behaviour with pipelining (e.g. using both implicit and > explicit transactions inside a single batch, using transactions across > batches, multiple explicit transactions inside a batch). > @Andres, just to make sure I understood, here is the test scenarios I'll add: Implicit and explicit multiple transactions: start batch: create_table insert X begin transaction insert X commit transaction begin transaction insert Y commit transaction end batch Transaction across batches: start batch: create_table begin transaction insert X end batch start batch: insert Y commit transaction end batch > > > > + PGresult *res = NULL; > > + const char *dummy_params[1] = {"1"}; > > + Oid dummy_param_oids[1] = {INT4OID}; > > + > > + fprintf(stderr, "simple batch... "); > > + fflush(stderr); > > Why do we need fflush()? IMO that shouldn't be needed in a use like > this? Especially not on stderr, which ought to be unbuffered? > Removed. > > > + /* > > + * Enter batch mode and dispatch a set of operations, which we'll then > > + * process the results of as they come in. > > + * > > + * For a simple case we should be able to do this without interim > > + * processing of results since our out buffer will give us enough slush to > > + * work with and we won't block on sending. So blocking mode is fine. > > + */ > > + if (PQisnonblocking(conn)) > > + { > > + fprintf(stderr, "Expected blocking connection mode\n"); > > + goto fail; > > + } > > Perhaps worth adding a helper for this? > > #define EXPECT(condition, explanation) \ > Fixed (and saved 600 lines :). Once I get your confirmation of the test scenarios, I'll implement them and share another patch.
Attachment
On 2020-Aug-31, Matthieu Garrigues wrote: > It seems like this patch is nearly finished. I fixed all the remaining > issues. I'm also asking a confirmation of the test scenarios you want > to see in the next version of the patch. Hmm, apparently nobody noticed that this patch is not registered in the current commitfest. Since it was submitted ahead of the deadline, I'm going to add it there. (The patch has also undergone a lot of review already; it's certainly not a newcomer.) -- Álvaro Herrera https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Alvaro,
On 2020-Aug-31, Matthieu Garrigues wrote:
> It seems like this patch is nearly finished. I fixed all the remaining
> issues. I'm also asking a confirmation of the test scenarios you want
> to see in the next version of the patch.
Hmm, apparently nobody noticed that this patch is not registered in the
current commitfest.
Since it was submitted ahead of the deadline, I'm going to add it there.
(The patch has also undergone a lot of review already; it's certainly
not a newcomer.)
Hi,
It seems like this patch is nearly finished. I fixed all the remaining
issues. I'm also asking
a confirmation of the test scenarios you want to see in the next
version of the patch.
> Hi,
>
> On 2020-07-10 19:01:49 -0400, Alvaro Herrera wrote:
> > Totally unasked for, here's a rebase of this patch series. I didn't do
> > anything other than rebasing to current master, solving a couple of very
> > trivial conflicts, fixing some whitespace complaints by git apply, and
> > running tests to verify everthing works.
> >
> > I don't foresee working on this at all, so if anyone is interested in
> > seeing this feature in, I encourage them to read and address
> > Horiguchi-san's feedback.
>
> Nor am I planning to do so, but I do think its a pretty important
> improvement.
>
>
Fixed
>
>
> > +/*
> > + * PQrecyclePipelinedCommand
> > + * Push a command queue entry onto the freelist. It must be a dangling entry
> > + * with null next pointer and not referenced by any other entry's next pointer.
> > + */
>
> Dangling sounds a bit like it's already freed.
>
>
Fixed
>
>
> > +/*
> > + * PQbatchSendQueue
> > + * End a batch submission by sending a protocol sync. The connection will
> > + * remain in batch mode and unavailable for new synchronous command execution
> > + * functions until all results from the batch are processed by the client.
>
> I feel like the reference to the protocol sync is a bit too low level
> for an external API. It should first document what the function does
> from a user's POV.
>
> I think it'd also be good to document whether / whether not queries can
> already have been sent before PQbatchSendQueue is called or not.
>
Fixed
>
>
>
> > + if (conn->batch_status == PQBATCH_MODE_ABORTED && conn->queryclass != PGQUERY_SYNC)
> > + {
> > + /*
> > + * In an aborted batch we don't get anything from the server for each
> > + * result; we're just discarding input until we get to the next sync
> > + * from the server. The client needs to know its queries got aborted
> > + * so we create a fake PGresult to return immediately from
> > + * PQgetResult.
> > + */
> > + conn->result = PQmakeEmptyPGresult(conn,
> > + PGRES_BATCH_ABORTED);
> > + if (!conn->result)
> > + {
> > + printfPQExpBuffer(&conn->errorMessage,
> > + libpq_gettext("out of memory"));
> > + pqSaveErrorResult(conn);
> > + return 0;
>
> Is there any way an application can recover at this point? ISTM we'd be
> stuck in the previous asyncStatus, no?
>
conn->result is null when malloc(sizeof(PGresult)) returns null. It's
very unlikely unless
the server machine is out of memory, so the server will probably be
unresponsive anyway.
I'm leaving this as it is but if anyone has a solution simple to
implement I'll fix it.
>
>
> > +/* pqBatchFlush
> > + * In batch mode, data will be flushed only when the out buffer reaches the threshold value.
> > + * In non-batch mode, data will be flushed all the time.
> > + */
> > +static int
> > +pqBatchFlush(PGconn *conn)
> > +{
> > + if ((conn->batch_status == PQBATCH_MODE_OFF)||(conn->outCount >= OUTBUFFER_THRESHOLD))
> > + return(pqFlush(conn));
> > + return 0; /* Just to keep compiler quiet */
> > +}
>
> unnecessarily long line.
>
Fixed
>
> > +/*
> > + * Connection's outbuffer threshold is set to 64k as it is safe
> > + * in Windows as per comments in pqSendSome() API.
> > + */
> > +#define OUTBUFFER_THRESHOLD 65536
>
> I don't think the comment explains much. It's fine to send more than 64k
> with pqSendSome(), they'll just be send with separate pgsecure_write()
> invocations. And only on windows.
>
> It clearly makes sense to start sending out data at a certain
> granularity to avoid needing unnecessary amounts of memory, and to make
> more efficient use of latency / serer side compute.
>
> It's not implausible that 64k is the right amount for that, I just don't
> think the explanation above is good.
>
Fixed
> > diff --git a/src/test/modules/test_libpq/testlibpqbatch.c b/src/test/modules/test_libpq/testlibpqbatch.c
> > new file mode 100644
> > index 0000000000..4d6ba266e5
> > --- /dev/null
> > +++ b/src/test/modules/test_libpq/testlibpqbatch.c
> > @@ -0,0 +1,1456 @@
> > +/*
> > + * src/test/modules/test_libpq/testlibpqbatch.c
> > + *
> > + *
> > + * testlibpqbatch.c
> > + * Test of batch execution functionality
> > + */
> > +
> > +#ifdef WIN32
> > +#include <windows.h>
> > +#endif
>
> ISTM that this shouldn't be needed in a test program like this?
> Shouldn't libpq abstract all of this away?
>
Fixed.
>
> > +static void
> > +simple_batch(PGconn *conn)
> > +{
>
> ISTM that all or at least several of these should include tests of
> transactional behaviour with pipelining (e.g. using both implicit and
> explicit transactions inside a single batch, using transactions across
> batches, multiple explicit transactions inside a batch).
>
@Andres, just to make sure I understood, here is the test scenarios I'll add:
Implicit and explicit multiple transactions:
start batch:
create_table
insert X
begin transaction
insert X
commit transaction
begin transaction
insert Y
commit transaction
end batch
Transaction across batches:
start batch:
create_table
begin transaction
insert X
end batch
start batch:
insert Y
commit transaction
end batch
>
>
> > + PGresult *res = NULL;
> > + const char *dummy_params[1] = {"1"};
> > + Oid dummy_param_oids[1] = {INT4OID};
> > +
> > + fprintf(stderr, "simple batch... ");
> > + fflush(stderr);
>
> Why do we need fflush()? IMO that shouldn't be needed in a use like
> this? Especially not on stderr, which ought to be unbuffered?
>
Removed.
>
> > + /*
> > + * Enter batch mode and dispatch a set of operations, which we'll then
> > + * process the results of as they come in.
> > + *
> > + * For a simple case we should be able to do this without interim
> > + * processing of results since our out buffer will give us enough slush to
> > + * work with and we won't block on sending. So blocking mode is fine.
> > + */
> > + if (PQisnonblocking(conn))
> > + {
> > + fprintf(stderr, "Expected blocking connection mode\n");
> > + goto fail;
> > + }
>
> Perhaps worth adding a helper for this?
>
> #define EXPECT(condition, explanation) \
>
Fixed (and saved 600 lines :).
Once I get your confirmation of the test scenarios, I'll implement
them and share another patch.
On 2020-Sep-21, Dave Cramer wrote: Hello Dave, > I am looking for this in the commitfest and can't find it. However there is > an old commitfest entry > > https://commitfest.postgresql.org/13/1024/ > > Do you have the link for the new one ? Here you go: https://commitfest.postgresql.org/29/2724/ -- Álvaro Herrera https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Matthieu Garrigues
On Mon, Sep 21, 2020 at 3:09 PM Dave Cramer <davecramer@postgres.rocks> wrote:
>>
> There was a comment upthread a while back that people should look at the comments made in https://www.postgresql.org/message-id/20180322.211148.187821341.horiguchi.kyotaro%40lab.ntt.co.jp by Horiguchi-San.
>
> From what I can tell this has not been addressed. The one big thing is the use of PQbatchProcessQueue vs just putting it in PQgetResult.
>
> The argument is that adding PQbatchProcessQueue is unnecessary and just adds another step. Looking at this, it seems like putting this inside PQgetResult would get my vote as it leaves the interface unchanged.
>
Ok. I'll merge PQbatchProcessQueue into PQgetResult. But just one
thing: I'll keep PQgetResult returning null between the result of two
batched query so the user
can know which result comes from which query.
Matthieu Garrigues On Mon, Sep 21, 2020 at 3:09 PM Dave Cramer <davecramer@postgres.rocks> wrote: >> > There was a comment upthread a while back that people should look at the comments made in https://www.postgresql.org/message-id/20180322.211148.187821341.horiguchi.kyotaro%40lab.ntt.co.jpby Horiguchi-San. > > From what I can tell this has not been addressed. The one big thing is the use of PQbatchProcessQueue vs just putting itin PQgetResult. > > The argument is that adding PQbatchProcessQueue is unnecessary and just adds another step. Looking at this, it seems likeputting this inside PQgetResult would get my vote as it leaves the interface unchanged. > Ok. I'll merge PQbatchProcessQueue into PQgetResult. But just one thing: I'll keep PQgetResult returning null between the result of two batched query so the user can know which result comes from which query.
On Mon, Sep 21, 2020 at 3:39 PM Dave Cramer <davecramer@postgres.rocks> wrote: > > > > On Mon, 21 Sep 2020 at 09:21, Matthieu Garrigues <matthieu.garrigues@gmail.com> wrote: >> >> Matthieu Garrigues >> >> On Mon, Sep 21, 2020 at 3:09 PM Dave Cramer <davecramer@postgres.rocks> wrote: >> >> >> > There was a comment upthread a while back that people should look at the comments made in https://www.postgresql.org/message-id/20180322.211148.187821341.horiguchi.kyotaro%40lab.ntt.co.jpby Horiguchi-San. >> > >> > From what I can tell this has not been addressed. The one big thing is the use of PQbatchProcessQueue vs just puttingit in PQgetResult. >> > >> > The argument is that adding PQbatchProcessQueue is unnecessary and just adds another step. Looking at this, it seemslike putting this inside PQgetResult would get my vote as it leaves the interface unchanged. >> > >> >> Ok. I'll merge PQbatchProcessQueue into PQgetResult. But just one >> thing: I'll keep PQgetResult returning null between the result of two >> batched query so the user >> can know which result comes from which query. > > > Fair enough. > > There may be other things in his comments that need to be addressed. That was the big one that stuck out for me. > > Thanks for working on this! > Yes I already addressed the other things in the v19 patch: https://www.postgresql.org/message-id/flat/CAJkzx4T5E-2cQe3dtv2R78dYFvz+in8PY7A8MArvLhs_pg75gg@mail.gmail.com
Hi Dave, I merged PQbatchProcessQueue into PQgetResult. One first init call to PQbatchProcessQueue was also required in PQsendQueue to have PQgetResult ready to read the first batch query. Tests and documentation are updated accordingly. Matthieu Garrigues On Mon, Sep 21, 2020 at 3:39 PM Dave Cramer <davecramer@postgres.rocks> wrote: > > > > On Mon, 21 Sep 2020 at 09:21, Matthieu Garrigues <matthieu.garrigues@gmail.com> wrote: >> >> Matthieu Garrigues >> >> On Mon, Sep 21, 2020 at 3:09 PM Dave Cramer <davecramer@postgres.rocks> wrote: >> >> >> > There was a comment upthread a while back that people should look at the comments made in https://www.postgresql.org/message-id/20180322.211148.187821341.horiguchi.kyotaro%40lab.ntt.co.jpby Horiguchi-San. >> > >> > From what I can tell this has not been addressed. The one big thing is the use of PQbatchProcessQueue vs just puttingit in PQgetResult. >> > >> > The argument is that adding PQbatchProcessQueue is unnecessary and just adds another step. Looking at this, it seemslike putting this inside PQgetResult would get my vote as it leaves the interface unchanged. >> > >> >> Ok. I'll merge PQbatchProcessQueue into PQgetResult. But just one >> thing: I'll keep PQgetResult returning null between the result of two >> batched query so the user >> can know which result comes from which query. > > > Fair enough. > > There may be other things in his comments that need to be addressed. That was the big one that stuck out for me. > > Thanks for working on this! > > > Dave Cramer > www.postgres.rocks
Attachment
On Mon, Sep 21, 2020 at 07:55:03PM +0200, Matthieu Garrigues wrote: > I merged PQbatchProcessQueue into PQgetResult. > One first init call to PQbatchProcessQueue was also required in > PQsendQueue to have > PQgetResult ready to read the first batch query. > > Tests and documentation are updated accordingly. The documentation is failing to build, and the patch does not build correctly on Windows. Could you address that? -- Michael
Attachment
On Thu, Oct 1, 2020 at 6:35 AM Michael Paquier <michael@paquier.xyz> wrote: > The documentation is failing to build, and the patch does not build > correctly on Windows. Could you address that? > -- > Michael Yes I'm on it. -- Matthieu
This patch fixes compilation on windows and compilation of the documentation. Matthieu Garrigues On Thu, Oct 1, 2020 at 8:41 AM Matthieu Garrigues <matthieu.garrigues@gmail.com> wrote: > > On Thu, Oct 1, 2020 at 6:35 AM Michael Paquier <michael@paquier.xyz> wrote: > > > The documentation is failing to build, and the patch does not build > > correctly on Windows. Could you address that? > > -- > > Michael > > Yes I'm on it. > > -- > Matthieu
Attachment
I started reading this patch. As a starting point I decided to post an updated copy (v22), wherein I reverted the overwriting of src/include/port/linux.h with the win32.h contents (?) and the inclusion of <windows.h> in libpq-fe.h. If those are needed, some different solution will have to be found. Trivial other changes (pgindent etc); nothing of substance. With the PQtrace() patch I posted at [1] the added test program crashes -- I don't know if the fault lies in this patch or the other one. [1] https://postgr.es/m/20201026162313.GA22502@alvherre.pgsql
Attachment
In v23 I've gone over docs; discovered that PQgetResults docs were missing the new values. Added those. No significant other changes yet.
On 2020-Nov-02, Alvaro Herrera wrote: > In v23 I've gone over docs; discovered that PQgetResults docs were > missing the new values. Added those. No significant other changes yet.
Attachment
On 2020-Nov-02, Alvaro Herrera wrote:
> In v23 I've gone over docs; discovered that PQgetResults docs were
> missing the new values. Added those. No significant other changes yet.
Hi Dave, On 2020-Nov-03, Dave Cramer wrote: > On Mon, 2 Nov 2020 at 10:57, Alvaro Herrera <alvherre@alvh.no-ip.org> wrote: > > > On 2020-Nov-02, Alvaro Herrera wrote: > > > > > In v23 I've gone over docs; discovered that PQgetResults docs were > > > missing the new values. Added those. No significant other changes yet. > > Thanks for looking at this. > > What else does it need to get it in shape to apply? I want to go over the code in depth to grok the design more fully. It would definitely help if you (and others) could think about the API being added: Does it fulfill the promises being made? Does it offer the guarantees that real-world apps want to have? I'm not much of an application writer myself -- particularly high-traffic apps that would want to use this. As a driver author I would welcome your insight in these questions.
Hi Dave,
On 2020-Nov-03, Dave Cramer wrote:
> On Mon, 2 Nov 2020 at 10:57, Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:
>
> > On 2020-Nov-02, Alvaro Herrera wrote:
> >
> > > In v23 I've gone over docs; discovered that PQgetResults docs were
> > > missing the new values. Added those. No significant other changes yet.
>
> Thanks for looking at this.
>
> What else does it need to get it in shape to apply?
I want to go over the code in depth to grok the design more fully.
It would definitely help if you (and others) could think about the API
being added: Does it fulfill the promises being made? Does it offer the
guarantees that real-world apps want to have? I'm not much of an
application writer myself -- particularly high-traffic apps that would
want to use this. As a driver author I would welcome your insight in
these questions.
I implemented a C++ async HTTP server using this new batch mode and it provides everything I needed to transparently batch sql requests. It gives a performance boost between x2 and x3 on this benchmark: https://www.techempower.com/benchmarks/#section=test&runid=3097dbae-5228-454c-ba2e-2055d3982790&hw=ph&test=query&a=2&f=zik0zj-zik0zj-zik0zj-zik0zj-zieepr-zik0zj-zik0zj-zik0zj-zik0zj-zik0zj-zik0zj I'll ask other users interested in this to review the API. Matthieu Garrigues On Tue, Nov 3, 2020 at 4:56 PM Dave Cramer <davecramer@postgres.rocks> wrote: > > > > On Tue, 3 Nov 2020 at 08:42, Alvaro Herrera <alvherre@alvh.no-ip.org> wrote: >> >> Hi Dave, >> >> On 2020-Nov-03, Dave Cramer wrote: >> >> > On Mon, 2 Nov 2020 at 10:57, Alvaro Herrera <alvherre@alvh.no-ip.org> wrote: >> > >> > > On 2020-Nov-02, Alvaro Herrera wrote: >> > > >> > > > In v23 I've gone over docs; discovered that PQgetResults docs were >> > > > missing the new values. Added those. No significant other changes yet. >> > >> > Thanks for looking at this. >> > >> > What else does it need to get it in shape to apply? >> >> I want to go over the code in depth to grok the design more fully. >> >> It would definitely help if you (and others) could think about the API >> being added: Does it fulfill the promises being made? Does it offer the >> guarantees that real-world apps want to have? I'm not much of an >> application writer myself -- particularly high-traffic apps that would >> want to use this. As a driver author I would welcome your insight in >> these questions. >> > > I'm sort of in the same boat as you. While I'm closer to the client. I don't personally write that much client code. > > I'd really like to hear from the users here. > > > Dave Cramer > www.postgres.rocks
On 2020-Nov-02, Alvaro Herrera wrote:
> In v23 I've gone over docs; discovered that PQgetResults docs were
> missing the new values. Added those. No significant other changes yet.
executed."
+ process all results after sending the whole batch.
Hi, On 2020-11-03 10:42:34 -0300, Alvaro Herrera wrote: > It would definitely help if you (and others) could think about the API > being added: Does it fulfill the promises being made? Does it offer the > guarantees that real-world apps want to have? I'm not much of an > application writer myself -- particularly high-traffic apps that would > want to use this. Somewhere earlier in this thread there was a patch with support for batching in pgbench. I think it'd be good to refresh that. Both because it shows at least some real-world-lite usage of the feature and because we need a way to stress it to see whether it has unnecessary bottlenecks. Greetings, Andres Freund
Hi David, Thanks for the feedback. I did rework a bit the doc based on your remarks. Here is the v24 patch. Matthieu Garrigues On Tue, Nov 3, 2020 at 6:21 PM David G. Johnston <david.g.johnston@gmail.com> wrote: > > On Mon, Nov 2, 2020 at 8:58 AM Alvaro Herrera <alvherre@alvh.no-ip.org> wrote: >> >> On 2020-Nov-02, Alvaro Herrera wrote: >> >> > In v23 I've gone over docs; discovered that PQgetResults docs were >> > missing the new values. Added those. No significant other changes yet. >> > > Just reading the documentation of this patch, haven't been following the longer thread: > > Given the caveats around blocking mode connections why not just require non-blocking mode, in a similar fashion to howsynchronous functions are disallowed? > > "Batched operations will be executed by the server in the order the client > sends them. The server will send the results in the order the statements > executed." > > Maybe: > > "The server executes statements, and returns results, in the order the client sends them." > > Using two sentences and relying on the user to mentally link the two "in the order" descriptions together seems to addunnecessary cognitive load. > > + The client <link linkend="libpq-batch-interleave">interleaves result > + processing</link> with sending batch queries, or for small batches may > + process all results after sending the whole batch. > > Suggest: "The client may choose to interleave result processing with sending batch queries, or wait until the completebatch has been sent." > > I would expect to process the results of a batch only after sending the entire batch to the server. That I don't haveto is informative but knowing when I should avoid doing so, and why, is informative as well. To the extreme while youcan use batch mode and interleave if you just poll getResult after every command you will make the whole batch thing pointless. Directing the reader from here to the section "Interleaving Result Processing and Query Dispatch" seems worthconsidering. The dynamics of small sizes and sockets remains a bit unclear as to what will break (if anything, or isit just process memory on the server) if interleaving it not performed and sizes are large. > > I would suggest placing commentary about "all transactions subsequent to a failed transaction in a batch are ignored whileprevious completed transactions are retained" in the "When to Use Batching". Something like "Batching is less useful,and more complex, when a single batch contains multiple transactions (see Error Handling)." > > My imagined use case would be to open a batch, start a transaction, send all of its components, end the transaction, endthe batch, check for batch failure and if it doesn't fail have the option to easily continue without processing individualpgResults (or if it does fail, have the option to extract the first error pgResult and continue, ignoring the rest,knowing that the transaction as a whole was reverted and the batch unapplied). I've never interfaced with libpq directly. Though given how the existing C API works what is implemented here seems consistent. > > The "queueing up queries into a pipeline to be executed as a batch on the server" can be read as a client-side behaviorwhere nothing is sent to the server until the batch has been completed. Reading further it becomes clear that allit basically is is a sever-side toggle that instructs the server to continue processing incoming commands even while priorcommands have their results waiting to be ingested by the client. > > Batch seems like the user-visible term to describe this feature. Pipeline seems like an implementation detail that doesn'tneed to be mentioned in the documentation - especially given that pipeline doesn't get a mentioned beyond the firsttwo paragraphs of the chapter and never without being linked directly to "batch". I would probably leave the indextermand have a paragraph describing that batching is implemented using a query pipeline so that people with the implementationdetail on their mind can find this chapter, but the prose for the user should just stick to batching. > > Sorry, that all is a bit unfocused, but the documentation for the user of the API could be cleaned up a bit and some morewords spent on what trade-offs are being made when using batching versus normal command-response processing. That said,while I don't see all of this purely a matter of style I'm also not seeing anything demonstrably wrong with the documentationat the moment. Hopefully my perspective helps though, and depending on what happens next I may try and makemy thoughts more concrete with an actual patch. > > David J. >
Attachment
(Adding previous reviewers to CC) On 2020-Nov-03, David G. Johnston wrote: > Given the caveats around blocking mode connections why not just require > non-blocking mode, in a similar fashion to how synchronous functions are > disallowed? This is a very good question. Why indeed? Does anybody have a good answer to this? If not, I propose we just require that non-blocking mode is in use in order for batch mode to be used. I've been doing a review pass over this patch and have an updated version, which I intend to share later today (after I fix what appears to be a misunderstanding in the "singlerow" test in testlibpqbatch.c)
Here's a v25. I made a few more changes to the docs per David's suggestions; I also reordered the sections quite a bit. It's now like this: * Batch Mode * Using Batch Mode * Issuing Queries * Processing Results * Error Handling * Interleaving Result Processing and Query Dispatch * Ending Batch Mode * Functions Associated with Batch Mode * When to Use Batching To me as a reader, this makes more sense, but if you disagree, I think we should discuss further changes. (For example, maybe we should move the "Functions" section at the end?) The original had "When to Use Batching" at the start, but it seemed to me that the points it's making are not as critical as understanding what it is. I reworked the test program to better fit the TAP model; I found that if one test mecha failed in whatever way, the connection would be in a weird state and cause the next test to also fail. I changed so that it runs one test and exits; then the t/001_libpq_async.pl file (hmm, need to rename to 001_batch.pl I guess) calls it once for each test. I adapted the test code to our code style. I also removed the "timings" stuff; I think that's something better left to pgbench. (I haven't looked at Daniel's pgbench stuff yet, but I will do that next.) While looking at how the tests worked, I gave a hard stare at the new libpq code and cleaned it up also. There's a lot of minor changes, but nothing truly substantial. I moved the code around a lot, to keep things where grouped together they belong. I'm not 100% clear on things like PGconn->batch_status and how PGconn->asyncStatus works. Currently everything seems to work correctly, but I'm worried that because we add new status values to asyncStatus, some existing code might not be doing everything correctly (for example when changing to/from ASYNC_BUSY in some cases, are we 100% we shouldn't be changing to ASYNC_QUEUED?) While looking this over I noticed a thread from 2014[1] where Matt Newell tried to implement this stuff and apparently the main review comment he got before abandoning the patch was that the user would like a way to access the query that corresponded to each result. The current patch set does not address that need; the approach to this problem is that it's on the application's head to keep track of this. Honestly I don't understand why it would be otherwise ... I'm not sure that it makes sense to expect that the application is stupid enough that it doesn't keep track in which order it sent things, but bright enough to keep pointers to the queries it sent (??). So this seems okay to me. But added Heikki and Claudio to CC because of that old thread. [1] https://postgr.es/m/2059418.jZQvL3lH90@obsidian
Attachment
Alvaro Herrera wrote: > I adapted the test code to our code style. I also removed the "timings" > stuff; I think that's something better left to pgbench. > > (I haven't looked at Daniel's pgbench stuff yet, but I will do that > next.) The patch I posted in [1] was pretty simple, but at the time, query results were always discarded. Now that pgbench can instantiate variables from query results, a script can do: select 1 as var \gset select :var; This kind of sequence wouldn't work in batch mode since it sends queries before getting results of previous queries. So maybe \gset should be rejected when inside a batch section. Or alternatively pgbench should collect results before a variable is reinjected into a query, thereby stalling the pipeline. To do this only when necessary, it would have to track read-write dependencies among variables, which seems overly complicated though. [1] https://www.postgresql.org/message-id/b4e34135-2bd9-4b8a-94ca-27d760da26d7@manitou-mail.org Best regards, -- Daniel Vérité PostgreSQL-powered mailer: https://www.manitou-mail.org Twitter: @DanielVerite
Here's a v25.
I made a few more changes to the docs per David's suggestions; I also
reordered the sections quite a bit. It's now like this:
* Batch Mode
* Using Batch Mode
* Issuing Queries
* Processing Results
* Error Handling
* Interleaving Result Processing and Query Dispatch
* Ending Batch Mode
* Functions Associated with Batch Mode
* When to Use Batching
Attachment
On 2020-Nov-14, Daniel Verite wrote: > The patch I posted in [1] was pretty simple, but at the time, query > results were always discarded. Now that pgbench can instantiate > variables from query results, a script can do: > select 1 as var \gset > select :var; > This kind of sequence wouldn't work in batch mode since it > sends queries before getting results of previous queries. > > So maybe \gset should be rejected when inside a batch section. Hah. Hacking pgbench extensively is beyond what I'm willing to do for this feature at this time. Making \gset rejected in a batch section sounds simple enough and supports \beginbatch et al sufficiently to compare performance, so I'm OK with a patch that does that. That'd be a small extension to your previous patch, if I understand correctly. If you or others want to send patches to extend batch support with read-write tracking for variables, feel free, but I hereby declare that I'm not taking immediate responsibility for getting them committed.
Hi On Wed, Nov 18, 2020, at 09:51, Alvaro Herrera wrote: > On 2020-Nov-14, Daniel Verite wrote: > > > The patch I posted in [1] was pretty simple, but at the time, query > > results were always discarded. Now that pgbench can instantiate > > variables from query results, a script can do: > > select 1 as var \gset > > select :var; > > This kind of sequence wouldn't work in batch mode since it > > sends queries before getting results of previous queries. > > > > So maybe \gset should be rejected when inside a batch section. > > Hah. > > Hacking pgbench extensively is beyond what I'm willing to do for this > feature at this time. Making \gset rejected in a batch section sounds > simple enough and supports \beginbatch et al sufficiently to compare > performance, so I'm OK with a patch that does that. That'd be a small > extension to your previous patch, if I understand correctly. > > If you or others want to send patches to extend batch support with > read-write tracking for variables, feel free, but I hereby declare that > I'm not taking immediate responsibility for getting them committed. I think minimal support is entirely sufficient initially. Andres
Hi, Here's a new version with the pgbench support included. Best regards, -- Daniel Vérité PostgreSQL-powered mailer: https://www.manitou-mail.org Twitter: @DanielVerite
Attachment
On 2020-Nov-23, Daniel Verite wrote: > Hi, > > Here's a new version with the pgbench support included. Thanks, incorporated into my local copy.
Thanks David Johnston and Daniel Vérité, I have incorporated your changes into this patch, which is now v26. Also, it's been rebased on current sources. I've been using the new PQtrace() stuff to verify the behavior of the new feature. It's not perfect, but at least it doesn't crash immediately as it did when I tried a few weeks ago. There are imperfections that I think are due to bugs in the PQtrace implementation, not in this patch. -- Álvaro Herrera 39°49'30"S 73°17'W "El conflicto es el camino real hacia la unión"
Attachment
As you can see in an XXX comment in the libpq test program, the current implementation has the behavior that PQgetResult() returns NULL after a batch is finished and has reported PGRES_BATCH_END. I don't know if there's a hard reason to do that, but I'd like to supress it because it seems weird and out of place. -- Álvaro Herrera 39°49'30"S 73°17'W
+ syntax_error(source, lineno, my_command->first_line, my_command->argv[0],
+ libpq_gettext("cannot queue commands during COPY\n"));
+ break;
+PQexitBatchMode(PGconn *conn)
Thanks David Johnston and Daniel Vérité, I have incorporated your
changes into this patch, which is now v26. Also, it's been rebased on
current sources.
I've been using the new PQtrace() stuff to verify the behavior of the
new feature. It's not perfect, but at least it doesn't crash
immediately as it did when I tried a few weeks ago. There are
imperfections that I think are due to bugs in the PQtrace
implementation, not in this patch.
--
Álvaro Herrera 39°49'30"S 73°17'W
"El conflicto es el camino real hacia la unión"
On 2021-Jan-21, Alvaro Herrera wrote: > As you can see in an XXX comment in the libpq test program, the current > implementation has the behavior that PQgetResult() returns NULL after a > batch is finished and has reported PGRES_BATCH_END. I don't know if > there's a hard reason to do that, but I'd like to supress it because it > seems weird and out of place. Hello Craig, a question for you since this API is your devising. I've been looking at changing the way this works to prevent those NULL returns from PQgetResult. That is, instead of having what seems like a "NULL separator" of query results, you'd just get the PGRES_BATCH_END to signify a batch end (not a NULL result after the BATCH_END); and the normal PGRES_COMMAND_OK or PGRES_TUPLES_OK etc when the result of a command has been sent. It's not working yet so I'm not sending an updated patch, but I wanted to know if you had a rationale for including this NULL return "separator" or was it just a convenience because of how the code grew together. Such a decision has obvious consequences for the test program (which right now expects that PQgetResult() returns NULL at each step); and naturally for libpq's internal state machine that controls how it all works. Thanks, -- Álvaro Herrera 39°49'30"S 73°17'W
On 2021-Jan-21, Alvaro Herrera wrote:
> As you can see in an XXX comment in the libpq test program, the current
> implementation has the behavior that PQgetResult() returns NULL after a
> batch is finished and has reported PGRES_BATCH_END. I don't know if
> there's a hard reason to do that, but I'd like to supress it because it
> seems weird and out of place.
Hello Craig, a question for you since this API is your devising. I've
been looking at changing the way this works to prevent those NULL
returns from PQgetResult. That is, instead of having what seems like a
"NULL separator" of query results, you'd just get the PGRES_BATCH_END to
signify a batch end (not a NULL result after the BATCH_END); and the
normal PGRES_COMMAND_OK or PGRES_TUPLES_OK etc when the result of a
command has been sent. It's not working yet so I'm not sending an
updated patch, but I wanted to know if you had a rationale for including
this NULL return "separator" or was it just a convenience because of how
the code grew together.
So long as there is a way to "send A", "send B", "send C", "read results from A", "send D", and there's a way for the application to associate some kind of state (an application specific id or index, a pointer to an application query-queue struct, whatever) so it can match queries to results ... then I'm happy.
On 2021-Feb-16, Craig Ringer wrote: > FWIW I'm also thinking of revising the docs to mostly use the term > "pipeline" instead of "batch". Use "pipelining and batching" in the chapter > topic, and mention "batch" in the index, and add a para that explains how > to run batches on top of pipelining, but otherwise use the term "pipeline" > not "batch". Hmm, this is a good point. It means I have a lot of API renaming to do. I'll get on it now and come back with a proposal. -- Álvaro Herrera Valdivia, Chile
Here's a new version, where I've renamed everything to "pipeline". I think the docs could use some additional tweaks now in order to make a coherent story on pipeline mode, how it can be used in a batched fashion, etc. Here's the renames I applied. It's mostly mechanical, except PQbatchSendQueue is now PQsendPipeline: PQBatchStatus -> PGpipelineStatus (enum) PQBATCH_MODE_OFF -> PQ_PIPELINE_OFF PQBATCH_MODE_ON -> PQ_PIPELINE_ON PQBATCH_MODE_ABORTED -> PQ_PIPELINE_ABORTED PQbatchStatus -> PQpipelineStatus (function) PQenterBatchMode -> PQenterPipelineMode PQexitBatchMode -> PQexitPipelineMode PQbatchSendQueue -> PQsendPipeline PGRES_BATCH_END -> PGRES_PIPELINE_END PGRES_BATCH_ABORTED -> PGRES_PIPELINE_ABORTED Also, PQbatchStatus(conn) returns enum PGpipelineStatus (it previously returned int). I'm tempted to rename PGASYNC_QUEUED to PGASYNC_PIPELINE_IDLE (not sure if PGASYNC_PIPELINE_READY fits better with the existing one). In pgbench, I changed the metacommands to be \startpipeline and \endpipeline. There's a failing Assert() there which I commented out; needs fixed. -- Álvaro Herrera 39°49'30"S 73°17'W
Attachment
+ {
+ commandFailed(st, "startpipeline", "cannot use pipeline mode with the simple query protocol");
+ st->state = CSTATE_ABORTED;
+ return CSTATE_ABORTED;
+{
+ if (!conn)
+ return false;
Here's a new version, where I've renamed everything to "pipeline". I
think the docs could use some additional tweaks now in order to make a
coherent story on pipeline mode, how it can be used in a batched
fashion, etc.
Here's the renames I applied. It's mostly mechanical, except
PQbatchSendQueue is now PQsendPipeline:
PQBatchStatus -> PGpipelineStatus (enum)
PQBATCH_MODE_OFF -> PQ_PIPELINE_OFF
PQBATCH_MODE_ON -> PQ_PIPELINE_ON
PQBATCH_MODE_ABORTED -> PQ_PIPELINE_ABORTED
PQbatchStatus -> PQpipelineStatus (function)
PQenterBatchMode -> PQenterPipelineMode
PQexitBatchMode -> PQexitPipelineMode
PQbatchSendQueue -> PQsendPipeline
PGRES_BATCH_END -> PGRES_PIPELINE_END
PGRES_BATCH_ABORTED -> PGRES_PIPELINE_ABORTED
Also, PQbatchStatus(conn) returns enum PGpipelineStatus (it previously
returned int).
I'm tempted to rename PGASYNC_QUEUED to PGASYNC_PIPELINE_IDLE (not sure
if PGASYNC_PIPELINE_READY fits better with the existing one).
In pgbench, I changed the metacommands to be \startpipeline and
\endpipeline. There's a failing Assert() there which I commented out;
needs fixed.
--
Álvaro Herrera 39°49'30"S 73°17'W