Thread: Re: Add Pipelining support in psql
On Wed, 27 Nov 2024 at 10:50, Anthonin Bonnefoy <anthonin.bonnefoy@datadoghq.com> wrote: > With \bind, \parse, \bind_named and \close, it is possible to issue > queries from psql using the extended protocol. However, it wasn't > possible to send those queries using pipelining and the only way to > test pipelined queries was through pgbench's tap tests. Big +1. Not being able to use psql for even the most basic pipeline tests has definitely been an annoyance of mine. I played around quickly with this patch and it works quite well. A few things that would be nice improvements I think. Feel free to change the command names: 1. Add a \flush command that calls PQflush 2. Add a \flushrequest command that calls PQsendFlushRequest 3. Add a \getresult command so you can get a result from a query without having to close the pipeline To be clear, not having those additional commands isn't a blocker for this patch imho, but I'd definitely miss them if they weren't there when I would be using it.
On Thu, 28 Nov 2024 at 07:43, Michael Paquier <michael@paquier.xyz> wrote: > Hmm. The start, end and sync meta-commands are useful for testing. I > find the flush control a bit less interesting, TBH. > > What would you use these for? I guess mostly for interactively playing around with pipelining from psql. But I think \getresult would be useful for testing too. This would allow us to test that we can read part of the pipeline, without sending a sync and waiting for everything. To be clear \flushrequest and \flush would be necessary to make \getresult work reliably.
On Tue, 10 Dec 2024 at 11:44, Anthonin Bonnefoy <anthonin.bonnefoy@datadoghq.com> wrote: > num_queries (2nd element in the pipeline status prompt) is now used to > track queued queries that were not flushed (with a flush request or > sync) to the server. It used to count both unflushed queries and > flushed queries. I skimmed the code a bit, but haven't looked closely at it yet. I did try the patch out though. My thoughts below: I think that new prompt is super useful, so useful in fact that I'd suggest linking to it from the \startpipeline docs. I do think that the wording in the docs could be a bit more precise: 1. The columns are not necessarily queries, they are messages or commands. i.e. \parse and \bind_named both count as 1, even though they form one query together. 2. messages not followed by \sync and \flushrequest, could very well already "be sent to the server" (if the client buffer got full, or in case of manual \flush). The main thing that \sync and \flushrequest do is make sure that the server actually sends its own result back, even if its buffer is not full yet. The main feedback I have after playing around with this version is that I'd like to have a \getresult (without the s), to only get a single result. So that you can get results one by one, possibly interleaved with some other queries again. One thing I'm wondering is how useful the num_syncs count is in the pipeline prompt, since you never really wait for a sync. Regarding the usefulness of \flush. I agree it's not as useful as I thought, because indeed \getresults already flushes everything. But it's not completely useless either. The main way I was able to use it interactively in a somewhat interesting way was to send it after a long running query, and then while that's processing type up the next query after it. Something like the following: localhost jelte@postgres:5432-26274= #> \startpipeline Time: 0.000 ms localhost jelte@postgres:5432-26274= #|0,0,0|> select pg_sleep(5) \bind \g Time: 0.000 ms localhost jelte@postgres:5432-26274= #|0,1,0|*> \flush Time: 0.000 ms localhost jelte@postgres:5432-26274= #|0,1,0|*> select 1 \bind \g Time: 0.000 ms localhost jelte@postgres:5432-26274= #|0,2,0|*> \syncpipeline Time: 0.000 ms localhost jelte@postgres:5432-26274= #|1,0,2|*> \getresults pg_sleep ────────── (1 row) ?column? ────────── 1 (1 row) Time: 0.348 ms
On Tue, Feb 18, 2025 at 06:34:20PM +0100, Anthonin Bonnefoy wrote: > On Tue, Feb 18, 2025 at 8:23 AM Michael Paquier <michael@paquier.xyz> wrote: >> The tests in psql.sql are becoming really long. Perhaps it would be >> better to split that into its own file, say psql_pipeline.sql? The >> input file is already 2k lines, you are adding 15% more lines to that. > > Agreed, I wasn't sure if this was enough to warrant a dedicated test > file. This is now separated in psql_pipeline.sql. You have forgotten the expected output. Not a big issue as the input was sent. >> What is the reasoning here behind this restriction? \gx is a wrapper >> of \g with expanded mode on, but it is also possible to call \g with >> expanded=on, bypassing this restriction. > > The issue is that \gx enables expanded mode for the duration of the > query and immediately reset it in sendquery_cleanup. With pipelining, > the command is piped and displaying is done by either \endpipeline or > \getresults, so the flag change has no impact. Forbidding it was a way > to make it clearer that it won't have the expected effect. If we > wanted a similar feature, this would need to be done with something > like \endpipelinex or \getresultsx. Hmm, okay. If one wants one mode or the other it is always possible to force one with \pset expanded when getting the results. Not sure if there is any need for new specific commands for these two printing the results. Another option would be to authorize the command to run, but perhaps your option is just better as per the enforced behavior in the output. So fine by me. There is coverage so we'll know if there are arguments in favor of authorizing the command, if need be. > I've split the patch and created the 3 special variables: > PIPELINE_SYNC_COUNT, PIPELINE_COMMAND_COUNT, PIPELINE_RESULT_COUNT. Thanks. Looks sensible now. > For requested_results, I don't think there's value in exposing it > since it is used as an exit condition and thus will always be 0 > outside of ExecQueryAndProcessResults. I've been playing with this patch and this configuration: \set PROMPT1 '=(pipeline=%P,sync=%:PIPELINE_SYNC_COUNT:,cmd=%:PIPELINE_COMMAND_COUNT:,res=%:PIPELINE_RESULT_COUNT:)%#' That's long, but seeing the evolution of the pipeline status is pretty cool depending on the meta-commands used. While testing, I have been able to run into an assertion failure by adding some tests in psql.sql to check for the case of inactive branches for \if. For example: --- a/src/test/regress/sql/psql.sql +++ b/src/test/regress/sql/psql.sql @@ -1047,11 +1047,15 @@ select \if false \\ (bogus \else \\ 42 \endif \\ forty_two; \echo arg1 arg2 arg3 arg4 arg5 \echo arg1 \encoding arg1 + \endpipeline \errverbose And the report: +psql: mainloop.c:513: MainLoop: Assertion `conditional_active(cond_stack)' failed. We should have tests for all new six meta-commands in psql.sql. MainLoop() is wrong when in pipeline mode for inactive branches. -- Michael
Attachment
On Thu, Feb 20, 2025 at 9:02 AM Michael Paquier <michael@paquier.xyz> wrote: > You have forgotten the expected output. Not a big issue as the input > was sent. I was writing the mail with the missing file when you sent this mail. This is fixed. > While testing, I have been able to run into an assertion failure by > adding some tests in psql.sql to check for the case of inactive > branches for \if. For example: > --- a/src/test/regress/sql/psql.sql > +++ b/src/test/regress/sql/psql.sql > @@ -1047,11 +1047,15 @@ select \if false \\ (bogus \else \\ 42 \endif \\ forty_two; > \echo arg1 arg2 arg3 arg4 arg5 > \echo arg1 > \encoding arg1 > + \endpipeline > \errverbose > > And the report: > +psql: mainloop.c:513: MainLoop: Assertion `conditional_active(cond_stack)' failed. > > We should have tests for all new six meta-commands in psql.sql. > MainLoop() is wrong when in pipeline mode for inactive branches. Ha yeah, I forgot about the inactive branches. I've added the new commands and fixed the behaviour. A small issue I've noticed while testing: When a pipeline has at least one queue command, pqClearConnErrorState isn't called in PQsendQueryStart and errors are appended. For example: \startpipeline select 1 \bind \g select 1; PQsendQuery not allowed in pipeline mode select 1; PQsendQuery not allowed in pipeline mode PQsendQuery not allowed in pipeline mode This looks more like an issue on libpq's side as there's no way to reset or advance the errorReported from ExecQueryAndProcessResults (plus PQerrorMessage seems to ignore errorReported). I've added an additional test to track this behaviour for now as this would probably be better discussed in a dedicated thread.
Attachment
On Thu, Feb 20, 2025 at 10:29:33AM +0100, Anthonin Bonnefoy wrote: > Ha yeah, I forgot about the inactive branches. I've added the new > commands and fixed the behaviour. And I did not notice that it was as simple as forcing the status in the routines for the new meta-commands, as we do for the existing ones. Noted. > This looks more like an issue on libpq's side as there's no way to > reset or advance the errorReported from ExecQueryAndProcessResults > (plus PQerrorMessage seems to ignore errorReported). I've added an > additional test to track this behaviour for now as this would probably > be better discussed in a dedicated thread. I am not sure if we should change that, actually, as it does not feel completely wrong to stack these errors. That's a bit confusing, sure. Perhaps a new libpq API to retrieve stacked errors when we are in pipeline mode would be more adapted? The design would be different. Anyway, I've stared at the result processing code for a couple of hours, and the branches we're taking for the pipeline modes seem to be rather right the way you have implemented them. The docs, comments and tests needed quite a few tweaks and edits to be more consistent. There were some grammar mistakes, some frenchisms. I'm hoping that there won't be any issues, but let's be honest, I am definitely sure there will be some more tuning required. It comes down to if we want this set of features, and I do to be able to expand tests in core with the extended query protocol and pipelines, knowing that there is an ask for out-of-core projects. This one being reachable with a COPY gave me a huge smile: +message type 0x5a arrived from server while idle So let's take one step here, I have applied the main patch. I am really excited by the possibilities all this stuff offers. Attached are the remaining pieces, split here because they are different bullet points: - Tests for implicit transactions with various commands, with some edits. - Prompt support, with more edits. I'm putting these on standby for a few days, to let the buildfarm digest the main change. -- Michael