Thread: libpq pipelineing
Hello, Using the asynchronous interface of libpq, is it possible to pipeline multiple queries? i.e. PQsendQuery(query1) PQsendQuery(query2) followed by query1_results = PQgetResult(...) query2_results = PQgetResult(...) I tried it but got "another command is already in progress" error. So, maybe it's not supported, or maybe I'm doing something wrong. Thanks Samuel
On Friday, June 26, 2020, Samuel Williams <space.ship.traveller@gmail.com> wrote:
Hello,
Using the asynchronous interface of libpq, is it possible to pipeline
multiple queries?
i.e.
PQsendQuery(query1)
PQsendQuery(query2)
followed by
query1_results = PQgetResult(...)
query2_results = PQgetResult(...)
I tried it but got "another command is already in progress" error.
The documentation seems to leave zero ambiguity:
After successfully calling
PQsendQuery
, call PQgetResult
one or more times to obtain the results. PQsendQuery
cannot be called again (on the same connection) until PQgetResult
has returned a null pointer, indicating that the command is done.David J.
Thanks David, You are correct. I was giving an example of what I was hoping to achieve, not what I expected to work with the current interface. I found some discussion in the past relating to batch processing which appears to support some kind of pipelining: https://2ndquadrant.github.io/postgres/libpq-batch-mode.html However it seems to be abandoned. Kind regards, Samuel On Sat, 27 Jun 2020 at 16:15, David G. Johnston <david.g.johnston@gmail.com> wrote: > > On Friday, June 26, 2020, Samuel Williams <space.ship.traveller@gmail.com> wrote: >> >> Hello, >> >> Using the asynchronous interface of libpq, is it possible to pipeline >> multiple queries? >> >> i.e. >> >> PQsendQuery(query1) >> PQsendQuery(query2) >> >> followed by >> >> query1_results = PQgetResult(...) >> query2_results = PQgetResult(...) >> >> I tried it but got "another command is already in progress" error. > > > The documentation seems to leave zero ambiguity: > > After successfully calling PQsendQuery, call PQgetResult one or more times to obtain the results. PQsendQuery cannot becalled again (on the same connection) until PQgetResult has returned a null pointer, indicating that the command is done. > > David J. >
On Friday, June 26, 2020, Samuel Williams <space.ship.traveller@gmail.com> wrote:
Thanks David,
You are correct.
I was giving an example of what I was hoping to achieve, not what I
expected to work with the current interface.
What about, as it says, sending multiple statements in a single sendQuery and then polling for multiple results?
David J.
> What about, as it says, sending multiple statements in a single sendQuery and then polling for multiple results? I tried this, and even in single row streaming mode, I found that there are cases where the results would not be streamed until all the queries were sent. From the users point of view, they may generate a loop sending multiple queries and don't care about the result, so a pipeline/batch processing is ideal to avoid RTT per loop iteration, if database access is slow, this can be a significant source of latency. Kind regards, Samuel
On Friday, June 26, 2020, Samuel Williams <space.ship.traveller@gmail.com> wrote:
> What about, as it says, sending multiple statements in a single sendQuery and then polling for multiple results?
I tried this, and even in single row streaming mode, I found that
there are cases where the results would not be streamed until all the
queries were sent.
From the users point of view, they may generate a loop sending
multiple queries and don't care about the result, so a pipeline/batch
processing is ideal to avoid RTT per loop iteration, if database
access is slow, this can be a significant source of latency
I don’t have any insight into the bigger picture but I’d concur that no other option is documented so what you desire is not possible.
David J.
Here is a short example: https://gist.github.com/ioquatix/2f08f78699418f65971035785c80cf18 It makes 10 queries in one "PQsendQuery" and sets single row mode. But all the results come back at once as shown by the timestamps. Next I'm planning to investigate streaming large recordsets to see if it works better/incrementally.
Samuel Williams <space.ship.traveller@gmail.com> writes: > Here is a short example: > https://gist.github.com/ioquatix/2f08f78699418f65971035785c80cf18 > It makes 10 queries in one "PQsendQuery" and sets single row mode. But > all the results come back at once as shown by the timestamps. That looks to be less about what libpq will do than what the Ruby interface code will do. The volume of return data may also be an issue. I don't think the backend will flush data out to the client except when it (a) reaches an idle state or (b) fills the output buffer. Ten occurrences of a short query result aren't gonna be enough for (b) --- from memory, that buffer is probably 8KB. regards, tom lane
I think libmariadb has a nicer interface for this. Essentially what you do is send your query, and then read a result set (one result set per query), and then you stream individual rows using: mysql_fetch_row_start mysql_fetch_row_cont Those methods don't seem to have an equivalent in libpq - you can use PQgetResult but it buffers all the rows. Using single row mode results in many results for each query (seems like a big overhead). The difference between this and MariaDB is that mysql_fetch_row still operates within one logical set of results, but single row mode breaks the single logical set of results into lots of individual results. Maybe the statement about efficiency is incorrect, but it would be nice if you could incrementally stream a single result set more easily. On Sun, 28 Jun 2020 at 02:40, Tom Lane <tgl@sss.pgh.pa.us> wrote: > > Samuel Williams <space.ship.traveller@gmail.com> writes: > > Here is a short example: > > https://gist.github.com/ioquatix/2f08f78699418f65971035785c80cf18 > > It makes 10 queries in one "PQsendQuery" and sets single row mode. But > > all the results come back at once as shown by the timestamps. > > That looks to be less about what libpq will do than what the Ruby > interface code will do. > > The volume of return data may also be an issue. I don't think the > backend will flush data out to the client except when it (a) reaches > an idle state or (b) fills the output buffer. Ten occurrences of > a short query result aren't gonna be enough for (b) --- from memory, > that buffer is probably 8KB. > > regards, tom lane
Greetings, * Samuel Williams (space.ship.traveller@gmail.com) wrote: > Here is a short example: > > https://gist.github.com/ioquatix/2f08f78699418f65971035785c80cf18 > > It makes 10 queries in one "PQsendQuery" and sets single row mode. But > all the results come back at once as shown by the timestamps. If you have 10 queries that you want to make in a given transaction and you care about the latency then really the best option is to wrap that all in a single pl/pgsql function on the server side and make one call. > Next I'm planning to investigate streaming large recordsets to see if > it works better/incrementally. If you want to stream large data sets to/from PG, you should consider using COPY. Thanks, Stephen
Attachment
Tom, I'm implementing a small abstraction layer for event-driven result streaming on PostgreSQL and MariaDB for Ruby, and I'll endeavor to report back with some numbers once I have enough of it working to benchmark something meaningful. Thanks for your patience and help. Kind regards, Samuel On Tue, 30 Jun 2020 at 02:06, Tom Lane <tgl@sss.pgh.pa.us> wrote: > > Samuel Williams <space.ship.traveller@gmail.com> writes: > > Those methods don't seem to have an equivalent in libpq - you can use > > PQgetResult but it buffers all the rows. Using single row mode results > > in many results for each query (seems like a big overhead). > > Have you got any actual evidence for that? Sure, the overhead is > more than zero, but does it mean anything in comparison to the other > costs of data transmission? > > > Maybe the statement about efficiency is incorrect, but it would be > > nice if you could incrementally stream a single result set more > > easily. > > More easily than what? If we did not construct a PGresult then we would > need some other abstraction for access to the returned row, dealing with > error cases, etc etc. That would mean a lot of very duplicative API code > in libpq, and a painful bunch of adjustments in client code. > > regards, tom lane
Hi, Here are some initial numbers. DB::Client Warming up -------------------------------------- db-postgres 281.000 i/100ms db-mariadb 399.000 i/100ms mysql2 533.000 i/100ms pg 591.000 i/100ms Calculating ------------------------------------- db-postgres 2.725k (± 1.8%) i/s - 13.769k in 5.053750s db-mariadb 3.990k (± 2.4%) i/s - 19.950k in 5.002453s mysql2 5.153k (± 4.7%) i/s - 26.117k in 5.079570s pg 5.772k (± 4.4%) i/s - 28.959k in 5.027423s Comparison: pg: 5771.7 i/s mysql2: 5152.8 i/s - 1.12x (± 0.00) slower db-mariadb: 3990.3 i/s - 1.45x (± 0.00) slower db-postgres: 2725.5 i/s - 2.12x (± 0.00) slower The db-* gems are event driven. However, that is of less interest right now. This benchmark creates a table, inserts 1000 (or some fixed number) of rows, and then selects them all back out. What I noticed is that `PQgetvalue` and `PQgetisnull` is not particularly efficient, at least via FFI. Requires rows * columns * 2 FFI calls. libmariadb provides `mysql_fetch_row` which returns a `char **` per row. Requires only rows FFI calls. Does a similar method exist for libpq? e.g. `PGgetrow(index) -> char**` (array of strings, one for each column, may be nil to indicate null). Kind regards, Samuel On Tue, 30 Jun 2020 at 12:50, Samuel Williams <space.ship.traveller@gmail.com> wrote: > > Tom, I'm implementing a small abstraction layer for event-driven > result streaming on PostgreSQL and MariaDB for Ruby, and I'll endeavor > to report back with some numbers once I have enough of it working to > benchmark something meaningful. > > Thanks for your patience and help. > > Kind regards, > Samuel > > On Tue, 30 Jun 2020 at 02:06, Tom Lane <tgl@sss.pgh.pa.us> wrote: > > > > Samuel Williams <space.ship.traveller@gmail.com> writes: > > > Those methods don't seem to have an equivalent in libpq - you can use > > > PQgetResult but it buffers all the rows. Using single row mode results > > > in many results for each query (seems like a big overhead). > > > > Have you got any actual evidence for that? Sure, the overhead is > > more than zero, but does it mean anything in comparison to the other > > costs of data transmission? > > > > > Maybe the statement about efficiency is incorrect, but it would be > > > nice if you could incrementally stream a single result set more > > > easily. > > > > More easily than what? If we did not construct a PGresult then we would > > need some other abstraction for access to the returned row, dealing with > > error cases, etc etc. That would mean a lot of very duplicative API code > > in libpq, and a painful bunch of adjustments in client code. > > > > regards, tom lane