Thread: libpq pipelining
Hi, The recent discussion about pipelining in the jodbc driver prompted me to look at what it would take for libpq. I have a proof of concept patch working. The results are even more promising than I expected. While it's true that many applications and frameworks won't easily benefit, it amazes me that this hasn't been explored before. I developed a simple test application that creates a table with a single auto increment primary key column, then runs a 4 simple queries x times each: "INSERT INTO test() VALUES ()" "SELECT * FROM test LIMIT 1" "SELECT * FROM test" "DELETE FROM test" The parameters to testPipelinedSeries are (number of times to execute each query, maximum number of queued queries). Results against local server: testPipelinedSeries(10,1) took 0.020884 testPipelinedSeries(10,3) took 0.020630, speedup 1.01 testPipelinedSeries(10,10) took 0.006265, speedup 3.33 testPipelinedSeries(100,1) took 0.042731 testPipelinedSeries(100,3) took 0.043035, speedup 0.99 testPipelinedSeries(100,10) took 0.037222, speedup 1.15 testPipelinedSeries(100,25) took 0.031223, speedup 1.37 testPipelinedSeries(100,50) took 0.032482, speedup 1.32 testPipelinedSeries(100,100) took 0.031356, speedup 1.36 Results against remote server through ssh tunnel(30-40ms rtt): testPipelinedSeries(10,1) took 3.2461736 testPipelinedSeries(10,3) took 1.1008443, speedup 2.44 testPipelinedSeries(10,10) took 0.342399, speedup 7.19 testPipelinedSeries(100,1) took 26.25882588 testPipelinedSeries(100,3) took 8.8509234, speedup 3.04 testPipelinedSeries(100,10) took 3.2866285, speedup 9.03 testPipelinedSeries(100,25) took 2.1472847, speedup 17.57 testPipelinedSeries(100,50) took 1.957510, speedup 27.03 testPipelinedSeries(100,100) took 0.690682, speedup 37.47 I plan to write documentation, add regression testing, and do general cleanup before asking for feedback on the patch itself. Any suggestions about performance testing or api design would be nice. I haven't played with changing the sync logic yet, but I'm guessing that an api to allow manual sync instead of a sync per PQsendQuery will be needed. That could make things tricky though with multi-statement queries, because currently the only way to detect when results change from one query to the next are a ReadyForQuery message. Matt Newell
Attachment
On 12/04/2014 03:11 AM, Matt Newell wrote: > The recent discussion about pipelining in the jodbc driver prompted me to look > at what it would take for libpq. Great! > I have a proof of concept patch working. The results are even more promising > than I expected. > > While it's true that many applications and frameworks won't easily benefit, it > amazes me that this hasn't been explored before. > > I developed a simple test application that creates a table with a single auto > increment primary key column, then runs a 4 simple queries x times each: > ... > > I plan to write documentation, add regression testing, and do general cleanup > before asking for feedback on the patch itself. Any suggestions about > performance testing or api design would be nice. I haven't played with > changing the sync logic yet, but I'm guessing that an api to allow manual sync > instead of a sync per PQsendQuery will be needed. That could make things > tricky though with multi-statement queries, because currently the only way to > detect when results change from one query to the next are a ReadyForQuery > message. A good API is crucial for this. It should make it easy to write an application that does pipelining, and to handle all the error conditions in a predictable way. I'd suggest that you write the documentation first, before writing any code, so that we can discuss the API. It doesn't have to be in SGML format yet, a plain-text description of the API will do. - Heikki
On 12/04/2014 05:08 PM, Heikki Linnakangas wrote: >> > > A good API is crucial for this. It should make it easy to write an > application that does pipelining, and to handle all the error conditions > in a predictable way. I'd suggest that you write the documentation > first, before writing any code, so that we can discuss the API. It > doesn't have to be in SGML format yet, a plain-text description of the > API will do. I strongly agree. Applications need to be able to reliably predict what will happen if there's an error in the middle of a pipeline. Consideration of implicit transactions (autocommit), the whole pipeline being one transaction, or multiple transactions is needed. Apps need to be able to wait for the result of a query partway through a pipeline, e.g. scheduling four queries, then waiting for the result of the 2nd. There are probably plenty of other wrinkly bits to think about. -- Craig Ringer http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
On Thursday, December 04, 2014 10:30:46 PM Craig Ringer wrote: > On 12/04/2014 05:08 PM, Heikki Linnakangas wrote: > > A good API is crucial for this. It should make it easy to write an > > application that does pipelining, and to handle all the error conditions > > in a predictable way. I'd suggest that you write the documentation > > first, before writing any code, so that we can discuss the API. It > > doesn't have to be in SGML format yet, a plain-text description of the > > API will do. > > I strongly agree. > First pass at the documentation changes attached, along with a new example that demonstrates pipelining 3 queries, with the middle one resulting in a PGRES_FATAL_ERROR response. With the API i am proposing, only 2 new functions (PQgetFirstQuery, PQgetLastQuery) are required to be able to match each result to the query that caused it. Another function, PQgetNextQuery allows iterating through the pending queries, and PQgetQueryCommand permits getting the original query text. Adding the ability to set a user supplied pointer on the PGquery struct might make it much easier for some frameworks, and other users might want a callback, but I don't think either are required. > Applications need to be able to reliably predict what will happen if > there's an error in the middle of a pipeline. > Yes, the API i am proposing makes it easy to get results for each submitted query independently of the success or failure of previous queries in the pipeline. > Consideration of implicit transactions (autocommit), the whole pipeline > being one transaction, or multiple transactions is needed. The more I think about this the more confident I am that no extra work is needed. Unless we start doing some preliminary processing of the query inside of libpq, our hands are tied wrt sending a sync at the end of each query. The reason for this is that we rely on the ReadyForQuery message to indicate the end of a query, so without the sync there is no way to tell if the next result is from another statement in the current query, or the first statement in the next query. I also don't see a reason to need multiple queries without a sync statement. If the user wants all queries to succeed or fail together it should be no problem to start the pipeline with begin and complete it commit. But I may be missing some detail... > > Apps need to be able to wait for the result of a query partway through a > pipeline, e.g. scheduling four queries, then waiting for the result of > the 2nd. > Right. With the api i am proposing the user does have to process each result until it gets to the one it wants, but it's no problem doing that. It would also be trivial to add a function PGresult * PQgetNextQueryResult(PQquery *query); that discards all results from previous queries. Very similar to how a PQexec disregards all results from previous async queries. It would also be possible to queue the results and be able to retrieve them out of order, but I think that add unnecessary complexity and might also make it easy for users to never retrieve and free some results. > There are probably plenty of other wrinkly bits to think about. Yup, I'm sure i'm still missing some significant things at this point... Matt Newell
Attachment
On Thu, Dec 4, 2014 at 4:11 PM, Matt Newell <newellm@blur.com> wrote: > With the API i am proposing, only 2 new functions (PQgetFirstQuery, > PQgetLastQuery) are required to be able to match each result to the query that > caused it. Another function, PQgetNextQuery allows iterating through the > pending queries, and PQgetQueryCommand permits getting the original query > text. > > Adding the ability to set a user supplied pointer on the PGquery struct might > make it much easier for some frameworks, and other users might want a > callback, but I don't think either are required. With a pointer on PGquery you wouldn't need any of the above. Who whants the query text sets it as a pointer, who wants some other struct sets it as a pointer. You would only need to be careful about the lifetime of the pointed struct, but that onus is on the application I'd say. The API only needs to provide some guarantees about how long or short it holds onto that pointer. I'm thinking this would be somewhat necessary for a python wrapper, like psycopg2 (the wrapper could build a dictionary based on query text, but there's no guarantee that query text will be unique so it'd be very tricky).
On Thursday, December 04, 2014 04:30:27 PM Claudio Freire wrote: > On Thu, Dec 4, 2014 at 4:11 PM, Matt Newell <newellm@blur.com> wrote: > > With the API i am proposing, only 2 new functions (PQgetFirstQuery, > > PQgetLastQuery) are required to be able to match each result to the query > > that caused it. Another function, PQgetNextQuery allows iterating > > through the pending queries, and PQgetQueryCommand permits getting the > > original query text. > > > > Adding the ability to set a user supplied pointer on the PGquery struct > > might make it much easier for some frameworks, and other users might want > > a callback, but I don't think either are required. > > With a pointer on PGquery you wouldn't need any of the above. Who > whants the query text sets it as a pointer, who wants some other > struct sets it as a pointer. > libpq already stores the (current) query text as it's used in some error cases, so that's not really optional without breaking backwards compatibility. Adding another pointer for the user to optional utilize should be no big deal though if everyone agrees it's a good thing. > You would only need to be careful about the lifetime of the pointed > struct, but that onus is on the application I'd say. The API only > needs to provide some guarantees about how long or short it holds onto > that pointer. Agreed. > > I'm thinking this would be somewhat necessary for a python wrapper, > like psycopg2 (the wrapper could build a dictionary based on query > text, but there's no guarantee that query text will be unique so it'd > be very tricky). While it might make some things simpler, i really don't think it absolutely necessary since the wrapper can maintain a queue that corresponds to libpq's internal queue of PGquery's. ie, each time you call a PQsendQuery* function you push your required state, and each time the return value of PQgetFirstQuery changes you pop from the queue.
On 12/04/2014 09:11 PM, Matt Newell wrote: > With the API i am proposing, only 2 new functions (PQgetFirstQuery, > PQgetLastQuery) are required to be able to match each result to the query that > caused it. Another function, PQgetNextQuery allows iterating through the > pending queries, and PQgetQueryCommand permits getting the original query > text. > > Adding the ability to set a user supplied pointer on the PGquery struct might > make it much easier for some frameworks, and other users might want a > callback, but I don't think either are required. I don't like exposing the PGquery struct to the application like that. Access to all other libpq objects is done via functions. The application can't (or shouldn't, anyway) directly access the fields of PGresult, for example. It has to call PQnfields(), PQntuples() etc. The user-supplied pointer seems quite pointless. It would make sense if the pointer was passed to PQsendquery(), and you'd get it back in PGquery. You could then use it to tag the query when you send it with whatever makes sense for the application, and use the tag in the result to match it with the original query. But as it stands, I don't see the point. The original query string might be handy for some things, but for others it's useless. It's not enough as a general method to identify the query the result belongs to. A common use case for this is to execute the same query many times with different parameters. So I don't think you've quite nailed the problem of how to match the results to the commands that originated them, yet. One idea is to add a function that can be called after PQgetResult(), to get some identifier of the original command. But there needs to be a mechanism to tag the PQsendQuery() calls. Or you can assign each call a unique ID automatically, and have a way to ask for that ID after calling PQsendQuery(). The explanation of PQgetFirstQuery makes it sound pretty hard to match up the result with the query. You have to pay attention to PQisBusy. It would be good to make it explicit when you start a pipelined operation. Currently, you get an error if you call PQsendQuery() twice in a row, without reading the result inbetween. That's a good thing, to catch application errors, when you're not trying to do pipelining. Otherwise, if you forget to get the result of a query you've sent, and then send another query, you'll merrily read the result of the first query and think that it belongs to the second. Are you trying to support "continous pipelining", where you send new queries all the time, and read results as they arrive, without ever draining the pipe? Or are you just trying to do "batches", where you send a bunch of queries, and wait for all the results to arrive, before sending more? A batched API would be easier to understand and work with, although a "continuous" pipeline could be more efficient for an application that can take advantage of it. >> Consideration of implicit transactions (autocommit), the whole pipeline >> being one transaction, or multiple transactions is needed. > The more I think about this the more confident I am that no extra work is > needed. > > Unless we start doing some preliminary processing of the query inside of > libpq, our hands are tied wrt sending a sync at the end of each query. The > reason for this is that we rely on the ReadyForQuery message to indicate the > end of a query, so without the sync there is no way to tell if the next result > is from another statement in the current query, or the first statement in the > next query. > > I also don't see a reason to need multiple queries without a sync statement. > If the user wants all queries to succeed or fail together it should be no > problem to start the pipeline with begin and complete it commit. But I may be > missing some detail... True. It makes me a bit uneasy, though, to not be sure that the whole batch is committed or rolled back as one unit. There are many ways the user can shoot himself in the foot with that. Error handling would be a lot simpler if you would only send one Sync for the whole batch. Tom explained it better on this recent thread: http://www.postgresql.org/message-id/32086.1415063405@sss.pgh.pa.us. Another thought is that for many applications, it would actually be OK to not know which query each result belongs to. For example, if you execute a bunch of inserts, you often just want to get back the total number of inserted, or maybe not even that. Or if you execute a "CREATE TEMPORARY TABLE ... ON COMMIT DROP", followed by some insertions to it, some more data manipulations, and finally a SELECT to get the results back. All you want is the last result set. If we could modify the wire protocol, we'd want to have a MiniSync message that is like Sync except that it wouldn't close the current transaction. The server would respond to it with a ReadyForQuery message (which could carry an ID number, to match it up with the MiniSync command). But I really wish we'd find a way to do this without changing the wire protocol. - Heikki
On Thursday, December 04, 2014 11:39:02 PM Heikki Linnakangas wrote: > > Adding the ability to set a user supplied pointer on the PGquery struct > > might make it much easier for some frameworks, and other users might want > > a callback, but I don't think either are required. > > I don't like exposing the PGquery struct to the application like that. > Access to all other libpq objects is done via functions. The application > can't (or shouldn't, anyway) directly access the fields of PGresult, for > example. It has to call PQnfields(), PQntuples() etc. > Right, my patch doesn't expose it. I was thinking of adding two new functions to get/set the user tag/pointer. > The user-supplied pointer seems quite pointless. It would make sense if > the pointer was passed to PQsendquery(), and you'd get it back in > PGquery. You could then use it to tag the query when you send it with > whatever makes sense for the application, and use the tag in the result > to match it with the original query. That's exactly what I envisioned, but with a separate call to avoid having to modify/duplicate the PQsendQuery functions: PQsendQuery(conn,...) query = PQgetLastQuery(conn); PQquerySetUserPointer(query,userPtr); ... result = PQgetResult(conn); query = PQgetFirstQuery(conn); userPtr = PQqueryGetUserPointer(query); > But as it stands, I don't see the > point. I don't need it since it should be easy to keep track without it. It was just an idea. > The original query string might be handy for some things, but for others > it's useless. It's not enough as a general method to identify the query > the result belongs to. A common use case for this is to execute the same > query many times with different parameters. > Right, I'm only saving the query text because that's how things were done already. Since it's already there I didn't see a reason not to expose it. > So I don't think you've quite nailed the problem of how to match the > results to the commands that originated them, yet. One idea is to add a > function that can be called after PQgetResult(), to get some identifier > of the original command. But there needs to be a mechanism to tag the > PQsendQuery() calls. Or you can assign each call a unique ID > automatically, and have a way to ask for that ID after calling > PQsendQuery(). PGquery IS the unique ID, and it is available after calling PQsendQuery by calling PQgetLastQuery. > > The explanation of PQgetFirstQuery makes it sound pretty hard to match > up the result with the query. You have to pay attention to PQisBusy. > It's not hard at all and is very natural to use since the whole point of an async api is to avoid blocking, so it's natural to only call PQgetResult when it's not going to block. PQgetFirstQuery should also be valid after calling PQgetResult and then you don't have to worry about PQisBusy, so I should probably change the documentation to indicate that is the preferred usage, or maybe make that the only guaranteed usage, and say the results are undefined if you call it before calling PQgetResult. That usage also makes it consistent with PQgetLastQuery being called immediately after PQsendQuery. Another option would be a function to get the PGquery for any PGresult. This would make things a bit more straightforward for the user, but more complicated in the implementation since multiple PGresults will share the same PGquery. However it's nothing that a reference count wouldn't solve. > It would be good to make it explicit when you start a pipelined > operation. Currently, you get an error if you call PQsendQuery() twice > in a row, without reading the result inbetween. That's a good thing, to > catch application errors, when you're not trying to do pipelining. > Otherwise, if you forget to get the result of a query you've sent, and > then send another query, you'll merrily read the result of the first > query and think that it belongs to the second. Agreed, and I think this is the only behavior change currently. An easy fix to restore existing behavior by default: PQsetPipelining(PGconn *conn, int arg); should work. > > Are you trying to support "continous pipelining", where you send new > queries all the time, and read results as they arrive, without ever > draining the pipe? Or are you just trying to do "batches", where you > send a bunch of queries, and wait for all the results to arrive, before > sending more? A batched API would be easier to understand and work with, > although a "continuous" pipeline could be more efficient for an > application that can take advantage of it. > I don't see any reason to limit it to batches, though it can certainly be used that way. My first test case does continuous pipelining and it provides a huge throughput gain when there's any latency in the connection. I can envision a lot of uses for the continuous approach. > >> Consideration of implicit transactions (autocommit), the whole pipeline > >> being one transaction, or multiple transactions is needed. > > > > The more I think about this the more confident I am that no extra work is > > needed. > > > > Unless we start doing some preliminary processing of the query inside of > > libpq, our hands are tied wrt sending a sync at the end of each query. > > The > > reason for this is that we rely on the ReadyForQuery message to indicate > > the end of a query, so without the sync there is no way to tell if the > > next result is from another statement in the current query, or the first > > statement in the next query. > > > > I also don't see a reason to need multiple queries without a sync > > statement. If the user wants all queries to succeed or fail together it > > should be no problem to start the pipeline with begin and complete it > > commit. But I may be missing some detail... > > True. It makes me a bit uneasy, though, to not be sure that the whole > batch is committed or rolled back as one unit. There are many ways the > user can shoot himself in the foot with that. Error handling would be a > lot simpler if you would only send one Sync for the whole batch. Tom > explained it better on this recent thread: > http://www.postgresql.org/message-id/32086.1415063405@sss.pgh.pa.us. > I've read tom's email and every other one i could find related to pipelining and I simply don't see the problem. What's the advantage of being able to pipeline multiple queries without sync and without an explicit transaction, vs an explicit transaction with a sync per query? The usage I have in mind for pipelining needs each query to succeed or fail independently of any query before or after it, unless the caller explicitly uses a transaction. > Another thought is that for many applications, it would actually be OK > to not know which query each result belongs to. For example, if you > execute a bunch of inserts, you often just want to get back the total > number of inserted, or maybe not even that. Or if you execute a "CREATE > TEMPORARY TABLE ... ON COMMIT DROP", followed by some insertions to it, > some more data manipulations, and finally a SELECT to get the results > back. All you want is the last result set. Easy, grab the PGquery from only the last select, call PQgetResult until you get a matching result, PQclear the rest. We could provide a function to do just that as a convenience to the user, if it's going to be a common use-case. It would be no different than how PQexec processes and discards results. Matt
> > > The explanation of PQgetFirstQuery makes it sound pretty hard to match > > up the result with the query. You have to pay attention to PQisBusy. > > PQgetFirstQuery should also be valid after > calling PQgetResult and then you don't have to worry about PQisBusy, so I > should probably change the documentation to indicate that is the preferred > usage, or maybe make that the only guaranteed usage, and say the results > are undefined if you call it before calling PQgetResult. That usage also > makes it consistent with PQgetLastQuery being called immediately after > PQsendQuery. > I changed my second example to call PQgetFirstQuery after PQgetResult instead of before, and that removes the need to call PQconsumeInput and PQisBusy when you don't mind blocking. It makes the example super simple: PQsendQuery(conn, "INSERT INTO test(id) VALUES (DEFAULT),(DEFAULT) RETURNING id");query1 = PQgetLastQuery(conn);/* Duplicate primary key error */PQsendQuery(conn, "UPDATE test SET id=2 WHEREid=1");query2 = PQgetLastQuery(conn);PQsendQuery(conn, "SELECT * FROM test");query3 = PQgetLastQuery(conn);while( (result= PQgetResult(conn)) != NULL ){ curQuery = PQgetFirstQuery(conn); if (curQuery == query1) checkResult(conn,result,curQuery,PGRES_TUPLES_OK); if (curQuery == query2) checkResult(conn,result,curQuery,PGRES_FATAL_ERROR); if (curQuery == query3) checkResult(conn,result,curQuery,PGRES_TUPLES_OK);} Note that the curQuery == queryX check will work no matter how many results a query produces. Matt Newell
On 12/05/2014 02:30 AM, Matt Newell wrote: >> >>> The explanation of PQgetFirstQuery makes it sound pretty hard to match >>> up the result with the query. You have to pay attention to PQisBusy. >> >> PQgetFirstQuery should also be valid after >> calling PQgetResult and then you don't have to worry about PQisBusy, so I >> should probably change the documentation to indicate that is the preferred >> usage, or maybe make that the only guaranteed usage, and say the results >> are undefined if you call it before calling PQgetResult. That usage also >> makes it consistent with PQgetLastQuery being called immediately after >> PQsendQuery. >> > I changed my second example to call PQgetFirstQuery after PQgetResult instead > of before, and that removes the need to call PQconsumeInput and PQisBusy when > you don't mind blocking. It makes the example super simple: > > PQsendQuery(conn, "INSERT INTO test(id) VALUES (DEFAULT),(DEFAULT) > RETURNING id"); > query1 = PQgetLastQuery(conn); > > /* Duplicate primary key error */ > PQsendQuery(conn, "UPDATE test SET id=2 WHERE id=1"); > query2 = PQgetLastQuery(conn); > > PQsendQuery(conn, "SELECT * FROM test"); > query3 = PQgetLastQuery(conn); > > while( (result = PQgetResult(conn)) != NULL ) > { > curQuery = PQgetFirstQuery(conn); > > if (curQuery == query1) > checkResult(conn,result,curQuery,PGRES_TUPLES_OK); > if (curQuery == query2) > checkResult(conn,result,curQuery,PGRES_FATAL_ERROR); > if (curQuery == query3) > checkResult(conn,result,curQuery,PGRES_TUPLES_OK); > } > > Note that the curQuery == queryX check will work no matter how many results a > query produces. Oh, that's what the PQgetLastQuery/PQgetNextQuery functions work! I didn't understand that before. I'd suggest renaming them to something like PQgetSentQuery() and PQgetResultQuery(). The first/last/next names made me think that they're used to iterate a list of queries, but in fact they're supposed to be used at very different stages. - Heikki
On Friday, December 05, 2014 12:22:38 PM Heikki Linnakangas wrote: > Oh, that's what the PQgetLastQuery/PQgetNextQuery functions work! I > didn't understand that before. I'd suggest renaming them to something > like PQgetSentQuery() and PQgetResultQuery(). The first/last/next names > made me think that they're used to iterate a list of queries, but in > fact they're supposed to be used at very different stages. > > - Heikki Okay, I have renamed them with your suggestions, and added PQsetPipelining/PQgetPipelining, defaulting to pipelining off. There should be no behavior change unless pipelining is enabled. Documentation should be mostly complete except the possible addition of an example and maybe a general pipelining overview paragraph. I have implemented async query support (that takes advantage of pipelining) in Qt, along with a couple test cases. This led to me discovering a bug with my last patch where a PGquery object could be reused twice in a row. I have fixed that. I contemplated not reusing the PGquery objects at all, but that wouldn't solve the problem because it's very possible that malloc will return a recent free of the same size anyway. Making the guarantee that a PGquery won't be reused twice in a row should be sufficient, and the only alternative is to add a unique id, but that will add further complexity that I don't think is warranted. Feedback is very welcome and appreciated. Thanks, Matt Newell