Thread: Progress of asynchronous queries
I couldn't find anything in the libpq interfaces that identifies the current progress of an asynchronous query as a percentage of the estimated total query time. So, unless I missed something, I would like to file this as a feature request. The idea behind this, of course, is to show a progress bar in the user interface of the client application as soon as the query takes longer than say 1 second. Regards, Adriaan van Os
On Fri, September 15, 2006 15:55, Adriaan van Os wrote: > I couldn't find anything in the libpq interfaces that identifies the > current progress of an > asynchronous query as a percentage of the estimated total query time. So, > unless I missed > something, I would like to file this as a feature request. > > The idea behind this, of course, is to show a progress bar in the user > interface of the client > application as soon as the query takes longer than say 1 second. One way to do this would be to use a cursor, and only receive batches of (say) a hundred or a thousand rows at a time. In that case you might not even want to bother with asynchronous execution at all. It may take some extra work for your program to handle its data in chunks.Then again, for many programs it's not necessaryto hold all of a result set in memory at the same time. In that case you may see reduced memory usage if you request one chunk of data through a cursor, process it, then request the next batch. Doing that will take longer overall but give you faster initial response time. If you want to keep the overall time low, and you do a lot of processing per batch, you may still want to request your batches asynchronously just so your program can process one batch of data while the server is retrieving the next. Jeroen
Jeroen T. Vermeulen wrote: > Adriaan van Os wrote: >> I couldn't find anything in the libpq interfaces that identifies the >> current progress of an >> asynchronous query as a percentage of the estimated total query time. So, >> unless I missed >> something, I would like to file this as a feature request. >> >> The idea behind this, of course, is to show a progress bar in the user >> interface of the client >> application as soon as the query takes longer than say 1 second. > > One way to do this would be to use a cursor, and only receive batches of > (say) a hundred or a thousand rows at a time. In that case you might not > even want to bother with asynchronous execution at all. > > It may take some extra work for your program to handle its data in chunks. > Then again, for many programs it's not necessary to hold all of a result > set in memory at the same time. In that case you may see reduced memory > usage if you request one chunk of data through a cursor, process it, then > request the next batch. > > Doing that will take longer overall but give you faster initial response > time. If you want to keep the overall time low, and you do a lot of > processing per batch, you may still want to request your batches > asynchronously just so your program can process one batch of data while > the server is retrieving the next. Thanks for your reply and the idea to process data in chunks. Still, I feel that a workaround for some situations (that may not work in others) is not in general a good replacement for a fundamental feature, as keeping the user informed is one of principles of User Interface design <http://developer.apple.com/documentation/UserExperience/Conceptual/OSXHIGuidelines/XHIGHIDesign/chapter_5_section_2.html>. Besides, when more than one user is connected, multiple SQL commands may behave different than a single SQL command (<http://www.postgresql.org/docs/8.1/static/transaction-iso.html>) Regards, Adriaan van Os
on 9/15/06 4:18 AM, jtv@xs4all.nl purportedly said: >> I couldn't find anything in the libpq interfaces that identifies the >> current progress of an >> asynchronous query as a percentage of the estimated total query time. So, >> unless I missed >> something, I would like to file this as a feature request. >> >> The idea behind this, of course, is to show a progress bar in the user >> interface of the client >> application as soon as the query takes longer than say 1 second. > > One way to do this would be to use a cursor, and only receive batches of > (say) a hundred or a thousand rows at a time. In that case you might not > even want to bother with asynchronous execution at all. > > It may take some extra work for your program to handle its data in chunks. > Then again, for many programs it's not necessary to hold all of a result > set in memory at the same time. In that case you may see reduced memory > usage if you request one chunk of data through a cursor, process it, then > request the next batch. > > Doing that will take longer overall but give you faster initial response > time. If you want to keep the overall time low, and you do a lot of > processing per batch, you may still want to request your batches > asynchronously just so your program can process one batch of data while > the server is retrieving the next. Using a cursor would only be useful in cases where the data transfer is the main bottleneck. This may work in situations with fairly straightforward queries and exceedingly large datasets, but will not work at all with complex queries that return small datasets. I.e., where the bottleneck is the query planning/execution phase. I will add my vote to this feature request. After all, MySQL and MSSql can do this. I don't know about Oracle, but I would be surprised if it didn't. Best, Keary Suska Esoteritech, Inc. "Demystifying technology for your home or business"
On Fri, September 15, 2006 19:56, Adriaan van Os wrote: > Besides, when more than one user is connected, multiple SQL commands may > behave different than a > single SQL command > (<http://www.postgresql.org/docs/8.1/static/transaction-iso.html>) But you'd be doing this in a transaction anyway: you can't declare a cursor without starting a transaction first. Yes, you could deliberately declare "WITH HOLD" and keep using your cursor after commiting or aborting the transaction. But even then, so far as I know, the cursor presents a snapshot view of its result set so you get an effective isolation level of "serializable" even then. The number of users has nothing to do with the matter--if that were a real concern, you'd be using a serializable transaction anyway, so you wouldn't have to worry about it even if cursors did behave as "read committed." Jeroen
Jeroen T. Vermeulen wrote: > Adriaan van Os wrote: > >> Besides, when more than one user is connected, multiple SQL commands may >> behave different than a >> single SQL command >> (<http://www.postgresql.org/docs/8.1/static/transaction-iso.html>) > > But you'd be doing this in a transaction anyway: you can't declare a > cursor without starting a transaction first.Yes, you could deliberately > declare "WITH HOLD" and keep using your cursor after commiting or aborting > the transaction. But even then, so far as I know, the cursor presents a > snapshot view of its result set so you get an effective isolation level of > "serializable" even then. No, carefully read <http://www.postgresql.org/docs/8.1/static/transaction-iso.html>. > The number of users has nothing to do with the matter It does. > if that were a real > concern, you'd be using a serializable transaction anyway, so you wouldn't > have to worry about it even if cursors did behave as "read committed." I don't want to be forced to use serializable transaction mode, just because I want to know the progress of an SQL command. Regards, Adriaan van Os
On Sun, September 17, 2006 17:22, Adriaan van Os wrote: >> But you'd be doing this in a transaction anyway: you can't declare a >> cursor without starting a transaction first.Yes, you could deliberately >> declare "WITH HOLD" and keep using your cursor after commiting or >> aborting >> the transaction. But even then, so far as I know, the cursor presents a >> snapshot view of its result set so you get an effective isolation level >> of >> "serializable" even then. > > No, carefully read > <http://www.postgresql.org/docs/8.1/static/transaction-iso.html>. I'm familiar with the document, thank you, but if you're not prepared to give any detail beyond "no" then I remain unconvinced. What part exactly do you disagree with? That cursors can't be declared without beginning a transaction first? That cursors present a snapshot view of their result set? That that amounts to an effective isolation level of "serializable?" >> The number of users has nothing to do with the matter > > It does. No. Number of concurrent sessions, yes, assuming you're working at "read committed" isolation level--and even then not for cursors, as I said. >> if that were a real >> concern, you'd be using a serializable transaction anyway, so you >> wouldn't >> have to worry about it even if cursors did behave as "read committed." > > I don't want to be forced to use serializable transaction mode, just > because I want to know the > progress of an SQL command. And that's not what I said. I said that if you have concerns over the consistency of your view of the database in the presence of other concurrent sessions, you'd be using "serializable" transactions in the first place--*regardless* of whether you want to report progress. Jeroen
Jeroen T. Vermeulen wrote: > On Sun, September 17, 2006 17:22, Adriaan van Os wrote: > >>> But you'd be doing this in a transaction anyway: you can't declare a >>> cursor without starting a transaction first.Yes, you could deliberately >>> declare "WITH HOLD" and keep using your cursor after commiting or >>> aborting >>> the transaction. But even then, so far as I know, the cursor presents a >>> snapshot view of its result set so you get an effective isolation level >>> of >>> "serializable" even then. >> No, carefully read >> <http://www.postgresql.org/docs/8.1/static/transaction-iso.html>. > > I'm familiar with the document, thank you, but if you're not prepared to > give any detail beyond "no" then I remain unconvinced. What part exactly > do you disagree with? That cursors can't be declared without beginning a > transaction first? That cursors present a snapshot view of their result > set? That that amounts to an effective isolation level of "serializable?" I now read that Postgres only supports FOR READ ONLY cursors. So, yes, for those cursors you are right and my remark doesn't apply. Still, the issue I raised was about tracing the progress of a SQL command in general, not about the specific case of using a cursor instead. Regards, Adriaan van Os
On Mon, September 18, 2006 14:51, Adriaan van Os wrote: > Still, the issue I raised was about tracing the progress of a SQL command > in general, not about the > specific case of using a cursor instead. True, and dealing with cursors is extra work in libpq. In the C++ API OTOH there is a class that lets you treat a cursor more or less as a regular query result. It's set to be replaced with a more modern version, but it's usable: chunks of result data are fetched on demand, i.e. as you access them. If you had something like that, all you'd have to do is issue your query as before, process your data as before, and update your progress display after every n rows of processing.You wouldn't really notice any of the complexityunderneath. Jeroen