Thread: Progress of asynchronous queries

Progress of asynchronous queries

From
Adriaan van Os
Date:
I couldn't find anything in the libpq interfaces that identifies the current progress of an 
asynchronous query as a percentage of the estimated total query time. So, unless I missed 
something, I would like to file this as a feature request.

The idea behind this, of course, is to show a progress bar in the user interface of the client 
application as soon as the query takes longer than say 1 second.

Regards,

Adriaan van Os


Re: Progress of asynchronous queries

From
"Jeroen T. Vermeulen"
Date:
On Fri, September 15, 2006 15:55, Adriaan van Os wrote:
> I couldn't find anything in the libpq interfaces that identifies the
> current progress of an
> asynchronous query as a percentage of the estimated total query time. So,
> unless I missed
> something, I would like to file this as a feature request.
>
> The idea behind this, of course, is to show a progress bar in the user
> interface of the client
> application as soon as the query takes longer than say 1 second.

One way to do this would be to use a cursor, and only receive batches of
(say) a hundred or a thousand rows at a time.  In that case you might not
even want to bother with asynchronous execution at all.

It may take some extra work for your program to handle its data in chunks.Then again, for many programs it's not
necessaryto hold all of a result
 
set in memory at the same time.  In that case you may see reduced memory
usage if you request one chunk of data through a cursor, process it, then
request the next batch.

Doing that will take longer overall but give you faster initial response
time.  If you want to keep the overall time low, and you do a lot of
processing per batch, you may still want to request your batches
asynchronously just so your program can process one batch of data while
the server is retrieving the next.


Jeroen




Re: Progress of asynchronous queries

From
Adriaan van Os
Date:
Jeroen T. Vermeulen wrote:

> Adriaan van Os wrote:
>> I couldn't find anything in the libpq interfaces that identifies the
>> current progress of an
>> asynchronous query as a percentage of the estimated total query time. So,
>> unless I missed
>> something, I would like to file this as a feature request.
>>
>> The idea behind this, of course, is to show a progress bar in the user
>> interface of the client
>> application as soon as the query takes longer than say 1 second.
> 
> One way to do this would be to use a cursor, and only receive batches of
> (say) a hundred or a thousand rows at a time.  In that case you might not
> even want to bother with asynchronous execution at all.
> 
> It may take some extra work for your program to handle its data in chunks.
>  Then again, for many programs it's not necessary to hold all of a result
> set in memory at the same time.  In that case you may see reduced memory
> usage if you request one chunk of data through a cursor, process it, then
> request the next batch.
> 
> Doing that will take longer overall but give you faster initial response
> time.  If you want to keep the overall time low, and you do a lot of
> processing per batch, you may still want to request your batches
> asynchronously just so your program can process one batch of data while
> the server is retrieving the next.

Thanks for your reply and the idea to process data in chunks. Still, I feel that a workaround for 
some situations (that may not work in others) is not in general a good replacement for a 
fundamental feature, as keeping the user informed is one of principles of User Interface design 

<http://developer.apple.com/documentation/UserExperience/Conceptual/OSXHIGuidelines/XHIGHIDesign/chapter_5_section_2.html>.

Besides, when more than one user is connected, multiple SQL commands may behave different than a 
single SQL command (<http://www.postgresql.org/docs/8.1/static/transaction-iso.html>)

Regards,

Adriaan van Os


Re: Progress of asynchronous queries

From
Keary Suska
Date:
on 9/15/06 4:18 AM, jtv@xs4all.nl purportedly said:

>> I couldn't find anything in the libpq interfaces that identifies the
>> current progress of an
>> asynchronous query as a percentage of the estimated total query time. So,
>> unless I missed
>> something, I would like to file this as a feature request.
>> 
>> The idea behind this, of course, is to show a progress bar in the user
>> interface of the client
>> application as soon as the query takes longer than say 1 second.
> 
> One way to do this would be to use a cursor, and only receive batches of
> (say) a hundred or a thousand rows at a time.  In that case you might not
> even want to bother with asynchronous execution at all.
> 
> It may take some extra work for your program to handle its data in chunks.
>  Then again, for many programs it's not necessary to hold all of a result
> set in memory at the same time.  In that case you may see reduced memory
> usage if you request one chunk of data through a cursor, process it, then
> request the next batch.
> 
> Doing that will take longer overall but give you faster initial response
> time.  If you want to keep the overall time low, and you do a lot of
> processing per batch, you may still want to request your batches
> asynchronously just so your program can process one batch of data while
> the server is retrieving the next.

Using a cursor would only be useful in cases where the data transfer is the
main bottleneck. This may work in situations with fairly straightforward
queries and exceedingly large datasets, but will not work at all with
complex queries that return small datasets. I.e., where the bottleneck is
the query planning/execution phase.

I will add my vote to this feature request. After all, MySQL and MSSql can
do this. I don't know about Oracle, but I would be surprised if it didn't.

Best,

Keary Suska
Esoteritech, Inc.
"Demystifying technology for your home or business"




Re: Progress of asynchronous queries

From
"Jeroen T. Vermeulen"
Date:
On Fri, September 15, 2006 19:56, Adriaan van Os wrote:

> Besides, when more than one user is connected, multiple SQL commands may
> behave different than a
> single SQL command
> (<http://www.postgresql.org/docs/8.1/static/transaction-iso.html>)

But you'd be doing this in a transaction anyway: you can't declare a
cursor without starting a transaction first.  Yes, you could deliberately
declare "WITH HOLD" and keep using your cursor after commiting or aborting
the transaction.  But even then, so far as I know, the cursor presents a
snapshot view of its result set so you get an effective isolation level of
"serializable" even then.

The number of users has nothing to do with the matter--if that were a real
concern, you'd be using a serializable transaction anyway, so you wouldn't
have to worry about it even if cursors did behave as "read committed."


Jeroen






Re: Progress of asynchronous queries

From
Adriaan van Os
Date:
Jeroen T. Vermeulen wrote:
> Adriaan van Os wrote:
> 
>> Besides, when more than one user is connected, multiple SQL commands may
>> behave different than a
>> single SQL command
>> (<http://www.postgresql.org/docs/8.1/static/transaction-iso.html>)
> 
> But you'd be doing this in a transaction anyway: you can't declare a
> cursor without starting a transaction first.Yes, you could deliberately
> declare "WITH HOLD" and keep using your cursor after commiting or aborting
> the transaction.  But even then, so far as I know, the cursor presents a
> snapshot view of its result set so you get an effective isolation level of
> "serializable" even then.

No, carefully read <http://www.postgresql.org/docs/8.1/static/transaction-iso.html>.

> The number of users has nothing to do with the matter

It does.

> if that were a real
> concern, you'd be using a serializable transaction anyway, so you wouldn't
> have to worry about it even if cursors did behave as "read committed."

I don't want to be forced to use serializable transaction mode, just because I want to know the 
progress of an SQL command.

Regards,

Adriaan van Os



Re: Progress of asynchronous queries

From
"Jeroen T. Vermeulen"
Date:
On Sun, September 17, 2006 17:22, Adriaan van Os wrote:

>> But you'd be doing this in a transaction anyway: you can't declare a
>> cursor without starting a transaction first.Yes, you could deliberately
>> declare "WITH HOLD" and keep using your cursor after commiting or
>> aborting
>> the transaction.  But even then, so far as I know, the cursor presents a
>> snapshot view of its result set so you get an effective isolation level
>> of
>> "serializable" even then.
>
> No, carefully read
> <http://www.postgresql.org/docs/8.1/static/transaction-iso.html>.

I'm familiar with the document, thank you, but if you're not prepared to
give any detail beyond "no" then I remain unconvinced.  What part exactly
do you disagree with?  That cursors can't be declared without beginning a
transaction first?  That cursors present a snapshot view of their result
set?  That that amounts to an effective isolation level of "serializable?"


>> The number of users has nothing to do with the matter
>
> It does.

No.  Number of concurrent sessions, yes, assuming you're working at "read
committed" isolation level--and even then not for cursors, as I said.


>> if that were a real
>> concern, you'd be using a serializable transaction anyway, so you
>> wouldn't
>> have to worry about it even if cursors did behave as "read committed."
>
> I don't want to be forced to use serializable transaction mode, just
> because I want to know the
> progress of an SQL command.

And that's not what I said.  I said that if you have concerns over the
consistency of your view of the database in the presence of other
concurrent sessions, you'd be using "serializable" transactions in the
first place--*regardless* of whether you want to report progress.


Jeroen




Re: Progress of asynchronous queries

From
Adriaan van Os
Date:
Jeroen T. Vermeulen wrote:
> On Sun, September 17, 2006 17:22, Adriaan van Os wrote:
> 
>>> But you'd be doing this in a transaction anyway: you can't declare a
>>> cursor without starting a transaction first.Yes, you could deliberately
>>> declare "WITH HOLD" and keep using your cursor after commiting or
>>> aborting
>>> the transaction.  But even then, so far as I know, the cursor presents a
>>> snapshot view of its result set so you get an effective isolation level
>>> of
>>> "serializable" even then.
>> No, carefully read
>> <http://www.postgresql.org/docs/8.1/static/transaction-iso.html>.
> 
> I'm familiar with the document, thank you, but if you're not prepared to
> give any detail beyond "no" then I remain unconvinced.  What part exactly
> do you disagree with?  That cursors can't be declared without beginning a
> transaction first?  That cursors present a snapshot view of their result
> set?  That that amounts to an effective isolation level of "serializable?"

I now read that Postgres only supports FOR READ ONLY cursors. So, yes, for those cursors you are 
right and my remark doesn't apply.

Still, the issue I raised was about tracing the progress of a SQL command in general, not about the 
specific case of using a cursor instead.

Regards,

Adriaan van Os


Re: Progress of asynchronous queries

From
"Jeroen T. Vermeulen"
Date:
On Mon, September 18, 2006 14:51, Adriaan van Os wrote:

> Still, the issue I raised was about tracing the progress of a SQL command
> in general, not about the
> specific case of using a cursor instead.

True, and dealing with cursors is extra work in libpq.

In the C++ API OTOH there is a class that lets you treat a cursor more or
less as a regular query result.  It's set to be replaced with a more
modern version, but it's usable: chunks of result data are fetched on
demand, i.e. as you access them.  If you had something like that, all
you'd have to do is issue your query as before, process your data as
before, and update your progress display after every n rows of processing.You wouldn't really notice any of the
complexityunderneath.
 


Jeroen