Thread: Progress bar updates

Progress bar updates

From
Gregory Stark
Date:
Has anyone looked thought about what it would take to get progress bars from
clients like pgadmin? (Or dare I even suggest psql:)

My first thought would be a message like CancelQuery which would cause the
backend to peek into a static data structure and return a message that the
client could parse and display something intelligent. Various commands would
then stuff information into this data structure as they worked.

For a first cut this "data structure" could just be a float between 0 and 1.
Or perhaps it should be two integers, a "current" and an "estimated final".
That would let the client do more intelligent things when the estimates change
for the length of the whole job.

Later I could imagine elaborating into more complex structures for
representing multi-step processes or even whole query plans. I also see it
possibly being interesting to stuff this data structure into shared memory
handled just like how Tom handled the "current command". That would let you
see the other queries running on the server, how long they've been running,
and estimates for how long they'll continue to run.

I would suggest starting with utility functions like index builds or COPY
which would have to be specially handled anyways. Handling all optimizable
queries in a single generic implementation seems like something to tackle only
once the basic infrastructure is there and working for simple cases.

Of course the estimates would be not much better than guesses. But if you want
to say it's not worth having since they won't be perfectly accurate be
prepared to swear that you've never looked at the "% complete" that modern ftp
clients and web browsers display even though they too are, of course, widely
inaccurate. They nonetheless provide some feedback the user desperately wants
to be reassured that his job is making progress and isn't years away from
finishing.

--  Gregory Stark EnterpriseDB          http://www.enterprisedb.com



Re: Progress bar updates

From
"Luke Lonergan"
Date:
+1

Looked, talked it through, it's a very important feature to BI users.


- Luke

Sent from my GoodLink synchronized handheld (www.good.com)

-----Original Message-----
From:     Gregory Stark [mailto:gsstark@mit.edu]
Sent:    Tuesday, July 18, 2006 02:38 PM Eastern Standard Time
To:    pgsql-hackers@postgresql.org
Subject:    [HACKERS] Progress bar updates


Has anyone looked thought about what it would take to get progress bars from
clients like pgadmin? (Or dare I even suggest psql:)

My first thought would be a message like CancelQuery which would cause the
backend to peek into a static data structure and return a message that the
client could parse and display something intelligent. Various commands would
then stuff information into this data structure as they worked.

For a first cut this "data structure" could just be a float between 0 and 1.
Or perhaps it should be two integers, a "current" and an "estimated final".
That would let the client do more intelligent things when the estimates change
for the length of the whole job.

Later I could imagine elaborating into more complex structures for
representing multi-step processes or even whole query plans. I also see it
possibly being interesting to stuff this data structure into shared memory
handled just like how Tom handled the "current command". That would let you
see the other queries running on the server, how long they've been running,
and estimates for how long they'll continue to run.

I would suggest starting with utility functions like index builds or COPY
which would have to be specially handled anyways. Handling all optimizable
queries in a single generic implementation seems like something to tackle only
once the basic infrastructure is there and working for simple cases.

Of course the estimates would be not much better than guesses. But if you want
to say it's not worth having since they won't be perfectly accurate be
prepared to swear that you've never looked at the "% complete" that modern ftp
clients and web browsers display even though they too are, of course, widely
inaccurate. They nonetheless provide some feedback the user desperately wants
to be reassured that his job is making progress and isn't years away from
finishing.

--  Gregory Stark EnterpriseDB          http://www.enterprisedb.com


---------------------------(end of broadcast)---------------------------
TIP 3: Have you checked our extensive FAQ?
              http://www.postgresql.org/docs/faq




Re: Progress bar updates

From
"Dave Page"
Date:

> -----Original Message-----
> From: pgsql-hackers-owner@postgresql.org
> [mailto:pgsql-hackers-owner@postgresql.org] On Behalf Of Gregory Stark
> Sent: 18 July 2006 19:36
> To: pgsql-hackers@postgresql.org
> Subject: [HACKERS] Progress bar updates
>
>
> For a first cut this "data structure" could just be a float
> between 0 and 1.
> Or perhaps it should be two integers, a "current" and an
> "estimated final".
> That would let the client do more intelligent things when the
> estimates change
> for the length of the whole job.

Hi Greg,

I would vote for the latter so that we could give more meaningful
feedback - for example, when vacuuming you might give a scale of 0 to
<num tables>. In cases such as COPY where you mightn't have any idea of
an upper bound, then a simple heartbeat could be supplied so at least
the client could count rows (or 100's of rows) processed or whatever.

It would certainly allow us to present a nicer user experience in
pgAdmin :-)

Regards, Dave.


Re: Progress bar updates

From
Andreas Pflug
Date:
Gregory Stark wrote:
> Has anyone looked thought about what it would take to get progress bars from
> clients like pgadmin? (Or dare I even suggest psql:)
>   

Some weeks ago I proposed a PROGRESS parameter for COPY, to enable 
progress feedback via notices. tgl thinks nobody needs that...

Regards,
Andreas



Re: Progress bar updates

From
Neil Conway
Date:
On Tue, 2006-07-18 at 14:35 -0400, Gregory Stark wrote:
> My first thought would be a message like CancelQuery which would cause the
> backend to peek into a static data structure and return a message that the
> client could parse and display something intelligent.

I'm not quite sure what you're suggesting; presumably you'd need to open
another client connection to send the "status report" message to a
backend (since a backend will not be polling its input socket during
query execution). That just seems like the wrong approach -- stashing a
backend's current status into shared memory sounds more promising, IMHO,
and won't require changes to the FE/BE protocol.

> I would suggest starting with utility functions like index builds or COPY
> which would have to be specially handled anyways. Handling all optimizable
> queries in a single generic implementation seems like something to tackle only
> once the basic infrastructure is there and working for simple cases.
> 
> Of course the estimates would be not much better than guesses.

Estimating query progress for DDL should be reasonably doable, but I
think it would require some hard thought to get even somewhat accurate
estimates for SELECT queries -- and I'm not sure there's much point
doing this if we don't at least have an idea how we might implement
reasonably accurate progress reporting for every kind of query.

This paper is worth a read: 

Gang Luo, Jeffrey F.Naughton, Curt Ellmann and Michael Watzke:  Toward a
Progress Indicator for Database Queries. SIGMOD Conference 2004:
791-802.

Interestingly, they apparently implemented a prototype using PostgreSQL.

-Neil




Re: Progress bar updates

From
Tom Lane
Date:
Neil Conway <neilc@samurai.com> writes:
> I'm not quite sure what you're suggesting; presumably you'd need to open
> another client connection to send the "status report" message to a
> backend (since a backend will not be polling its input socket during
> query execution). That just seems like the wrong approach -- stashing a
> backend's current status into shared memory sounds more promising, IMHO,
> and won't require changes to the FE/BE protocol.

Yeah, I was about to make the same comment.  The new support for query
status in shared memory should make it pretty cheap to update a progress
indicator there, and then it'd be trivial to expose the indicator to
other backends via pg_stat_activity.

Sending the progress info directly to the connected client implies
protocol changes (fairly trivial ones) and client changes (possibly
highly nontrivial ones --- think about how you'd get the info out
through something like a webserver application with multiple layers
of software in the way).  In practice, if a query is taking long
enough for this feature to be interesting, making another connection and
looking to see what's happening is not a problem, and it's likely to be
the most practical way anyway for many clients.
        regards, tom lane


Re: Progress bar updates

From
Josh Berkus
Date:
Andreas,

> Some weeks ago I proposed a PROGRESS parameter for COPY, to enable
> progress feedback via notices. tgl thinks nobody needs that...

Well, *Tom* doesn't need it.  What mechanism did you propose to make this 
work?

-- 
Josh Berkus
PostgreSQL @ Sun
San Francisco


Re: Progress bar updates

From
Greg Stark
Date:
Tom Lane <tgl@sss.pgh.pa.us> writes:

> Neil Conway <neilc@samurai.com> writes:
> > I'm not quite sure what you're suggesting; presumably you'd need to open
> > another client connection to send the "status report" message to a
> > backend (since a backend will not be polling its input socket during
> > query execution). That just seems like the wrong approach -- stashing a
> > backend's current status into shared memory sounds more promising, IMHO,
> > and won't require changes to the FE/BE protocol.
> 
> Yeah, I was about to make the same comment.  The new support for query
> status in shared memory should make it pretty cheap to update a progress
> indicator there, and then it'd be trivial to expose the indicator to
> other backends via pg_stat_activity.

I think that would be a fine feature too. But I don't think that reduces the
desire clients have to be able to request updates on the status of their own
queries.

> In practice, if a query is taking long enough for this feature to be
> interesting, making another connection and looking to see what's happening
> is not a problem, and it's likely to be the most practical way anyway for
> many clients.

It would be the most practical way for a DBA to monitor an application. But
it's not going to be convenient for clients like pgadmin or psql. Even a web
server may want to, for example, stream ajax code updating a progress bar
until it has results and then stream the ajax to display the results. Having
to get the backend pid before your query and then open a second database
connection to monitor your first connection would be extra footwork for
nothing.

-- 
greg



Re: Progress bar updates

From
Hannu Krosing
Date:
Ühel kenal päeval, K, 2006-07-19 kell 05:18, kirjutas Greg Stark:
> Tom Lane <tgl@sss.pgh.pa.us> writes:
> 
> > Neil Conway <neilc@samurai.com> writes:
> > > I'm not quite sure what you're suggesting; presumably you'd need to open
> > > another client connection to send the "status report" message to a
> > > backend (since a backend will not be polling its input socket during
> > > query execution). That just seems like the wrong approach -- stashing a
> > > backend's current status into shared memory sounds more promising, IMHO,
> > > and won't require changes to the FE/BE protocol.
> > 
> > Yeah, I was about to make the same comment.  The new support for query
> > status in shared memory should make it pretty cheap to update a progress
> > indicator there, and then it'd be trivial to expose the indicator to
> > other backends via pg_stat_activity.
> 
> I think that would be a fine feature too. But I don't think that reduces the
> desire clients have to be able to request updates on the status of their own
> queries.

another \x command could be added to psql to do just that

> > In practice, if a query is taking long enough for this feature to be
> > interesting, making another connection and looking to see what's happening
> > is not a problem, and it's likely to be the most practical way anyway for
> > many clients.
> 
> It would be the most practical way for a DBA to monitor an application. But
> it's not going to be convenient for clients like pgadmin or psql. Even a web
> server may want to, for example, stream ajax code updating a progress bar
> until it has results and then stream the ajax to display the results. Having
> to get the backend pid before your query and then open a second database
> connection to monitor your first connection would be extra footwork for
> nothing.

You would have to do some extra work anyway. opening another connection
is not such a big deal.

-- 
----------------
Hannu Krosing
Database Architect
Skype Technologies OÜ
Akadeemia tee 21 F, Tallinn, 12618, Estonia

Skype me:  callto:hkrosing
Get Skype for free:  http://www.skype.com




Re: Progress bar updates

From
"Dave Page"
Date:

> -----Original Message-----
> From: pgsql-hackers-owner@postgresql.org
> [mailto:pgsql-hackers-owner@postgresql.org] On Behalf Of Greg Stark
> Sent: 19 July 2006 10:19
> To: Tom Lane
> Cc: Neil Conway; Gregory Stark; pgsql-hackers@postgresql.org
> Subject: Re: [HACKERS] Progress bar updates
>
> It would be the most practical way for a DBA to monitor an
> application. But
> it's not going to be convenient for clients like pgadmin or
> psql. Even a web
> server may want to, for example, stream ajax code updating a
> progress bar
> until it has results and then stream the ajax to display the
> results. Having
> to get the backend pid before your query and then open a
> second database
> connection to monitor your first connection would be extra
> footwork for
> nothing.

No to mention that we already get occasional complaints about the number
of connections pgAdmin can open (even though it's only one per database
for the main app, plus one per query tool or data editor window).

Regards, Dave.


Re: Progress bar updates

From
Andreas Pflug
Date:
Josh Berkus wrote:
> Andreas,
>
>   
>> Some weeks ago I proposed a PROGRESS parameter for COPY, to enable
>> progress feedback via notices. tgl thinks nobody needs that...
>>     
>
> Well, *Tom* doesn't need it.  What mechanism did you propose to make this 
> work?
>   
Extended the parser to accept that keyword, and emit notices when n 
lines were copied. I found that convenient when transferring a large 
amount of data, to estimate total runtime.
Patch was submitted a while ago to -hackers, together with compression 
that was torn down in a way not suitable to inspire me to continue.

Regards,
Andreas

Regards,
Andreas



Re: Progress bar updates

From
Tom Lane
Date:
Greg Stark <gsstark@mit.edu> writes:
> Tom Lane <tgl@sss.pgh.pa.us> writes:
>> In practice, if a query is taking long enough for this feature to be
>> interesting, making another connection and looking to see what's happening
>> is not a problem, and it's likely to be the most practical way anyway for
>> many clients.

> It would be the most practical way for a DBA to monitor an application. But
> it's not going to be convenient for clients like pgadmin or psql.

[ shrug... ]  Let me explain it to you this way: a progress counter
visible through pg_stat_activity is something that might possibly get
done in time for 8.2.  If you insist on having the other stuff right
off the bat as well, it won't get done this cycle.
        regards, tom lane


Re: Progress bar updates

From
Darcy Buskermolen
Date:
On Wednesday 19 July 2006 07:33, Tom Lane wrote:
> Greg Stark <gsstark@mit.edu> writes:
> > Tom Lane <tgl@sss.pgh.pa.us> writes:
> >> In practice, if a query is taking long enough for this feature to be
> >> interesting, making another connection and looking to see what's
> >> happening is not a problem, and it's likely to be the most practical way
> >> anyway for many clients.
> >
> > It would be the most practical way for a DBA to monitor an application.
> > But it's not going to be convenient for clients like pgadmin or psql.
>
> [ shrug... ]  Let me explain it to you this way: a progress counter
> visible through pg_stat_activity is something that might possibly get
> done in time for 8.2.  If you insist on having the other stuff right
> off the bat as well, it won't get done this cycle.

Having the progress, or estimated time of completion in pg_stat_activity 
sounds like a good starting point, the rest of the desired features can be 
bolted on top of this down the road

>
>             regards, tom lane
>
> ---------------------------(end of broadcast)---------------------------
> TIP 3: Have you checked our extensive FAQ?
>
>                http://www.postgresql.org/docs/faq

-- 
Darcy Buskermolen
Wavefire Technologies Corp.

http://www.wavefire.com
ph: 250.717.0200
fx: 250.763.1759


Re: Progress bar updates

From
"Andrew Hammond"
Date:
Neil Conway wrote:
> > I would suggest starting with utility functions like index builds or COPY
> > which would have to be specially handled anyways. Handling all optimizable
> > queries in a single generic implementation seems like something to tackle only
> > once the basic infrastructure is there and working for simple cases.
> >
> > Of course the estimates would be not much better than guesses.
>
> Estimating query progress for DDL should be reasonably doable, but I
> think it would require some hard thought to get even somewhat accurate
> estimates for SELECT queries -- and I'm not sure there's much point
> doing this if we don't at least have an idea how we might implement
> reasonably accurate progress reporting for every kind of query.

We already have EXPLAIN ANALYZE. Perhaps the right way to do this is
something that provides similar output. I could see something that
looks like EXPLAIN for the parts that have not yet executed, something
reasonable to show progress of the currently active part of the plan
(current time, rows, loops), and EXPLAIN ANALYZE output for the parts
which have been completed.

I can see how this might lead to dynamically re-planning queries. Going
backwards, perhaps there's something related to progress monitoring
that could be taken from the TelegraphCQ work?

Drew



Re: Progress bar updates

From
Christopher Kings-Lynne
Date:
> It would be the most practical way for a DBA to monitor an application. But
> it's not going to be convenient for clients like pgadmin or psql. Even a web
> server may want to, for example, stream ajax code updating a progress bar
> until it has results and then stream the ajax to display the results. Having
> to get the backend pid before your query and then open a second database
> connection to monitor your first connection would be extra footwork for
> nothing.

But that said, it CAN be coded and work just fine no?



Re: Progress bar updates

From
Agent M
Date:
Why make it so complicated?

There could be a guc to indicate that the client is interested in
progress updates. For the execution phase, elog(INFO,...) could be
emitted for each major plan node. (The client would probably run the
explain plan beforehand or it would be embedded in the elog).

During the downloading of the rows, the client would display the bar
relative to the number of estimated rows returned.

-M

On Jul 18, 2006, at 2:35 PM, Gregory Stark wrote:

>
> Has anyone looked thought about what it would take to get progress
> bars from
> clients like pgadmin? (Or dare I even suggest psql:)
>
> My first thought would be a message like CancelQuery which would cause
> the
> backend to peek into a static data structure and return a message that
> the
> client could parse and display something intelligent. Various commands
> would
> then stuff information into this data structure as they worked.

¬ ¬ ¬ ¬ ¬ ¬ ¬ ¬ ¬ ¬ ¬ ¬ ¬ ¬ ¬ ¬ ¬ ¬
AgentM
agentm@themactionfaction.com
¬ ¬ ¬ ¬ ¬ ¬ ¬ ¬ ¬ ¬ ¬ ¬ ¬ ¬ ¬ ¬ ¬ ¬



Re: Progress bar updates

From
Csaba Nagy
Date:
> We already have EXPLAIN ANALYZE. Perhaps the right way to do this is
> something that provides similar output. I could see something that
> looks like EXPLAIN for the parts that have not yet executed, something
> reasonable to show progress of the currently active part of the plan
> (current time, rows, loops), and EXPLAIN ANALYZE output for the parts
> which have been completed.

Now this is something that would really help testing a system, by
dynamically seeing the plans of queries which run too long. That
combined with the ability to see the values of bind parameters would be
a useful debug aid.

Cheers,
Csaba.




Re: Progress bar updates

From
"Luke Lonergan"
Date:
This is how netezza and teradata do it and it works very well.

In each of their cases you can see a graphical representation of the plan with progress for each stage.

For the command line it would be great to just dump the current status, which would provide a snapshot of the explain
analyze.


- Luke

Sent from my GoodLink synchronized handheld (www.good.com)

-----Original Message-----
From:     Csaba Nagy [mailto:nagy@ecircle-ag.com]
Sent:    Thursday, July 20, 2006 04:52 AM Eastern Standard Time
To:    Andrew Hammond
Cc:    postgres hackers
Subject:    Re: [HACKERS] Progress bar updates

> We already have EXPLAIN ANALYZE. Perhaps the right way to do this is
> something that provides similar output. I could see something that
> looks like EXPLAIN for the parts that have not yet executed, something
> reasonable to show progress of the currently active part of the plan
> (current time, rows, loops), and EXPLAIN ANALYZE output for the parts
> which have been completed.

Now this is something that would really help testing a system, by
dynamically seeing the plans of queries which run too long. That
combined with the ability to see the values of bind parameters would be
a useful debug aid.

Cheers,
Csaba.



---------------------------(end of broadcast)---------------------------
TIP 4: Have you searched our list archives?
              http://archives.postgresql.org