Thread: Long running update

Long running update

From
Akash Garg
Date:
Hello,
I currently running an update statement that updates every row in a
very large table.  This query will obviously take a long time to run.
My question -- is there any way to know how much time it will take
once it starts?  Even something that could help me approximate the
speed at which it's doing the update would be helpful.

Thanks,
Akash

Re: Long running update

From
Roman Neuhauser
Date:
# akash.garg@gmail.com / 2005-08-11 16:17:09 -0700:
> Hello,
> I currently running an update statement that updates every row in a
> very large table.  This query will obviously take a long time to run.
> My question -- is there any way to know how much time it will take
> once it starts?  Even something that could help me approximate the
> speed at which it's doing the update would be helpful.

    Yeah, it would be nice if select, insert, update, delete could
    be set to log their progress at configured intervals.

--
How many Vietnam vets does it take to screw in a light bulb?
You don't know, man.  You don't KNOW.
Cause you weren't THERE.             http://bash.org/?255991

Re: Long running update

From
Tom Lane
Date:
Akash Garg <akash.garg@gmail.com> writes:
> I currently running an update statement that updates every row in a
> very large table.  This query will obviously take a long time to run.
> My question -- is there any way to know how much time it will take
> once it starts?  Even something that could help me approximate the
> speed at which it's doing the update would be helpful.

If it's doing a seqscan then strace'ing the backend and noting which
blocks it's reading would give you a good hint.  If it's an indexscan
there's probably no very easy way to tell ...

            regards, tom lane

Re: Long running update

From
Greg Stark
Date:
Roman Neuhauser <neuhauser@sigpipe.cz> writes:

>     Yeah, it would be nice if select, insert, update, delete could
>     be set to log their progress at configured intervals.

One feature that would be handy for doing this would be if Postgres supported
READ DIRTY. Few databases support it, and Postgres currently doesn't, but it
would be extremely handy for peeking to see how much progress an update,
delete, or insert has made.

--
greg

Re: Long running update

From
Tom Lane
Date:
Greg Stark <gsstark@mit.edu> writes:
> One feature that would be handy for doing this would be if Postgres supported
> READ DIRTY. Few databases support it, and Postgres currently doesn't, but it
> would be extremely handy for peeking to see how much progress an update,
> delete, or insert has made.

Would it?  My first thought on reading the OP was to recommend
contrib/pgstattuple, but I refrained after thinking that if the table
is all that big, the last thing you need is someone doing a seqscan of
the whole table to see where you are.  Much less doing so repeatedly.

            regards, tom lane

Re: Long running update

From
Greg Stark
Date:
Tom Lane <tgl@sss.pgh.pa.us> writes:

> Would it?  My first thought on reading the OP was to recommend
> contrib/pgstattuple, but I refrained after thinking that if the table
> is all that big, the last thing you need is someone doing a seqscan of
> the whole table to see where you are.  Much less doing so repeatedly.

Well it's not a magic bullet, but there are lots of circumstances where it
would be useful.

I used to load huge data sets into Oracle from text files. These are purely
insert only, reading the table contents isn't really going to interfere much
with the load operation.

Sometimes I do large queries that seem to take longer than I expect. It would
be useful to be able to look at the data to be sure it isn't off doing
something entirely inappropriate like an unconstrained join or a join on the
wrong columns.

The application where it would be really invaluable is when you want a web
page that displays a progress bar or monitoring software for some running job
(triggered either from the web application or a cron job). Normally you
wouldn't be able to do this unless you were willing to have the job commit
periodically or jury rig up something effectively equivalent to READ DIRTY
involving temporary files or extra database connections. Any progress
information the job has available to it isn't available to the web
application or monitoring software.

I would say this doesn't replace having some sort of progress indication from
SQL queries. It's an entirely separate feature that happens to be useful for
some of the same uses. There are certainly pros and cons of each approach for
various use cases.

--
greg