Re: [Proposal] Progress bar for pg_dump/pg_restore - Mailing list pgsql-hackers

From Craig Ringer
Subject Re: [Proposal] Progress bar for pg_dump/pg_restore
Date
Msg-id CAMsr+YEwOVwVoibmCAEnJc5J_0gi9fD0kzqfVNZMZObcozdKtQ@mail.gmail.com
Whole thread Raw
In response to Re: [Proposal] Progress bar for pg_dump/pg_restore  (Taiki Kondo <tai-kondo@yk.jp.nec.com>)
Responses Re: [Proposal] Progress bar for pg_dump/pg_restore  (Jim Nasby <Jim.Nasby@BlueTreble.com>)
List pgsql-hackers
On 19 June 2015 at 16:45, Taiki Kondo <tai-kondo@yk.jp.nec.com> wrote:
> Hi, andres
>
> Thank you for your comment, and sorry for late response.
>
>> The question is how to actually get useful estimates. As there's no
>> progress report for indvidiual COPY and CREATE INDEX commands you'll, in
>> many cases, have very irregular progress updates. In many many cases
>> most of the time is spent on a very small subset of the total objects.
>
> When dumping, I think number of tuples can be got from pg_class.reltuples, therefore I want pg_dump to run "select
reltuples"to get it, and then pg_dump will calculate estimated time to execute "COPY FROM" command in getting each
tuples.

It'd need to be a bit smarter than that, since it'd have to take some
account of average tuple size, etc, but it's an interesting idea to
use the stats to guestimate copy times.

> For restoring, I think it's better to record above information (number of tuples) into pg_dump file to estimate time
torestore tables.
 

Since we generally suggest that people use a pg_dump and pg_restore
from the server version they're going to be restoring to, that should
be OK. It'd create some new entries in the pg_restore file manifest
that older pg_restore versions wouldn't understand.

> And, I also understood your concern about "CREATE INDEX", but we have no way to get progress information of "CREATE
INDEX".
> At present, I think it may be good to refer to the same time as estimated time to execute "COPY TO".

You could probably get a handwave-quality estimate by looking at the
average column widths for the cols included in the index plus the
number of tuples in the table. It'd be useless for expression indexes,
partial indexes, etc, but you can't have everything...

Interesting idea to explore.



pgsql-hackers by date:

Previous
From: Craig Ringer
Date:
Subject: Re: Is Postgres database server works fine if there is a change in system time?
Next
From: Craig Ringer
Date:
Subject: Re: proposal: row_to_array function