Thread: pg_restore oddity?

pg_restore oddity?

From
Mario Weilguni
Date:
There's a IMO a problem with pg_restore, it should be easy to fix (I 
hope - and I could try to fix it and send a patch).
* I've a dump taken from a 8.1 database
* I'm using gist and ltree
* I'm restoring to a 8.2 database

Problem:
I cannot use "-1" for performance, because some gist stuff has changed 
and the restore fails. But there seems to be no option for pg_restore to 
use transactions for data restore, so it's very very slow (one million 
records, each obviously in it's own transaction - because a separate 
session "select count(1) from logins" shows a growing number).

It would be nice to use transactions for the data stuff itself, but not 
for schema changes or functions. I know I can use separate pg_restore 
runs for schema and data, but it's complicated IMHO.

I see several options:
* Use transactions for data, maybe with a separate command line option
* Use transactions everytime, and place savepoints to recover from errors?

Any ideas what I could do?

Regards
Mario



Re: pg_restore oddity?

From
"Heikki Linnakangas"
Date:
Mario Weilguni wrote:
> I cannot use "-1" for performance, because some gist stuff has changed
> and the restore fails. But there seems to be no option for pg_restore to
> use transactions for data restore, so it's very very slow (one million
> records, each obviously in it's own transaction - because a separate
> session "select count(1) from logins" shows a growing number).

By default, pg_dump/pg_restore uses a COPY command for each table, and
each COPY executes as a single transaction, so you shouldn't see the row
count growing like that. Is the dump file in --inserts format?

> It would be nice to use transactions for the data stuff itself, but not
> for schema changes or functions. I know I can use separate pg_restore
> runs for schema and data, but it's complicated IMHO.

pg_restore -s foo
pg_restore -a -1 foo

doesn't seem too complicated to me. Am I missing something?

--  Heikki Linnakangas EnterpriseDB   http://www.enterprisedb.com


Re: pg_restore oddity?

From
Mario Weilguni
Date:
Heikki Linnakangas schrieb:
> Mario Weilguni wrote:
>   
>> I cannot use "-1" for performance, because some gist stuff has changed
>> and the restore fails. But there seems to be no option for pg_restore to
>> use transactions for data restore, so it's very very slow (one million
>> records, each obviously in it's own transaction - because a separate
>> session "select count(1) from logins" shows a growing number).
>>     
>
> By default, pg_dump/pg_restore uses a COPY command for each table, and
> each COPY executes as a single transaction, so you shouldn't see the row
> count growing like that. Is the dump file in --inserts format?
>
>   

You are right, it was my fault. I was confused about the pg_dump syntax, 
and used "-d" (the "-d" because pg_restore needs it for the destination 
database, not the dump itself), so it was using "--inserts".

Everything is working fine. I've done dump/restores cycles a hundreds 
times, and now such a mistake. I can't believe it.
Seems like I need to take some vacations.

Thanks for the help!




Re: pg_restore oddity?

From
"Florian G. Pflug"
Date:
Heikki Linnakangas wrote:
> Mario Weilguni wrote:
>> I cannot use "-1" for performance, because some gist stuff has changed
>> and the restore fails. But there seems to be no option for pg_restore to
>> use transactions for data restore, so it's very very slow (one million
>> records, each obviously in it's own transaction - because a separate
>> session "select count(1) from logins" shows a growing number).
> 
> By default, pg_dump/pg_restore uses a COPY command for each table, and
> each COPY executes as a single transaction, so you shouldn't see the row
> count growing like that. Is the dump file in --inserts format?
> 
>> It would be nice to use transactions for the data stuff itself, but not
>> for schema changes or functions. I know I can use separate pg_restore
>> runs for schema and data, but it's complicated IMHO.
> 
> pg_restore -s foo
> pg_restore -a -1 foo
> 
> doesn't seem too complicated to me. Am I missing something?

Doesn't pg_restore create the indices *after* loading the data if you let it 
restore the schema *and* the data in one step? The above workaround would 
disable that optimization, thereby making the data-restore phase much more costly.

Now that I think about it, I remember that I've often whished that we not only 
had --schema-only and --data-only, but also --schema-unconstrained-only and 
--constraints-only.

regards, Florian Pflug