Re: pg_dump restore time and Foreign Keys - Mailing list pgsql-hackers

From Simon Riggs
Subject Re: pg_dump restore time and Foreign Keys
Date
Msg-id 1212672470.19964.71.camel@ebony.site
Whole thread Raw
In response to Re: pg_dump restore time and Foreign Keys  ("Heikki Linnakangas" <heikki@enterprisedb.com>)
Responses Re: pg_dump restore time and Foreign Keys  ("Heikki Linnakangas" <heikki@enterprisedb.com>)
List pgsql-hackers
On Thu, 2008-06-05 at 16:01 +0300, Heikki Linnakangas wrote:
> Simon Riggs wrote:
> > On Thu, 2008-06-05 at 10:19 +0300, Heikki Linnakangas wrote:
> >> Simon Riggs wrote:
> >>> I'm guessing that the WITHOUT CHECK option would not be acceptable as an
> >>> unprotected trap for our lazy and wicked users. :-)
> >> Yes, that sounds scary.
> >>
> >> Instead, I'd suggest finding ways to speed up the ALTER TABLE ADD 
> >> FOREIGN KEY. 
> > 
> > I managed a suggestion for improving it for integers only, but if
> > anybody has any other ideas, I'm all ears. 
> 
> Well, one idea would be to allow adding multiple foreign keys in one 
> command, and checking them all at once with one SQL query instead of one 
> per foreign key. Right now we need one seq scan over the table per 
> foreign key, by checking all references at once we would only need one 
> seq scan to check them all.

No need. Just parallelise the restore with concurrent psql. Which would
speed up the index creation also. Does Greg have plans for further work?

> >> Or speeding up COPY into a table with foreign keys already 
> >> defined. For example, you might want to build an in-memory hash table of 
> >> the keys in the target table, instead of issuing a query on each INSERT, 
> >> if the target table isn't huge.
> > 
> > No, that's not the problem, but I agree that is a problem also.
> 
> It is related, because if we can make COPY into a table with foreign 
> keys fast enough, we could rearrange dumps so that foreign keys are 
> created before loading data. That would save the seqscan over the table 
> altogether.

True.

> Thinking about this idea a bit more, instead of loading the whole target 
> table into memory, it would probably make more sense to keep a hash 
> table as just a cache of the most recent keys that have been referenced.

If you can think of a way of improving hash joins generally, then it
will work for this specific case also.

-- Simon Riggs           www.2ndQuadrant.comPostgreSQL Training, Services and Support



pgsql-hackers by date:

Previous
From: "Heikki Linnakangas"
Date:
Subject: Re: pg_dump restore time and Foreign Keys
Next
From: "Heikki Linnakangas"
Date:
Subject: Re: pg_dump restore time and Foreign Keys