Re: pg_dump restore time and Foreign Keys - Mailing list pgsql-hackers

From Heikki Linnakangas
Subject Re: pg_dump restore time and Foreign Keys
Date
Msg-id 4847EB8E.20600@enterprisedb.com
Whole thread Raw
In response to Re: pg_dump restore time and Foreign Keys  (Simon Riggs <simon@2ndquadrant.com>)
List pgsql-hackers
Simon Riggs wrote:
> On Thu, 2008-06-05 at 16:01 +0300, Heikki Linnakangas wrote:
>> Well, one idea would be to allow adding multiple foreign keys in one 
>> command, and checking them all at once with one SQL query instead of one 
>> per foreign key. Right now we need one seq scan over the table per 
>> foreign key, by checking all references at once we would only need one 
>> seq scan to check them all.
> 
> No need. Just parallelise the restore with concurrent psql. Which would
> speed up the index creation also.

True, you could do that.

>  Does Greg have plans for further work?

I believe he's busy with other stuff at the moment.

>> Thinking about this idea a bit more, instead of loading the whole target 
>> table into memory, it would probably make more sense to keep a hash 
>> table as just a cache of the most recent keys that have been referenced.
> 
> If you can think of a way of improving hash joins generally, then it
> will work for this specific case also.

Individual RI checks performed on inserts/COPY don't do a hash join. The 
bulk check done by ALTER TABLE ADD FOREIGN KEY does, but that's 
different issue.

This hash table would be a specific trick to speed up RI checks. If 
you're anyway I/O bound, it wouldn't help, and you'd already be better 
off creating the foreign key first and loading the data after that.

--   Heikki Linnakangas  EnterpriseDB   http://www.enterprisedb.com


pgsql-hackers by date:

Previous
From: Simon Riggs
Date:
Subject: Re: pg_dump restore time and Foreign Keys
Next
From: Nikhils
Date:
Subject: ExecuteTruncate quirk: expects a unique list of relations