Thread: Adding foreign key performance
I recalled seeing a thread on -HACKERS about some major improvements to the speed of adding an FK to an existing table in7.4. Naturally I was curious and decided to give it a whirl. My findings are not too good. In fact, they are bad. Could it be this patch never made it in? Anyway, here's the info. Machine: Linux 2.4.18 [stock rh8], p3 500, 512mb, 4x18GB scsi raid 0 Two tables: members and watchedmembers with 1045720 and 829994 rows respectivly. freshly vacuum analyze'd for each PG: 7.4b4, 10k shared buff, 256mb effective cache: 485706ms 7.3.4 [same settings]: 412304.76 ms Now the odd thing during that operation was that the machine was about oh, 50-70% _idle_ during the whole time. Then I started thinking more about it and realized hearing if you bump sort_mem up ridiculously high during a foreign keyadd it helps. So I did. Bumped it up to 256MB. [again, vacuum analyze'd each beforehand] 7.3.4: 328912ms [cpu pegged] 7.4b4: 298383ms [cpu pegged] Quite an improvement I'd say. Perhaps we should make note of this somewhere? Performance guide? Docs? And this leads to the place we'd get a huge benefit: Restoring backups.. If there were some way to bump up sort_mem whiledoing the restore.. things would be much more pleasant. [Although, even better would be to disable FK stuff while restoringa backup and assume the backup is "sane"] How we'd go about doing that is the subject of much debate. Perhaps add the functionality to pg_restore? ie, pg_restore -s 256MB mybackup.db? It would just end up issuing a set sort_mem=256000.. What do you guys think? -- Jeff Trout <jeff@jefftrout.com> http://www.jefftrout.com/ http://www.stuarthamm.net/
On Tue, 28 Oct 2003, Jeff wrote: > I recalled seeing a thread on -HACKERS about some major improvements to > the speed of adding an FK to an existing table in 7.4. Naturally I was > curious and decided to give it a whirl. My findings are not too good. In > fact, they are bad. > > Could it be this patch never made it in? I think it went in between b4 and b5, can you try with b5?
Jeff <threshar@torgo.978.org> writes: > I recalled seeing a thread on -HACKERS about some major improvements to the speed of adding an FK to an existing tablein 7.4. Naturally I was curious and decided to give it a whirl. My findings are not too good. In fact, they are bad. > 7.4b4, 10k shared buff, 256mb effective cache: 485706ms You are testing the wrong version. beta5 has the ADD FOREIGN KEY improvement. regards, tom lane
On Tue, 28 Oct 2003 09:16:45 -0500 Jeff <threshar@torgo.978.org> wrote: > 7.3.4: 328912ms [cpu pegged] > 7.4b4: 298383ms [cpu pegged] > Just loaded up delicious 7.4b5 and wow... sort_mem 8192: 137038ms [lots of tmp file activity] sort_mem 256000: 83109ms That's some good work there Lou, You'll make sargent for that someday. -- Jeff Trout <jeff@jefftrout.com> http://www.jefftrout.com/ http://www.stuarthamm.net/
FWIW: I'm fiddling with that right now, and the FK think was quick... a few seconds... the tables in question have 1400 records, 343000 records and 7200 records... I'm running Beta5... John. On Tuesday 28 October 2003 10:21, Stephan Szabo wrote: > On Tue, 28 Oct 2003, Jeff wrote: > > I recalled seeing a thread on -HACKERS about some major improvements to > > the speed of adding an FK to an existing table in 7.4. Naturally I was > > curious and decided to give it a whirl. My findings are not too good. In > > fact, they are bad. > > > > Could it be this patch never made it in? > > I think it went in between b4 and b5, can you try with b5? > > > ---------------------------(end of broadcast)--------------------------- > TIP 5: Have you checked our extensive FAQ? > > http://www.postgresql.org/docs/faqs/FAQ.html
On Tue, 28 Oct 2003, Jeff wrote: > On Tue, 28 Oct 2003 09:16:45 -0500 > Jeff <threshar@torgo.978.org> wrote: > > > > 7.3.4: 328912ms [cpu pegged] > > 7.4b4: 298383ms [cpu pegged] > > > > Just loaded up delicious 7.4b5 and wow... > > sort_mem 8192: 137038ms [lots of tmp file activity] > sort_mem 256000: 83109ms Hmm, 298383 -> 83109 (since those are the 256k numbers). Not as much as I'd have hoped, but I'll take a factor of 3.
John K. Herreshoff wrote: > FWIW: I'm fiddling with that right now, and the FK think was quick... a few > seconds... the tables in question have 1400 records, 343000 records and 7200 > records... I'm running Beta5... Did those tables have analyze statistics? Can you try it without statistics (I think you have to drop the tables to erase the statistics). -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania 19073
On Tue, 28 Oct 2003 10:32:36 -0800 (PST) Stephan Szabo <sszabo@megazone.bigpanda.com> wrote: > Hmm, 298383 -> 83109 (since those are the 256k numbers). Not as > much as I'd have hoped, but I'll take a factor of 3. Yes. those are the numbers for 256MB of sort_mem. It seemed to saturate the IO so once I get more disks in here it should hopefully speed up. -- Jeff Trout <jeff@jefftrout.com> http://www.jefftrout.com/ http://www.stuarthamm.net/
I'm not sure about the analyze stats... Where would I find that (in postgresql.conf I suppose) I'll go see what I have set up, and get back to you in 30 minutes or less... John. On Tuesday 28 October 2003 13:34, Bruce Momjian wrote: > John K. Herreshoff wrote: > > FWIW: I'm fiddling with that right now, and the FK think was quick... a > > few seconds... the tables in question have 1400 records, 343000 records > > and 7200 records... I'm running Beta5... > > Did those tables have analyze statistics? Can you try it without > statistics (I think you have to drop the tables to erase the > statistics).
John K. Herreshoff wrote: > I'm not sure about the analyze stats... Where would I find that (in > postgresql.conf I suppose) I'll go see what I have set up, and get back to > you in 30 minutes or less... They are in pg_statistic. If you have ever anaylzed the table, there are stats. I am interested in the non-analyze case because that's how the data will load into a fresh db via pg_dump. -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania 19073
That did not take long... about 13 minutes to reload the tables from an *.mdb file, and a second or two for each of the 'alter table foo add foreign key...' lines. I tried to drop a 'referencing' table, and the database would not let me, said that something depended on it ;o) Is there some way to name the foreign key so that it can be dropped later, or is there a way to drop the foreign key using information already in the database? John. On Tuesday 28 October 2003 13:34, Bruce Momjian wrote: > John K. Herreshoff wrote: > > FWIW: I'm fiddling with that right now, and the FK think was quick... a > > few seconds... the tables in question have 1400 records, 343000 records > > and 7200 records... I'm running Beta5... > > Did those tables have analyze statistics? Can you try it without > statistics (I think you have to drop the tables to erase the > statistics).
>>>>> "J" == Jeff <threshar@torgo.978.org> writes: J> And this leads to the place we'd get a huge benefit: Restoring J> backups.. If there were some way to bump up sort_mem while doing J> the restore.. things would be much more pleasant. [Although, even There was a rather substantial thread on this about the time when 7.4b1 was released. J> better would be to disable FK stuff while restoring a backup and J> assume the backup is "sane"] How we'd go about doing that is the J> subject of much debate. If you're restoring from a pg_dump -Fc (compressed dump) it already happens for you. The indexes and foreign keys are not added until the very end, from what I recall. J> Perhaps add the functionality to pg_restore? ie, pg_restore -s J> 256MB mybackup.db? It would just end up issuing a set J> sort_mem=256000.. This was essentially my proposal, though I had better speed enhancement by increasing the number of checkpoint buffers. -- =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= Vivek Khera, Ph.D. Khera Communications, Inc. Internet: khera@kciLink.com Rockville, MD +1-240-453-8497 AIM: vivekkhera Y!: vivek_khera http://www.khera.org/~vivek/
John K. Herreshoff wrote: > That did not take long... about 13 minutes to reload the tables from an *.mdb > file, and a second or two for each of the 'alter table foo add foreign > key...' lines. I tried to drop a 'referencing' table, and the database would > not let me, said that something depended on it ;o) > > Is there some way to name the foreign key so that it can be dropped later, or > is there a way to drop the foreign key using information already in the > database? You have to use ALTER TABLE DROP CONSTRAINT perhaps. -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania 19073
On Tue, 28 Oct 2003 14:22:04 -0500 Vivek Khera <khera@kcilink.com> wrote: > If you're restoring from a pg_dump -Fc (compressed dump) it already > happens for you. The indexes and foreign keys are not added until the > very end, from what I recall. > This happens with regular dumps - at the end is a pile of alter table's that create the indices, FK's and triggers. Is the -Fc method different? -- Jeff Trout <jeff@jefftrout.com> http://www.jefftrout.com/ http://www.stuarthamm.net/
>>Just loaded up delicious 7.4b5 and wow... >> >>sort_mem 8192: 137038ms [lots of tmp file activity] >>sort_mem 256000: 83109ms > > > Hmm, 298383 -> 83109 (since those are the 256k numbers). Not as > much as I'd have hoped, but I'll take a factor of 3. Hi Jeff, Could you let us know the load times when you have done: 1. A full ANALYZE 2. A delete all from pg_statistic So we can see if ANALYZE stats make much difference? Chris
On Wed, 29 Oct 2003 09:47:28 +0800 Christopher Kings-Lynne <chriskl@familyhealth.com.au> wrote: > >>Just loaded up delicious 7.4b5 and wow... > >> > >>sort_mem 8192: 137038ms [lots of tmp file activity] > >>sort_mem 256000: 83109ms > > > 1. A full ANALYZE > 2. A delete all from pg_statistic > I had previously analyze'd before I ran those numbers. But I did it again with and without stats. With: Run 1 Time: 80157.21 ms Run 2 Time: 80763.59 ms Killed statistics: Time: 80571.71 ms Time: 80759.18 ms Chances are it is going to seq scan regardless so the stats are rather useless. Perhaps in other scenarios it would help. -- Jeff Trout <jeff@jefftrout.com> http://www.jefftrout.com/ http://www.stuarthamm.net/