Thread: Optimizing huge inserts/copy's
Hi all, Does anybody have any thoughts on optimizing a huge insert, involving something like 3 million records all at once? Should I drop my indices before doing the copy, and then create them after? I keep a tab-delimited file as a buffer, copy it, then do it again about 400 times. Each separate buffer is a few thousand records. We do this at night, so it's not the end of the world if it takes 8 hours, but I would be very grateful for some good ideas... Thanks W __________________________________________________ Do You Yahoo!? Yahoo! Mail - Free email you can access from anywhere! http://mail.yahoo.com/
Hi, there, 1. use copy ... from '.....'; 2. write a PL/pgSQL function and pass multiple records as an array. However, if your table have a foreign key constraint, it cannot be speed up, I have same question as you, my table invloving 9-13 million rows, I don't know how can I add a foreign key them also? Webb Sprague wrote: > Hi all, > > Does anybody have any thoughts on optimizing a huge > insert, involving something like 3 million records all > at once? Should I drop my indices before doing the > copy, and then create them after? I keep a > tab-delimited file as a buffer, copy it, then do it > again about 400 times. Each separate buffer is a few > thousand records. > > We do this at night, so it's not the end of the world > if it takes 8 hours, but I would be very grateful for > some good ideas... > > Thanks > W > > __________________________________________________ > Do You Yahoo!? > Yahoo! Mail - Free email you can access from anywhere! > http://mail.yahoo.com/ -- Jie LIANG Internet Products Inc. 10350 Science Center Drive Suite 100, San Diego, CA 92121 Office:(858)320-4873 jliang@ipinc.com www.ipinc.com
On Tue, 29 Aug 2000, Jie Liang wrote: > Hi, there, > > 1. use copy ... from '.....'; > 2. write a PL/pgSQL function and pass multiple records as an array. > > However, if your table have a foreign key constraint, it cannot be speed > up, > > I have same question as you, my table invloving 9-13 million rows, I > don't > know how can I add a foreign key them also? I haven't tried it on really large tables, but does it turn out faster to use ALTER TABLE ADD CONSTRAINT to add the foreign key constraint after the data is loaded and the indexes are created?
Hi, there, I tried different ways, include vaccum table , ensure index works, it still is as slow as ~100rows per minute. Stephan Szabo wrote: > On Tue, 29 Aug 2000, Jie Liang wrote: > > > Hi, there, > > > > 1. use copy ... from '.....'; > > 2. write a PL/pgSQL function and pass multiple records as an array. > > > > However, if your table have a foreign key constraint, it cannot be speed > > up, > > > > I have same question as you, my table invloving 9-13 million rows, I > > don't > > know how can I add a foreign key them also? > > I haven't tried it on really large tables, but does it turn out faster to > use ALTER TABLE ADD CONSTRAINT to add the foreign key constraint after the > data is loaded and the indexes are created? -- Jie LIANG Internet Products Inc. 10350 Science Center Drive Suite 100, San Diego, CA 92121 Office:(858)320-4873 jliang@ipinc.com www.ipinc.com
I am experimenting with this too. If I have any indexes at all, the copy's get VERY SLOW as the table gets big. Delete ALL your indexes, do your copy's, and then create your indexes again. Good luck. --- Jie Liang <jliang@ipinc.com> wrote: > Hi, there, > > I tried different ways, include vaccum table , > ensure index works, it > still is as slow as ~100rows per minute. > > > Stephan Szabo wrote: > > > On Tue, 29 Aug 2000, Jie Liang wrote: > > > > > Hi, there, > > > > > > 1. use copy ... from '.....'; > > > 2. write a PL/pgSQL function and pass multiple > records as an array. > > > > > > However, if your table have a foreign key > constraint, it cannot be speed > > > up, > > > > > > I have same question as you, my table invloving > 9-13 million rows, I > > > don't > > > know how can I add a foreign key them also? > > > > I haven't tried it on really large tables, but > does it turn out faster to > > use ALTER TABLE ADD CONSTRAINT to add the foreign > key constraint after the > > data is loaded and the indexes are created? > > -- > Jie LIANG > > Internet Products Inc. > > 10350 Science Center Drive > Suite 100, San Diego, CA 92121 > Office:(858)320-4873 > > jliang@ipinc.com > www.ipinc.com > > > __________________________________________________ Do You Yahoo!? Yahoo! Mail - Free email you can access from anywhere! http://mail.yahoo.com/
Hi, I knew that if no constarint, it populate very quick, my question is: when two tables have been reloaded, then I want to add a foreign key constraint to it, say: tableA has primary key column (id) tableB has a column (id) references it, so I say: ALTER TABLE tableB ADD CONSTRAINT distfk FOREIGN KEY (id) REFERENCES tableA(id) ON DELETE CASCADE ; It just seems takes forever. Thanks anyway. Webb Sprague wrote: > I am experimenting with this too. If I have any > indexes at all, the copy's get VERY SLOW as the table > gets big. Delete ALL your indexes, do your copy's, > and then create your indexes again. > > Good luck. > --- Jie Liang <jliang@ipinc.com> wrote: > > Hi, there, > > > > I tried different ways, include vaccum table , > > ensure index works, it > > still is as slow as ~100rows per minute. > > > > > > Stephan Szabo wrote: > > > > > On Tue, 29 Aug 2000, Jie Liang wrote: > > > > > > > Hi, there, > > > > > > > > 1. use copy ... from '.....'; > > > > 2. write a PL/pgSQL function and pass multiple > > records as an array. > > > > > > > > However, if your table have a foreign key > > constraint, it cannot be speed > > > > up, > > > > > > > > I have same question as you, my table invloving > > 9-13 million rows, I > > > > don't > > > > know how can I add a foreign key them also? > > > > > > I haven't tried it on really large tables, but > > does it turn out faster to > > > use ALTER TABLE ADD CONSTRAINT to add the foreign > > key constraint after the > > > data is loaded and the indexes are created? > > > > -- > > Jie LIANG > > > > Internet Products Inc. > > > > 10350 Science Center Drive > > Suite 100, San Diego, CA 92121 > > Office:(858)320-4873 > > > > jliang@ipinc.com > > www.ipinc.com > > > > > > > > __________________________________________________ > Do You Yahoo!? > Yahoo! Mail - Free email you can access from anywhere! > http://mail.yahoo.com/ -- Jie LIANG Internet Products Inc. 10350 Science Center Drive Suite 100, San Diego, CA 92121 Office:(858)320-4873 jliang@ipinc.com www.ipinc.com
On Wed, 30 Aug 2000, Jie Liang wrote: > Hi, > > I knew that if no constarint, it populate very quick, my question is: > when two tables have been > reloaded, then I want to add a foreign key constraint to it, say: > tableA has primary key column (id) > tableB has a column (id) references it, so I say: > ALTER TABLE tableB ADD CONSTRAINT distfk FOREIGN KEY (id) REFERENCES > tableA(id) ON DELETE CASCADE ; Yeah, the alter table has to check that the constraint is valid. There might be a faster way than the current "scan through table calling trigger function" mechanism, although doing most of them starts pulling logic for the obeying constraint into multiple places.
Webb Sprague <wsprague100@yahoo.com> writes: > I am experimenting with this too. If I have any > indexes at all, the copy's get VERY SLOW as the table > gets big. Delete ALL your indexes, do your copy's, > and then create your indexes again. Do you have a lot of equal index keys in the data you're inserting? I've recently been swatting some performance problems in the btree index code for the case of large numbers of equal keys. regards, tom lane
Jie Liang <jliang@ipinc.com> writes: > Hi, there, > > I tried different ways, include vaccum table , ensure index works, it > still is as slow as ~100rows per minute. > PGFSYNC=no in postmaster.init? Well, this might be Debian Linux specific, pardon me if it is. I have just begun playing with Postgres. Still learning, myself. IOW, disable fsync after every statement and your OS will do much better work clustering writes. That means more inserts/sec for you. In one of my tests I was able to insert at ~1000/sec rate. Then I made an experiment, enabled pgfsync _and_ indexes. The inserting speed dropped to 10/sec. Very interesting. Regards, -- Zlatko