Thread: Deletes and large tables
Greetings! We have a table with more than 250 million rows. I am trying to delete the first 100,000 rows (based on a bigint primary key), and I had to cancel after 4 hours of the system not actually finishing the delete. I wrote a script to delete individual rows 10,000 at a time using transactions, and am finding each individual delete takes on the order of 0.1 seconds to 2-3 seconds. There are 4 indexes on the table, one of which is very "hashlike" (ie, distribution is throught the index for sequential rows). Is this considered "normal" for delete speed? Currently using 8.0.1. Drives are capable of 40MB/sec sustained transfers, but only acheiving about 2-10kB/sec, and mostly CPU bound. Regards! Ed
Edmund Dengler wrote: > Greetings! > > We have a table with more than 250 million rows. I am trying to delete the > first 100,000 rows (based on a bigint primary key), and I had to cancel > after 4 hours of the system not actually finishing the delete. I wrote a > script to delete individual rows 10,000 at a time using transactions, and > am finding each individual delete takes on the order of 0.1 seconds to 2-3 > seconds. There are 4 indexes on the table, one of which is very "hashlike" > (ie, distribution is throught the index for sequential rows). I don't suppose it's off checking foreign-keys in a lot of tables is it? -- Richard Huxton Archonet Ltd
Hi, You have any foreign keys pointing this table ? That's the problem I got when I wanted to delete all rows from a table with 5 FK. You may search my name in the list archive and found the thread on this matter. Ciao /David Edmund Dengler wrote: >Greetings! > >We have a table with more than 250 million rows. I am trying to delete the >first 100,000 rows (based on a bigint primary key), and I had to cancel >after 4 hours of the system not actually finishing the delete. I wrote a >script to delete individual rows 10,000 at a time using transactions, and >am finding each individual delete takes on the order of 0.1 seconds to 2-3 >seconds. There are 4 indexes on the table, one of which is very "hashlike" >(ie, distribution is throught the index for sequential rows). > >Is this considered "normal" for delete speed? Currently using 8.0.1. >Drives are capable of 40MB/sec sustained transfers, but only acheiving >about 2-10kB/sec, and mostly CPU bound. > >Regards! >Ed > >---------------------------(end of broadcast)--------------------------- >TIP 6: Have you searched our list archives? > > http://archives.postgresql.org > > >
Greetings! This table has 3 foreign keys, but that should not matter during deletes. In addition, the tables being referred to are small, and should be in cache. There are no tables depending on it for references, so no dependent triggers should be running. Also, if this was a foreign key issue, I would expect I/O issues/bounds and not CPU. Regards! Ed On Fri, 10 Jun 2005, Richard Huxton wrote: > Edmund Dengler wrote: > > Greetings! > > > > We have a table with more than 250 million rows. I am trying to delete the > > first 100,000 rows (based on a bigint primary key), and I had to cancel > > after 4 hours of the system not actually finishing the delete. I wrote a > > script to delete individual rows 10,000 at a time using transactions, and > > am finding each individual delete takes on the order of 0.1 seconds to 2-3 > > seconds. There are 4 indexes on the table, one of which is very "hashlike" > > (ie, distribution is throught the index for sequential rows). > > I don't suppose it's off checking foreign-keys in a lot of tables is it? > > -- > Richard Huxton > Archonet Ltd >
>This table has 3 foreign keys, but that should not matter during deletes. >In addition, the tables being referred to are small, and should be in >cache. > > I'm talking about FK that point this table... Not FK defined for this table that point to other table. If Table A is referenced by 10 other tables 10 referencial check are needed. >There are no tables depending on it for references, so no dependent >triggers should be running. > >Also, if this was a foreign key issue, I would expect I/O issues/bounds >and not CPU. > > Maybe... I'm honetly not sure. Like I said in my previous mail... I got a similar problem (maybe not the same). It was taking 10 minutes to delete 10k line in a table. I turn on some log info in postgresql.conf and I saw that for each row deleted 4 selects were issued to check FK. I drop those FK and the after the delete was taking less than a second. Hope it help /David >Regards! >Ed > > >On Fri, 10 Jun 2005, Richard Huxton wrote: > > > >>Edmund Dengler wrote: >> >> >>>Greetings! >>> >>>We have a table with more than 250 million rows. I am trying to delete the >>>first 100,000 rows (based on a bigint primary key), and I had to cancel >>>after 4 hours of the system not actually finishing the delete. I wrote a >>>script to delete individual rows 10,000 at a time using transactions, and >>>am finding each individual delete takes on the order of 0.1 seconds to 2-3 >>>seconds. There are 4 indexes on the table, one of which is very "hashlike" >>>(ie, distribution is throught the index for sequential rows). >>> >>> >>I don't suppose it's off checking foreign-keys in a lot of tables is it? >> >>-- >> Richard Huxton >> Archonet Ltd >> >> >> > >---------------------------(end of broadcast)--------------------------- >TIP 3: if posting/reading through Usenet, please send an appropriate > subscribe-nomail command to majordomo@postgresql.org so that your > message can get through to the mailing list cleanly > > >
Edmund Dengler <edmundd@eSentire.com> writes: > This table has 3 foreign keys, but that should not matter during deletes. Richard was inquiring about foreign keys linking *into* this table, not out of it. In particular, foreign keys that don't have indexes on the referencing columns; those will incur seqscans to check for appearances of the value you are deleting. regards, tom lane
Just did a sanity check. I dumped the DB schema, and there is indeed a foreign key reference into the table. Now interestingly, the table pointing in has no index on the column, but is a relatively small table with only entries near the end of the large table. So looks like I was getting CPU bound because of a sequental scan of the smaller table per delete. Dropped the constraint, and deletes are now much faster. Regards! Ed On Fri, 10 Jun 2005, David Gagnon wrote: > > >This table has 3 foreign keys, but that should not matter during deletes. > >In addition, the tables being referred to are small, and should be in > >cache. > > > > > I'm talking about FK that point this table... Not FK defined for this > table that point to other table. If Table A is referenced by 10 other > tables 10 referencial check are needed. > > > >There are no tables depending on it for references, so no dependent > >triggers should be running. > > > >Also, if this was a foreign key issue, I would expect I/O issues/bounds > >and not CPU. > > > > > Maybe... I'm honetly not sure. > > Like I said in my previous mail... I got a similar problem (maybe not > the same). It was taking 10 minutes to delete 10k line in a table. I > turn on some log info in postgresql.conf and I saw that for each row > deleted 4 selects were issued to check FK. I drop those FK and the > after the delete was taking less than a second. > > Hope it help > /David > > > > >Regards! > >Ed > > > > > >On Fri, 10 Jun 2005, Richard Huxton wrote: > > > > > > > >>Edmund Dengler wrote: > >> > >> > >>>Greetings! > >>> > >>>We have a table with more than 250 million rows. I am trying to delete the > >>>first 100,000 rows (based on a bigint primary key), and I had to cancel > >>>after 4 hours of the system not actually finishing the delete. I wrote a > >>>script to delete individual rows 10,000 at a time using transactions, and > >>>am finding each individual delete takes on the order of 0.1 seconds to 2-3 > >>>seconds. There are 4 indexes on the table, one of which is very "hashlike" > >>>(ie, distribution is throught the index for sequential rows). > >>> > >>> > >>I don't suppose it's off checking foreign-keys in a lot of tables is it? > >> > >>-- > >> Richard Huxton > >> Archonet Ltd > >> > >> > >> > > > >---------------------------(end of broadcast)--------------------------- > >TIP 3: if posting/reading through Usenet, please send an appropriate > > subscribe-nomail command to majordomo@postgresql.org so that your > > message can get through to the mailing list cleanly > > > > > > > >