Thread: delete taking long time
<div dir="ltr"><div class="gmail_default" style="font-family:'comic sans ms',sans-serif">HI,</div><div class="gmail_default"style="font-family:'comic sans ms',sans-serif"><br /></div><div class="gmail_default" style="font-family:'comicsans ms',sans-serif">I have the following table:</div><div class="gmail_default" style="font-family:'comicsans ms',sans-serif"><br /></div><div class="gmail_default" style="font-family:'comic sans ms',sans-serif"><pclass=""><span class=""> Table "public.connection"</span><p class=""><span class=""> Column | Type | Modifiers </span><p class=""><span class="">-------------------+--------------------------------+-----------</span><pclass=""><span class=""> uid | character varying(18) | not null</span><p class=""><span class=""> ts | timestamp(6) withouttime zone | not null</span><p class=""><span class=""> host_origin | inet | not null</span><pclass=""><span class=""> port_origin | integer | not null</span><p class=""><spanclass=""> host_destination | inet | not null</span><p class=""><span class=""> port_destination | integer | not null</span><p class=""><span class=""> protocol | character varying(12) | </span><p class=""><span class=""> service | character varying(12) | </span><p class=""><span class=""> duration | interval second(6) | </span><p class=""><span class=""> origin_bytes | bigint | </span><p class=""><span class=""> response_bytes | bigint | </span><p class=""><span class=""> connection_state | character varying(8) | </span><pclass=""><span class=""> local_origin | boolean | </span><p class=""><span class=""> local_response | boolean | </span><p class=""><span class=""> missed_bytes | bigint | </span><p class=""><span class=""> history | text | </span><pclass=""><span class=""> origin_packets | bigint | </span><p class=""><span class=""> origin_ip_bytes | bigint | </span><p class=""><span class=""> response_packets | bigint | </span><p class=""><span class=""> response_ip_bytes | bigint | </span><pclass=""><span class="">Indexes:</span><p class=""><span class=""> "connection_pkey" PRIMARY KEY, btree (uid)</span><pclass=""><span class="">Foreign-key constraints:</span><p class=""><span class=""> "connection_protocol_fkey"FOREIGN KEY (protocol) REFERENCES protocol(name)</span><p class=""><span class=""> "connection_service_fkey"FOREIGN KEY (service) REFERENCES service(name)</span><p class=""><span class="">Referenced by:</span><pclass=""><span class=""> TABLE "dns" CONSTRAINT "dns_uid_fkey" FOREIGN KEY (uid) REFERENCES connection(uid)</span><pclass=""><span class=""> TABLE "files" CONSTRAINT "files_uid_fkey" FOREIGN KEY (uid) REFERENCESconnection(uid)</span><p class=""><span class=""> TABLE "http" CONSTRAINT "http_uid_fkey" FOREIGN KEY (uid)REFERENCES connection(uid)</span><p class=""><span class=""> TABLE "notice" CONSTRAINT "notice_uid_fkey" FOREIGNKEY (uid) REFERENCES connection(uid)</span><p class=""><span class=""> TABLE "snmp" CONSTRAINT "snmp_uid_fkey"FOREIGN KEY (uid) REFERENCES connection(uid)</span><p class=""><span class=""> TABLE "ssl" CONSTRAINT "ssl_uid_fkey"FOREIGN KEY (uid) REFERENCES connection(uid)</span><p class=""><span class=""> TABLE "weird" CONSTRAINT"weird_uid_fkey" FOREIGN KEY (uid) REFERENCES connection(uid)</span><p class=""><span class=""><br /></span><pclass=""><span class="">I am trying to delete the connections with date 2016-03-10 by using the following:</span><pclass=""><span class=""><br /></span><p class=""><span class=""> </span><p class=""><span class="">deletefrom connection where uid in (select uid from connection where ts > '2016-03-10 00:30:00');</span><p class=""><spanclass="">There are around 800.000 records matching this rule, and seems to be taking an awful lot of time -4 hours and counting. What could be the reason for such a performance hit and how could I optimise this for future cases?</span><pclass=""><span class="">Regards.</span></div></div>
> > I am trying to delete the connections with date 2016-03-10 by using the > following: > > > delete from connection where uid in (select uid from connection where ts > > '2016-03-10 00:30:00'); try to rewrite that to : delete from connection where ts > '2016-03-10 00:30:00'; It's simpler - and (maybe) faster. > There are around 800.000 records matching this rule, and seems to be taking > an awful lot of time - 4 hours and counting. What could be the reason for > such a performance hit and how could I optimise this for future cases? > > Regards. the db has to touch such many rows, and has to write the transaction log. And update every index. And it has to check the referenced tables for the constraints. Do you have proper indexes? How large is shared_buffers set? Can you show us the explain (analyse)? Regards, Andreas -- Andreas Kretschmer http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
>
> I am trying to delete the connections with date 2016-03-10 by using the
> following:
>
>
> delete from connection where uid in (select uid from connection where ts >
> '2016-03-10 00:30:00');
try to rewrite that to :
delete from connection where ts > '2016-03-10 00:30:00';
It's simpler - and (maybe) faster.
> There are around 800.000 records matching this rule, and seems to be taking
> an awful lot of time - 4 hours and counting. What could be the reason for
> such a performance hit and how could I optimise this for future cases?
>
> Regards.
the db has to touch such many rows, and has to write the transaction log. And
update every index. And it has to check the referenced tables for the
constraints. Do you have proper indexes?
On Tuesday, March 15, 2016, David G. Johnston <david.g.johnston@gmail.com> wrote:
On Tuesday, March 15, 2016, Andreas Kretschmer <andreas@a-kretschmer.de> wrote:
>
> I am trying to delete the connections with date 2016-03-10 by using the
> following:
>
>
> delete from connection where uid in (select uid from connection where ts >
> '2016-03-10 00:30:00');
try to rewrite that to :
delete from connection where ts > '2016-03-10 00:30:00';
It's simpler - and (maybe) faster.It also gives a different answer...
> > Given the lack of indexes on the one table that is shown I suspect this is > the most likely cause (FK + indexes) > right, there should be an index on ts. If it is a 9.5 AND the rows are ordered on ts a BRIN-index would be fine. If not - a normal index. How many rows contains the table? -- Andreas Kretschmer http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
>>From: David G. Johnston
>>Sent: Tuesday, March 15, 2016 6:25 PM
> There are around 800.000 records matching this rule, and seems to be taking
> an awful lot of time - 4 hours and counting. What could be the reason for
> such a performance hit and how could I optimise this for future cases?
the db has to touch such many rows, and has to write the transaction log. And
update every index. And it has to check the referenced tables for the
constraints. Do you have proper indexes?
Given the lack of indexes on the one table that is shown I suspect this is the most likely cause (FK + indexes)
David J.
<<
There are SEVEN FKs against that table…I would bet that’s 50% of the duration. The lack of an index, perhaps an issue, but
With that many FK references plus that many rows…the transaction log could easily blow out and start paging to disk.
When deleting more than perhaps 20k rows, I will normally write a delete loop, grabbing roughly 20-50k rows at time (server capacity
dependent), deleting that set, grabbing another set, etc. That allows the set to commit, releasing pressure on the tran log.
You can easily experiment and see how long 10k rows take to delete. If still long, the FKs are the issue…you may need to script them out,
drop them, run the deletes, then rebuild them.
Mike S.
<div dir="ltr"><div class="gmail_default" style="font-family:'comic sans ms',sans-serif"><div dir="ltr" style="font-family:arial,sans-serif;font-size:12.8px"><divclass="gmail_default" style="font-family:'comic sans ms',sans-serif">HiAndreas,</div><div class="gmail_default" style="font-family:'comic sans ms',sans-serif"><br /></div><divclass="gmail_default" style="font-family:'comic sans ms',sans-serif">shared_buffers is 512 MB. </div><div class="gmail_default"style="font-family:'comic sans ms',sans-serif">I had to reduce the ts to only 10 minutes, just to beable to finish promptly. Results are bellow.</div><div class="gmail_default" style="font-family:'comic sans ms',sans-serif"><br/></div><div class="gmail_default" style="font-family:'comic sans ms',sans-serif"><p>select count(uid)from connection where uid in (select uid from connection where ts > '2016-03-10 01:00:00' and ts < '2016-03-1001:10:00');<p>count <p>-------<p> 2156<p><p>(1 row)<p><br /><p><p>explain (analyze) delete from connection whereuid in (select uid from connection where ts > '2016-03-10 01:00:00' and ts < '2016-03-10 01:10:00');<p><br /><p>-------------------------------------------------------------------------------------------------------------------------------------------------<p> Delete onconnection (cost=0.43..174184.31 rows=7756 width=12) (actual time=529.739..529.739 rows=0 loops=1)<p> -> NestedLoop (cost=0.43..174184.31 rows=7756 width=12) (actual time=0.036..526.295 rows=2156 loops=1)<p> -> SeqScan on connection connection_1 (cost=0.00..115684.55 rows=7756 width=24) (actual time=0.020..505.012 rows=2156 loops=1)<p> Filter: ((ts > '2016-03-10 01:00:00'::timestamp without time zone) AND (ts < '2016-03-10 01:10:00'::timestampwithout time zone))<p> Rows Removed by Filter: <a href="tel:3108811" target="_blank" value="+13108811">3108811</a><p> -> Index Scan using connection_pkey on connection (cost=0.43..7.53 rows=1 width=24)(actual time=0.009..0.010 rows=1 loops=2156)<p> Index Cond: ((uid)::text = (connection_1.uid)::text)<p> Planningtime: 0.220 ms<p> Trigger for constraint dns_uid_fkey: time=133.046 calls=2156<p> Triggerfor constraint files_uid_fkey: time=39780.799 calls=2156<p> Trigger for constraint http_uid_fkey: time=99300.851calls=2156<p> Trigger for constraint notice_uid_fkey: time=128.653 calls=2156<p> Trigger for constraint snmp_uid_fkey:time=59.491 calls=2156<p> Trigger for constraint ssl_uid_fkey: time=74.052 calls=2156<p> Trigger for constraintweird_uid_fkey: time=25868.651 calls=2156<p> Execution time: 165880.419 ms<p><p>(16 rows)<p><br /><p><span class="im"style="font-family:arial,sans-serif;font-size:12.8px">><br />> Given the lack of indexes on the one tablethat is shown I suspect this is<br />> the most likely cause (FK + indexes)<br />><br /><br /></span><span style="font-family:arial,sans-serif;font-size:12.8px">>right, there should be an index on ts. If it is a 9.5 AND the rowsare ordered</span><br style="font-family:arial,sans-serif;font-size:12.8px" /><span style="font-family:arial,sans-serif;font-size:12.8px">>on ts a BRIN-index would be fine.</span><br style="font-family:arial,sans-serif;font-size:12.8px"/><span style="font-family:arial,sans-serif;font-size:12.8px">> Ifnot - a normal index. How many rows contains the table?</span><br /></div></div><div class="" style="font-family:arial,sans-serif;font-size:12.8px"></div><divclass="" style="font-family:arial,sans-serif;font-size:12.8px"><pclass=""><span class=""> count </span><p class=""><span class="">---------</span><pclass=""><span class=""> 3108811</span><p class=""><span class="">(1 row)</span></div><div class=""style="font-family:arial,sans-serif;font-size:12.8px"><p class="MsoNormal" style="font-size:12.8px"><span style="font-size:11pt;font-family:Calibri,sans-serif">>There are SEVEN FKs against that table…I would bet that’s 50% ofthe duration. The lack of an index, </span><p class="MsoNormal" style="font-size:12.8px"><span style="font-size:11pt;font-family:Calibri,sans-serif">>perhaps an issue, but<u></u><u></u></span><p class="MsoNormal"style="font-size:12.8px"><span style="font-size:11pt;font-family:Calibri,sans-serif">> With that manyFK references plus that many rows…the transaction log could easily blow out and start </span><p class="MsoNormal" style="font-size:12.8px"><spanstyle="font-size:11pt;font-family:Calibri,sans-serif">> paging to disk.</span><p class="MsoNormal"style="font-size:12.8px"><span style="font-size:11pt;font-family:Calibri,sans-serif"><br /></span><p class="MsoNormal"style="font-size:12.8px">I think you are right, fk seem to take the biggest chunk of time from the holedelete operation. I made a test with 10.000 rows, it took 12 minutes. 20.000 rows took about 25 minutes to delete. <pclass="MsoNormal" style="font-size:12.8px"><br /><p class="MsoNormal" style="font-size:12.8px">Regards to all.</div></div></div><divclass="gmail_extra"><br /><div class="gmail_quote">On 16 March 2016 at 01:12, Andreas Kretschmer<span dir="ltr"><<a href="mailto:andreas@a-kretschmer.de" target="_blank">andreas@a-kretschmer.de</a>></span>wrote:<br /><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px#ccc solid;padding-left:1ex"><span class=""><br /> ><br /> > I am trying to delete the connectionswith date 2016-03-10 by using the<br /> > following:<br /> ><br /> ><br /> > delete from connectionwhere uid in (select uid from connection where ts ><br /> > '2016-03-10 00:30:00');<br /><br /></span>tryto rewrite that to :<br /><br /> delete from connection where ts > '2016-03-10 00:30:00';<br /><br /> It'ssimpler - and (maybe) faster.<br /><span class=""><br /><br /><br /> > There are around 800.000 records matching thisrule, and seems to be taking<br /> > an awful lot of time - 4 hours and counting. What could be the reason for<br/> > such a performance hit and how could I optimise this for future cases?<br /> ><br /> > Regards.<br /><br/></span>the db has to touch such many rows, and has to write the transaction log. And<br /> update every index. Andit has to check the referenced tables for the<br /> constraints. Do you have proper indexes?<br /><br /> How large isshared_buffers set?<br /><br /> Can you show us the explain (analyse)?<br /><br /><br /> Regards, Andreas<br /><span class="HOEnZb"><fontcolor="#888888">--<br /> Andreas Kretschmer<br /><a href="http://www.2ndQuadrant.com/" rel="noreferrer"target="_blank">http://www.2ndQuadrant.com/</a><br /> PostgreSQL Development, 24x7 Support, Remote DBA, Training& Services<br /></font></span></blockquote></div><br /></div>
ivo liondov <ivo.liondov@gmail.com> wrote: > > explain (analyze) delete from connection where uid in (select uid from > connection where ts > '2016-03-10 01:00:00' and ts < '2016-03-10 01:10:00'); > > > ------------------------------------------------------------------------------------------------------------------------------------------------- > > Delete on connection (cost=0.43..174184.31 rows=7756 width=12) (actual time= > 529.739..529.739 rows=0 loops=1) > > -> Nested Loop (cost=0.43..174184.31 rows=7756 width=12) (actual time= > 0.036..526.295 rows=2156 loops=1) > > -> Seq Scan on connection connection_1 (cost=0.00..115684.55 rows= > 7756 width=24) (actual time=0.020..505.012 rows=2156 loops=1) > > Filter: ((ts > '2016-03-10 01:00:00'::timestamp without time > zone) AND (ts < '2016-03-10 01:10:00'::timestamp without time zone)) there is no index on the ts-column. > > Rows Removed by Filter: 3108811 > > -> Index Scan using connection_pkey on connection (cost=0.43..7.53 > rows=1 width=24) (actual time=0.009..0.010 rows=1 loops=2156) > > Index Cond: ((uid)::text = (connection_1.uid)::text) > > Planning time: 0.220 ms > > Trigger for constraint dns_uid_fkey: time=133.046 calls=2156 > > Trigger for constraint files_uid_fkey: time=39780.799 calls=2156 > > Trigger for constraint http_uid_fkey: time=99300.851 calls=2156 > > Trigger for constraint notice_uid_fkey: time=128.653 calls=2156 > > Trigger for constraint snmp_uid_fkey: time=59.491 calls=2156 > > Trigger for constraint ssl_uid_fkey: time=74.052 calls=2156 > > Trigger for constraint weird_uid_fkey: time=25868.651 calls=2156 > > Execution time: 165880.419 ms i guess there are no indexes for this tables and the relevant columns > I think you are right, fk seem to take the biggest chunk of time from the hole > delete operation. I made a test with 10.000 rows, it took 12 minutes. 20.000 > rows took about 25 minutes to delete. create the missing indexes now and come back with the new duration. Andreas -- Really, I'm not out to destroy Microsoft. That will just be a completely unintentional side effect. (Linus Torvalds) "If I was god, I would recompile penguin with --enable-fly." (unknown) Kaufbach, Saxony, Germany, Europe. N 51.05082°, E 13.56889°
I agree with Andreas (indexes) - 10 minutes to delete 10k rows is about 9.5 minutes too long. Either the "select" part of the query can't find the rows quickly or the FK burden is crushing the life out of it. If every involved table has an index on their Primary Key then the 10k row delete should take maybe 30-60 seconds. Highly dependent on how many FK rows are involved. And...from a db design perspective, a table referenced by 7 FKs shouldn't be having this type of delete run against it...it's just too expensive if it must happen routinely. This is where de-normalization might be called for, to collapse some of those references, or a shift to stored functions that maintain integrity versus the declared foreign keys maintaining it. Mike S. -----Original Message----- From: Andreas Kretschmer Sent: Wednesday, March 16, 2016 4:58 AM ivo liondov <ivo.liondov@gmail.com> wrote: > > explain (analyze) delete from connection where uid in (select uid from > connection where ts > '2016-03-10 01:00:00' and ts < '2016-03-10 > 01:10:00'); > > > ---------------------------------------------------------------------- > ---------------------------------------------------------------------- > ----- > > Delete on connection (cost=0.43..174184.31 rows=7756 width=12) > (actual time= > 529.739..529.739 rows=0 loops=1) > > -> Nested Loop (cost=0.43..174184.31 rows=7756 width=12) (actual > time= > 0.036..526.295 rows=2156 loops=1) > > -> Seq Scan on connection connection_1 > (cost=0.00..115684.55 rows= > 7756 width=24) (actual time=0.020..505.012 rows=2156 loops=1) > > Filter: ((ts > '2016-03-10 01:00:00'::timestamp without > time > zone) AND (ts < '2016-03-10 01:10:00'::timestamp without time zone)) there is no index on the ts-column. > > Rows Removed by Filter: 3108811 > > -> Index Scan using connection_pkey on connection > (cost=0.43..7.53 > rows=1 width=24) (actual time=0.009..0.010 rows=1 loops=2156) > > Index Cond: ((uid)::text = (connection_1.uid)::text) > > Planning time: 0.220 ms > > Trigger for constraint dns_uid_fkey: time=133.046 calls=2156 > > Trigger for constraint files_uid_fkey: time=39780.799 calls=2156 > > Trigger for constraint http_uid_fkey: time=99300.851 calls=2156 > > Trigger for constraint notice_uid_fkey: time=128.653 calls=2156 > > Trigger for constraint snmp_uid_fkey: time=59.491 calls=2156 > > Trigger for constraint ssl_uid_fkey: time=74.052 calls=2156 > > Trigger for constraint weird_uid_fkey: time=25868.651 calls=2156 > > Execution time: 165880.419 ms i guess there are no indexes for this tables and the relevant columns > I think you are right, fk seem to take the biggest chunk of time from > the hole delete operation. I made a test with 10.000 rows, it took 12 > minutes. 20.000 rows took about 25 minutes to delete. create the missing indexes now and come back with the new duration. Andreas -- Really, I'm not out to destroy Microsoft. That will just be a completely unintentional side effect. (Linus Torvalds) "If I was god, I would recompile penguin with --enable-fly." (unknown) Kaufbach, Saxony, Germany, Europe. N 51.05082°, E 13.56889° -- Sent via pgsql-sql mailing list (pgsql-sql@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-sql
<div dir="ltr"><div class="gmail_default" style="font-family:'comic sans ms',sans-serif">My bad, I assumed postgres createsindexes on foreign keys automatically. </div><div class="gmail_default" style="font-family:'comic sans ms',sans-serif">Icreated indexes on all fk and on ts and OMG:</div><div class="gmail_default" style="font-family:'comic sansms',sans-serif"><br /></div><div class="gmail_default" style="font-family:'comic sans ms',sans-serif"><p class=""><spanclass="">bro=# select count(ts) from connection where ts >'2016-03-10 00:00:00';</span><p class=""><spanclass=""> count </span><p class=""><span class="">--------</span><p class=""><span class=""> 546997</span><pclass=""><span class="">(1 row)</span><p class=""><span class=""></span><br /><p class=""><spanclass="">bro=# select now(); delete from connection where uid in (select uid from connection where ts >'2016-03-10 00:00:00' ); select now();</span><p class=""><span class=""> now </span><p class=""><spanclass="">-------------------------------</span><p class=""><span class=""> 2016-03-16 14:02:43.172617+00</span><pclass=""><span class="">(1 row)</span><p class=""><span class=""></span><br /><p class=""><spanclass="">DELETE 546997</span><p class=""><span class=""> now </span><p class=""><spanclass="">-------------------------------</span><p class=""><span class=""> 2016-03-16 14:04:25.608695+00</span><pclass=""><span class="">(1 row)</span><p class=""><span class=""><br /></span><p class=""><spanclass="">Took less than 2 minutes. Thanks for the help.</span></div></div><div class="gmail_extra"><br /><divclass="gmail_quote">On 16 March 2016 at 11:57, Andreas Kretschmer <span dir="ltr"><<a href="mailto:akretschmer@spamfence.net"target="_blank">akretschmer@spamfence.net</a>></span> wrote:<br /><blockquote class="gmail_quote"style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><span class="">ivo liondov <<ahref="mailto:ivo.liondov@gmail.com">ivo.liondov@gmail.com</a>> wrote:<br /><br /> ><br /> > explain (analyze)delete from connection where uid in (select uid from<br /> > connection where ts > '2016-03-10 01:00:00' andts < '2016-03-10 01:10:00');<br /> ><br /> ><br /> > -------------------------------------------------------------------------------------------------------------------------------------------------<br />><br /> > Delete on connection (cost=0.43..174184.31 rows=7756 width=12) (actual time=<br /> > 529.739..529.739rows=0 loops=1)<br /> ><br /> > -> Nested Loop (cost=0.43..174184.31 rows=7756 width=12) (actualtime=<br /> > 0.036..526.295 rows=2156 loops=1)<br /> ><br /> > -> Seq Scan on connection connection_1 (cost=0.00..115684.55 rows=<br /> > 7756 width=24) (actual time=0.020..505.012 rows=2156 loops=1)<br /> ><br/> > Filter: ((ts > '2016-03-10 01:00:00'::timestamp without time<br /> > zone) AND (ts <'2016-03-10 01:10:00'::timestamp without time zone))<br /><br /><br /></span>there is no index on the ts-column.<br /><spanclass=""><br /><br /><br /><br /><br /><br /><br /><br /> ><br /> > Rows Removed by Filter: <ahref="tel:3108811" value="+13108811">3108811</a><br /> ><br /> > -> Index Scan using connection_pkeyon connection (cost=0.43..7.53<br /> > rows=1 width=24) (actual time=0.009..0.010 rows=1 loops=2156)<br/> ><br /> > Index Cond: ((uid)::text = (connection_1.uid)::text)<br /> ><br /> > Planning time: 0.220 ms<br /> ><br /> > Trigger for constraint dns_uid_fkey: time=133.046 calls=2156<br /> ><br/> > Trigger for constraint files_uid_fkey: time=39780.799 calls=2156<br /> ><br /> > Trigger for constrainthttp_uid_fkey: time=99300.851 calls=2156<br /> ><br /> > Trigger for constraint notice_uid_fkey: time=128.653calls=2156<br /> ><br /> > Trigger for constraint snmp_uid_fkey: time=59.491 calls=2156<br /> ><br/> > Trigger for constraint ssl_uid_fkey: time=74.052 calls=2156<br /> ><br /> > Trigger for constraintweird_uid_fkey: time=25868.651 calls=2156<br /> ><br /> > Execution time: 165880.419 ms<br /><br /></span>iguess there are no indexes for this tables and the relevant columns<br /><span class=""><br /><br /> > I thinkyou are right, fk seem to take the biggest chunk of time from the hole<br /> > delete operation. I made a test with10.000 rows, it took 12 minutes. 20.000<br /> > rows took about 25 minutes to delete. <br /><br /></span>create themissing indexes now and come back with the new duration.<br /><br /><br /> Andreas<br /><span class="HOEnZb"><font color="#888888">--<br/> Really, I'm not out to destroy Microsoft. That will just be a completely<br /> unintentional sideeffect. (Linus Torvalds)<br /> "If I was god, I would recompile penguin with --enable-fly." (unknown)<br /> Kaufbach, Saxony, Germany, Europe. N 51.05082°, E 13.56889°<br /><br /><br />--<br /> Sent via pgsql-sql mailing list (<a href="mailto:pgsql-sql@postgresql.org">pgsql-sql@postgresql.org</a>)<br />To make changes to your subscription:<br /><a href="http://www.postgresql.org/mailpref/pgsql-sql" rel="noreferrer" target="_blank">http://www.postgresql.org/mailpref/pgsql-sql</a><br/></font></span></blockquote></div><br /></div>
<div dir="ltr"><div class="gmail_default" style="font-family:'comic sans ms',sans-serif"><span style="font-family:arial,sans-serif;font-size:12.8px">>And...from a db design perspective, a table referenced by 7 FKsshouldn't be</span><br style="font-family:arial,sans-serif;font-size:12.8px" /><span style="font-family:arial,sans-serif;font-size:12.8px">>having this type</span><br style="font-family:arial,sans-serif;font-size:12.8px"/><span style="font-family:arial,sans-serif;font-size:12.8px">> ofdelete run against it...it's just too expensive if it must happen</span><br style="font-family:arial,sans-serif;font-size:12.8px"/><span style="font-family:arial,sans-serif;font-size:12.8px">> routinely. </span><br/></div><div class="gmail_default" style="font-family:'comic sans ms',sans-serif"><span style="font-family:arial,sans-serif;font-size:12.8px"><br/></span></div><div class="gmail_default" style="font-family:'comicsans ms',sans-serif"><span style="font-family:arial,sans-serif;font-size:12.8px">The connectiontable should never have any data deleted, only insertions are performed.</span></div><div class="gmail_default"><spanstyle="font-size:12.8px">I had logged using ssh to the db server, forgot to plugin my laptop andit went to sleep mode while running an insert script from external data source - a bro ids file. The shell logged me outand the insert statement never completed ( about 10 % of the data never got inserted). I should know better.</span></div><divclass="gmail_default"><span style="font-size:12.8px"><br /></span></div><div class="gmail_default"><spanstyle="font-size:12.8px">Kind regards to all of you.</span></div></div><div class="gmail_extra"><br/><div class="gmail_quote">On 16 March 2016 at 12:31, Mike Sofen <span dir="ltr"><<a href="mailto:msofen@runbox.com"target="_blank">msofen@runbox.com</a>></span> wrote:<br /><blockquote class="gmail_quote"style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">I agree with Andreas (indexes) -10 minutes to delete 10k rows is about 9.5<br /> minutes too long.<br /> Either the "select" part of the query can't findthe rows quickly or the FK<br /> burden is crushing the life out of it.<br /><br /> If every involved table has an indexon their Primary Key then the 10k row<br /> delete should take<br /> maybe 30-60 seconds. Highly dependent on how manyFK rows are involved.<br /><br /> And...from a db design perspective, a table referenced by 7 FKs shouldn't be<br />having this type<br /> of delete run against it...it's just too expensive if it must happen<br /> routinely. This is where<br/> de-normalization might be called for, to collapse some of those references,<br /> or a shift to<br /> stored functionsthat maintain integrity versus the declared foreign keys<br /> maintaining it.<br /><br /> Mike S.<br /><div class="HOEnZb"><divclass="h5"><br /> -----Original Message-----<br /> From: Andreas Kretschmer<br /> Sent: Wednesday, March16, 2016 4:58 AM<br /><br /> ivo liondov <<a href="mailto:ivo.liondov@gmail.com">ivo.liondov@gmail.com</a>> wrote:<br/><br /> ><br /> > explain (analyze) delete from connection where uid in (select uid from<br /> > connectionwhere ts > '2016-03-10 01:00:00' and ts < '2016-03-10<br /> > 01:10:00');<br /> ><br /> ><br />> ----------------------------------------------------------------------<br /> > ----------------------------------------------------------------------<br/> > -----<br /> ><br /> > Delete on connection (cost=0.43..174184.31 rows=7756 width=12)<br /> > (actual time=<br /> > 529.739..529.739 rows=0 loops=1)<br/> ><br /> > -> Nested Loop (cost=0.43..174184.31 rows=7756 width=12) (actual<br /> > time=<br/> > 0.036..526.295 rows=2156 loops=1)<br /> ><br /> > -> Seq Scan on connection connection_1 <br/> > (cost=0.00..115684.55 rows=<br /> > 7756 width=24) (actual time=0.020..505.012 rows=2156 loops=1)<br/> ><br /> > Filter: ((ts > '2016-03-10 01:00:00'::timestamp without<br /> > time<br/> > zone) AND (ts < '2016-03-10 01:10:00'::timestamp without time zone))<br /><br /><br /> there is no indexon the ts-column.<br /><br /><br /><br /><br /><br /><br /><br /><br /> ><br /> > Rows Removedby Filter: 3108811<br /> ><br /> > -> Index Scan using connection_pkey on connection <br /> >(cost=0.43..7.53<br /> > rows=1 width=24) (actual time=0.009..0.010 rows=1 loops=2156)<br /> ><br /> > Index Cond: ((uid)::text = (connection_1.uid)::text)<br /> ><br /> > Planning time: 0.220 ms<br /> ><br/> > Trigger for constraint dns_uid_fkey: time=133.046 calls=2156<br /> ><br /> > Trigger for constraintfiles_uid_fkey: time=39780.799 calls=2156<br /> ><br /> > Trigger for constraint http_uid_fkey: time=99300.851calls=2156<br /> ><br /> > Trigger for constraint notice_uid_fkey: time=128.653 calls=2156<br /> ><br/> > Trigger for constraint snmp_uid_fkey: time=59.491 calls=2156<br /> ><br /> > Trigger for constraintssl_uid_fkey: time=74.052 calls=2156<br /> ><br /> > Trigger for constraint weird_uid_fkey: time=25868.651calls=2156<br /> ><br /> > Execution time: 165880.419 ms<br /><br /> i guess there are no indexes forthis tables and the relevant columns<br /><br /><br /> > I think you are right, fk seem to take the biggest chunk oftime from<br /> > the hole delete operation. I made a test with 10.000 rows, it took 12<br /> > minutes. 20.000 rowstook about 25 minutes to delete.<br /><br /> create the missing indexes now and come back with the new duration.<br /><br/><br /> Andreas<br /> --<br /> Really, I'm not out to destroy Microsoft. That will just be a completely<br /> unintentionalside effect. (Linus Torvalds)<br /> "If I was god, I would recompile penguin with--enable-fly." (unknown)<br /> Kaufbach, Saxony, Germany, Europe. N 51.05082°, E 13.56889°<br /><br /><br/> --<br /> Sent via pgsql-sql mailing list (<a href="mailto:pgsql-sql@postgresql.org">pgsql-sql@postgresql.org</a>)<br/> To make changes to your subscription:<br /><a href="http://www.postgresql.org/mailpref/pgsql-sql"rel="noreferrer" target="_blank">http://www.postgresql.org/mailpref/pgsql-sql</a><br/><br /><br /><br /> --<br /> Sent via pgsql-sql mailinglist (<a href="mailto:pgsql-sql@postgresql.org">pgsql-sql@postgresql.org</a>)<br /> To make changes to your subscription:<br/><a href="http://www.postgresql.org/mailpref/pgsql-sql" rel="noreferrer" target="_blank">http://www.postgresql.org/mailpref/pgsql-sql</a><br/></div></div></blockquote></div><br /></div>
ivo liondov <ivo.liondov@gmail.com> wrote: > Took less than 2 minutes. Thanks for the help. you are welcome. Andreas -- Really, I'm not out to destroy Microsoft. That will just be a completely unintentional side effect. (Linus Torvalds) "If I was god, I would recompile penguin with --enable-fly." (unknown) Kaufbach, Saxony, Germany, Europe. N 51.05082°, E 13.56889°