Thread: DELETE with filter on ctid
We have a query which generates a small set of rows (~1,000) which are to be used in a DELETE on the same table. The problem we have is that we need to join on 5 different columns and it takes far too long.
I have a solution but I'm not sure it's the right one. Instead of joining on 5 columns in the DELETE the join uses the ctid column.BEGIN;
CREATE INDEX gregs_table_ctid_idx ON gregs_table(ctid);
DELETE FROM gregs_table gt
USING (SELECT ctid FROM gregs_table WHERE ...) as s
WHERE gt.ctid=s.ctid;
DROP INDEX gregs_table_ctid_idx;
COMMIT;
The difference to me is a 20+ minute to a ~5 second transaction. The table is loaded using COPY, never INSERT, never UPDATE'd. COPY, SELECT and DELETE is its life. PostgreSQL 8.2.1 on RedHat ES 4.0 is the target platform.
Any possible issues with using ctid in the DELETE and transaction? I understand ctid is "useless" in the long run as the documentation points out but for the short term and within a transaction it seems to work well.
Thoughts?
Greg
--
Greg Spiegelberg
614.318.4314, office
614.431.8388, fax
ISOdx Product Development Manager
Cranel, Inc.
"Spiegelberg, Greg" <gspiegelberg@cranel.com> writes: > We have a query which generates a small set of rows (~1,000) which are > to be used in a DELETE on the same table. The problem we have is that > we need to join on 5 different columns and it takes far too long. I > have a solution but I'm not sure it's the right one. Instead of joining > on 5 columns in the DELETE the join uses the ctid column. > BEGIN; > CREATE INDEX gregs_table_ctid_idx ON gregs_table(ctid); > DELETE FROM gregs_table gt > USING (SELECT ctid FROM gregs_table WHERE ...) as s > WHERE gt.ctid=s.ctid; > DROP INDEX gregs_table_ctid_idx; > COMMIT; Forget the index, it's useless here (hint: ctid is a physical address). I'm wondering though why you don't just transpose the subquery's WHERE condition into the DELETE's WHERE? Or is this example oversimplified? regards, tom lane
Spiegelberg, Greg wrote: > We have a query which generates a small set of rows (~1,000) which are > to be used in a DELETE on the same table. The problem we have is that > we need to join on 5 different columns and it takes far too long. You may have encountered the same problem I did: You *must* run ANALYZE on a temporary table before you use in another query. It's surprising that this is true even for very small tables (a few hundred to a few thousand rows), but it is. Ihad a case where I created a "scratch" table like yours, and the before/after ANALYZE performance was the difference between30 seconds and a few milliseconds for the same query. Craig
Tom et al, Sometimes it takes a look from someone on the outside to get the job done right. Below is, I believe, everything pertinent to this problem. First is the table in question, second is the problematic and original query, and final is the transaction that I have working today with the CTID implementation. I would welcome any feedback. TIA, Greg cranel=# \d sid2.data_id_table Table "sid2.data_id_table" Column | Type | Modifiers -------------+---------+--------------- point_id | bigint | dtype_id | bigint | segment_id | bigint | key1_id | bigint | not null key2_id | bigint | data_id | bigint | not null deleted | boolean | default false removed | boolean | default false added | boolean | default false persist | boolean | default false Indexes: "data_id_table_data_id_indx" btree (data_id) "data_id_table_dtype_id_indx" btree (dtype_id) "data_id_table_dtype_ss_id_indx" btree (dtype_id, point_id) "data_id_table_key1_id_indx" btree (key1_id) "data_id_table_key2_id_indx" btree (key2_id) "data_id_table_mod_dtype_ss_id_indx" btree (segment_id, dtype_id, point_id) "data_id_table_segment_id_indx" btree (segment_id) "data_id_table_point_id_indx" btree (point_id) cranel=# explain analyze DELETE FROM sid2.data_id_table AS dd USING public.points AS ss, (SELECT markeddel.* FROM (SELECT d.* FROM sid2.data_id_table d,public.points s WHERE s.systems_id=2 AND s.id<2 AND s.permpoint=FALSE AND s.id=d.point_id AND d.persist=FALSE AND d.dtype_id=3) AS markeddel JOIN (SELECT DISTINCT ON (d.key1_id,d.key2_id) d.* FROM sid2.data_id_table d,public.points s WHERE s.systems_id=2 AND s.id<=2 AND s.id=d.point_id AND d.dtype_id=3 ORDER BY d.key1_id,d.key2_id,d.point_id DESC) AS rollup ON (markeddel.key1_id,markeddel.key2_id)=(rollup.key1_id,rollup.key2_id) WHERE markeddel.point_id<>rollup.point_id) ru WHERE ss.systems_id=2 AND ss.id<2 AND ss.permpoint=FALSE AND ss.id=dd.point_id AND dd.persist=FALSE AND dd.dtype_id=3 AND (dd.point_id,dd.key1_id,dd.key2_id)=(ru.point_id,ru.key1_id,ru.key2_id); QUERY PLAN ------------------------------------------------------------------------ ------------------------------------------------------------------------ ------------------------------------ Nested Loop (cost=1037.06..1130.46 rows=1 width=6) (actual time=33291.639..678047.543 rows=564 loops=1) Join Filter: ((dd.point_id = d.point_id) AND (d.point_id <> rollup.point_id)) -> Merge Join (cost=1028.10..1117.47 rows=1 width=70) (actual time=1775.971..3721.991 rows=156750 loops=1) Merge Cond: ((rollup.key1_id = dd.key1_id) AND (rollup.key2_id = dd.key2_id)) -> Unique (cost=629.66..659.24 rows=3944 width=52) (actual time=896.293..1571.591 rows=156779 loops=1) -> Sort (cost=629.66..639.52 rows=3944 width=52) (actual time=896.285..1080.444 rows=157342 loops=1) Sort Key: d.key1_id, d.key2_id, d.point_id -> Nested Loop (cost=0.00..394.10 rows=3944 width=52) (actual time=8.846..529.901 rows=157352 loops=1) -> Seq Scan on points s (cost=0.00..1.72 rows=1 width=8) (actual time=0.064..0.096 rows=2 loops=1) Filter: ((systems_id = 2) AND (id <= 2)) -> Index Scan using data_id_table_point_id_indx on data_id_table d (cost=0.00..339.79 rows=4207 width=52) (actual time=4.649..155.174 rows=78676 loops=2) Index Cond: (s.id = d.point_id) Filter: (dtype_id = 3) -> Sort (cost=398.44..398.64 rows=82 width=46) (actual time=879.658..1109.830 rows=156750 loops=1) Sort Key: dd.key1_id, dd.key2_id -> Nested Loop (cost=0.00..395.83 rows=82 width=46) (actual time=5.197..549.873 rows=156750 loops=1) -> Nested Loop (cost=0.00..3.45 rows=1 width=16) (actual time=0.055..0.107 rows=1 loops=1) Join Filter: (ss.id = s.id) -> Seq Scan on points ss (cost=0.00..1.72 rows=1 width=8) (actual time=0.037..0.052 rows=1 loops=1) Filter: ((systems_id = 2) AND (id < 2) AND (NOT permpoint)) -> Seq Scan on points s (cost=0.00..1.72 rows=1 width=8) (actual time=0.006..0.039 rows=1 loops=1) Filter: ((systems_id = 2) AND (id < 2) AND (NOT permpoint)) -> Index Scan using data_id_table_point_id_indx on data_id_table dd (cost=0.00..339.79 rows=4207 width=30) (actual time=5.135..342.406 rows=156750 loops=1) Index Cond: (ss.id = dd.point_id) Filter: ((NOT persist) AND (dtype_id = 3)) -> Bitmap Heap Scan on data_id_table d (cost=8.96..12.97 rows=1 width=24) (actual time=4.289..4.290 rows=1 loops=156750) Recheck Cond: ((d.key1_id = rollup.key1_id) AND (d.key2_id = rollup.key2_id)) Filter: ((NOT persist) AND (dtype_id = 3)) -> BitmapAnd (cost=8.96..8.96 rows=1 width=0) (actual time=4.280..4.280 rows=0 loops=156750) -> Bitmap Index Scan on data_id_table_key1_id_indx (cost=0.00..4.32 rows=4 width=0) (actual time=0.020..0.020 rows=31 loops=156750) Index Cond: (d.key1_id = rollup.key1_id) -> Bitmap Index Scan on data_id_table_key2_id_indx (cost=0.00..4.38 rows=13 width=0) (actual time=4.254..4.254 rows=26187 loops=156750) Index Cond: (d.key2_id = rollup.key2_id) Total runtime: 678063.873 ms (34 rows) cranel=# \timing Timing is on. cranel=# BEGIN; BEGIN Time: 0.340 ms cranel=# CREATE INDEX data_id_table_ctid_idx ON sid2.data_id_table(ctid); CREATE INDEX Time: 648.911 ms cranel=# explain analyze DELETE FROM sid2.data_id_table AS dd USING public.points AS ss, (SELECT markeddel.ctid FROM (SELECT d.ctid,d.* FROM sid2.data_id_table d,public.points s WHERE s.systems_id=2 AND s.id<2 AND s.permpoint=FALSE AND s.id=d.point_id AND d.persist=FALSE AND d.dtype_id=3) AS markeddel JOIN (SELECT DISTINCT ON (d.key1_id,d.key2_id) d.* FROM sid2.data_id_table d,public.points s WHERE s.systems_id=2 AND s.id<=2 AND s.id=d.point_id AND d.dtype_id=3 ORDER BY d.key1_id,d.key2_id,d.point_id DESC) AS rollup ON (markeddel.key1_id,markeddel.key2_id)=(rollup.key1_id,rollup.key2_id) WHERE markeddel.point_id<>rollup.point_id) ru WHERE ss.systems_id=2 AND ss.id<2 AND ss.permpoint=FALSE AND ss.id=dd.point_id AND dd.persist=FALSE AND dd.dtype_id=3 AND dd.ctid=ru.ctid; QUERY PLAN ------------------------------------------------------------------------ ------------------------------------------------------------------------ ------------------------------------------ Nested Loop (cost=1259.33..1378.37 rows=1 width=6) (actual time=1807.429..2625.722 rows=562 loops=1) -> Nested Loop (cost=1259.33..1378.08 rows=1 width=14) (actual time=1807.372..2619.592 rows=562 loops=1) -> Merge Join (cost=1259.33..1377.66 rows=1 width=6) (actual time=1807.228..2606.901 rows=562 loops=1) Merge Cond: ((rollup.key1_id = d.key1_id) AND (rollup.key2_id = d.key2_id)) Join Filter: (d.point_id <> rollup.point_id) -> Unique (cost=629.66..659.24 rows=3944 width=52) (actual time=911.409..1271.121 rows=156779 loops=1) -> Sort (cost=629.66..639.52 rows=3944 width=52) (actual time=911.403..1024.775 rows=157342 loops=1) Sort Key: d.key1_id, d.key2_id, d.point_id -> Nested Loop (cost=0.00..394.10 rows=3944 width=52) (actual time=6.036..548.119 rows=157352 loops=1) -> Seq Scan on points s (cost=0.00..1.72 rows=1 width=8) (actual time=0.114..0.137 rows=2 loops=1) Filter: ((systems_id = 2) AND (id <= 2)) -> Index Scan using data_id_table_point_id_indx on data_id_table d (cost=0.00..339.79 rows=4207 width=52) (actual time=3.216..155.284 rows=78676 loops=2) Index Cond: (s.id = d.point_id) Filter: (dtype_id = 3) -> Sort (cost=629.66..639.52 rows=3944 width=30) (actual time=875.213..980.618 rows=156750 loops=1) Sort Key: d.key1_id, d.key2_id -> Nested Loop (cost=0.00..394.10 rows=3944 width=30) (actual time=5.864..553.290 rows=156750 loops=1) -> Seq Scan on points s (cost=0.00..1.72 rows=1 width=8) (actual time=0.022..0.053 rows=1 loops=1) Filter: ((systems_id = 2) AND (id < 2) AND (NOT permpoint)) -> Index Scan using data_id_table_point_id_indx on data_id_table d (cost=0.00..339.79 rows=4207 width=30) (actual time=5.831..355.139 rows=156750 loops=1) Index Cond: (s.id = d.point_id) Filter: ((NOT persist) AND (dtype_id = 3)) -> Index Scan using data_id_table_ctid_idx on data_id_table dd (cost=0.00..0.41 rows=1 width=14) (actual time=0.017..0.019 rows=1 loops=562) Index Cond: (dd.ctid = d.ctid) Filter: ((NOT persist) AND (dtype_id = 3)) -> Index Scan using points_pkey on points ss (cost=0.00..0.28 rows=1 width=8) (actual time=0.005..0.007 rows=1 loops=562) Index Cond: ((ss.id < 2) AND (ss.id = dd.point_id)) Filter: ((systems_id = 2) AND (NOT permpoint)) Total runtime: 2641.820 ms (29 rows) Time: 2652.940 ms cranel=# DROP INDEX data_id_table_ctid_idx; DROP INDEX Time: 33.653 ms cranel=# DELETE FROM sid2.data_id_table AS dd WHERE dd.point_id=2 AND dd.dtype_id=3 AND dd.deleted AND NOT dd.persist; DELETE 0 Time: 0.960 ms cranel=# COMMIT; Time: 20.500 ms -----Original Message----- From: Tom Lane [mailto:tgl@sss.pgh.pa.us] Sent: Monday, April 09, 2007 4:55 PM To: Spiegelberg, Greg Cc: pgsql-performance@postgresql.org Subject: Re: [PERFORM] DELETE with filter on ctid "Spiegelberg, Greg" <gspiegelberg@cranel.com> writes: > We have a query which generates a small set of rows (~1,000) which are > to be used in a DELETE on the same table. The problem we have is that > we need to join on 5 different columns and it takes far too long. I > have a solution but I'm not sure it's the right one. Instead of joining > on 5 columns in the DELETE the join uses the ctid column. > BEGIN; > CREATE INDEX gregs_table_ctid_idx ON gregs_table(ctid); > DELETE FROM gregs_table gt > USING (SELECT ctid FROM gregs_table WHERE ...) as s > WHERE gt.ctid=s.ctid; > DROP INDEX gregs_table_ctid_idx; > COMMIT; Forget the index, it's useless here (hint: ctid is a physical address). I'm wondering though why you don't just transpose the subquery's WHERE condition into the DELETE's WHERE? Or is this example oversimplified? regards, tom lane
Craig, I'm not using a TEMP TABLE in this DELETE however I have tried an ANALYZE prior to the DELETE but it hardly makes a dent in the time. Please look at the other follow-up email I just sent for full details. Greg -----Original Message----- From: Craig A. James [mailto:cjames@modgraph-usa.com] Sent: Monday, April 09, 2007 5:58 PM To: Spiegelberg, Greg Cc: pgsql-performance@postgresql.org Subject: Re: [PERFORM] DELETE with filter on ctid Spiegelberg, Greg wrote: > We have a query which generates a small set of rows (~1,000) which are > to be used in a DELETE on the same table. The problem we have is that > we need to join on 5 different columns and it takes far too long. You may have encountered the same problem I did: You *must* run ANALYZE on a temporary table before you use in another query. It's surprising that this is true even for very small tables (a few hundred to a few thousand rows), but it is. I had a case where I created a "scratch" table like yours, and the before/after ANALYZE performance was the difference between 30 seconds and a few milliseconds for the same query. Craig
"Spiegelberg, Greg" <gspiegelberg@cranel.com> writes: > Below is, I believe, everything pertinent to this problem. First is the > table in question, second is the problematic and original query, and > final is the transaction that I have working today with the CTID > implementation. So the basic issue here is that data_id_table hasn't got a primary key you could use as a join key? I won't lecture you about that, but a lot of people think it's bad practice not to have a recognizable primary key. The slow query's problem seems to be mostly that the rowcount estimates are horribly bad, leading to inappropriate choices of nestloop joins. Are the statistics up-to-date? You might try increasing the stats target for data_id_table in particular. A really brute-force test would be to see what happens with that query if you just set enable_nestloop = 0. As for the CTID query, my initial reaction that you shouldn't need an index was wrong; looking into the code I see * There is currently no special support for joins involving CTID; in * particular nothing corresponding to best_inner_indexscan(). Since it's * not very useful to store TIDs of one table in another table, there * doesn't seem to be enough use-case to justify adding a lot of code * for that. Maybe we should revisit that sometime, though I'm still not entirely convinced by this example. regards, tom lane