Thread: DELETE with filter on ctid

DELETE with filter on ctid

From
"Spiegelberg, Greg"
Date:
We have a query which generates a small set of rows (~1,000) which are to be used in a DELETE on the same table.  The problem we have is that we need to join on 5 different columns and it takes far too long. 
I have a solution but I'm not sure it's the right one.  Instead of joining on 5 columns in the DELETE the join uses the ctid column.
 
BEGIN;
CREATE INDEX gregs_table_ctid_idx ON gregs_table(ctid);
DELETE FROM gregs_table gt
   USING (SELECT ctid FROM gregs_table WHERE ...) as s
   WHERE gt.ctid=s.ctid;
DROP INDEX gregs_table_ctid_idx;
COMMIT;
 
The difference to me is a 20+ minute to a ~5 second transaction.  The table is loaded using COPY, never INSERT, never UPDATE'd.  COPY, SELECT and DELETE is its life.  PostgreSQL 8.2.1 on RedHat ES 4.0 is the target platform.
 
Any possible issues with using ctid in the DELETE and transaction?  I understand ctid is "useless" in the long run as the documentation points out but for the short term and within a transaction it seems to work well.
 
Thoughts?
 
Greg
 
 
--
 Greg Spiegelberg
 614.318.4314, office
 614.431.8388, fax
 ISOdx Product Development Manager
 Cranel, Inc.
 
 

Re: DELETE with filter on ctid

From
Tom Lane
Date:
"Spiegelberg, Greg" <gspiegelberg@cranel.com> writes:
> We have a query which generates a small set of rows (~1,000) which are
> to be used in a DELETE on the same table.  The problem we have is that
> we need to join on 5 different columns and it takes far too long.  I
> have a solution but I'm not sure it's the right one.  Instead of joining
> on 5 columns in the DELETE the join uses the ctid column.

> BEGIN;
> CREATE INDEX gregs_table_ctid_idx ON gregs_table(ctid);
> DELETE FROM gregs_table gt
>    USING (SELECT ctid FROM gregs_table WHERE ...) as s
>    WHERE gt.ctid=s.ctid;
> DROP INDEX gregs_table_ctid_idx;
> COMMIT;

Forget the index, it's useless here (hint: ctid is a physical address).
I'm wondering though why you don't just transpose the subquery's WHERE
condition into the DELETE's WHERE?  Or is this example oversimplified?

            regards, tom lane

Re: DELETE with filter on ctid

From
"Craig A. James"
Date:
Spiegelberg, Greg wrote:
> We have a query which generates a small set of rows (~1,000) which are
> to be used in a DELETE on the same table.  The problem we have is that
> we need to join on 5 different columns and it takes far too long.

You may have encountered the same problem I did:  You *must* run ANALYZE on a temporary table before you use in another
query. It's surprising that this is true even for very small tables (a few hundred to a few thousand rows), but it is.
Ihad a case where I created a "scratch" table like yours, and the before/after ANALYZE performance was the difference
between30 seconds and a few milliseconds for the same query. 

Craig

Re: DELETE with filter on ctid

From
"Spiegelberg, Greg"
Date:
Tom et al,

Sometimes it takes a look from someone on the outside to get the job
done right.

Below is, I believe, everything pertinent to this problem.  First is the
table in question, second is the problematic and original query, and
final is the transaction that I have working today with the CTID
implementation.

I would welcome any feedback.

TIA,
Greg



cranel=# \d sid2.data_id_table
        Table "sid2.data_id_table"
   Column    |  Type   |   Modifiers
-------------+---------+---------------
 point_id    | bigint  |
 dtype_id    | bigint  |
 segment_id  | bigint  |
 key1_id     | bigint  | not null
 key2_id     | bigint  |
 data_id     | bigint  | not null
 deleted     | boolean | default false
 removed     | boolean | default false
 added       | boolean | default false
 persist     | boolean | default false
Indexes:
    "data_id_table_data_id_indx" btree (data_id)
    "data_id_table_dtype_id_indx" btree (dtype_id)
    "data_id_table_dtype_ss_id_indx" btree (dtype_id, point_id)
    "data_id_table_key1_id_indx" btree (key1_id)
    "data_id_table_key2_id_indx" btree (key2_id)
    "data_id_table_mod_dtype_ss_id_indx" btree (segment_id, dtype_id,
point_id)
    "data_id_table_segment_id_indx" btree (segment_id)
    "data_id_table_point_id_indx" btree (point_id)

cranel=# explain analyze
DELETE FROM sid2.data_id_table AS dd
 USING public.points AS ss,
       (SELECT markeddel.*
          FROM (SELECT d.*
                  FROM sid2.data_id_table d,public.points s
                 WHERE s.systems_id=2 AND s.id<2 AND s.permpoint=FALSE
AND s.id=d.point_id AND d.persist=FALSE
                   AND d.dtype_id=3) AS markeddel
               JOIN
               (SELECT DISTINCT ON (d.key1_id,d.key2_id) d.*
                  FROM sid2.data_id_table d,public.points s
                 WHERE s.systems_id=2 AND s.id<=2 AND s.id=d.point_id
AND d.dtype_id=3
                 ORDER BY d.key1_id,d.key2_id,d.point_id DESC) AS rollup
               ON
(markeddel.key1_id,markeddel.key2_id)=(rollup.key1_id,rollup.key2_id)
         WHERE markeddel.point_id<>rollup.point_id) ru
 WHERE ss.systems_id=2 AND ss.id<2 AND ss.permpoint=FALSE AND
ss.id=dd.point_id
   AND dd.persist=FALSE AND dd.dtype_id=3
   AND
(dd.point_id,dd.key1_id,dd.key2_id)=(ru.point_id,ru.key1_id,ru.key2_id);

QUERY PLAN

------------------------------------------------------------------------
------------------------------------------------------------------------
------------------------------------
 Nested Loop  (cost=1037.06..1130.46 rows=1 width=6) (actual
time=33291.639..678047.543 rows=564 loops=1)
   Join Filter: ((dd.point_id = d.point_id) AND (d.point_id <>
rollup.point_id))
   ->  Merge Join  (cost=1028.10..1117.47 rows=1 width=70) (actual
time=1775.971..3721.991 rows=156750 loops=1)
         Merge Cond: ((rollup.key1_id = dd.key1_id) AND (rollup.key2_id
= dd.key2_id))
         ->  Unique  (cost=629.66..659.24 rows=3944 width=52) (actual
time=896.293..1571.591 rows=156779 loops=1)
               ->  Sort  (cost=629.66..639.52 rows=3944 width=52)
(actual time=896.285..1080.444 rows=157342 loops=1)
                     Sort Key: d.key1_id, d.key2_id, d.point_id
                     ->  Nested Loop  (cost=0.00..394.10 rows=3944
width=52) (actual time=8.846..529.901 rows=157352 loops=1)
                           ->  Seq Scan on points s  (cost=0.00..1.72
rows=1 width=8) (actual time=0.064..0.096 rows=2 loops=1)
                                 Filter: ((systems_id = 2) AND (id <=
2))
                           ->  Index Scan using
data_id_table_point_id_indx on data_id_table d  (cost=0.00..339.79
rows=4207 width=52) (actual time=4.649..155.174 rows=78676 loops=2)
                                 Index Cond: (s.id = d.point_id)
                                 Filter: (dtype_id = 3)
         ->  Sort  (cost=398.44..398.64 rows=82 width=46) (actual
time=879.658..1109.830 rows=156750 loops=1)
               Sort Key: dd.key1_id, dd.key2_id
               ->  Nested Loop  (cost=0.00..395.83 rows=82 width=46)
(actual time=5.197..549.873 rows=156750 loops=1)
                     ->  Nested Loop  (cost=0.00..3.45 rows=1 width=16)
(actual time=0.055..0.107 rows=1 loops=1)
                           Join Filter: (ss.id = s.id)
                           ->  Seq Scan on points ss  (cost=0.00..1.72
rows=1 width=8) (actual time=0.037..0.052 rows=1 loops=1)
                                 Filter: ((systems_id = 2) AND (id < 2)
AND (NOT permpoint))
                           ->  Seq Scan on points s  (cost=0.00..1.72
rows=1 width=8) (actual time=0.006..0.039 rows=1 loops=1)
                                 Filter: ((systems_id = 2) AND (id < 2)
AND (NOT permpoint))
                     ->  Index Scan using data_id_table_point_id_indx on
data_id_table dd  (cost=0.00..339.79 rows=4207 width=30) (actual
time=5.135..342.406 rows=156750 loops=1)
                           Index Cond: (ss.id = dd.point_id)
                           Filter: ((NOT persist) AND (dtype_id = 3))
   ->  Bitmap Heap Scan on data_id_table d  (cost=8.96..12.97 rows=1
width=24) (actual time=4.289..4.290 rows=1 loops=156750)
         Recheck Cond: ((d.key1_id = rollup.key1_id) AND (d.key2_id =
rollup.key2_id))
         Filter: ((NOT persist) AND (dtype_id = 3))
         ->  BitmapAnd  (cost=8.96..8.96 rows=1 width=0) (actual
time=4.280..4.280 rows=0 loops=156750)
               ->  Bitmap Index Scan on data_id_table_key1_id_indx
(cost=0.00..4.32 rows=4 width=0) (actual time=0.020..0.020 rows=31
loops=156750)
                     Index Cond: (d.key1_id = rollup.key1_id)
               ->  Bitmap Index Scan on data_id_table_key2_id_indx
(cost=0.00..4.38 rows=13 width=0) (actual time=4.254..4.254 rows=26187
loops=156750)
                     Index Cond: (d.key2_id = rollup.key2_id)
 Total runtime: 678063.873 ms
(34 rows)

cranel=# \timing
Timing is on.
cranel=# BEGIN;
BEGIN
Time: 0.340 ms

cranel=# CREATE INDEX data_id_table_ctid_idx ON
sid2.data_id_table(ctid);
CREATE INDEX
Time: 648.911 ms

cranel=# explain analyze
DELETE FROM sid2.data_id_table AS dd
 USING public.points AS ss,
       (SELECT markeddel.ctid
          FROM (SELECT d.ctid,d.*
                  FROM sid2.data_id_table d,public.points s
                 WHERE s.systems_id=2
                   AND s.id<2
                   AND s.permpoint=FALSE
                   AND s.id=d.point_id
                   AND d.persist=FALSE
                   AND d.dtype_id=3) AS markeddel
               JOIN
               (SELECT DISTINCT ON (d.key1_id,d.key2_id) d.*
                  FROM sid2.data_id_table d,public.points s
                 WHERE s.systems_id=2
                   AND s.id<=2
                   AND s.id=d.point_id
                   AND d.dtype_id=3
                 ORDER BY d.key1_id,d.key2_id,d.point_id DESC) AS rollup
               ON
(markeddel.key1_id,markeddel.key2_id)=(rollup.key1_id,rollup.key2_id)
         WHERE markeddel.point_id<>rollup.point_id) ru
 WHERE ss.systems_id=2
   AND ss.id<2
   AND ss.permpoint=FALSE
   AND ss.id=dd.point_id
   AND dd.persist=FALSE
   AND dd.dtype_id=3
   AND dd.ctid=ru.ctid;

QUERY PLAN

------------------------------------------------------------------------
------------------------------------------------------------------------
------------------------------------------
 Nested Loop  (cost=1259.33..1378.37 rows=1 width=6) (actual
time=1807.429..2625.722 rows=562 loops=1)
   ->  Nested Loop  (cost=1259.33..1378.08 rows=1 width=14) (actual
time=1807.372..2619.592 rows=562 loops=1)
         ->  Merge Join  (cost=1259.33..1377.66 rows=1 width=6) (actual
time=1807.228..2606.901 rows=562 loops=1)
               Merge Cond: ((rollup.key1_id = d.key1_id) AND
(rollup.key2_id = d.key2_id))
               Join Filter: (d.point_id <> rollup.point_id)
               ->  Unique  (cost=629.66..659.24 rows=3944 width=52)
(actual time=911.409..1271.121 rows=156779 loops=1)
                     ->  Sort  (cost=629.66..639.52 rows=3944 width=52)
(actual time=911.403..1024.775 rows=157342 loops=1)
                           Sort Key: d.key1_id, d.key2_id, d.point_id
                           ->  Nested Loop  (cost=0.00..394.10 rows=3944
width=52) (actual time=6.036..548.119 rows=157352 loops=1)
                                 ->  Seq Scan on points s
(cost=0.00..1.72 rows=1 width=8) (actual time=0.114..0.137 rows=2
loops=1)
                                       Filter: ((systems_id = 2) AND (id
<= 2))
                                 ->  Index Scan using
data_id_table_point_id_indx on data_id_table d  (cost=0.00..339.79
rows=4207 width=52) (actual time=3.216..155.284
rows=78676 loops=2)
                                       Index Cond: (s.id = d.point_id)
                                       Filter: (dtype_id = 3)
               ->  Sort  (cost=629.66..639.52 rows=3944 width=30)
(actual time=875.213..980.618 rows=156750 loops=1)
                     Sort Key: d.key1_id, d.key2_id
                     ->  Nested Loop  (cost=0.00..394.10 rows=3944
width=30) (actual time=5.864..553.290 rows=156750 loops=1)
                           ->  Seq Scan on points s  (cost=0.00..1.72
rows=1 width=8) (actual time=0.022..0.053 rows=1 loops=1)
                                 Filter: ((systems_id = 2) AND (id < 2)
AND (NOT permpoint))
                           ->  Index Scan using
data_id_table_point_id_indx on data_id_table d  (cost=0.00..339.79
rows=4207 width=30) (actual time=5.831..355.139 rows=156750 loops=1)
                                 Index Cond: (s.id = d.point_id)
                                 Filter: ((NOT persist) AND (dtype_id =
3))
         ->  Index Scan using data_id_table_ctid_idx on data_id_table dd
(cost=0.00..0.41 rows=1 width=14) (actual time=0.017..0.019 rows=1
loops=562)
               Index Cond: (dd.ctid = d.ctid)
               Filter: ((NOT persist) AND (dtype_id = 3))
   ->  Index Scan using points_pkey on points ss  (cost=0.00..0.28
rows=1 width=8) (actual time=0.005..0.007 rows=1 loops=562)
         Index Cond: ((ss.id < 2) AND (ss.id = dd.point_id))
         Filter: ((systems_id = 2) AND (NOT permpoint))
 Total runtime: 2641.820 ms
(29 rows)
Time: 2652.940 ms

cranel=# DROP INDEX data_id_table_ctid_idx;
DROP INDEX
Time: 33.653 ms

cranel=# DELETE FROM sid2.data_id_table AS dd WHERE dd.point_id=2 AND
dd.dtype_id=3 AND dd.deleted AND NOT dd.persist;
DELETE 0
Time: 0.960 ms

cranel=# COMMIT;
Time: 20.500 ms






-----Original Message-----
From: Tom Lane [mailto:tgl@sss.pgh.pa.us]
Sent: Monday, April 09, 2007 4:55 PM
To: Spiegelberg, Greg
Cc: pgsql-performance@postgresql.org
Subject: Re: [PERFORM] DELETE with filter on ctid

"Spiegelberg, Greg" <gspiegelberg@cranel.com> writes:
> We have a query which generates a small set of rows (~1,000) which are
> to be used in a DELETE on the same table.  The problem we have is that
> we need to join on 5 different columns and it takes far too long.  I
> have a solution but I'm not sure it's the right one.  Instead of
joining
> on 5 columns in the DELETE the join uses the ctid column.

> BEGIN;
> CREATE INDEX gregs_table_ctid_idx ON gregs_table(ctid);
> DELETE FROM gregs_table gt
>    USING (SELECT ctid FROM gregs_table WHERE ...) as s
>    WHERE gt.ctid=s.ctid;
> DROP INDEX gregs_table_ctid_idx;
> COMMIT;

Forget the index, it's useless here (hint: ctid is a physical address).
I'm wondering though why you don't just transpose the subquery's WHERE
condition into the DELETE's WHERE?  Or is this example oversimplified?

            regards, tom lane

Re: DELETE with filter on ctid

From
"Spiegelberg, Greg"
Date:
Craig,

I'm not using a TEMP TABLE in this DELETE however I have tried an
ANALYZE prior to the DELETE but it hardly makes a dent in the time.

Please look at the other follow-up email I just sent for full details.

Greg


-----Original Message-----
From: Craig A. James [mailto:cjames@modgraph-usa.com]
Sent: Monday, April 09, 2007 5:58 PM
To: Spiegelberg, Greg
Cc: pgsql-performance@postgresql.org
Subject: Re: [PERFORM] DELETE with filter on ctid

Spiegelberg, Greg wrote:
> We have a query which generates a small set of rows (~1,000) which are

> to be used in a DELETE on the same table.  The problem we have is that

> we need to join on 5 different columns and it takes far too long.

You may have encountered the same problem I did:  You *must* run ANALYZE
on a temporary table before you use in another query.  It's surprising
that this is true even for very small tables (a few hundred to a few
thousand rows), but it is.  I had a case where I created a "scratch"
table like yours, and the before/after ANALYZE performance was the
difference between 30 seconds and a few milliseconds for the same query.

Craig

Re: DELETE with filter on ctid

From
Tom Lane
Date:
"Spiegelberg, Greg" <gspiegelberg@cranel.com> writes:
> Below is, I believe, everything pertinent to this problem.  First is the
> table in question, second is the problematic and original query, and
> final is the transaction that I have working today with the CTID
> implementation.

So the basic issue here is that data_id_table hasn't got a primary key
you could use as a join key?  I won't lecture you about that, but a lot
of people think it's bad practice not to have a recognizable primary key.

The slow query's problem seems to be mostly that the rowcount estimates
are horribly bad, leading to inappropriate choices of nestloop joins.
Are the statistics up-to-date?  You might try increasing the stats target
for data_id_table in particular.  A really brute-force test would be to
see what happens with that query if you just set enable_nestloop = 0.

As for the CTID query, my initial reaction that you shouldn't need an
index was wrong; looking into the code I see

 * There is currently no special support for joins involving CTID; in
 * particular nothing corresponding to best_inner_indexscan().    Since it's
 * not very useful to store TIDs of one table in another table, there
 * doesn't seem to be enough use-case to justify adding a lot of code
 * for that.

Maybe we should revisit that sometime, though I'm still not entirely
convinced by this example.

            regards, tom lane