Thread: BUG #13640: Delete isn't using available Index Only Scan

BUG #13640: Delete isn't using available Index Only Scan

From
finzelj@gmail.com
Date:
The following bug has been logged on the website:

Bug reference:      13640
Logged by:          Jeremy Finzel
Email address:      finzelj@gmail.com
PostgreSQL version: 9.3.9
Operating system:   Linux
Description:

I have a query running in Postgres 9.3.9 where I want to delete some records
from a temp table based on using an EXISTS clause that matches a specific
partial index condition I created. The following related query uses an Index
Only Scan on this partial index (abbreviated as 'conditions' below):

EXPLAIN
SELECT l.id
FROM temp_table l
WHERE NOT EXISTS
        (SELECT 1
          FROM customers cx
          WHERE cx.id = l.customer_id
            AND ( conditions ));

                                          QUERY PLAN

----------------------------------------------------------------------------------------------
 Nested Loop Anti Join  (cost=0.42..252440.38 rows=43549 width=4)
   ->  Seq Scan on temp_table l  (cost=0.00..1277.98 rows=87098 width=8)
   ->  Index Only Scan using customers__bad on customers cx
(cost=0.42..3.35     rows=1 width=4)
         Index Cond: (id = l.customer_id)
(4 rows)

Here is the actual delete query SQL. This doesn't but I am convinced should
use the same Index Only Scan as above, and I wonder if it's a bug? Notice
the higher cost:

DELETE
  FROM temp_table l
  WHERE EXISTS(SELECT 1
          FROM cnu.customers cx
          WHERE cx.id = l.customer_id
            AND ( conditions ));


                                           QUERY PLAN

------------------------------------------------------------------------------------------------
 Delete on temp_table l  (cost=0.42..495426.94 rows=43549 width=12)
   ->  Nested Loop Semi Join  (cost=0.42..495426.94 rows=43549 width=12)
         ->  Seq Scan on temp_table l  (cost=0.00..1277.98 rows=87098
width=10)
         ->  Index Scan using customers__bad on customers cx
(cost=0.42..6.67 rows=1 width=10)
               Index Cond: (id = l.customer_id)
(5 rows)

To show that it should be possible on delete to get the same plan, I had to
do this, and it gave me the plan I wanted, and was twice as fast as the
query above that uses an Index Scan instead of Index Only Scan:

WITH the_right_records AS
(SELECT l.id
FROM temp_table l
WHERE NOT EXISTS
        (SELECT 1
          FROM cnu.customers cx
          WHERE cx.id = l.customer_id
            AND ( conditions ))

DELETE FROM temp_table t
WHERE NOT EXISTS (SELECT 1
                  FROM the_right_records x
                  WHERE x.id = t.id);

                                              QUERY PLAN

------------------------------------------------------------------------------------------------------
 Delete on temp_table t  (cost=253855.72..256902.88 rows=43549 width=34)
   CTE the_right_records
     ->  Nested Loop Anti Join  (cost=0.42..252440.38 rows=43549 width=4)
           ->  Seq Scan on temp_table l  (cost=0.00..1277.98 rows=87098
width=8)
           ->  Index Only Scan using customers__bad on customers cx
(cost=0.42..3.35 rows=1 width=4)
                 Index Cond: (id = l.customer_id)
   ->  Hash Anti Join  (cost=1415.34..4462.50 rows=43549 width=34)
         Hash Cond: (t.id = x.id)
         ->  Seq Scan on temp_table t  (cost=0.00..1277.98 rows=87098
width=10)
         ->  Hash  (cost=870.98..870.98 rows=43549 width=32)
               ->  CTE Scan on the_right_records x  (cost=0.00..870.98
rows=43549 width=32)
(11 rows)

I've noticed this same behavior in other examples.

Thank you.

Re: BUG #13640: Delete isn't using available Index Only Scan

From
Tom Lane
Date:
finzelj@gmail.com writes:
> Here is the actual delete query SQL. This doesn't but I am convinced should
> use the same Index Only Scan as above, and I wonder if it's a bug?

No, unfortunately not: the DELETE requires that each table scan return
CTID (for possible EvalPlanQual rechecks), and index-only scans can't
return columns that aren't in the index.

Although come to think of it, we necessarily get the TID from the index
AM, so in principle it should be possible for an index-only scan to
provide that column.  But that's an unimplemented feature, not a bug.

            regards, tom lane