Thread: DELETE ... WHERE ctid IN (...) vs. Iteration

DELETE ... WHERE ctid IN (...) vs. Iteration

From
Jonathan Gardner
Date:
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

I guess this may have come up before, but now that 7.4 has the IN with
improved performance, it may be time to revisit this topic.

Compare these two algorithms (in plpgsql):

(a)
DELETE FROM foo WHERE ctid IN (
    SELECT foo.ctid
    FROM ... WHERE ...
);

(b)
FOR result IN SELECT foo.ctid FROM ... WHERE ... LOOP
    DELETE FROM foo WHERE ctid = result;
END LOOP;

My poor understanding of how the IN operator works leaves me to believe
that for a large set of data in the IN group, a hash is used and a
tablescan done on foo.  However, for a small set of data in the IN group,
no tablescan is performed.

I assume that (a) works at O(ln(N)) for large N, and O(N) for small N,
while (b) works at O(N) universally. Therefore, (a) is the superior
algorithm. I welcome criticism and correction.

- --
Jonathan Gardner
jgardner@jonathangardner.net
Live Free, Use Linux!
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.2 (GNU/Linux)

iD8DBQE/8aipWgwF3QvpWNwRAk8GAJoDWISjxG7LMB1FdCFmwlOafsmZTwCePx18
lyHLNBJ8nP0RHzv6WfRzQ+M=
=FPdW
-----END PGP SIGNATURE-----