Re: Does VACUUM reorder tables on clustered indices - Mailing list pgsql-sql

From Tom Lane
Subject Re: Does VACUUM reorder tables on clustered indices
Date
Msg-id 29359.1135005570@sss.pgh.pa.us
Whole thread Raw
In response to Re: Does VACUUM reorder tables on clustered indices  (Andrew Sullivan <ajs@crankycanuck.ca>)
List pgsql-sql
Andrew Sullivan <ajs@crankycanuck.ca> writes:
> On Sun, Dec 18, 2005 at 10:08:22PM -0500, Tom Lane wrote:
>> Just for the record, that behavior is seriously broken: it violates
>> MVCC if any of the deleted tuples are still visible to anyone else.

> Does it remove tuples that VACUUM FULL wouldn't?

Yes.  CLUSTER works on SnapshotNow, so it will remove committed-dead
tuples even if there are still open transactions that could see them.
Of course, said transactions couldn't be actively using the table
while the CLUSTER runs, because it takes an exclusive table lock.
But they *could* look at it afterwards.  Offhand I think you'd only
be likely to notice the difference if the open transactions were
SERIALIZABLE --- in READ COMMITTED mode, by the time they could look
at the clustered table, they'd likely be using a snapshot that postdates
the DELETE.

[ experiments a bit... ]  Hmm.  Actually, it's far worse than I
thought.  It looks like CLUSTER puts the tuples into the new table with
its own xid, which means that concurrent serializable transactions will
see the new table as completely empty!

<< session 1 >>

regression=# select * from int4_tbl;    f1
-------------          0     123456    -123456 2147483647-2147483647
(5 rows)

regression=# create index fooi on int4_tbl(f1);
CREATE INDEX
regression=# begin isolation level serializable;
BEGIN
regression=# select 2+2; -- establish transaction snapshot?column?
----------       4
(1 row)

<< session 2 >>

regression=# delete from int4_tbl where f1 = -123456;
DELETE 1
regression=# cluster fooi on int4_tbl;
CLUSTER

<< back to session 1 >>

regression=# select * from int4_tbl;f1
----
(0 rows)

regression=# commit;
COMMIT
regression=# select * from int4_tbl;    f1
--------------2147483647          0     123456 2147483647
(4 rows)

        regards, tom lane


pgsql-sql by date:

Previous
From: Andrew Sullivan
Date:
Subject: Re: How to Force Transactions to Process Serially on A Table
Next
From: "Ken Winter"
Date:
Subject: Re: Rule causes baffling error