Thread: Does VACUUM reorder tables on clustered indices
Does VACUUMing reorder tables on clustered indices or is it only the CLUSTER command that can do that? / r church ---------------------------------------------------------------- This message was sent using IMP, the Internet Messaging Program.
On Sun, 18 Dec 2005, frank church wrote: > > Does VACUUMing reorder tables on clustered indices or is it only the CLUSTER > command that can do that? Cluster does that. Vacuum only cleans dead tuples from the tables. -- 18:02:25 up 4 days, 9:57, 5 users, load average: 1.59, 1.57, 1.62 --------------------------------------------------------- Lic. Martín Marqués | SELECT 'mmarques' || Centro de Telemática | '@' || 'unl.edu.ar'; Universidad Nacional | DBA, Programador, del Litoral | Administrador ---------------------------------------------------------
Martin Marques escribió: > On Sun, 18 Dec 2005, frank church wrote: > > > > >Does VACUUMing reorder tables on clustered indices or is it only the > >CLUSTER > >command that can do that? > > Cluster does that. Vacuum only cleans dead tuples from the tables. Note that while reordering, CLUSTER also gets rid of dead tuples, so if you cluster you don't need to vacuum. -- Alvaro Herrera http://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc.
Alvaro Herrera <alvherre@commandprompt.com> writes: > Note that while reordering, CLUSTER also gets rid of dead tuples, so if > you cluster you don't need to vacuum. Just for the record, that behavior is seriously broken: it violates MVCC if any of the deleted tuples are still visible to anyone else. regards, tom lane
Tom Lane escribió: > Alvaro Herrera <alvherre@commandprompt.com> writes: > > Note that while reordering, CLUSTER also gets rid of dead tuples, so if > > you cluster you don't need to vacuum. > > Just for the record, that behavior is seriously broken: it violates > MVCC if any of the deleted tuples are still visible to anyone else. Well, FWIW, the fact that the CLUSTER command exists at all is broken IMHO ... but meanwhile, are you suggesting that CLUSTER should be fixed to retain tuples that are visible to running transactions? -- Alvaro Herrera http://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support
On Sun, Dec 18, 2005 at 10:08:22PM -0500, Tom Lane wrote: > > Just for the record, that behavior is seriously broken: it violates > MVCC if any of the deleted tuples are still visible to anyone else. Does it remove tuples that VACUUM FULL wouldn't? I always thought it did essentially the same thing? A -- Andrew Sullivan | ajs@crankycanuck.ca I remember when computers were frustrating because they *did* exactly what you told them to. That actually seems sort of quaint now. --J.D. Baldwin
Andrew Sullivan <ajs@crankycanuck.ca> writes: > On Sun, Dec 18, 2005 at 10:08:22PM -0500, Tom Lane wrote: >> Just for the record, that behavior is seriously broken: it violates >> MVCC if any of the deleted tuples are still visible to anyone else. > Does it remove tuples that VACUUM FULL wouldn't? Yes. CLUSTER works on SnapshotNow, so it will remove committed-dead tuples even if there are still open transactions that could see them. Of course, said transactions couldn't be actively using the table while the CLUSTER runs, because it takes an exclusive table lock. But they *could* look at it afterwards. Offhand I think you'd only be likely to notice the difference if the open transactions were SERIALIZABLE --- in READ COMMITTED mode, by the time they could look at the clustered table, they'd likely be using a snapshot that postdates the DELETE. [ experiments a bit... ] Hmm. Actually, it's far worse than I thought. It looks like CLUSTER puts the tuples into the new table with its own xid, which means that concurrent serializable transactions will see the new table as completely empty! << session 1 >> regression=# select * from int4_tbl; f1 ------------- 0 123456 -123456 2147483647-2147483647 (5 rows) regression=# create index fooi on int4_tbl(f1); CREATE INDEX regression=# begin isolation level serializable; BEGIN regression=# select 2+2; -- establish transaction snapshot?column? ---------- 4 (1 row) << session 2 >> regression=# delete from int4_tbl where f1 = -123456; DELETE 1 regression=# cluster fooi on int4_tbl; CLUSTER << back to session 1 >> regression=# select * from int4_tbl;f1 ---- (0 rows) regression=# commit; COMMIT regression=# select * from int4_tbl; f1 --------------2147483647 0 123456 2147483647 (4 rows) regards, tom lane
On Sun, Dec 18, 2005 at 07:01:39PM -0300, Alvaro Herrera wrote: > Martin Marques escribi?: > > On Sun, 18 Dec 2005, frank church wrote: > > > > > > > >Does VACUUMing reorder tables on clustered indices or is it only the > > >CLUSTER > > >command that can do that? > > > > Cluster does that. Vacuum only cleans dead tuples from the tables. > > Note that while reordering, CLUSTER also gets rid of dead tuples, so if > you cluster you don't need to vacuum. It also does a REINDEX... -- Jim C. Nasby, Sr. Engineering Consultant jnasby@pervasive.com Pervasive Software http://pervasive.com work: 512-231-6117 vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461
Hi, Utilize <b>CLUSTER;</b> (after vacuum) to reorder the data. Regards ----- Original Message ----- From: "Jim C. Nasby" <jnasby@pervasive.com> To: "Martin Marques" <martin@bugs.unl.edu.ar>; "frank church" <pgsql@adontendev.net>; <pgsql-sql@postgresql.org> Sent: Tuesday, December 20, 2005 10:41 PM Subject: Re: [SQL] Does VACUUM reorder tables on clustered indices > On Sun, Dec 18, 2005 at 07:01:39PM -0300, Alvaro Herrera wrote: >> Martin Marques escribi?: >> > On Sun, 18 Dec 2005, frank church wrote: >> > >> > > >> > >Does VACUUMing reorder tables on clustered indices or is it only the >> > >CLUSTER >> > >command that can do that? >> > >> > Cluster does that. Vacuum only cleans dead tuples from the tables. >> >> Note that while reordering, CLUSTER also gets rid of dead tuples, so if >> you cluster you don't need to vacuum. > > It also does a REINDEX... > -- > Jim C. Nasby, Sr. Engineering Consultant jnasby@pervasive.com > Pervasive Software http://pervasive.com work: 512-231-6117 > vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461 > > ---------------------------(end of broadcast)--------------------------- > TIP 6: explain analyze is your friend > >
On Wed, Dec 21, 2005 at 12:34:12AM +0100, ipv@tinet.org wrote: > Hi, > > Utilize <b>CLUSTER;</b> (after vacuum) to reorder the data. Why would you vacuum when cluster is just going to wipe out the dead tuples anyway? > >>Note that while reordering, CLUSTER also gets rid of dead tuples, so if > >>you cluster you don't need to vacuum. > > > >It also does a REINDEX... -- Jim C. Nasby, Sr. Engineering Consultant jnasby@pervasive.com Pervasive Software http://pervasive.com work: 512-231-6117 vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461
jnasby@pervasive.com ("Jim C. Nasby") writes: > On Wed, Dec 21, 2005 at 12:34:12AM +0100, ipv@tinet.org wrote: >> Hi, >> >> Utilize <b>CLUSTER;</b> (after vacuum) to reorder the data. > > Why would you vacuum when cluster is just going to wipe out the dead > tuples anyway? There is one reason to VACUUM before running CLUSTER... That is that VACUUM will be *guaranteed* to draw all the pages into memory. Subsequently, you can be certain that the pages are in cache, and that the CLUSTER should need to do minimal I/O to read data into memory. If I'm considering clustering the Slony-I "sl_log_1" table, forcing it into memory *is* something I'll consider doing in order to minimize the time that would-be writers are blocked from writing... -- let name="cbbrowne" and tld="cbbrowne.com" in String.concat "@" [name;tld];; http://cbbrowne.com/info/multiplexor.html Never criticize anybody until you have walked a mile in their shoes, because by that time you will be a mile away and have their shoes. -- email sig, Brian Servis
Chris Browne wrote: > jnasby@pervasive.com ("Jim C. Nasby") writes: > > On Wed, Dec 21, 2005 at 12:34:12AM +0100, ipv@tinet.org wrote: > >> Hi, > >> > >> Utilize <b>CLUSTER;</b> (after vacuum) to reorder the data. > > > > Why would you vacuum when cluster is just going to wipe out the dead > > tuples anyway? > > There is one reason to VACUUM before running CLUSTER... > > That is that VACUUM will be *guaranteed* to draw all the pages into memory. > > Subsequently, you can be certain that the pages are in cache, and that > the CLUSTER should need to do minimal I/O to read data into memory. > > If I'm considering clustering the Slony-I "sl_log_1" table, forcing it > into memory *is* something I'll consider doing in order to minimize > the time that would-be writers are blocked from writing... Why don't you just do SELECT * FROM tab WHERE col != 'lkjasdflkjadsf'. That should pull things into memory without the VACUUM overhead. -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001+ If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania19073
On Wed, Dec 21, 2005 at 06:36:45PM -0500, Chris Browne wrote: > If I'm considering clustering the Slony-I "sl_log_1" table, forcing it > into memory *is* something I'll consider doing in order to minimize > the time that would-be writers are blocked from writing... Given what Tom Lane recently reported (and assuming I understood his remarks), I think it's a Mighty Bad Idea to CLUSTER sl_log_1. A -- Andrew Sullivan | ajs@crankycanuck.ca In the future this spectacle of the middle classes shocking the avant- garde will probably become the textbook definition of Postmodernism. --Brad Holland