Re: SELECT * FROM LIMIT 1; is really slow - Mailing list pgsql-hackers
From Tom Lane
Subject Re: SELECT * FROM LIMIT 1; is really slow
Date
Msg-id 5438.1085623573@sss.pgh.pa.us
Whole thread Raw
In response to Re: SELECT * FROM LIMIT 1; is really slow  (Christopher Kings-Lynne <chriskl@familyhealth.com.au>)
Responses Re: SELECT * FROM LIMIT 1; is really slow
List pgsql-hackers
Christopher Kings-Lynne <chriskl@familyhealth.com.au> writes:
>> It wouldn't; you'd need vacuum full to collapse out the dead space.
>> You could also try CLUSTER which is faster than VACUUM FULL when most
>> of the data has to be moved anyway.  (Some days I think we should dump
>> VACUUM FULL, because it's optimized for a case that's no longer very
>> interesting...)

> Out of interest, is CLUSTER another fast way of truly removing OIDs from 
> a table, after going SET WITHOUT OIDS?

I think not --- IIRC, CLUSTER just copies the tuples verbatim.  It'd do
fine for getting rid of dead tuples and unused space, but not for making
any actual changes in the individual tuples.

What would be useful for that (as of 7.5) is a dummy ALTER TABLE TYPE,
viz
ALTER TABLE tab ALTER COLUMN col TYPE same-type-it-already-has

There not being any special case to short-circuit this, the thing will
sit there and reconstruct the tuples and write them into a fresh table
file (and rebuild the indexes, too).  Net effect is to collapse out any
dropped columns (or OIDs) as well as losing dead tuples and free space.

I was just noticing today that the ALTER TABLE documentation suggests
a dummy UPDATE and VACUUM FULL to get rid of the space occupied by a
dropped column.  On reflection the ALTER TYPE method is likely to be
an order of magnitude faster.  Will update the docs.

A further thought along this line: if you have to do an UPDATE that
affects every row of a large table, the traditional way is, eg,
UPDATE tab SET col = col + 1

which leaves you with N live rows, N dead rows, and lots of pain to get
back down to a less-than-twice-normal-size table.  (Traditional way is
VACUUM FULL; CLUSTER is better, but still painful.)  As of 7.5 you could
hack this with
ALTER TABLE tab ALTER COLUMN col TYPE same-type USING col + 1

which will have the effect of rewriting a fresh table file containing
the updated rows, and dropping the old file at commit.  Sweet.  (Peak
disk usage is 2x normal either way, but you don't pay through the nose
to get back down to 1x.)  I'm not real sure where to document this
trick but it seems like we ought to mention it someplace.
        regards, tom lane


pgsql-hackers by date:

Previous
From: Christopher Kings-Lynne
Date:
Subject: Re: tablespaces and DB administration
Next
From: Tom Lane
Date:
Subject: Re: SELECT * FROM LIMIT 1; is really slow