Re: degenerate performance on one server of 3 - Mailing list pgsql-performance

From Erik Aronesty
Subject Re: degenerate performance on one server of 3
Date
Msg-id ccd588d90906051850w6d413ae4t4b482ba34cf3a20e@mail.gmail.com
Whole thread Raw
In response to Re: degenerate performance on one server of 3  (Scott Carey <scott@richrelevance.com>)
List pgsql-performance
> See ALTER TABLE and CREATE TABLE  (and the Index variants).
>
> ALTER TABLE foo SET (fillfactor=90);

I'll try that.

> This will leave on average, 10% of every 8k block empty and allow updates to
> columns to more likely live within the same block.

Good for the items table.

Probably bad for the cookies table, with 6 million rows, and thousands
of inserts and deletes every day, but few updates.

Maybe I should have another way of doing it.    That table gets
bloated fast.   A vacuum full takes 3 and half hours - which would be
an unacceptable amount of downtime if I didn't have working mirrors of
everything.

> Creating a new table as a select from the old and renaming, OR doing a
> CLUSTER and REINDEX is almost always faster than VACUUM FULL for such large
> tables.  But there are different implications on how long other queries are
> locked out of access to the table.  CLUSTER will generally lock out other
> queries for a long time, but the end result (especially combined with a
> reasonable fillfactor setting) ends up best for long term performance and
> reduction in bloat.

I'll try it on the other mirror server, which has the same specs and
size, see if CLUSTER/REINDEX is faster.

>>    - copy all rows to new table
>>    - lock for a millisecond while renaming tables
>>    - drop old table.
>>
>> Locking a whole table for a very long time is scary for admins.
>>
>
> You can do the above manually in a single transaction, however any updates
> or inserts during that time may be lost.

Postgres can have multiple row versions around for transactions, so
for a lockless vacuum full to work, some row versions would have to be
in the "new table". I think that could be done at the expense of some
performance degradation, as you'd have to figure out which table to
look at (or reads... new one.... nothing there.... ok then old
one...., for copies... there's an update there... put the copy "under"
it), some wacky logic like that.

I don't know postgres's internals well enough to do it for "all
cases", but I know my own DB well enought to get it to work for me.
Have 2 tables with triggered timestamps, then juggling of the queries
that hit the tables (check table a and table b, use the row with newer
timestamp for reads, meanwhile a is copying to b, but not overwriting
newer rows....something like that).

Not sure whether I'd rather have a 7-hour performance degraded
"table-copy" (which would reindex and recluster too) or a 3.5 hour
table-locked vacuum (which doesn't reindex or re-cluster).

pgsql-performance by date:

Previous
From: David Blewett
Date:
Subject: Re: Bad Plan for Questionnaire-Type Query
Next
From: Robert Haas
Date:
Subject: Re: Bad Plan for Questionnaire-Type Query