Re: Performance query about large tables, lots of concurrent access - Mailing list pgsql-performance

From Karl Wright
Subject Re: Performance query about large tables, lots of concurrent access
Date
Msg-id 4679B6C7.3080207@metacarta.com
Whole thread Raw
In response to Re: Performance query about large tables, lots of concurrent access  (Scott Marlowe <smarlowe@g2switchworks.com>)
Responses Re: Performance query about large tables, lots of concurrent access  (Scott Marlowe <smarlowe@g2switchworks.com>)
List pgsql-performance
Scott Marlowe wrote:
> Karl Wright wrote:
>
>> Shaun Thomas wrote:
>>
>>> On Wednesday 20 June 2007 12:55:20 pm Karl Wright wrote:
>>>
>>>
>>>> I am afraid that I did answer this.  My largest tables
>>>> are the ones continually being updated.  The smaller
>>>> ones are updated only infrequently.
>>>
>>>
>>>
>>> You know, it actually sounds like you're getting whacked by the same
>>> problem that got us a while back.  It sounds like you weren't
>>> vacuuming frequently enough initially, and then tried vacuuming
>>> later, only after you noticed performance degrade.
>>>
>>> Unfortunately what that means, is for several weeks or months,
>>> Postgres has not been reusing rows on your (admittedly) active and
>>> large tables; it just appends at the end, and lets old rows slowly
>>> bloat that table larger and larger.  Indexes too, will suffer from
>>> dead pages.  As frightening/sickening as this sounds, you may need to
>>> dump/restore the really huge table, or vacuum-full to put it on a
>>> crash diet, and then maintain a strict daily or bi-daily vacuum
>>> schedule to keep it under control.
>>>
>>
>> A nice try, but I had just completed a VACUUM on this database three
>> hours prior to starting the VACUUM that I gave up on after 27 hours.
>> So I don't see how much more frequently I could do it.  (The one I did
>> earlier finished in six hours - but to accomplish that I had to shut
>> down EVERYTHING else that machine was doing.)
>
>
> So, have you ever run vacuum full or reindex on this database?
>

No.  However, this database has only existed since last Thursday afternoon.

> You are aware of the difference between how vacuum and vacuum full work,
> right?
>
> vacuum := mark deleted tuples as available, leave in table
> vacuum full := compact tables to remove deleted tuples.
>
> While you should generally avoid vacuum full, if you've let your
> database get so bloated that the majority of space in your tables is now
> empty / deleted tuples, you likely need to vacuuum full / reindex it.
>

If the database is continually growing, should VACUUM FULL be necessary?

> For instance, on my tiny little 31 Gigabyte reporting database, the main
> table takes up about 17 Gigs.  This query gives you some idea how many
> bytes each row is taking on average:
>
> select relname, relpages::float*8192 as size, reltuples,
> (relpages::double precision*8192)/reltuples::double precision as
> bytes_per_row from pg_class where relname = 'businessrequestsummary';
>        relname         |    size     |  reltuples  |  bytes_per_row
> ------------------------+-------------+-------------+-----------------
> businessrequestsummary | 17560944640 | 5.49438e+07 | 319.61656229454
>
> Note that these numbers are updated by running analyze...
>
> What does it say about your DB?
>

I wish I could tell you.  Like I said, I had to abandon this project to
test out an upgrade procedure involving pg_dump and pg_restore.  (The
upgrade also seems to take a very long time - over 6 hours so far.)
When it is back online I can provide further information.

Karl

pgsql-performance by date:

Previous
From: Scott Marlowe
Date:
Subject: Re: Performance query about large tables, lots of concurrent access
Next
From: Kurt Overberg
Date:
Subject: Re: Maintenance question / DB size anomaly...