Re: strategies for dealing with frequently updated tables - Mailing list pgsql-general

From Andy Colson
Subject Re: strategies for dealing with frequently updated tables
Date
Msg-id 4F4C4A0A.4050505@squeakycode.net
Whole thread Raw
In response to strategies for dealing with frequently updated tables  (Dave Vitek <dvitek@grammatech.com>)
Responses Re: strategies for dealing with frequently updated tables
List pgsql-general
On 02/27/2012 06:55 PM, Dave Vitek wrote:
> Hi all,
>
> I have a relation where a tuple typically undergoes a lifecycle something like:
> 1) Created
> 2) Updated maybe thousands of times (no updates to indexed columns though)
> 3) Rarely or never modified again
>
> The following query takes about 100 minutes (3 seconds per tuple):
> SELECT count(id) from T
>
> (2) is causing a lot of auto vacuum/analyze activity (which is OK). HOT seems to get used for about 90% of the
updates,but there are enough updates that don't use it to cause issues. I'm using pg version 9.0.3 on a 32-bit windows
xpmachine with 3GB of RAM. The .conf file is using default settings. 
>
> Table Info:
> n_live_tup 1799
> n_dead_tup 191
> pg_relation_size 2343mb
> indexsize 10mb
> toastsize 552kb
> toastindexsize 16kb
>
> This reports under 10kb for most tuples:
> psql -A -c "select * from T where id=123" | wc -c
>
> auto-vacuum and auto-analyze both ran yesterday with default settings. There are only one or two new tuples since
yesterday.The database is fairly old (was probably created using pg_restore about when 9.0.3 came out). 
>
> Here is the output from VACUUM VERBOSE:
> INFO: vacuuming "public.T"
> 0 pages are entirely empty.
> CPU 0.00s/0.00u sec elapsed 0.56 sec.
> VACUUM
>
>
> I imagine CLUSTERing the table would make things happier, but I'm hoping for a permanent solution that avoids
periodicdowntime. 
>
> One thought was to partition so rows that are still changing live in a separate table from the more stable rows. I
imaginethe cardinality of (2) rarely exceeds 10. Can I still get into performance trouble with a table that small after
enoughupdates? Anyone have other ideas? 
>
> - Dave
>

I'm a little confused... whats the problem?


> The following query takes about 100 minutes (3 seconds per tuple):
> SELECT count(id) from T

so table T has 18,000 rows?   (100 * 60 * 3)

> Table Info:
> n_live_tup 1799

Oh, so the table has 1,800 rows?


> This reports under 10kb for most tuples:
> psql -A -c "select * from T where id=123" | wc -c

10 kb?  what? So that's one row?  And that one row is about 10,000 bytes?


So you have a large row, who knows how many, and the problem is "SELECT count(id) from T" is slow?

You say:

> 2) Updated maybe thousands of times (no updates to indexed columns though)

Are the updates slow?  How slow, and how fast to you need them to be?  Is ID a unique key? Or does "update ... where
id=123"update more than one row? 

What does the actual table look like?  Looks like it has lots of indexes, but I'm just guessing!  Are you sure you need
allthose indexes?  updating a row has to update all those indexes.. I assume, but you give no details. 

Also... in the end, you're on windows, so you probably cant tell me if you are IO bound, but you probably need a better
IOsubsystem.  And stop running on raid 5 (which I'm assuming since you don't mention anything) 

Wow.  Re-reading that I seem to be angry.  Please don't take it that way.  Blood sugar is high, makes me angry person.
Otherswill be along to scold me.  Bad Andy!  Bad!  Go to your room until you find your happy face. 

-Andy

ps: based on you're email gramma, I almost changed all your to you're, just to see if it annoyed you.  Sounds like you
workfor grammar technology. 



pgsql-general by date:

Previous
From: chinnaobi
Date:
Subject: Re: PSQL 9.1 Streaming Replication Windows 2008 64 bit Servers
Next
From: Lionel Elie Mamane
Date:
Subject: Re: Does the current user have UPDATE privilege on FOO?