Re: splitting up tables based on read/write frequency of columns - Mailing list pgsql-general

From David G Johnston
Subject Re: splitting up tables based on read/write frequency of columns
Date
Msg-id 1421860186723-5834911.post@n5.nabble.com
Whole thread Raw
In response to splitting up tables based on read/write frequency of columns  (Jonathan Vanasco <postgres@2xlp.com>)
List pgsql-general
Jonathan Vanasco-7 wrote
> This is really a theoretical/anecdotal question, as I'm not at a scale yet
> where this would measurable.  I want to investigate while this is fresh in
> my mind...
>
> I recall reading that unless a row has columns that are TOASTed, an
> `UPDATE` is essentially an `INSERT + DELETE`, with the previous row marked
> for vacuuming.
>
> A few of my tables have the following characteristics:
>     - The Primary Key has many other tables/columns that FKEY onto it.
>     - Many columns (30+) of small data size
>     - Most columns (90%) are 1 WRITE(UPDATE) for 1000 READS
>     - Some columns (10%) do a bit of internal bookkeeping and are 1
> WRITE(UPDATE) for 50 READS
>
> Has anyone done testing/benchmarking on potential efficiency/savings by
> consolidating the frequent UPDATE columns into their own table?

Consider another in-database attempt to mitigate the need to do this
manually:

HOT (heap-only-tuple)

http://pgsql.tapoueh.org/site/html/misc/hot.html

I haven't done any bench-marking but I do currently use this idea to
segregate read-only fields from read-write fields.  Likely there is also a
model reason why these fields have different update frequencies and so can
both logically and physically be partitioned out.

The only harm I see is that it can make using the schema more difficult -
though that can be somewhat mitigated by using (updateable) views in front
of the partitioned tables.

If you can logically partition them I would go for it; if it is a purely
physical concern then I'd probably ponder it for another couple of days and
then go for it anyway.  The main additional thing to ponder is the cost of
additional parsing and the additional join.  Neither is super expensive but
if only doing this for physical reasons you need to evaluate your planned
usage patterns.  With a logical split you are more likely to find situations
where you do not even care about one table or the other and so can avoid the
join entirely.

David J.



--
View this message in context:
http://postgresql.nabble.com/splitting-up-tables-based-on-read-write-frequency-of-columns-tp5834646p5834911.html
Sent from the PostgreSQL - general mailing list archive at Nabble.com.


pgsql-general by date:

Previous
From: Raymond O'Donnell
Date:
Subject: Re: Fwd: Ask for a question
Next
From: Pierre Hsieh
Date:
Subject: Re: Fwd: Ask for a question