Re: What's the cost of a few extra columns? - Mailing list pgsql-performance

From Jim C. Nasby
Subject Re: What's the cost of a few extra columns?
Date
Msg-id 20051011001249.GJ39569@pervasive.com
Whole thread Raw
In response to What's the cost of a few extra columns?  ("Announce" <truthhurts@insightbb.com>)
Responses Re: What's the cost of a few extra columns?
List pgsql-performance
What you're describing is known as vertical partitioning (think of
splitting a table vertically), and can be a good technique for
increasing performance when used properly. The key is to try and get the
average row size down, since that means more rows per page which means
less I/O. Some things to consider:

First rule of performance tuning: don't. In other words, you should be
able to verify with benchmark numbers that a) you need to do this and b)
how much it's actually helping.

How will splitting the table affect *_tstmp, especially mod_tstmp?

How will you handle inserts and joining these two tables together? Will
you always do a left join (preferably via a view), or will you have a
trigger/rule that inserts into production_info whenever a row is
inserted into productions?

On Sun, Oct 09, 2005 at 10:03:33PM -0500, Announce wrote:
> What's goin on pg-people?
>
> I have a table PRODUCTIONS that is central to the DB and ties a lot of other
> information together:
>
> PRODUCTIONS (table)
> ----------------------------------
> prod_id        primary key
> type_id        foreign key
> level_id        foreign key
> tour_id        foreign key
> show_id        foreign key
> venue_id        foreign key
> title            varchar(255); not null indexed
> version        char;
> details        text
> open_date        date
> close_date        date
> preview_open    date
> preview_close    date
> perform_tot        int
> preview_tot        int
> park_info        text
> phone_nos        text
> some_other_info    text
> seating_info    text
> this            text
> that            text
> create_tstmp    timestamptz; NOW()
> mod_tstmp        timestamptz;triggered
> delete_tstmp    timestamptz;default null
> is_complete        bool
>
>
> As it stands now, there are approximately 25-30 columns on the table. Since
> this table is very central to the database, would it be more efficient to
> break some of the columns (especially the TEXT ones) out into a separate
> INFO table since some queries on the web will not care about all of these
> text columns anyway? I know that pg can handle A LOT more columns and if
> there IS no performance hit for keeping them all on the same table, I would
> like to do that because the relation between PRODUCTIONS and the INFO will
> always be 1-to-1.
>
> My implementation of this INFO table would look a little somethin' like
> this:
>
> PROD_INFO (table)
> -------------------------------
> prod_id        pkey/fkey
> open_date        date
> close_date        date
> preview_open    date
> preview_close    date
> perform_tot        int
> preview_tot        int
> park_info        text
> phone_nos        text
> some_other_info    text
> seating_info    text
> this            text
> that            text
> (the rest would stay in in the original PRODUCTIONS table)
>
>
> I am open to ANY suggestions, criticisms, mockery, etc.
>
> Thanks,
>
> Aaron
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 5: don't forget to increase your free space map settings
>

--
Jim C. Nasby, Sr. Engineering Consultant      jnasby@pervasive.com
Pervasive Software      http://pervasive.com    work: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf       cell: 512-569-9461

pgsql-performance by date:

Previous
From: Michael Stone
Date:
Subject: Re: XFS External Log on Pg 7.4.8 Pg_xlog drives?
Next
From: "Announce"
Date:
Subject: Re: What's the cost of a few extra columns?