Re: Selecting across Multiple Tables - Mailing list pgsql-general

From will trillich
Subject Re: Selecting across Multiple Tables
Date
Msg-id 20020110114120.A28002@serensoft.com
Whole thread Raw
In response to Re: Selecting across Multiple Tables  (Antonio Fiol Bonnín <fiol@w3ping.com>)
List pgsql-general
On Wed, Dec 05, 2001 at 02:43:07PM +0100, Antonio Fiol Bonnín wrote:
> I'm afraid that performance shoud be substantially better if you had a
> single table with all the attributes. Except for situations like:
>
> Contact_name, Company_name, Phone_number, Billing_address
>
> In this case, billing address is something "company-specific", and not
> "contact-specific". So I'd see something like:
>
> Table1: Contact_name, Phone_number, id_company
> Table2: id_company, Company_name, Billing_address
>
> If and only if you intend to have multiple contacts for each company. If
> you only intend to have one contact for each company, performance (and
> file size) will be better if you stick everything in the same table.
>
> Another situation that could lead to good results by splitting tables is
> the following:
>
> Original Table: Report_number, magnitudeA, magnitudeB, magnitudeC
>
> If reports usually only contain one magnitude, either A, B, or C, you
> may think of splitting that into three tables:
>
> Table A: Report_number, magnitudeA
> Table B: Report_number, magnitudeB
> Table C: Report_number, magnitudeC
>
> You may substantially improve disk usage, but it depends mostly on how
> your data is organised. However, performance does not seem to me that it
> may get improved by splitting tables, in general, as you will need (in
> general, again) to re-join the information to access it.
>
> Each case is different, and it depends much on your numbers.
>
> As a general advice:
>
> EXPLAIN SELECT ...
> And then analyze precisely what is happening. A long (high cost) seq
> scan is bad. But an index scan can also be bad, if it is not on the best
> possible index. I have recently optimised a request that used to take
> about 50 sec, and now it is sub-second, just by finding (and creating)
> the missing index.
>
> In your case, I suppose I should not need to say that if you have the
> split tables you should index all of them on the ID you use to access
> the data.

very helpful nudgings...  nice general advice!  we need more like
this. nice work, Antonio!

> I hope that helps, but I could be completely wrong.

:)

--
DEBIAN NEWBIE TIP #99 from Greg Wiley <greg@orthogony.com>
:
Wondering WHERE TO SPECIFY BASH DEFAULTS? In case you are in X
and launching a term, .bashrc is automatically loaded (instead
of .bash_profile, .profile, .login) when bash is not a login
shell.

Also see http://newbieDoc.sourceForge.net/ ...

pgsql-general by date:

Previous
From: Doug McNaught
Date:
Subject: Re: Performance tips
Next
From: Bruce Momjian
Date:
Subject: Re: Q about function