Re: Q on views and performance - Mailing list pgsql-performance

From Kynn Jones
Subject Re: Q on views and performance
Date
Msg-id c2350ba40802230559l280eab73k8e79a8bda7a4994a@mail.gmail.com
Whole thread Raw
In response to Re: Q on views and performance  ("Dean Gibson (DB Administrator)" <postgresql@ultimeth.com>)
Responses Re: Q on views and performance
List pgsql-performance
On Fri, Feb 22, 2008 at 8:48 PM, Dean Gibson (DB Administrator) <postgresql@ultimeth.com> wrote:
On 2008-02-22 12:49, Kynn Jones wrote:
> Of course, I expect that using views V<int1> and V<int2>... would
> result in a loss in performance relative to a version that used bona
> fide tables T<int1> and T<int2>.  My question is, how can I minimize
> this performance loss?

That used to be my thoughts too, but I have found over the years that
the PostgreSQL execution planner is able to "flatten" SELECTs using
VIEWs, ALMOST ALWAYS in a way that does not adversely affect
performance, and often gives an IMPROVEMENT in performance, probably
because by using VIEWs I am stating the query problem in a better way
than if I try to guess the best way to optimize a SELECT.

I have at least a 10:1 ratio of VIEWs to TABLEs.  Occasionally, with
some query that is slow, I will try to rewrite it without VIEWs.  This
ALMOST NEVER results in an improvement in performance, and when it does,
I am able to find another way to write the VIEW and SELECT to recapture
the gain.

Since you have experience working with views, let me ask you this.  The converse strategy to the one I described originally would be to create the individual tables T1, T2, T3, ..., T100, but instead of keeping around the original (and now redundant) table T, replace it with a view V made up of the union of T1, T2, T3, ..., T100.  The problem with this alternative is that one cannot index V, or define a primary key constraint for it, because it's a view.  This means that a search in V, even for a primary key value, would be *have to be* very inefficient (i.e. I don't see how even the very clever PostgreSQL implementers could get around this one!), because the engine would have to search *all* the underlying tables, T1 through T100, even if it found the desired record in T1, since it has no way of knowing that the value is unique all across V.

Is there a way around this?

kynn

pgsql-performance by date:

Previous
From: "Kynn Jones"
Date:
Subject: Re: Q on views and performance
Next
From: "Robins Tharakan"
Date:
Subject: Re: Q on views and performance