Re: Q on views and performance - Mailing list pgsql-performance

From Dean Gibson (DB Administrator)
Subject Re: Q on views and performance
Date
Msg-id 47C03709.5010205@ultimeth.com
Whole thread Raw
In response to Re: Q on views and performance  ("Kynn Jones" <kynnjo@gmail.com>)
Responses Re: Q on views and performance
Re: Q on views and performance
List pgsql-performance
On 2008-02-23 05:59, Kynn Jones wrote:
On Fri, Feb 22, 2008 at 8:48 PM, Dean Gibson (DB Administrator) <postgresql@ultimeth.com> wrote:
...

Since you have experience working with views, let me ask you this.  The converse strategy to the one I described originally would be to create the individual tables T1, T2, T3, ..., T100, but instead of keeping around the original (and now redundant) table T, replace it with a view V made up of the union of T1, T2, T3, ..., T100.  The problem with this alternative is that one cannot index V, or define a primary key constraint for it, because it's a view.  This means that a search in V, even for a primary key value, would be *have to be* very inefficient (i.e. I don't see how even the very clever PostgreSQL implementers could get around this one!), because the engine would have to search *all* the underlying tables, T1 through T100, even if it found the desired record in T1, since it has no way of knowing that the value is unique all across V.

Is there a way around this?

kynn

Oh, I wouldn't create separate tables and do a UNION of them, I'd think that would be inefficient.

I didn't look in detail at your previous eMail, but I will now:

1. You haven't told us the distribution of "zipk", or what the tables are indexed on, or what type of performance you are expecting.  Your initial examples don't help much unless you actually have performance numbers or EXPLAIN output for them, since adding the third JOIN significantly changes the picture, as does changing one of the JOINs to a LEFT JOIN.

2. In your actual (Q1** and Q2**) examples, why is one JOIN an INNER JOIN and the other one a LEFT JOIN?  Given your description of Q1 at the top of your message, that doesn't make sense to me.

3. Why not write:

CREATE VIEW txt AS
  SELECT a1.word AS word1, a1.type AS type1, a2.word AS word2, a2.type AS type2
    FROM T a1 [LEFT] JOIN T a2 USING( zipk );  -- Use "LEFT" if appropriate
SELECT word1, word1
  FROM S JOIN txt ON word = word1

  WHERE type1 = <int1> AND type2 = <int2>;


If either of those (either with or without the "LEFT") are not equivalent to your problem, how about just:

SELECT a1.word AS word1, a2.word AS word2
  FROM S JOIN T a1 USING( word)
    [LEFT] JOIN T a2 USING( zipk )
  -- Use "LEFT" if appropriate
  WHERE a1.type = <int1> AND a2.type = <int2>;

Show us (using EXPLAIN) what the query planner thinks of each of these.

-- 
Mail to my list address MUST be sent via the mailing list.
All other mail to my list address will bounce.

pgsql-performance by date:

Previous
From: "Robins Tharakan"
Date:
Subject: Re: Q on views and performance
Next
From: "Dean Gibson (DB Administrator)"
Date:
Subject: Re: Q on views and performance