Thread: Queries joining views

Queries joining views

From

Alban Hertroys

Date:

21 August 2006, 13:48:55

Is there a trick to make this work a bit faster?

We have a number of views that join tables, and we have queries that
join those views. Some relatively large tables are involved.

We added indexes that match our query constraints as much as possible,
and that does work if we explicitly query the tables with all the
involved joins, instead of the views. However, if we query the views,
the planner starts using a filter instead of the desired index...

What we see basically is that adding one view to the query makes it go
from 12ms to 130ms...


zorgweb_solaris=> explain analyze SELECT

insrel.owner,insrel.rnumber,insrel.dnumber,insrel.snumber,insrel.dir,insrel.otype,insrel.number,medical_care_container.number,product.number


FROM mm_medical_care_container_table medical_care_container,mm_insrel
insrel,mm_product_table product WHERE
medical_care_container.number=558332 AND
(medical_care_container.number=insrel.dnumber AND
product.number=insrel.snumber AND insrel.dir<>1);

                      QUERY PLAN


---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
  Nested Loop  (cost=114.23..203.24 rows=3 width=42) (actual
time=10.137..12.171 rows=1 loops=1)
    ->  Index Scan using mm_medical_care_container_table_pkey on
mm_medical_care_container_table medical_care_container  (cost=0.00..5.64
rows=1 width=4) (actual time=0.048..0.052 rows=1 loops=1)
          Index Cond: (number = 558332)
    ->  Nested Loop  (cost=114.23..197.57 rows=3 width=38) (actual
time=10.077..12.106 rows=1 loops=1)
          ->  Merge Join  (cost=114.23..186.13 rows=3 width=24) (actual
time=10.025..12.049 rows=1 loops=1)
                Merge Cond: ("outer".number = "inner".snumber)
                ->  Index Scan using mm_product_table_pkey on
mm_product_table product  (cost=0.00..67.90 rows=1571 width=4) (actual
time=0.025..9.460 rows=1571 loops=1)
                ->  Sort  (cost=114.23..114.31 rows=30 width=20) (actual
time=0.144..0.145 rows=2 loops=1)
                      Sort Key: mm_insrel_table.snumber
                      ->  Bitmap Heap Scan on mm_insrel_table
(cost=2.11..113.50 rows=30 width=20) (actual time=0.092..0.103 rows=2
loops=1)
                            Recheck Cond: ((dnumber = 558332) AND (dir
<> 1))
                            ->  Bitmap Index Scan on
mm_insrel_dnumber_dir_not_one_idx  (cost=0.00..2.11 rows=30 width=0)
(actual time=0.070..0.070 rows=2 loops=1)
                                  Index Cond: (dnumber = 558332)
          ->  Index Scan using mm_object_pkey on mm_object
(cost=0.00..3.80 rows=1 width=18) (actual time=0.042..0.046 rows=1 loops=1)
                Index Cond: ("outer".number = mm_object.number)
  Total runtime: 12.765 ms

zorgweb_solaris=> explain analyze SELECT
insrel.rnumber,insrel.dnumber,insrel.snumber,insrel.dir,insrel.number,medical_care_container.number,product.number

FROM mm_medical_care_container medical_care_container,mm_insrel
insrel,mm_product product WHERE medical_care_container.number=558332 AND
(medical_care_container.number=insrel.dnumber AND
product.number=insrel.snumber AND insrel.dir<>1);

                       QUERY PLAN


----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
  Nested Loop  (cost=0.00..172.69 rows=1 width=28) (actual
time=53.987..129.419 rows=1 loops=1)
    ->  Nested Loop  (cost=0.00..168.88 rows=1 width=28) (actual
time=53.940..129.365 rows=1 loops=1)
          ->  Merge Join  (cost=0.00..165.07 rows=1 width=28) (actual
time=53.890..129.310 rows=1 loops=1)
                Merge Cond: ("outer".number = "inner".number)
                ->  Nested Loop  (cost=0.00..2796.82 rows=30 width=28)
(actual time=44.088..117.487 rows=2 loops=1)
                      ->  Nested Loop  (cost=0.00..2682.38 rows=30
width=24) (actual time=44.034..117.375 rows=2 loops=1)
                            ->  Index Scan using mm_insrel_full_idx on
mm_insrel_table  (cost=0.00..2512.97 rows=30 width=20) (actual
time=43.975..117.246 rows=2 loops=1)
                                  Index Cond: (dnumber = 558332)
                            ->  Index Scan using
mm_medical_care_container_table_pkey on mm_medical_care_container_table
  (cost=0.00..5.64 rows=1 width=4) (actual time=0.044..0.045 rows=1 loops=2)
                                  Index Cond: (558332 = number)
                      ->  Index Scan using mm_object_pkey on mm_object
(cost=0.00..3.80 rows=1 width=4) (actual time=0.041..0.043 rows=1 loops=2)
                            Index Cond: (mm_object.number = "outer".snumber)
                ->  Index Scan using mm_product_table_pkey on
mm_product_table  (cost=0.00..67.90 rows=1571 width=4) (actual
time=0.023..9.443 rows=1571 loops=1)
          ->  Index Scan using mm_object_pkey on mm_object
(cost=0.00..3.80 rows=1 width=4) (actual time=0.040..0.042 rows=1 loops=1)
                Index Cond: ("outer".number = mm_object.number)
    ->  Index Scan using mm_object_pkey on mm_object  (cost=0.00..3.80
rows=1 width=4) (actual time=0.038..0.042 rows=1 loops=1)
          Index Cond: (number = 558332)
  Total runtime: 130.149 ms


zorgweb_solaris=> \d mm_insrel;
     View "public.mm_insrel"
  Column  |  Type   | Modifiers
---------+---------+-----------
  number  | integer |
  otype   | integer |
  owner   | text    |
  snumber | integer |
  dnumber | integer |
  rnumber | integer |
  dir     | integer |
View definition:
  SELECT mm_object.number, mm_object.otype, mm_object."owner",
mm_insrel_table.snumber, mm_insrel_table.dnumber,
mm_insrel_table.rnumber, mm_insrel_table.dir
    FROM mm_insrel_table
    JOIN mm_object USING (number);

zorgweb_solaris=> \d mm_medical_care_container
View "public.mm_medical_care_container"
  Column |  Type   | Modifiers
--------+---------+-----------
  number | integer |
  otype  | integer |
  owner  | text    |
View definition:
  SELECT mm_object.number, mm_object.otype, mm_object."owner"
    FROM mm_medical_care_container_table
    JOIN mm_object USING (number);

zorgweb_solaris=> \d mm_product
            View "public.mm_product"
          Column         |  Type   | Modifiers
------------------------+---------+-----------
  number                 | integer |
  otype                  | integer |
  owner                  | text    |
  created                | bigint  |
  lastmodified           | bigint  |
  start_time             | bigint  |
  end_time               | bigint  |
  title                  | text    |
  details                | text    |
  only_collectively      | boolean |
  term_of_notice         | text    |
  max_number_paying_kids | integer |
  contract_term          | text    |
  advance_declarations   | text    |
  free_care_choice       | text    |
  export_to_rivm         | boolean |
  export_to_kwiz         | boolean |
  export_to_independer   | boolean |
  show_in_frontend       | boolean |
  path                   | text    |
  type_notes             | text    |
View definition:
  SELECT mm_object.number, mm_object.otype, mm_object."owner",
mm_product_table.created, mm_product_table.lastmodified,
mm_product_table.start_time, mm_product_table.end_time, mm_p
roduct_table.title, mm_product_table.details,
mm_product_table.only_collectively, mm_product_table.term_of_notice,
mm_product_table.max_number_paying_kids, mm_product_table.contra
ct_term, mm_product_table.advance_declarations,
mm_product_table.free_care_choice, mm_product_table.export_to_rivm,
mm_product_table.export_to_kwiz, mm_product_table.export_to_ind
epender, mm_product_table.show_in_frontend, mm_product_table.path,
mm_product_table.type_notes
    FROM mm_product_table
    JOIN mm_object USING (number);


--
Alban Hertroys
alban@magproductions.nl

magproductions b.v.

T: ++31(0)534346874
F: ++31(0)534346876
M:
I: www.magproductions.nl
A: Postbus 416
    7500 AK Enschede

// Integrate Your World //

Re: Queries joining views

From

"John D. Burger"

Date:

21 August 2006, 14:06:06

Alban Hertroys wrote:

> We have a number of views that join tables, and we have queries that
> join those views. Some relatively large tables are involved.
> We added indexes that match our query constraints as much as possible,
> and that does work if we explicitly query the tables with all the
> involved joins, instead of the views. However, if we query the views,
> the planner starts using a filter instead of the desired index...

Anecdotally, I had a situation recently where I got different plans
depending on whether I queried a join of a view against itself, or
"macro expanded" the view by hand.  I was =very= surprised at this.

- John D. Burger
   MITRE

Re: Queries joining views

From

Tom Lane

Date:

21 August 2006, 14:48:53

Alban Hertroys <alban@magproductions.nl> writes:
> Is there a trick to make this work a bit faster?

Have you really shown us the right queries for those explain results?
I don't see where the second plan is testing "dir <> 1" at all.
It looks like the first one is faster because it's using a partial
index that has predicate dir <> 1, while the second one is using
a much larger full index.  But I don't see where the second plan
is applying that restriction, so I wonder if you forgot it in the
query.

            regards, tom lane

Re: Queries joining views

From

Tom Lane

Date:

21 August 2006, 14:58:26

"John D. Burger" <john@mitre.org> writes:
> Anecdotally, I had a situation recently where I got different plans
> depending on whether I queried a join of a view against itself, or
> "macro expanded" the view by hand.  I was =very= surprised at this.

Me too, at least if you didn't do any hand optimization but just stuck
the view definition in as a sub-select.  Can you provide a reproducible
case?

            regards, tom lane

Re: Queries joining views

From

"John D. Burger"

Date:

21 August 2006, 16:10:59

Tom Lane wrote:

>> Anecdotally, I had a situation recently where I got different plans
>> depending on whether I queried a join of a view against itself, or
>> "macro expanded" the view by hand.  I was =very= surprised at this.
>
> Me too, at least if you didn't do any hand optimization but just stuck
> the view definition in as a sub-select.  Can you provide a reproducible
> case?

No surprise, I cannot reproduce this, especially since the DDL has
evolved since then.  But you are almost certainly right, my "macro
expansion" must have done something more than simply dropping in the
view definition as is.  Presumably something minor (to me) but a
show-stopper for the planner.

Sorry for the distraction. :(

- John D. Burger
   MITRE

Re: Queries joining views

From

DelGurth

Date:

21 August 2006, 17:58:29

Alban Hertroys <alban ( at ) magproductions ( dot ) nl> writes:
> Is there a trick to make this work a bit faster?

Have you really shown us the right queries for those explain results?
I don't see where the second plan is testing "dir <> 1" at all.
It looks like the first one is faster because it's using a partial
index that has predicate dir <> 1, while the second one is using
a much larger full index. But I don't see where the second plan
is applying that restriction, so I wonder if you forgot it in the
query.

regards, tom lane

He has really shown the right queries. But I see the table definition if mm_insrel_table (including the indexes) is not in the e-mail, so you don't see why the dir <> 1 is not in the query plan. Here is the table definition, with the indexes. As you can see we tried some indexes, to see if we could get the queries on the views to become faster.

zorgweb_solaris=> \d mm_insrel_table
Table "public.mm_insrel_table"
Column | Type   | Modifiers
---------+---------+-----------
number | integer | not null
snumber | integer | not null
dnumber | integer | not null
rnumber | integer | not null
dir     | integer |
Indexes:
    "mm_insrel_table_pkey" PRIMARY KEY, btree (number)
    "mm_insrel_dir_not_one_idx" btree (dnumber, snumber) WHERE dir <> 1
    "mm_insrel_dnumber_dir_not_one_idx" btree (dnumber) WHERE dir <> 1
    "mm_insrel_full_idx" btree (snumber, dnumber, rnumber, dir) WHERE dir <> 1
    "mm_insrel_relation_idx" btree (snumber, dnumber, rnumber)
Foreign-key constraints:
    "mm_insrel_table_dnumber_fkey" FOREIGN KEY (dnumber) REFERENCES mm_object(number)
    "mm_insrel_table_rnumber_fkey" FOREIGN KEY (rnumber) REFERENCES mm_object(number)
    "mm_insrel_table_snumber_fkey" FOREIGN KEY (snumber) REFERENCES mm_object(number)

I hope this explains you why the dir <> 1 is not in the view query. Why the other query plan thinks it needs to recheck the condition is not clear to me, but I'm not an expert on PostgreSQL query plans.

Regards,
Wessel van Norel

Re: Queries joining views

From

Tom Lane

Date:

21 August 2006, 18:32:45

DelGurth <delgurth@gmail.com> writes:
> As you can see we tried some indexes, to see if we could
> get the queries on the views to become faster.

> Indexes:
>     "mm_insrel_table_pkey" PRIMARY KEY, btree (number)
>     "mm_insrel_dir_not_one_idx" btree (dnumber, snumber) WHERE dir <> 1
>     "mm_insrel_dnumber_dir_not_one_idx" btree (dnumber) WHERE dir <> 1
>     "mm_insrel_full_idx" btree (snumber, dnumber, rnumber, dir) WHERE dir <>
> 1
>     "mm_insrel_relation_idx" btree (snumber, dnumber, rnumber)

Hmph ... it certainly appears to be choosing the wrong index in the
second case.  I wonder why --- can you show the relpages and reltuples
stats from pg_class for these indexes?

It might be interesting also to examine the output of just

    explain select * from mm_insrel_table where dnumber=558332 and dir<>1

with different subsets of these indexes in place.  I'd like to see what
it's deriving as the cost estimates for these indexes.  If you can get
an EXPLAIN selecting each one of the indexes, that would help diagnose
what's going on.

Dunno if you know this trick already, but it's possible to experiment
with different index subsets without physically dropping and recreating
the indexes.  Try

    begin;
    drop [unwanted indexes]
    explain ...
    rollback;

This will hold exclusive lock on the table until you rollback, so if
it's a production database you want to be quick about it --- maybe put
the whole thing in a SQL script.  But it sure beats rebuilding indexes.

BTW, what PG version is this exactly?

            regards, tom lane

Re: Queries joining views

From

DelGurth

Date:

21 August 2006, 19:38:15

BTW, what PG version is this exactly?

Our PG version is the version downloadable from http://www.sunfreeware.com/programlistsparc10.html#postgresql , so 8.0.1 for solaris sparc.

Sorry I was wrong on this point, it's 8.1.4

-bash-3.00$ pg_config --version
PostgreSQL 8.1.4

And it's the version from blastwave.org: http://www.blastwave.org/packages.php/postgresql

Regards,
Wessel van Norel

Re: Queries joining views

From

DelGurth

Date:

21 August 2006, 19:44:29

On 8/21/06, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Hmph ... it certainly appears to be choosing the wrong index in the
second case. I wonder why --- can you show the relpages and reltuples
stats from pg_class for these indexes?

I'm personally not aware how to do that, perhaps Alban will (tell me how to) do that tomorrow.

It might be interesting also to examine the output of just

    explain select * from mm_insrel_table where dnumber=558332 and dir<>1

with different subsets of these indexes in place.  I'd like to see what
it's deriving as the cost estimates for these indexes.  If you can get
an EXPLAIN selecting each one of the indexes, that would help diagnose
what's going on.

Ok. Did that (with your trick, thanks!). The output is attached to this e-mail. The script I used to drop the indexes dropped them in the order it was using them (partially by accident, partially because I assumed it would work in that order).

I'm not sure if you want to see more permutations, if so please tell me.

BTW, what PG version is this exactly?

Our PG version is the version downloadable from http://www.sunfreeware.com/programlistsparc10.html#postgresql , so 8.0.1 for solaris sparc.

(it's a sun T2000 test machine).

regards, tom lane

Regards,
Wessel van Norel

Attachment

index_differences.txt

Re: Queries joining views

From

Tom Lane

Date:

21 August 2006, 21:10:15

DelGurth <delgurth@gmail.com> writes:
> On 8/21/06, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> It might be interesting also to examine the output of just
>> explain select * from mm_insrel_table where dnumber=558332 and dir<>1
>> with different subsets of these indexes in place.

> Ok. Did that (with your trick, thanks!). The output is attached to
> this e-mail.

Well, no smoking gun there, it clearly knows that mm_insrel_full_idx
is much more expensive for this query than the other two...

Looking back at Alban's original post, I finally see what the planner
is up to:

          ->  Merge Join  (cost=0.00..165.07 rows=1 width=28) (actual time=53.890..129.310 rows=1 loops=1)
                Merge Cond: ("outer".number = "inner".number)
                ->  Nested Loop  (cost=0.00..2796.82 rows=30 width=28) (actual time=44.088..117.487 rows=2 loops=1)
                      ->  Nested Loop  (cost=0.00..2682.38 rows=30 width=24) (actual time=44.034..117.375 rows=2
loops=1)
                            ->  Index Scan using mm_insrel_full_idx on mm_insrel_table  (cost=0.00..2512.97 rows=30
width=20)(actual time=43.975..117.246 rows=2 loops=1) 
                                  Index Cond: (dnumber = 558332)

The reason it's choosing this indexscan is that that will give it data
sorted by mm_insrel_table.number, which it can feed into the mergejoin
without an extra sort step.  Now sorting 30 rows is not going to take
nearly as much time as the indexscan eats up, so this still doesn't
make sense ---  until you notice that it's estimating the top merge join
at considerably less than the cost of its inputs (165.07, vss 2796.82
just for this input).  That means it thinks it won't have to run the
inputs to completion in order to finish the mergejoin, and so it's
picking a sub-plan that has zero start cost.

What this means is that the planner thinks the range of "number" values
in mm_product_table (the other side of the mergejoin) is much less than
the range in mm_insrel_table.  Is that the case?  Perhaps your ANALYZE
stats for these tables are out of date.  If not I'd like to see the
pg_stats entries for the two "number" columns.

            regards, tom lane

Re: Queries joining views

From

Alban Hertroys

Date:

22 August 2006, 06:50:52

Tom Lane wrote:
> DelGurth <delgurth@gmail.com> writes:
>> As you can see we tried some indexes, to see if we could
>> get the queries on the views to become faster.
>
>> Indexes:
>>     "mm_insrel_table_pkey" PRIMARY KEY, btree (number)
>>     "mm_insrel_dir_not_one_idx" btree (dnumber, snumber) WHERE dir <> 1
>>     "mm_insrel_dnumber_dir_not_one_idx" btree (dnumber) WHERE dir <> 1
>>     "mm_insrel_full_idx" btree (snumber, dnumber, rnumber, dir) WHERE dir <>
>> 1
>>     "mm_insrel_relation_idx" btree (snumber, dnumber, rnumber)
>
> Hmph ... it certainly appears to be choosing the wrong index in the
> second case.  I wonder why --- can you show the relpages and reltuples
> stats from pg_class for these indexes?

Here they are:

               relname              | relpages | reltuples
-----------------------------------+----------+-----------
  mm_insrel_dir_not_one_idx         |      899 |    323628
  mm_insrel_dnumber_dir_not_one_idx |      899 |    323628
  mm_insrel_table_pkey              |     1237 |    323628
  mm_insrel_relation_idx            |     1849 |    323628
  mm_insrel_full_idx                |     1260 |    323628


--
Alban Hertroys
alban@magproductions.nl

magproductions b.v.

T: ++31(0)534346874
F: ++31(0)534346876
M:
I: www.magproductions.nl
A: Postbus 416
    7500 AK Enschede

// Integrate Your World //

Re: Queries joining views

From

Alban Hertroys

Date:

22 August 2006, 07:13:08

Tom Lane wrote:
> DelGurth <delgurth@gmail.com> writes:
>> On 8/21/06, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>> It might be interesting also to examine the output of just
>>> explain select * from mm_insrel_table where dnumber=558332 and dir<>1
>>> with different subsets of these indexes in place.
>
>> Ok. Did that (with your trick, thanks!). The output is attached to
>> this e-mail.
>
> Well, no smoking gun there, it clearly knows that mm_insrel_full_idx
> is much more expensive for this query than the other two...
>
> Looking back at Alban's original post, I finally see what the planner
> is up to:
>
>           ->  Merge Join  (cost=0.00..165.07 rows=1 width=28) (actual time=53.890..129.310 rows=1 loops=1)
>                 Merge Cond: ("outer".number = "inner".number)
>                 ->  Nested Loop  (cost=0.00..2796.82 rows=30 width=28) (actual time=44.088..117.487 rows=2 loops=1)
>                       ->  Nested Loop  (cost=0.00..2682.38 rows=30 width=24) (actual time=44.034..117.375 rows=2
loops=1)
>                             ->  Index Scan using mm_insrel_full_idx on mm_insrel_table  (cost=0.00..2512.97 rows=30
width=20)(actual time=43.975..117.246 rows=2 loops=1) 
>                                   Index Cond: (dnumber = 558332)
>
> The reason it's choosing this indexscan is that that will give it data
> sorted by mm_insrel_table.number, which it can feed into the mergejoin
> without an extra sort step.  Now sorting 30 rows is not going to take
> nearly as much time as the indexscan eats up, so this still doesn't
> make sense ---  until you notice that it's estimating the top merge join
> at considerably less than the cost of its inputs (165.07, vss 2796.82
> just for this input).  That means it thinks it won't have to run the
> inputs to completion in order to finish the mergejoin, and so it's
> picking a sub-plan that has zero start cost.

I see. Obviously that's not right for our case, so we'll have to figure
out why it thinks that.

> What this means is that the planner thinks the range of "number" values
> in mm_product_table (the other side of the mergejoin) is much less than
> the range in mm_insrel_table.  Is that the case?  Perhaps your ANALYZE

Very much so. The mm_product_table only contains about 1500 products,
while the mm_insrel_table contains a record for every relation in the
application (This is MMBase; it handles all relations with a relation
table). That's about 330,000 records.

> stats for these tables are out of date.  If not I'd like to see the
> pg_stats entries for the two "number" columns.

Especially during optimization sessions like this one we tend to analyse
rather frequently. All these indices were created yesterday (except for
the primary key index) and the corresponding tables were analyzed after
each index creation (aboutish). Data hasn't changed since a while -
we're in the middle of a test migration from mysql[1].

As for the stats, I included the one for mm_object as well, as every
view contains at least a join with that table - thus it contains 1284556
records... I suspect we're in for another few painful surprises there.

zorgweb_solaris=> select * from pg_stats where attname = 'number' and
tablename IN ('mm_insrel_table', 'mm_product_table', 'mm_object');
-[ RECORD 1
]-----+------------------------------------------------------------------------------------
schemaname        | public
tablename         | mm_product_table
attname           | number
null_frac         | 0
avg_width         | 4
n_distinct        | -1
most_common_vals  |
most_common_freqs |
histogram_bounds  | {2930,3244,3558,3872,4186,4500,4814,5128,5442,5756,6070}
correlation       | 0.993398
-[ RECORD 2
]-----+------------------------------------------------------------------------------------
schemaname        | public
tablename         | mm_insrel_table
attname           | number
null_frac         | 0
avg_width         | 4
n_distinct        | -1
most_common_vals  |
most_common_freqs |
histogram_bounds  |
{615920,689286,750855,812003,872741,933041,1004672,1068250,1134894,1198559,1261685}
correlation       | 0.664637
-[ RECORD 3
]-----+------------------------------------------------------------------------------------
schemaname        | public
tablename         | mm_object
attname           | number
null_frac         | 0
avg_width         | 4
n_distinct        | -1
most_common_vals  |
most_common_freqs |
histogram_bounds  |
{287,124412,256534,375896,505810,643940,770327,899229,1028933,1153260,1262338}
correlation       | 0.858558

As a side note, I'm in the progress of rewriting MMBase code to use
explicit joins where applicable. AFAIK the planner can handle those better.

>             regards, tom lane

[1]: With MySQL(4) there was no way for us to enhance performance any
more. There are a number of tree-like structures in our data model, and
MySQL just lacks the features to cope with that. With PostgreSQL we at
least can make use of the ltree contrib package - quite an improvement
in performance so far. Even though the ltrees are applied on text
columns with text2ltree() conversions in both the queries and the GIST
index.

We also figured we could use some triggers to generate data that could
improve query performance (moving conversions from SELECT-time to
INSERT-time), but unfortunately MMBase's caches are in the way there.

Regards, Alban.
--
Alban Hertroys
alban@magproductions.nl

magproductions b.v.

T: ++31(0)534346874
F: ++31(0)534346876
M:
I: www.magproductions.nl
A: Postbus 416
    7500 AK Enschede

// Integrate Your World //

Re: Queries joining views

From

Alban Hertroys

Date:

22 August 2006, 07:17:14

Tom Lane wrote:
> Looking back at Alban's original post, I finally see what the planner
> is up to:
>
>           ->  Merge Join  (cost=0.00..165.07 rows=1 width=28) (actual time=53.890..129.310 rows=1 loops=1)
>                 Merge Cond: ("outer".number = "inner".number)
>                 ->  Nested Loop  (cost=0.00..2796.82 rows=30 width=28) (actual time=44.088..117.487 rows=2 loops=1)
>                       ->  Nested Loop  (cost=0.00..2682.38 rows=30 width=24) (actual time=44.034..117.375 rows=2
loops=1)
>                             ->  Index Scan using mm_insrel_full_idx on mm_insrel_table  (cost=0.00..2512.97 rows=30
width=20)(actual time=43.975..117.246 rows=2 loops=1) 
>                                   Index Cond: (dnumber = 558332)
>
> The reason it's choosing this indexscan is that that will give it data
> sorted by mm_insrel_table.number, which it can feed into the mergejoin
> without an extra sort step.  Now sorting 30 rows is not going to take
> nearly as much time as the indexscan eats up, so this still doesn't
> make sense ---  until you notice that it's estimating the top merge join
> at considerably less than the cost of its inputs (165.07, vss 2796.82
> just for this input).  That means it thinks it won't have to run the
> inputs to completion in order to finish the mergejoin, and so it's
> picking a sub-plan that has zero start cost.

I'm thinking that removing the indexes it's erroneously using now could
help performance, as it can no longer use that index. It may however
pick the primary key index (likely), or - if we remove even that one - a
sequential scan... Experimenting will answer that.

Thanks for your answers so far, at least now we know what's going on.

Regards,
--
Alban Hertroys
alban@magproductions.nl

magproductions b.v.

T: ++31(0)534346874
F: ++31(0)534346876
M:
I: www.magproductions.nl
A: Postbus 416
    7500 AK Enschede

// Integrate Your World //

Re: Queries joining views

From

Alban Hertroys

Date:

22 August 2006, 07:31:24

Alban Hertroys wrote:
> Tom Lane wrote:
> I'm thinking that removing the indexes it's erroneously using now could
> help performance, as it can no longer use that index. It may however
> pick the primary key index (likely), or - if we remove even that one - a
> sequential scan... Experimenting will answer that.

Well, look at the attachhed explain output of:

zorgweb_solaris=> BEGIN;
DROP INDEX mm_insrel_full_idx; DROP INDEX mm_insrel_relation_idx;
EXPLAIN ANALYZE
SELECT
insrel.rnumber,insrel.dnumber,insrel.snumber,insrel.dir,insrel.number,medical_care_container.number,product.number
FROM mm_medical_care_container medical_care_container,
         mm_insrel insrel,
         mm_product product
  WHERE medical_care_container.number=558332
  AND (medical_care_container.number=insrel.dnumber
  AND product.number=insrel.snumber AND insrel.dir<>1);
ROLLBACK;

Only 13ms as opposed to 130-ish :)

--
Alban Hertroys
alban@magproductions.nl

magproductions b.v.

T: ++31(0)534346874
F: ++31(0)534346876
M:
I: www.magproductions.nl
A: Postbus 416
    7500 AK Enschede

// Integrate Your World //
                                                                                  QUERY PLAN
                                                  

-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Nested Loop  (cost=118.34..211.13 rows=1 width=28) (actual time=10.943..12.988 rows=1 loops=1)
   ->  Nested Loop  (cost=118.34..205.49 rows=1 width=28) (actual time=10.903..12.942 rows=1 loops=1)
         ->  Nested Loop  (cost=118.34..201.67 rows=1 width=28) (actual time=10.857..12.891 rows=1 loops=1)
               ->  Merge Join  (cost=118.34..190.23 rows=3 width=28) (actual time=10.807..12.837 rows=1 loops=1)
                     Merge Cond: ("outer".number = "inner".snumber)
                     ->  Index Scan using mm_product_table_pkey on mm_product_table  (cost=0.00..67.90 rows=1571
width=4)(actual time=0.040..9.542 rows=1571 loops=1) 
                     ->  Sort  (cost=118.34..118.41 rows=30 width=24) (actual time=0.867..0.868 rows=2 loops=1)
                           Sort Key: mm_insrel_table.snumber
                           ->  Nested Loop  (cost=2.11..117.60 rows=30 width=24) (actual time=0.804..0.827 rows=2
loops=1)
                                 ->  Index Scan using mm_object_pkey on mm_object  (cost=0.00..3.80 rows=1 width=4)
(actualtime=0.055..0.061 rows=1 loops=1) 
                                       Index Cond: (number = 558332)
                                 ->  Bitmap Heap Scan on mm_insrel_table  (cost=2.11..113.50 rows=30 width=20) (actual
time=0.723..0.734rows=2 loops=1) 
                                       Recheck Cond: ((dnumber = 558332) AND (dir <> 1))
                                       ->  Bitmap Index Scan on mm_insrel_dnumber_dir_not_one_idx  (cost=0.00..2.11
rows=30width=0) (actual time=0.705..0.705 rows=2 loops=1) 
                                             Index Cond: (dnumber = 558332)
               ->  Index Scan using mm_object_pkey on mm_object  (cost=0.00..3.80 rows=1 width=4) (actual
time=0.040..0.043rows=1 loops=1) 
                     Index Cond: ("outer".number = mm_object.number)
         ->  Index Scan using mm_object_pkey on mm_object  (cost=0.00..3.80 rows=1 width=4) (actual time=0.038..0.040
rows=1loops=1) 
               Index Cond: ("outer".number = mm_object.number)
   ->  Index Scan using mm_medical_care_container_table_pkey on mm_medical_care_container_table  (cost=0.00..5.64
rows=1width=4) (actual time=0.033..0.036 rows=1 loops=1) 
         Index Cond: (558332 = number)
 Total runtime: 13.799 ms
(22 rows)

Re: Queries joining views

From

Tom Lane

Date:

22 August 2006, 09:57:04

Alban Hertroys <alban@magproductions.nl> writes:
> zorgweb_solaris=> select * from pg_stats where attname = 'number' and
> tablename IN ('mm_insrel_table', 'mm_product_table', 'mm_object');

> tablename         | mm_product_table
> histogram_bounds  | {2930,3244,3558,3872,4186,4500,4814,5128,5442,5756,6070}

> tablename         | mm_insrel_table
> {615920,689286,750855,812003,872741,933041,1004672,1068250,1134894,1198559,1261685}

> tablename         | mm_object
> histogram_bounds  |
> {287,124412,256534,375896,505810,643940,770327,899229,1028933,1153260,1262338}

OK, so here's our problem: according to those stats, the ranges of
"number" in mm_product_table and mm_insrel_table don't overlap at all.
So the cost model for mergejoin predicts that a mergejoin on "number"
will have to read all of mm_product_table but only the first record from
mm_insrel_table, and given the difference in size of the two tables,
that looks like a pretty good deal.

Given that the plan is not actually very fast, I suppose that the
histogram is not telling the whole truth --- probably there are a few
outlying records in one table or the other causing there to be a more
significant overlap than the planner expects.  If so, you can probably
fix it by increasing the statistics target for that table.

            regards, tom lane

Re: Queries joining views

From

Alban Hertroys

Date:

22 August 2006, 10:08:40

Tom Lane wrote:
> Alban Hertroys <alban@magproductions.nl> writes:
>> zorgweb_solaris=> select * from pg_stats where attname = 'number' and
>> tablename IN ('mm_insrel_table', 'mm_product_table', 'mm_object');
>
>> tablename         | mm_product_table
>> histogram_bounds  | {2930,3244,3558,3872,4186,4500,4814,5128,5442,5756,6070}
>
>> tablename         | mm_insrel_table
>> {615920,689286,750855,812003,872741,933041,1004672,1068250,1134894,1198559,1261685}
>
>> tablename         | mm_object
>> histogram_bounds  |
>> {287,124412,256534,375896,505810,643940,770327,899229,1028933,1153260,1262338}
>
> OK, so here's our problem: according to those stats, the ranges of
> "number" in mm_product_table and mm_insrel_table don't overlap at all.

That's correct, the numbers are generated by a global sequence.
Insrel.number can never match a product.number.

However, mm_product.number always matches either mm_insrel.snumber or
mm_insrel.dnumber (source and destination respectively). The other way
around this isn't the case; then snumber and dnumber match number-fields
in other tables (they always do).

> So the cost model for mergejoin predicts that a mergejoin on "number"
> will have to read all of mm_product_table but only the first record from
> mm_insrel_table, and given the difference in size of the two tables,
> that looks like a pretty good deal.
>
> Given that the plan is not actually very fast, I suppose that the
> histogram is not telling the whole truth --- probably there are a few
> outlying records in one table or the other causing there to be a more
> significant overlap than the planner expects.  If so, you can probably
> fix it by increasing the statistics target for that table.

That's a bit odd, as the number fields of the different tables are
globally unique by definition.

>             regards, tom lane

Regards,
--
Alban Hertroys
alban@magproductions.nl

magproductions b.v.

T: ++31(0)534346874
F: ++31(0)534346876
M:
I: www.magproductions.nl
A: Postbus 416
    7500 AK Enschede

// Integrate Your World //

Re: Queries joining views

From

Tom Lane

Date:

22 August 2006, 10:22:29

Alban Hertroys <alban@magproductions.nl> writes:
> However, mm_product.number always matches either mm_insrel.snumber or
> mm_insrel.dnumber (source and destination respectively). The other way
> around this isn't the case; then snumber and dnumber match number-fields
> in other tables (they always do).

Oh, then we are looking at the wrong things: we should be comparing the
histograms of the fields that are being used as the join keys in this
query.  I had thought they were both "number", but I must be confused.

            regards, tom lane

Re: Queries joining views

From

Alban Hertroys

Date:

22 August 2006, 10:59:35

Tom Lane wrote:
> Alban Hertroys <alban@magproductions.nl> writes:
>> However, mm_product.number always matches either mm_insrel.snumber or
>> mm_insrel.dnumber (source and destination respectively). The other way
>> around this isn't the case; then snumber and dnumber match number-fields
>> in other tables (they always do).
>
> Oh, then we are looking at the wrong things: we should be comparing the
> histograms of the fields that are being used as the join keys in this
> query.  I had thought they were both "number", but I must be confused.

The design is certainly a bit confusing until you get used to it. It
usually takes new devs here a while to find their way around MMBase
(www.mmbase.org) and its peculiarities. It doesn't help that the
documentation is in rather bad English.

>             regards, tom lane

So this is what we're looking for, right? I can't say I understand how
to interpret this, let alone come to conclusions. I'm afraid I totally
depend on your interpretation here...

zorgweb_solaris=> select * from pg_stats where (attname in ('snumber',
'dnumber') and tablename = 'mm_insrel_table') or (attname = 'number' and
tablename = 'mm_product_table');
-[ RECORD 1
]-----+-----------------------------------------------------------------------------------------------------------
schemaname        | public
tablename         | mm_product_table
attname           | number
null_frac         | 0
avg_width         | 4
n_distinct        | -1
most_common_vals  |
most_common_freqs |
histogram_bounds  | {2930,3244,3558,3872,4186,4500,4814,5128,5442,5756,6070}
correlation       | 0.993398
-[ RECORD 2
]-----+-----------------------------------------------------------------------------------------------------------
schemaname        | public
tablename         | mm_insrel_table
attname           | snumber
null_frac         | 0
avg_width         | 4
n_distinct        | 14336
most_common_vals  | {4300,5210,5366,2994,3724,4118,2982,3058,3072,3460}
most_common_freqs |
{0.00266667,0.002,0.002,0.00166667,0.00166667,0.00166667,0.00133333,0.00133333,0.00133333,0.00133333}
histogram_bounds  |
{135,3768,4780,14822,57048,92958,125442,158954,433002,502836,610034}
correlation       | 0.083602
-[ RECORD 3
]-----+-----------------------------------------------------------------------------------------------------------
schemaname        | public
tablename         | mm_insrel_table
attname           | dnumber
null_frac         | 0
avg_width         | 4
n_distinct        | 11028
most_common_vals  | {1117583,279,415,291,343,389,635,839,1043,319}
most_common_freqs |
{0.00433333,0.00333333,0.003,0.00266667,0.00266667,0.00266667,0.00266667,0.00266667,0.00266667,0.00233333}
histogram_bounds  |
{147,717,3770,263126,327054,429524,461026,490094,518872,544098,1117603}
correlation       | 0.0571927

Regards,
--
Alban Hertroys
alban@magproductions.nl

magproductions b.v.

T: ++31(0)534346874
F: ++31(0)534346876
M:
I: www.magproductions.nl
A: Postbus 416
    7500 AK Enschede

// Integrate Your World //

Re: Queries joining views

From

Tom Lane

Date:

22 August 2006, 11:48:50

Alban Hertroys <alban@magproductions.nl> writes:
> tablename         | mm_product_table
> attname           | number
> histogram_bounds  | {2930,3244,3558,3872,4186,4500,4814,5128,5442,5756,6070}

> tablename         | mm_insrel_table
> attname           | snumber
> histogram_bounds  |
> {135,3768,4780,14822,57048,92958,125442,158954,433002,502836,610034}

Hmm ... if I'm not still confused, these are the two columns being
mergejoined in your slow query (would you double-check that?).
But the numbers don't seem to add up.  Given those stats the estimate
should be that something over 20% of the mm_insrel_table has to be
scanned to complete the join (since 6070 falls into the third decile
of the other histogram).  But we saw from Alban's original post that
the planner must be estimating well under 10% of the table needs to
be scanned.  Either we're still confused about which columns are being
joined, or there's some weird bug in the computation.

            regards, tom lane

Re: Queries joining views

From

Alban Hertroys

Date:

22 August 2006, 12:07:50

Tom Lane wrote:
> Alban Hertroys <alban@magproductions.nl> writes:
>> tablename         | mm_product_table
>> attname           | number
>> histogram_bounds  | {2930,3244,3558,3872,4186,4500,4814,5128,5442,5756,6070}
>
>> tablename         | mm_insrel_table
>> attname           | snumber
>> histogram_bounds  |
>> {135,3768,4780,14822,57048,92958,125442,158954,433002,502836,610034}
>
> Hmm ... if I'm not still confused, these are the two columns being
> mergejoined in your slow query (would you double-check that?).

That's correct.

I read up some on the meaning of the pg_stats values, and I noticed that
  mm_insrel_tables' snumber and dnumber columns seem to have a rather
bad correlation. I think this could be improved by clustering on an
index over (number, snumber, dnumber); is that correct?

> But the numbers don't seem to add up.  Given those stats the estimate
> should be that something over 20% of the mm_insrel_table has to be
> scanned to complete the join (since 6070 falls into the third decile
> of the other histogram).  But we saw from Alban's original post that
> the planner must be estimating well under 10% of the table needs to
> be scanned.  Either we're still confused about which columns are being
> joined, or there's some weird bug in the computation.

Since the start of this thread the insrel table has grown to 339195
records (it was closer to 330,000), maybe that changed the statistics a
bit. To be sure, attached is the query plan of the problematic query
again at this moment.

The other table involved, mm_medical_care_container_table, has the
following stats on number:

zorgweb_solaris=> select * from pg_stats where attname = 'number' and
tablename = 'mm_medical_care_container_table';
-[ RECORD 1
]-----+-------------------------------------------------------------------------------
schemaname        | public
tablename         | mm_medical_care_container_table
attname           | number
null_frac         | 0
avg_width         | 4
n_distinct        | -1
most_common_vals  |
most_common_freqs |
histogram_bounds  |
{418768,436686,455444,473600,490610,508680,527182,545038,562786,578528,595132}
correlation       | 0.339138

I sure hope we get this mystery unveiled...

Regards,
--
Alban Hertroys
alban@magproductions.nl

magproductions b.v.

T: ++31(0)534346874
F: ++31(0)534346876
M:
I: www.magproductions.nl
A: Postbus 416
    7500 AK Enschede

// Integrate Your World //
zorgweb_solaris=> explain analyze SELECT
zorgweb_solaris->
insrel.rnumber,insrel.dnumber,insrel.snumber,insrel.dir,insrel.number,medical_care_container.number,product.number
zorgweb_solaris-> FROM mm_medical_care_container medical_care_container,mm_insrel
zorgweb_solaris-> insrel,mm_product product WHERE medical_care_container.number=558332 AND
zorgweb_solaris-> (medical_care_container.number=insrel.dnumber AND
zorgweb_solaris(> product.number=insrel.snumber AND insrel.dir<>1);
                                                                                             QUERY PLAN
                                                  

----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Nested Loop  (cost=0.00..178.61 rows=1 width=28) (actual time=56.089..137.304 rows=1 loops=1)
   ->  Nested Loop  (cost=0.00..174.67 rows=1 width=28) (actual time=56.041..137.250 rows=1 loops=1)
         ->  Merge Join  (cost=0.00..170.73 rows=1 width=28) (actual time=55.986..137.189 rows=1 loops=1)
               Merge Cond: ("outer".number = "inner".number)
               ->  Nested Loop  (cost=0.00..2966.75 rows=30 width=28) (actual time=46.161..125.315 rows=2 loops=1)
                     ->  Nested Loop  (cost=0.00..2848.50 rows=30 width=24) (actual time=46.108..125.205 rows=2
loops=1)
                           ->  Index Scan using mm_insrel_full_idx on mm_insrel_table  (cost=0.00..2678.20 rows=30
width=20)(actual time=46.042..125.067 rows=2 loops=1) 
                                 Index Cond: (dnumber = 558332)
                           ->  Index Scan using mm_medical_care_container_table_pkey on mm_medical_care_container_table
(cost=0.00..5.67 rows=1 width=4) (actual time=0.048..0.049 rows=1 loops=2) 
                                 Index Cond: (558332 = number)
                     ->  Index Scan using mm_object_pkey on mm_object  (cost=0.00..3.93 rows=1 width=4) (actual
time=0.041..0.043rows=1 loops=2) 
                           Index Cond: (mm_object.number = "outer".snumber)
               ->  Index Scan using mm_product_table_pkey on mm_product_table  (cost=0.00..67.90 rows=1571 width=4)
(actualtime=0.024..9.462 rows=1571 loops=1) 
         ->  Index Scan using mm_object_pkey on mm_object  (cost=0.00..3.93 rows=1 width=4) (actual time=0.043..0.045
rows=1loops=1) 
               Index Cond: ("outer".number = mm_object.number)
   ->  Index Scan using mm_object_pkey on mm_object  (cost=0.00..3.93 rows=1 width=4) (actual time=0.038..0.041 rows=1
loops=1)
         Index Cond: (number = 558332)
 Total runtime: 138.132 ms
(18 rows)

Re: Queries joining views

From

Tom Lane

Date:

22 August 2006, 12:15:10

Alban Hertroys <alban@magproductions.nl> writes:
> I sure hope we get this mystery unveiled...

I'm confused too.  Would it be possible for you to send me a dump of
your database?  I need the full schema definitions of these tables,
indexes, and views, but you could null out all but the various
"number" columns being used in the joins and query conditions,
so there shouldn't be any privacy issues.

            regards, tom lane

Re: Queries joining views

From

Tom Lane

Date:

22 August 2006, 16:47:05

Alban Hertroys <alban@magproductions.nl> writes:
>> I'm confused too.  Would it be possible for you to send me a dump of
>> your database?

> Attached is a cleaned out database, the full schema is included, but
> only the relevant tables contain any data.

Thanks.  After digging through it a bit, I understand what's happening,
but I'm not seeing any simple fix.  The problem is basically that
you've got

create or replace view mm_product as
 SELECT ... FROM mm_product_table JOIN mm_object mmop USING (number);

and then the problem query has WHERE mm_product.number = insrel.snumber
which causes the planner to conclude that mm_product_table.number,
mm_object.number, and mm_insrel_table.snumber are all basically
interchangeable.  In particular it ends up performing the join between
mm_product_table.number and mm_object.number as though
mm_product_table.number were being joined to mm_insrel_table.snumber.
Which is fine, except that it's thinking that the statistics for
mm_object.number are applicable in this context, and they're completely
misleading.  After the join to mm_insrel_table, the statistics of the
variable are really like mm_insrel_table.number --- in particular the
fraction of the table that has to be visited is much larger than it
would've been for mm_object as a whole.

This is a problem we've understood in a generic form for awhile:
a join or selection might change the statistics of a variable,
and so the info stored in the catalogs ought to be modified somehow
to predict what will happen at upper join levels.  We've not seen
it in this particular form before, though.

I'm not sure if there's a whole lot you can do about it in the near term
other than refactor your schema to avoid having different tables joining
to different subranges of mm_object.number.  (You don't necessarily have
to get rid of mm_object --- just try assigning its keys from a serial,
or something, so that there's no correlation to the ranges of keys in
other tables.)

We might be able to do something about actually solving the statistical
problem in 8.3, but I fear it's too late to think about it for 8.2.

            regards, tom lane

Re: Queries joining views

From

Tom Lane

Date:

22 August 2006, 16:52:59

I wrote:
> Thanks.  After digging through it a bit, I understand what's happening,
> but I'm not seeing any simple fix.

I forgot to mention that although I could reproduce your bad plan in
8.1, CVS HEAD doesn't fall into the trap.  I don't believe we've done
anything to fix the fundamental problem however --- it may just be a
side effect of the changes in the indexscan cost model that cause it
to not go for the bogus plan.

            regards, tom lane

Re: [HACKERS] Queries joining views

From

Alvaro Herrera

Date:

22 August 2006, 16:53:43

Tom Lane wrote:

> We might be able to do something about actually solving the statistical
> problem in 8.3, but I fear it's too late to think about it for 8.2.

I take it you mean you already have a very concrete idea on how to solve
it.  Come on, illuminate us poor dumb souls.

--
Alvaro Herrera                                http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

Re: [HACKERS] Queries joining views

From

Tom Lane

Date:

22 August 2006, 17:30:54

Alvaro Herrera <alvherre@commandprompt.com> writes:
> Tom Lane wrote:
>> We might be able to do something about actually solving the statistical
>> problem in 8.3, but I fear it's too late to think about it for 8.2.

> I take it you mean you already have a very concrete idea on how to solve
> it.  Come on, illuminate us poor dumb souls.

No, I don't :-( ... that was intended to suggest that we might think of
a solution given months to work on it rather than days.

            regards, tom lane

Re: Queries joining views

From

Alban Hertroys

Date:

23 August 2006, 09:08:16

Tom Lane wrote:
> Alban Hertroys <alban@magproductions.nl> writes:
>>> I'm confused too.  Would it be possible for you to send me a dump of
>>> your database?
>
>> Attached is a cleaned out database, the full schema is included, but
>> only the relevant tables contain any data.
>
> Thanks.  After digging through it a bit, I understand what's happening,
> but I'm not seeing any simple fix.  The problem is basically that
> you've got
>
> create or replace view mm_product as
>  SELECT ... FROM mm_product_table JOIN mm_object mmop USING (number);
>
> and then the problem query has WHERE mm_product.number = insrel.snumber
> which causes the planner to conclude that mm_product_table.number,
> mm_object.number, and mm_insrel_table.snumber are all basically
> interchangeable.  In particular it ends up performing the join between
> mm_product_table.number and mm_object.number as though
> mm_product_table.number were being joined to mm_insrel_table.snumber.

It's even worse, I guess, as the mm_insrel view joins mm_insrel_table
with mm_object again. So basically the query performs a self-join on
mm_object with a detour through mm_insrel_table and mm_product_table...

> Which is fine, except that it's thinking that the statistics for
> mm_object.number are applicable in this context, and they're completely
> misleading.  After the join to mm_insrel_table, the statistics of the
> variable are really like mm_insrel_table.number --- in particular the
> fraction of the table that has to be visited is much larger than it
> would've been for mm_object as a whole.

I don't entirely understand what you're saying here.

Mm_object is always larger than any other table in the database, as
every table joins with (different) records in it to determine it's otype
and owner. So I don't understand how a fraction of any of those tables
could be larger than mm_object as a whole...

In fact, originally the schema used inheritance; every table inherited
(directly or indirectly) from mm_object. As this resulted in unions,
which caused much more performance problems than the current
view-approach, I implemented the current approach.
In fact, this approach was lent from what MMBase uses for the MSSQL layer.

Well, as I implemented the way the views are defined, there is room for
changes in that area. Suggestions are welcome.

> This is a problem we've understood in a generic form for awhile:
> a join or selection might change the statistics of a variable,
> and so the info stored in the catalogs ought to be modified somehow
> to predict what will happen at upper join levels.  We've not seen
> it in this particular form before, though.
>
> I'm not sure if there's a whole lot you can do about it in the near term
> other than refactor your schema to avoid having different tables joining
> to different subranges of mm_object.number.  (You don't necessarily have
> to get rid of mm_object --- just try assigning its keys from a serial,
> or something, so that there's no correlation to the ranges of keys in
> other tables.)

Unfortunately the number key is required to correlate to the number keys
in other tables. That's the whole point of that table. It's also already
generated from a sequence...

I am looking at a view options at the moment:

1.) Cluster mm_object on an index over otype - I'm not sure how that
would influence the statistics; if it doesn't then this wouldn't change
much.

2.) Change mm_object into a view over the tables that now join with it.
I'll have to devise some way to get the otype and owner columns into the
other tables.

3.) An extension to option 2; Creating seperate tables, only containing
the relevant sections from mm_object, combining them into a view-version
of mm_object. Like this:

CREATE TABLE mm_product_object (
    number integer PRIMARY KEY,
    otype integer,
    owner text
);
CREATE TABLE mm_insrel_object (
    number integer PRIMARY KEY,
    otype integer,
    owner text
);

(I recall seeing an inheritance-like statement that makes copies of
table definitions - seems useful in this case)

CREATE OR REPLACE VIEW mm_object AS
    SELECT * FROM mm_product_object
    UNION ALL
    SELECT * FROM mm_insrel_object;

It remains to be seen that MMBase can handle mm_object being a view, but
  (if not) it probably will work if it's an updatable view.

I'm leaning to option 3, but if option 1 helps it saves me a lot of trouble.

> We might be able to do something about actually solving the statistical
> problem in 8.3, but I fear it's too late to think about it for 8.2.

Well, I had hoped for a suitable workaround, and I believe I may have a
few options now. Waiting for the next PostgreSQL release never really
was an option for us (deadline is somewhere next week). So it doesn't
really matter to us that there won't be a solution until 8.3, or maybe
even later.

Thanks for the help so far, glad to be able to point out an actual problem.

Regards,
--
Alban Hertroys
alban@magproductions.nl

magproductions b.v.

T: ++31(0)534346874
F: ++31(0)534346876
M:
I: www.magproductions.nl
A: Postbus 416
    7500 AK Enschede

// Integrate Your World //

Re: Queries joining views

From

Tom Lane

Date:

23 August 2006, 09:51:06

Alban Hertroys <alban@magproductions.nl> writes:
> Mm_object is always larger than any other table in the database, as
> every table joins with (different) records in it to determine it's otype
> and owner. So I don't understand how a fraction of any of those tables
> could be larger than mm_object as a whole...

No, I said a larger fraction, not a larger absolute number of tuples.
The problem is that because mm_product contains only very small values
of "number", a mergejoin looks like a great way to join it to mm_object:
only the first 5% of mm_object will need to be scanned.  The bug
consists in applying that 5% number to mm_insrel, for which it's not
correct.

            regards, tom lane