Home > mailing lists

Inheritance, unique keys and performance - Mailing list pgsql-performance

From	Julian Scarfe
Subject	Inheritance, unique keys and performance
Date	January 12, 2008 15:34:32
Msg-id	01a301c85552$24ab6270$0600a8c0@Wilbur Whole thread Raw
Responses	Re: Inheritance, unique keys and performance
List	pgsql-performance

Tree view

Useing 8.1.9 I'm running into some performance issues with inheritance.
I've abstracted the situation below because otherwise there's lots of
spurious stuff, but I'll gladly provide the real EXPLAIN ANALYZE output if
necessary.

I have a superclass and a dozen subclasses, of which I've picked one as an
example here.  The 'point' type in the class is not, as far as I can see,
relevant to the issue -- it could be any type.

create table superclass (
  id   integer PRIMARY KEY,
  node point,
  ...
)

create table subclass (
...
) INHERITS (superclass);
CREATE UNIQUE INDEX subclass_id ON subclass USING btree (id);

create table some_table (
  node point,
  ...
)

I perform a query on some_table using an expensive function with two of its
three point parameters looked up in the subclass (as an id -> node lookup
table). The first two point parameters to expensive_function are effectively
constants looked up once. I can structure it using scalar subqueries q1, or
as a join q2. Both are quick, the join is a little quicker.

-- q1 scalar subqueries
select *
  from some_table
  where
  and expensive_function(
    (select node from subclass where id = 101),
    (select node from subclass where id = 102),
    some_table.node);

-- q2 join
select *
  from some_table, subclass g1, subclass g2
  where expensive_function(g1.node, g2.node, some_table.node)
and  g1.id = 101
and  g2.id = 102;

Now what if I use the superclass?  The scalar subquery strategy q3 is fine.
The result of the subquery is unique because it's a scalar subquery, and the
planner knows that:

-- q3 scalar subqueries using superclass
select *
  from some_table
  where
  and expensive_function(
    (select node from superclass where id = 101),
    (select node from superclass where id = 102),
    some_table.node);

But the join q4 is a disaster.

-- q4 join join using superclass
select *
  from some_table, superclass g1, superclass g2
  where expensive_function(g1.node, g2.node, some_table.node)
and  g1.id = 101
and  g2.id = 102;

And I *think* I can see why -- I hope I'm not trying to be too smart here ;)
: superclass.id is not guaranteed to be unique, and the planner must cater
for multiple rows where g1.id = 101, and multiple rows where g2.id = 102
across the dozen tables comprising superclass.  So it picks a different
strategy involving sequential scans of all the superclass tables (even
though they have been ANALYZED) which is 100 times slower.

So the scalar-subqueries method is the only one I can use for the
superclass.  That's all very well as a workaround, but what I really want to
do is a further join.  Here are the queries using the subclass.

create table other_table (
  route       integer,
  leg_no     integer,
  start_id    integer,
  end_id     integer
)


-- q5 scalar subqueries
select some_table.*
  from some_table, other_table
  where
  and expensive_function(
    (select node from subclass where id = start_id),
    (select node from subclass where id = end_id),
    some_table.node)
  and other_table.route = 1;

-- q6 join
select some_table.*
  from some_table, other_table, subclass g1, subclass g2
  where expensive_function(g1.node, g2.node, some_table.node)
  and other_table.route = 1
  and other_table.start_id = g1.id
  and other_table.end_id   = g2.id;

When I test this on the subclass, as the "route" acquires more and more
"legs", the join q6 outperforms q5 by more and more.

-- q7 join
select some_table.*
  from some_table, other_table, superclass g1, superclass g2
  where expensive_function(g1.node, g2.node, some_table.node)
  and other_table.route = 1
  and other_table.start_id = g1.id
  and other_table.end_id   = g2.id;

So is there some way I can hint to the planner in q7 that superclass.id is
unique and that all it has to do is use superclass as an id -> node lookup
table?

Thanks

Julian

pgsql-performance by date:

From: "D'Arcy J.M. Cain"
Date: 12 January 2008, 02:01:04
Subject: Re: Best way to index IP data?

From: Tom Lane
Date: 12 January 2008, 16:35:54
Subject: Re: Inheritance, unique keys and performance

Inheritance, unique keys and performance - Mailing list pgsql-performance

Previous

Next