Inheritance, unique keys and performance - Mailing list pgsql-performance
From | Julian Scarfe |
---|---|
Subject | Inheritance, unique keys and performance |
Date | |
Msg-id | 01a301c85552$24ab6270$0600a8c0@Wilbur Whole thread Raw |
Responses |
Re: Inheritance, unique keys and performance
|
List | pgsql-performance |
Useing 8.1.9 I'm running into some performance issues with inheritance. I've abstracted the situation below because otherwise there's lots of spurious stuff, but I'll gladly provide the real EXPLAIN ANALYZE output if necessary. I have a superclass and a dozen subclasses, of which I've picked one as an example here. The 'point' type in the class is not, as far as I can see, relevant to the issue -- it could be any type. create table superclass ( id integer PRIMARY KEY, node point, ... ) create table subclass ( ... ) INHERITS (superclass); CREATE UNIQUE INDEX subclass_id ON subclass USING btree (id); create table some_table ( node point, ... ) I perform a query on some_table using an expensive function with two of its three point parameters looked up in the subclass (as an id -> node lookup table). The first two point parameters to expensive_function are effectively constants looked up once. I can structure it using scalar subqueries q1, or as a join q2. Both are quick, the join is a little quicker. -- q1 scalar subqueries select * from some_table where and expensive_function( (select node from subclass where id = 101), (select node from subclass where id = 102), some_table.node); -- q2 join select * from some_table, subclass g1, subclass g2 where expensive_function(g1.node, g2.node, some_table.node) and g1.id = 101 and g2.id = 102; Now what if I use the superclass? The scalar subquery strategy q3 is fine. The result of the subquery is unique because it's a scalar subquery, and the planner knows that: -- q3 scalar subqueries using superclass select * from some_table where and expensive_function( (select node from superclass where id = 101), (select node from superclass where id = 102), some_table.node); But the join q4 is a disaster. -- q4 join join using superclass select * from some_table, superclass g1, superclass g2 where expensive_function(g1.node, g2.node, some_table.node) and g1.id = 101 and g2.id = 102; And I *think* I can see why -- I hope I'm not trying to be too smart here ;) : superclass.id is not guaranteed to be unique, and the planner must cater for multiple rows where g1.id = 101, and multiple rows where g2.id = 102 across the dozen tables comprising superclass. So it picks a different strategy involving sequential scans of all the superclass tables (even though they have been ANALYZED) which is 100 times slower. So the scalar-subqueries method is the only one I can use for the superclass. That's all very well as a workaround, but what I really want to do is a further join. Here are the queries using the subclass. create table other_table ( route integer, leg_no integer, start_id integer, end_id integer ) -- q5 scalar subqueries select some_table.* from some_table, other_table where and expensive_function( (select node from subclass where id = start_id), (select node from subclass where id = end_id), some_table.node) and other_table.route = 1; -- q6 join select some_table.* from some_table, other_table, subclass g1, subclass g2 where expensive_function(g1.node, g2.node, some_table.node) and other_table.route = 1 and other_table.start_id = g1.id and other_table.end_id = g2.id; When I test this on the subclass, as the "route" acquires more and more "legs", the join q6 outperforms q5 by more and more. -- q7 join select some_table.* from some_table, other_table, superclass g1, superclass g2 where expensive_function(g1.node, g2.node, some_table.node) and other_table.route = 1 and other_table.start_id = g1.id and other_table.end_id = g2.id; So is there some way I can hint to the planner in q7 that superclass.id is unique and that all it has to do is use superclass as an id -> node lookup table? Thanks Julian
pgsql-performance by date: