Greetings!
Does anybody know how well the optimizer works when dealing with inherited
tables? I am currently using 8.0.1.
I have a table called eventlog.record_main, and a number of inherited
tables to partition the data (called
eventlog_partition._<day>__record_main). <luid> is the primary key (all
tables have this indexed via the primary key).
The reason for doing this is that a single table would be way too big
(there are on average 6-7 million rows per table) so that vacuum and
deletes would be inefficient. Inserting has been much more efficient using
this mechanism.
When I try the following query, I get sequential scans:
explain select * from eventlog.record_main order by luid limit 5;
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------
Limit (cost=160800332.75..160800332.77 rows=5 width=92)
-> Sort (cost=160800332.75..161874465.60 rows=429653138 width=92)
Sort Key: eventlog.record_main.luid
-> Result (cost=0.00..11138614.37 rows=429653138 width=92)
-> Append (cost=0.00..11138614.37 rows=429653138 width=92)
-> Seq Scan on record_main (cost=0.00..0.00 rows=1 width=92)
-> Seq Scan on _20050401__record_main record_main (cost=0.00..223029.64 rows=8620164 width=92)
-> Seq Scan on _20050402__record_main record_main (cost=0.00..201536.46 rows=7789446 width=92)
-> Seq Scan on _20050403__record_main record_main (cost=0.00..211277.34 rows=8165934 width=92)
-> Seq Scan on _20050404__record_main record_main (cost=0.00..219381.70 rows=8479170 width=92)
-> Seq Scan on _20050405__record_main record_main (cost=0.00..226305.94 rows=8746794 width=92)
(and so on)
Yet, when I run the query on a single table, I get index usage:
explain select * from eventlog_partition._20050601__record_main order by luid limit 5;
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------
Limit (cost=0.00..0.15 rows=5 width=92)
-> Index Scan using _20050601__record_main_pkey on _20050601__record_main (cost=0.00..163375.23 rows=5460021
width=92)
(2 rows)
This means that any query that limits the rows will run extremely
inefficiently. Given a limit of 5, at most only 5 rows need to be
considered in each partition sub-table, so an optimal plan would run
a sub-query in each table limited to 5 rows, and then merge the results.
Any ideas/fixes/patches?
Regards!
Ed