Hypothetical suggestions for planner, indexing improvement - Mailing list pgsql-performance

From Josh Berkus
Subject Hypothetical suggestions for planner, indexing improvement
Date
Msg-id 200305051219.58955.josh@agliodbs.com
Whole thread Raw
Responses Re: [HACKERS] Hypothetical suggestions for planner, indexing improvement
List pgsql-performance
Folks,

An area in which postgresql planner & indexing could be improved have occurred
to me over the last week.    I'd like to share this ideas with you in case it
is worthy of the todo list.

Please excuse me if this issue is already dealt with in CVS; I've been unable
to keep up completely on HACKERS lately.   Please also excuse me if this
issue has been discussed and was tabled due to some theoretical limitation,
such as x^n scaling problems

THE IDEA:  The planner should keep statistics on the correlation of foreign
keys and apply them to the expected row counts for EXISTS clause limitations,
and possibly for other query types as well.

To illustrate:
Database "calendar" has two tables, events and event_days.
Event_days has FK on column event_id to parent table Events.
There is at lease one record in event_days for each record in events, and the
average parent-child relationship is 1 event -> 1.15 event_days records.

This query:
SELECT events.* FROM events
WHERE EXISTS (SELECT event_id FROM event_days
    WHERE event_day BETWEEN '2003-04-08' AND '2003-05-18');

Currently, (in 7.2.4 and 7.3.1) the planner makes the assumption that the
above EXISTS restriction will only filter events by 50% and makes other join
and execution plans accordingly.   In fact, it filters events by 96% and the
ideal execution plan should be quite different.

It would be really keen if planner statistics could be expanded to include
correlation on foriegn keys in order to make more intelligent planner
decisions on the above type of query possible.

Thanks for your attention!


--
-Josh Berkus
 Aglio Database Solutions
 San Francisco


pgsql-performance by date:

Previous
From: Josh Berkus
Date:
Subject: Re: Suggestions wanted for 7.2.4 query
Next
From: Josh Berkus
Date:
Subject: Re: Suggestions wanted for 7.2.4 query