Home > mailing lists

Optimizer improvements: to do or not to do? - Mailing list pgsql-hackers

From	Say42
Subject	Optimizer improvements: to do or not to do?
Date	September 11, 2006 10:20:37
Msg-id	1157980821.124341.269020@d34g2000cwd.googlegroups.com Whole thread Raw
Responses	Re: Optimizer improvements: to do or not to do?
List	pgsql-hackers

Tree view

I intend to play with some optimizer aspects. Just for fun. I'm a
novice in the DBMS development so I can not promise any available
results but if it can be useful even as yet another failed attempt I
will try.

That's what I want to do:
1. Replace not very useful indexCorrelation with indexClustering.
2. Consider caching of inner table in a nested loops join during
estimation total cost of the join.

More details:
1. During analyze we have sample rows. For every N-th sample row we can
scan indices on qual like 'value >= index_first_column' and fetch first
N row TIDs. To estimate count of fetched heap pages is not hard. To
take the index clustering value just divide the pages count by the
sample rows count.
2. It's more-more harder and may be impossible to me at all. The main
ideas:
- split page fetches cost and CPU cost into different variables and
don't summarize it before join estimation.
- final path cost estimation should be done in the join cost estimation
and take into account number of inner table access (=K). CPU cost is
directly proportionate to K but page fetches can be estimated by
Mackert and Lohman formula using the total tuples count (K *
inner_table_selectivity * inner_table_total_tuples).

Any thoughts?

pgsql-hackers by date:

From: Markus Schaber
Date: 11 September 2006, 08:58:39
Subject: Re: Fixed length data types issue

From: Gregory Stark
Date: 11 September 2006, 10:22:14
Subject: Re: Fixed length data types issue

Optimizer improvements: to do or not to do? - Mailing list pgsql-hackers

Previous

Next