Thread: estemated number of rows and optimiser effeciency

estemated number of rows and optimiser effeciency

From
salah jubeh
Date:
Hello,

I know that the optimizer, uses the number of rows to calculate the total query  execution time.  In complex queries, it is very difficult to know the number of rows in the expected result. This certainly affects the optimizer to very great extent.  I have a view and it should return around 5.5 million rows ; at the beginning postgresql  used to execute this query in around 12 minutes. After running vacuum analyze; the query execution time dropped to 1.5 minutes. Still, I think this query time could be executed in around 40-50 second.  Before and after running vacuum Analyze,  the number of expected rows was 1600 and 6500 respectively .  By comparing 5.5 million rows (real result)  and 6500 rows,  and 1600 rows (expected results) ,  one can observe how much this could affect the optimizer plans.

 I am wondering, why the historical data (real result of the query) does not affect the execution plan. For example, If I ran the query 10 times I always get around 6500 instead of 5.5 million in the top most execution plan node.

Regards

Re: estemated number of rows and optimiser effeciency

From
"David Johnston"
Date:
>>  I am wondering, why the historical data (real result of the query) does
not affect the execution plan. For example, If I ran the query 10 times I
always get around 6500 instead of 5.5 million in the top most execution plan
node.

The likely accurate, if cheeky, answer is simply that no one has taken the
time to implement what at face value would seem to be an extremely complex
behavior.


As to the rest you should probably gather up some additional details and
post this over on performance.  There is likely a middle ground between your
idea and the status quo but that ground cannot be evaluated unless examples
of possibly misbehaving queries are provided.

David J.




Re: estemated number of rows and optimiser effeciency

From
salah jubeh
Date:
Hello David,

Thanks for the quick response, I will follow up with the performance  group after preparing some case scenarios.

Regards


From: David Johnston <polobo@yahoo.com>
To: 'salah jubeh' <s_jubeh@yahoo.com>; 'pgsql' <pgsql-general@postgresql.org>
Sent: Wednesday, September 26, 2012 4:41 PM
Subject: RE: [GENERAL] estemated number of rows and optimiser effeciency

>>  I am wondering, why the historical data (real result of the query) does
not affect the execution plan. For example, If I ran the query 10 times I
always get around 6500 instead of 5.5 million in the top most execution plan
node.

The likely accurate, if cheeky, answer is simply that no one has taken the
time to implement what at face value would seem to be an extremely complex
behavior.


As to the rest you should probably gather up some additional details and
post this over on performance.  There is likely a middle ground between your
idea and the status quo but that ground cannot be evaluated unless examples
of possibly misbehaving queries are provided.

David J.