Thread: Re: [postgis-users] Is my query planner failing me, or vice versa?

Re: [postgis-users] Is my query planner failing me, or vice versa?

From
"Gregory S. Williamson"
Date:
Well, what does the random_page_cost do internally ?

I don't think I'd expect postgres to be able to *do* anything in particular, any more than I would expect it to "do"
somethingabout slow disk I/O or having limited cache. But it might be useful to the EXPLAIN ANALYZE in estimating costs
ofretrieving such data.  

Admittedly, this is not as clear as wanting a sequential scan in preference to indexed reads when there are either very
fewrows or a huge number, but it strikes me as useful to me the DBA to have this factoid thrust in front of me when
consideringwhy a given query is slower than I might like. Perhaps an added time based on this factor and the
random_page_costvalue, since lots of TOAST data and a high access time would indicate to my (ignorant!) mind that
retrievalwould be slower, especially over large data sets. 

Forgive my ignorance ... obviously I am but a humble user. grin.

G

-----Original Message-----
From:    Tom Lane [mailto:tgl@sss.pgh.pa.us]
Sent:    Wed 12/14/2005 9:36 PM
To:    Gregory S. Williamson
Cc:    pgsql-performance@postgresql.org; PostGIS Users Discussion
Subject:    Re: [PERFORM] [postgis-users] Is my query planner failing me, or vice versa?
"Gregory S. Williamson" <gsw@globexplorer.com> writes:
> Forgive the cross-posting, but I found myself wondering if might not
> be some way future way of telling the planner that a given table
> (column ?) has a high likelyhood of being TOASTed.

What would you expect the planner to do with the information, exactly?

We could certainly cause ANALYZE to record some estimate of this, but
I'm not too clear on what happens after that...

            regards, tom lane

!DSPAM:43a100d6285261205920220!





Re: [postgis-users] Is my query planner failing me,or vice versa?

From
"Mark Cave-Ayland"
Date:
> -----Original Message-----
> From:    Tom Lane [mailto:tgl@sss.pgh.pa.us]
> Sent:    Wed 12/14/2005 9:36 PM
> To:    Gregory S. Williamson
> Cc:    pgsql-performance@postgresql.org; PostGIS Users Discussion
> Subject:    Re: [PERFORM] [postgis-users] Is my query planner failing
me,
> or vice versa?
> "Gregory S. Williamson" <gsw@globexplorer.com> writes:
> > Forgive the cross-posting, but I found myself wondering if might not
> > be some way future way of telling the planner that a given table
> > (column ?) has a high likelyhood of being TOASTed.
>
> What would you expect the planner to do with the information, exactly?
>
> We could certainly cause ANALYZE to record some estimate of this, but
> I'm not too clear on what happens after that...
>
>             regards, tom lane
>
>
> -----Original Message-----
> From: postgis-users-bounces@postgis.refractions.net [mailto:postgis-users-
> bounces@postgis.refractions.net] On Behalf Of Gregory S. Williamson
> Sent: 15 December 2005 12:03
> To: Tom Lane
> Cc: pgsql-performance@postgresql.org; PostGIS Users Discussion
> Subject: RE: [PERFORM] [postgis-users] Is my query planner failing me,or
> vice versa?
>
> Well, what does the random_page_cost do internally ?
>
> I don't think I'd expect postgres to be able to *do* anything in
> particular, any more than I would expect it to "do" something about slow
> disk I/O or having limited cache. But it might be useful to the EXPLAIN
> ANALYZE in estimating costs of retrieving such data.
>
> Admittedly, this is not as clear as wanting a sequential scan in
> preference to indexed reads when there are either very few rows or a huge
> number, but it strikes me as useful to me the DBA to have this factoid
> thrust in front of me when considering why a given query is slower than I
> might like. Perhaps an added time based on this factor and the
> random_page_cost value, since lots of TOAST data and a high access time
> would indicate to my (ignorant!) mind that retrieval would be slower,
> especially over large data sets.
>
> Forgive my ignorance ... obviously I am but a humble user. grin.
>
> G


As I understood from the original discussions with Markus/Tom, the problem
was that the optimizer didn't consider the value of the VacAttrStats
stawidth value when calculating the cost of a sequential scan. I don't know
if this is still the case though - Tom will probably have a rough idea
already whereas I would need to spend some time sifting through the source.

However, I do know that the PostGIS statistics collector does store the
average detoasted geometry size in stawidth during ANALYZE so the value is
there if it can be used.


Kind regards,

Mark.

------------------------
WebBased Ltd
17 Research Way
Plymouth
PL6 8BT

T: +44 (0)1752 797131
F: +44 (0)1752 791023

http://www.webbased.co.uk
http://www.infomapper.com
http://www.swtc.co.uk

This email and any attachments are confidential to the intended recipient
and may also be privileged. If you are not the intended recipient please
delete it from your system and notify the sender. You should not copy it or
use it for any purpose nor disclose or distribute its contents to any other
person.