Re: Operator Classes and ANALYZE - Mailing list pgsql-bugs
From | Rusty Conover |
---|---|
Subject | Re: Operator Classes and ANALYZE |
Date | |
Msg-id | 77971F3B-C24E-4155-8B8B-09F2AE0C0903@infogears.com Whole thread Raw |
In response to | Re: Operator Classes and ANALYZE ("Simon Riggs" <simon@2ndquadrant.com>) |
Responses |
Re: Operator Classes and ANALYZE
|
List | pgsql-bugs |
On Nov 6, 2006, at 1:53 PM, Simon Riggs wrote: > On Fri, 2006-11-03 at 15:16 -0700, Rusty Conover wrote: > >> It doesn't appear that ANALYZE uses the specified operator class for >> producing statistics on an index when that operator class is not the >> default for the data type. This appears to be leading to poor query >> planning. > >> For speed of indexing a few million urls I'm indexing them with a >> custom data type. The data type aptly named "urlhash" has: > > Have you read the CREATE TYPE man page, specifically with regard to > the > analyze_function clause? Basically, if you want anything different, > you > have to write an analysis function yourself. This is what PostGIS > (www.postgis.org) does, if you want to look for specific code > examples. I have read it and works great, because it says it uses the default operators = and < to build the statistics which should work great for the default operator class. I don't think I need to write my own analyze function because the default behavior works so well. I just want the ANALYZE call to use the index's opclass definitions of = and < if the index is created with a custom operator class that is not the default for the data type. > > ANALYZE collects stats for tables, not indexes, using the default > operator class for the datatype. So even though you've clearly > specified > an opclass for the index, no stats will be collected using it. > I don't think thats correct because, with indexes based on functions there are statistics that clearly can't be gathered just directly from the table. Analyze does look at indexes to build up the histogram statistics and correlation. See compute_index_stats in analyze.c. > Alternatively, perhaps you have fallen foul of this situation? > > /* > * Can't analyze if the opclass uses a storage type > * different from the expression result type. We'd get > * confused because the type shown in pg_attribute for > * the index column doesn't match what we are getting > * from the expression. Perhaps this can be fixed > * someday, but for now, punt. > */ > > It's in the analyze.c code, but not in the docs. > > Be interested in a full report of your research, once you're done. > I don't think this applies because the storage type is the same its just a separate operator class being used. I think the problem may be with the "ordering_oper" and "equality_oper" only being passed the type rather then the type and the operator class that's being used for the index. Looking more it goes back into typcache.c which only caches the default operators from the default btree or hash operator classes. Changing this looks like it would require a bit of effort if it is the right path to pursue. Thanks, Rusty -- Rusty Conover InfoGears Inc. Web: http://www.infogears.com
pgsql-bugs by date: