Thread: PostgreSQL index usage discussion.
We have had several threads about index usage, specifically when PostgreSQL has the choice of using one or not. There seems to be a few points of view: (1) The planner and statistics need to improve, so that erroneously using an index (or not) happens less frequently or not at all. (2) Use programmatic hints which allow coders specify which indexes are used during a query. (ala Oracle) (3) It is pretty much OK as-is, just use enable_seqscan=false in the query. My point of view is about this subject is one from personal experience. I had a database on which PostgreSQL would always (erroneously) choose not to use an index. Are my experiences typical? Probably not, but are experiences like it very common? I don't know, but we see a number "Why won't PostgreSQL use my index" messages to at least conclude that it happens every now and then. In my experience, when it happens, it is very frustrating. I think statement (1) is a good idea, but I think it is optimistic to expect that a statistical analysis of a table will contain enough information for all possible cases. Statement (2) would allow the flexibility needed, but as was pointed out, the hints may become wrong over time as characteristics of the various change. Statement (3) is not good enough because disabling sequential scans affect whole queries and sub-queries which would correctly not use an index would be forced to do so. My personal preference is that some more specific mechanism than enable_seqscan be provided for the DBA to assure an index is used. Working on the statistics and the planner is fine, but I suspect there will always be a strong argument for manual override in the exceptional cases where it will be needed. What do you all think? What would be a good plan of attack?
* mlw (markw@mohawksoft.com) [020424 18:51]: > > (2) Use programmatic hints which allow coders specify which indexes are used > during a query. (ala Oracle) We would certainly use this if it were available. Hopefully not to shoot ourselves in the foot, but for the rather common case of having a small set of important predefined queries that run over data sets that neither grow significantly nor change in characteristics (for example, a table of airline routes and fares, with a few million rows). We want to squeeze every last bit of performance out of certain queries, and we're willing to spend the time to verify that the manual tuning beats the planner. > What do you all think? What would be a good plan of attack? I dunno. If someone comes up with one that I can reasonably contribute to, I will. -Brad
On Thu, 2002-04-25 at 00:46, mlw wrote: > We have had several threads about index usage, specifically when PostgreSQL has > the choice of using one or not. > > There seems to be a few points of view: > > (1) The planner and statistics need to improve, so that erroneously using an > index (or not) happens less frequently or not at all. > > (2) Use programmatic hints which allow coders specify which indexes are used > during a query. (ala Oracle) > > (3) It is pretty much OK as-is, just use enable_seqscan=false in the query. > > My point of view is about this subject is one from personal experience. I had a > database on which PostgreSQL would always (erroneously) choose not to use an > index. Are my experiences typical? Probably not, but are experiences like it > very common? I have currently 2 databases that run with enable_seqscan=false to avoid choosing plans that take forever. > I don't know, but we see a number "Why won't PostgreSQL use my > index" messages to at least conclude that it happens every now and then. In my > experience, when it happens, it is very frustrating. > > I think statement (1) is a good idea, but I think it is optimistic to expect > that a statistical analysis of a table will contain enough information for all > possible cases. Perhaps we can come up with some special rules to avoid grossly pessimal plans. -------------------- Hannu
> > (2) Use programmatic hints which allow coders specify which indexes are used > during a query. (ala Oracle) > As I said before it would be useful a way to improve(not force) using indexes on particular queries, i.e. lowering the cost of using this index on this query. Regards
On Thu, 2002-04-25 at 08:42, Luis Alberto Amigo Navarro wrote: > > > > (2) Use programmatic hints which allow coders specify which indexes are > used > > during a query. (ala Oracle) > > > As I said before it would be useful a way to improve(not force) using > indexes on particular queries, i.e. lowering the cost of using this index on > this query. > Regards I was told that DB2 has per-table (or rather per-tablespace) knowledge of disk speeds, so keeping separate random and seqsqan costs for each table and index could be a good way here (to force use of a particular index make its use cheap) ------------ Hannu
> I was told that DB2 has per-table (or rather per-tablespace) knowledge > of disk speeds, so keeping separate random and seqsqan costs for each > table and index could be a good way here (to force use of a particular > index make its use cheap) > I was wondering something even easier, keeping 1 cost per index, 1 cost per seqscan, but being allowed to scale cost for each index on each query(recommended, null or unrecommended) Regards