Thread: automated index suggestor -- request for comment

automated index suggestor -- request for comment

From

johnnnnnn

Date:

12 December 2002, 22:18:04

The manual is pretty sparse on advice regarding indices. Plenty of
good feature documentation, but not much about when and where an index
is appropriate (except a suggestion that multi-column indices should
be avoided).

Of course, the ultimate arbiter of which indices are used is the
planner/optimizer. If i could somehow convince the optimizer to
consider indices that don't yet exist, it could tell me which would
give the greatest benefit should i add them.

So, i'm writing for two reasons. First, i want to gauge interest in
this tool. Is this something that people would find useful?

Second, i am looking to solicit some advice. Is this project even
feasible? If so, where would be the best place to start? My assumption
has been that i would need to hack into the current code for
determining index paths, and spoof it somehow, but is that possible
without actually creating the indices?

Any and all feedback welcome.

-johnnnnnnnnnn

Re: automated index suggestor -- request for comment

From

Hannu Krosing

Date:

13 December 2002, 05:11:48

On Fri, 2002-12-13 at 03:22, johnnnnnn wrote:
> The manual is pretty sparse on advice regarding indices. Plenty of
> good feature documentation, but not much about when and where an index
> is appropriate (except a suggestion that multi-column indices should
> be avoided).
>
> Of course, the ultimate arbiter of which indices are used is the
> planner/optimizer. If i could somehow convince the optimizer to
> consider indices that don't yet exist, it could tell me which would
> give the greatest benefit should i add them.

the generated index names should be self-explaining or else we would
have to change EXPLAIN output code as well, just to tell what the actual
index definition was.

That could become the EXPLAIN SPECULATE command ?

> So, i'm writing for two reasons. First, i want to gauge interest in
> this tool. Is this something that people would find useful?

Sure it would be helpful.

> Second, i am looking to solicit some advice. Is this project even
> feasible?

As tom recently wrote on this list, no statistics is _gathered_ base on
existence of indexes, so pretending that they are there should be
limited just to planner changes plus a way to tell the planner to do it.

> If so, where would be the best place to start? My assumption
> has been that i would need to hack into the current code for
> determining index paths, and spoof it somehow, but is that possible
> without actually creating the indices?

Either with or without real indexes, it's all just code ;)

In worst case you could generate the entries in pg_class table without
building the actual index and then drop or rollback when the explain is
ready.

Of course you could just determine all possibly useful indexes and
generate then anyhow an then drop them if they were not used ;)

--
Hannu Krosing <hannu@tm.ee>

Re: automated index suggestor -- request for comment

From

Tom Lane

Date:

13 December 2002, 09:50:37

Hannu Krosing <hannu@tm.ee> writes:
> That could become the EXPLAIN SPECULATE command ?

[ snicker... ]  Seriously, it wouldn't be hard to inject a slew of phony
index definitions into the planner to see what it comes up with.  You
just have to cons up an IndexOptInfo record, the planner will be none
the wiser.  The tricky part is deciding which indexes are even worth
expending planner cycles on.  ("Make 'em all" doesn't seem very
practical when you consider multi-column or functional indexes.)

Also, I don't see any reasonable way to automatically suggest partial
indexes; certainly not on the basis of individual queries.

The big boys approach this sort of problem with "workload analysis"
tools, which start from a whole collection of sample queries not just
one.  I don't think EXPLAIN applied to individual queries can hope to
produce similarly useful results.

            regards, tom lane

Re: automated index suggestor -- request for comment

From

Hannu Krosing

Date:

13 December 2002, 10:07:18

I cc'b back to list, hope this is ok?

On Fri, 2002-12-13 at 14:56, george young wrote:
> On 13 Dec 2002 12:05:36 +0000
> Hannu Krosing <hannu@tm.ee> wrote:
>
> > On Fri, 2002-12-13 at 03:22, johnnnnnn wrote:
> >
> > In worst case you could generate the entries in pg_class table without
> > building the actual index and then drop or rollback when the explain is
> > ready.
> >
> --> Of course you could just determine all possibly useful indexes and <--
> --> generate then anyhow an then drop them if they were not used ;)    <--
>
> Why not!  At least for selects, this seems like the ideal.  For insert
> and update, you have to deal with updating the superfluous indexes --
> does the planner include index updating in its work estimates?

Probably not -  the work should be almost the same (modulo cached status
of index pages) for any plan.

At least I think we don't optimize the plan for different index access
patterns for updating indexes.

> > For queries
> that use functions in the where clause, you'd have to parse enough to know
> to include indexes on the functions (I know-- the last time I said "all I
> have to do is parse ..." I was really sorry later...).
--
Hannu Krosing <hannu@tm.ee>

Re: automated index suggestor -- request for comment

From

johnnnnnn

Date:

13 December 2002, 10:17:01

On Fri, Dec 13, 2002 at 09:49:53AM -0500, Tom Lane wrote:
> Hannu Krosing <hannu@tm.ee> writes:
> > That could become the EXPLAIN SPECULATE command ?
>
> [ snicker... ]  Seriously, it wouldn't be hard to inject a slew of
> phony index definitions into the planner to see what it comes up
> with.  You just have to cons up an IndexOptInfo record, the planner
> will be none the wiser.

That's good news. The easier it is, the more likely i am to actually
get it working and available to people.

> The tricky part is deciding which indexes are even worth expending
> planner cycles on.  ("Make 'em all" doesn't seem very practical when
> you consider multi-column or functional indexes.)

Agreed. But for a first development iteration, "Make 'em all" could
certainly include the combinatorial explosion of all single- and
multi-column indices. It might be slow as a dog, but it would exist.

> The big boys approach this sort of problem with "workload analysis"
> tools, which start from a whole collection of sample queries not
> just one.  I don't think EXPLAIN applied to individual queries can
> hope to produce similarly useful results.

Again, agreed. My intent was to start with something simple which
could only deal with one query at a time, and then build a more robust
tool from that point.

That said, i wasn't planning on grafting onto the EXPLAIN syntax, but
rather creating a new SUGGEST command, which could take a query or
eventually a workload file. The other option was to decouple it from
pg proper and have an independent application to live in contrib/ or
gborg.

-johnnnnnnnnnnn

Re: automated index suggestor -- request for comment

From

johnnnnnn

Date:

13 December 2002, 10:34:12

On Fri, Dec 13, 2002 at 05:00:32PM +0000, Hannu Krosing wrote:
> On Fri, 2002-12-13 at 14:56, george young wrote:
> > > Of course you could just determine all possibly useful indexes
> > > and generate then anyhow an then drop them if they were not used
> > > ;)
> >
> > Why not!  At least for selects, this seems like the ideal.  For
> > insert and update, you have to deal with updating the superfluous
> > indexes -- does the planner include index updating in its work
> > estimates?

Well, i had a few reasons i didn't want to *actually* create the
indices:

1- Disk space. If it's evaluating all indices, including multi-column
indices, that ends up being a significant space drain.

2- Time. Creating indices can take a while for big tables (again,
moreso for multi-column indices).

3- Usability on running systems. If i can eliminate actual index
creation, it won't tie up disk access on systems that are already
dealing with high load.

> At least I think we don't optimize the plan for different index
> access patterns for updating indexes.

I don't think that's the case either, which makes it more difficult to
estimate negative cost of index creation. Not sure how i'll deal with
that except by (for now) ignoring it.

-johnnnnnnnnnnn

Re: automated index suggestor -- request for comment

From

Ron Johnson

Date:

14 December 2002, 14:47:27

On Thu, 2002-12-12 at 21:22, johnnnnnn wrote:
> The manual is pretty sparse on advice regarding indices. Plenty of
> good feature documentation, but not much about when and where an index
> is appropriate (except a suggestion that multi-column indices should
> be avoided).
>
> Of course, the ultimate arbiter of which indices are used is the
> planner/optimizer. If i could somehow convince the optimizer to
> consider indices that don't yet exist, it could tell me which would
> give the greatest benefit should i add them.
>
> So, i'm writing for two reasons. First, i want to gauge interest in
> this tool. Is this something that people would find useful?
>
> Second, i am looking to solicit some advice. Is this project even
> feasible? If so, where would be the best place to start? My assumption
> has been that i would need to hack into the current code for
> determining index paths, and spoof it somehow, but is that possible
> without actually creating the indices?

Isn't this what a DBA (or, heck, even a modestly bright developer)
does during transactional analysis?

You know what the INSERTs and statements-that-have-WHERE-clauses
are, and, hopefully, approximately how often per day (or week)
each should execute.

Then, *you* make the decision about which single-key and multi-key
indexes should be created, based upon
a) the cardinality of each table
b) the frequency each query (includes UPDATE & DELETE) is run
c) how often INSERT statements occur

Thus, for example, an OLTP database will have a significantly
different mix of indexes than, say, a "reporting" database...
--
+---------------------------------------------------------------+
| Ron Johnson, Jr.        mailto:ron.l.johnson@cox.net          |
| Jefferson, LA  USA      http://members.cox.net/ron.l.johnson  |
|                                                               |
| "My advice to you is to get married: If you find a good wife, |
| you will be happy; if not, you will become a philosopher."    |
|    Socrates                                                   |
+---------------------------------------------------------------+