Thread: Hypothetical Indexes
Hello, I have just finished my M.Sc. thesis at Pontifícia Universidade Católica do Rio de Janeiro (PUC-Rio), Brazil, and we have developed a research prototype for automatic index selection using a software agent in PostgreSQL. In order to make index selection possible, we have extended the PostgreSQL DBMS to allow the simulation of hypothetical indexes. We believe these server extensions may be of value for addition to the PostgreSQL code base. A description of the work we have been doing is available at: http://www.inf.puc-rio.br/~postgresql/ There you will find a link to a tutorial based description of the hypothetical indexes feature we have implemented on PostgreSQL 7.4 beta 3. I am aware that we shouldn't go on working on things that are not on the TODO list, but we had a research oriented focus. At the end of the day, it turned out that some of the things we coded might be useful to the PostgreSQL community and we would be happy if we can contribute. So, what do you people think? Should we work together to make this new feature available on newer PostgreSQL versions? Thanks in advance for your attention, Marcos Salles. P.S: An interesting consequence of having hypothetical indexes in the system is that this eases the implementation of index tuning tools (e.g. the Microsoft SQL Server's Index Tuning Wizard). We are planning to implement one such tool to the PostgreSQL database (at first as a command-line utility).
It seems to me to be a very valuable and interesting idea. (IBM has something similar) Probably, for production you do not want your optimizer puzzling about possible indexes. However, it would be a very valuabletool for the test environment. I can easily imagine a setting that allows the feature to be turned on or off andthe suggestions could (of course) be very valuable. Do you have a paper describing your work? I would like to read it. > -----Original Message----- > From: pgsql-hackers-owner@postgresql.org > [mailto:pgsql-hackers-owner@postgresql.org] On Behalf Of > Marcos A Vaz Salles > Sent: Wednesday, October 06, 2004 7:40 PM > To: pgsql-hackers@postgresql.org > Cc: Sergio Lifschitz; Maira Ferreira de Noronha; msalles@acm.org > Subject: [HACKERS] Hypothetical Indexes > > > Hello, > > I have just finished my M.Sc. thesis at Pontifícia > Universidade Católica do Rio de Janeiro (PUC-Rio), Brazil, > and we have developed a research prototype for automatic > index selection using a software agent in PostgreSQL. In > order to make index selection possible, we have extended the > PostgreSQL DBMS to allow the simulation of hypothetical > indexes. We believe these server extensions may be of value > for addition to the PostgreSQL code base. > > A description of the work we have been doing is available at: > http://www.inf.puc-rio.br/~postgresql/ There you will find a link to a tutorial based description of the hypothetical indexes feature we have implemented on PostgreSQL7.4 beta 3. I am aware that we shouldn't go on working on things that are not on the TODO list, but we had a research oriented focus.At the end of the day, it turned out that some of the things we coded might be useful to the PostgreSQL community andwe would be happy if we can contribute. So, what do you people think? Should we work together to make this new feature available on newer PostgreSQL versions? Thanks in advance for your attention, Marcos Salles. P.S: An interesting consequence of having hypothetical indexes in the system is that this eases the implementation of indextuning tools (e.g. the Microsoft SQL Server's Index Tuning Wizard). We are planning to implement one such tool to thePostgreSQL database (at first as a command-line utility). ---------------------------(end of broadcast)--------------------------- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Marcos A Vaz Salles <msalles@gmail.com> writes: > In order to make index selection possible, we > have extended the PostgreSQL DBMS to allow the simulation of > hypothetical indexes. We believe these server extensions may be of > value for addition to the PostgreSQL code base. This would be of some value if the optimizer's cost estimates were highly reliable, but unfortunately they are far from being so :-( Without the ability to measure *actual* as opposed to estimated costs, I'm not sure you can really do much. regards, tom lane
Marcos, > http://www.inf.puc-rio.br/~postgresql/ > > There you will find a link to a tutorial based description of the > hypothetical indexes feature we have implemented on PostgreSQL 7.4 > beta 3. I would love to see this as an add-in project on pgFoundry. Particularly since your online "tutorial" only works in Internet Explorer, so I can't read it. I believe that there was/is a team exploring a set of utilities to produce database optimization "hints" for the admin. Your idea would dovetail nicely with that. -- --Josh Josh Berkus Aglio Database Solutions San Francisco
Tom, > > This would be of some value if the optimizer's cost estimates were > highly reliable, but unfortunately they are far from being so :-( > Without the ability to measure *actual* as opposed to estimated > costs, I'm not sure you can really do much. > In fact, any index selection tool that uses hypothetical indexes will not recommend indexes that the optimizer does not recognize as good enough for the query. The bright side of this is that the better the optimizer gets, the better are the recommendations made by the index selection tool. And people tend to invest resources in constructing good query optimizers. Other index selection tools, such as Microsoft's and IBM's, have the same limitation. Even though, the tools are useful for people that have to deal with databases with a big quantity of tables and queries. Finding useful indexes in this kind of setting is a difficult problem for DBAs. So, our point is that hypothetical indexes just have to be as well estimated by the optimizer as conventional, real indexes. An index not suggested by the optimizer might still be usable, but that would require rewriting the query or using hints, things that need the intervention of a more skilled DBA anyway. Best regards, Marcos. > regards, tom lane >
On 10/12/2004 4:02 PM Tom Lane could be overheard saying:: >Marcos A Vaz Salles <msalles@gmail.com> writes: > > >>In order to make index selection possible, we >>have extended the PostgreSQL DBMS to allow the simulation of >>hypothetical indexes. We believe these server extensions may be of >>value for addition to the PostgreSQL code base. >> >> > >This would be of some value if the optimizer's cost estimates were >highly reliable, but unfortunately they are far from being so :-( >Without the ability to measure *actual* as opposed to estimated >costs, I'm not sure you can really do much. > > Is it possible for the backend to store performance data and try to modify its cost estimates? I was thinking of the statistics analyzer currently in use and whether (query) performance data could piggy back on it or if it would need a different process, something that could try and correlate cost estimates with actual costs. Given sample data the calculation could result in an actual execution time estimation. Would it be worth pursuing, or would it be too invasive? I know the autotune project has similar goals from a different angle, system/memory settings. Thomas
Hello Josh, I will take a look at pgFoundry and register a new project for index selection. I will also look for other projects there that we may help somehow. About the tutorial, I will send you the presentation we used to generate the web pages so that you can see it with OpenOffice. ;) Thanks for your feedback, Marcos. On Wed, 13 Oct 2004 11:13:55 -0700, Josh Berkus <josh@agliodbs.com> wrote: > Marcos, > > > http://www.inf.puc-rio.br/~postgresql/ > > > > There you will find a link to a tutorial based description of the > > hypothetical indexes feature we have implemented on PostgreSQL 7.4 > > beta 3. > > I would love to see this as an add-in project on pgFoundry. Particularly > since your online "tutorial" only works in Internet Explorer, so I can't read > it. > > I believe that there was/is a team exploring a set of utilities to produce > database optimization "hints" for the admin. Your idea would dovetail > nicely with that. > > -- > --Josh > > Josh Berkus > Aglio Database Solutions > San Francisco >