Suggestion: Helping the optimizer - Mailing list pgsql-hackers
From | Hans-Jürgen Schönig |
---|---|
Subject | Suggestion: Helping the optimizer |
Date | |
Msg-id | 3DA67BF9.6080504@cybertec.at Whole thread Raw |
List | pgsql-hackers |
I guess we had this discussion before but I have just gone through the general list and I have encountered a problem I had a least VERY often before. Sometimes the planner does not find the best way through a query. Looking at the problem of query optimization it is pretty obvious that things like that can happen. The planner is a wonderful piece of software and I have a high esteem of people working on it. In some cases the planner fails because it is impossible to optimize every query coming along - this is a natural thing. In case of very complex SQL statements it would be wonderful to have a command which allows the user to turn an INDEX on or off temporarily. This would solve 90% of all problems people have with the planner. People say that 10% of all queries cause 90% of the load. If we could help those 10% we could gain A LOT of performance with very little effort. Improving other things help of lot as well but in some cases the planner decides whether a query can be done or not. YES/NO is a much bigger problem than 5% faster or not. Just have a look at a query like that: $database->dbi_select("SELECT a.code, b.code, t_gruppe.id, t_strukturtyp.id, t_struktur.id,t_struktur.oid FROM t_master, t_struktur, t_strukturtyp, t_gruppenelement, t_gruppe, t_text AS a, t_text AS b, t_betriebdetail, t_strukturbetrieb WHERE t_master.master_id = '$sportort' AND t_master.slave_id= t_struktur.id AND t_struktur.typid = t_strukturtyp.id AND t_strukturtyp.kommentar= 'betrieb' AND get_bezahlt(t_struktur.id) = 't' AND t_strukturtyp.id= t_gruppenelement.suchid AND t_gruppenelement.icode = 'strukturtyp' AND t_gruppenelement.gruppeid= t_gruppe.id AND a.suchid = t_gruppe.id AND a.icode = 'gruppe' AND a.sprache = $session{lang} AND a.texttyp IS NULL AND b.suchid = t_struktur.id AND b.icode = 'struktur' AND b.sprache = $session{lang} AND b.texttyp IS NULL AND t_gruppe.sortierung >= getmin('basic') AND t_gruppe.sortierung <= getmax('basic') AND t_struktur.id IN ( SELECT DISTINCT a.refid FROM t_punkte ASa,t_text AS b,t_struktur AS c WHERE a.refid=b.suchid AND a.icode='struktur' AND b.icode='struktur' ANDa.refid=c.id AND b.sprache=1 AND a.bildid='$picdata[0]' AND b.texttyp IS NULL ) AND t_betriebdetail.von< now() AND t_betriebdetail.strukturbetriebid = t_strukturbetrieb.betriebid AND t_strukturbetrieb.strukturid = t_struktur.id ORDER BY t_gruppe.sortierung,t_strukturtyp.sortierung, t_betriebdetail.leistung , b.code"); This has been taken from a real world application I have written a few weeks ago (unfortunately it is German). In this case the planner does it absolutely right. There are subqueries and functions and many other ugly things for the planner but it works. What should I do if it doesn't work? Well, I could turn seq scans off globally even if I knew that there is just one table causing high execution times. People can easily imagine that a bad execution plan can lead to really bad performance - especially when there are millions of records around. By tweaking the optimizer a little we could gain 100% percents of performance. (idx scan vs. nested loop and seq scan or something like that). I guess the patch for this tweaking stuff could be fairly easy. Currently I am abusing system tables to get the problem fixed (which is bad for other queries of course). Running VACUUM is not that funny if the data in the system tables is mistreated. Concern: People might think this is ANSI: I know that this can be a problem but is it better if people start abusing system tables or think that PostgreSQL is bad or slow? Take the time and fix the planner: I can fully understand this concern. However, there is no way to fix the optimizer to do it right in every case. The planner is really good but I am talking about 3% of all those queries out there - unfortunately they cause 90% of the problems people have. I have taken this query so that people can see that the planner is doing good work but people should also think of a situation where a query like that can cause severe head ache ... maybe this problem should be discussed from time to time. Best regards, Hans <http://kernel.cybertec.at>
pgsql-hackers by date: