Thread: Is disableing nested_loops a bad idea ?
Hi, I am confronted with a use case where my database mainly does big aggregate select (ROLAP), a bunch of batch jobs, and quite few OLTP. I come into cases where the planner under-estimates the number of rows in some relations, chooses to go for nested loops, and takes forever to complete the request. (Notice as the side note that Oracle (10g or 11g) is not any better on this workload and will sometime go crazy and choose a plan that takes hours...) I've played with statistics, vacuum and so on, but at the end the planner is not accurate enough when evaluating the number of rows in some complex queries. Disableing nested loops most of the time solves the performance issues in my tests... generally going from 30 sec. down to 1 sec. So my question is : would it be a very bad idea to disable nested loops in production ? The way I see it is that it could be a little bit less optimal to use merge join or hash join when joining on a few rows, but this is peanuts compared to how bad it is to use nested loops when the number of rows happens to be much higher than what the planner thinks. Is this stupid, ie are there cases when merge join or hash join are much slower than nested loops on a few rows ? Thanks in advance, Franck
Without knowing more about your queries and table structure, it is hard to say if there is a better solution. But one thing you should probably consider doing is just finding the queries where disabling nested loops is verifiably effective and then just disabling nested loops on that connection before running the query and then reset after the query completes. That way, you won't impact queries that legitimately use nested loops. Someone with more experience than I have in tuning the general postgres config may be able to offer a better solution for getting the query planner to make better decisions with the global config, but they'll surely need to know a lot more about your queries in order to do so.
On Thu, Sep 16, 2010 at 1:23 AM, Franck Routier <franck.routier@axege.com> wrote:
Hi,
I am confronted with a use case where my database mainly does big
aggregate select (ROLAP), a bunch of batch jobs, and quite few OLTP.
I come into cases where the planner under-estimates the number of rows
in some relations, chooses to go for nested loops, and takes forever to
complete the request. (Notice as the side note that Oracle (10g or 11g)
is not any better on this workload and will sometime go crazy and choose
a plan that takes hours...)
I've played with statistics, vacuum and so on, but at the end the
planner is not accurate enough when evaluating the number of rows in
some complex queries.
Disableing nested loops most of the time solves the performance issues
in my tests... generally going from 30 sec. down to 1 sec.
So my question is : would it be a very bad idea to disable nested loops
in production ?
The way I see it is that it could be a little bit less optimal to use
merge join or hash join when joining on a few rows, but this is peanuts
compared to how bad it is to use nested loops when the number of rows
happens to be much higher than what the planner thinks.
Is this stupid, ie are there cases when merge join or hash join are much
slower than nested loops on a few rows ?
Thanks in advance,
Franck
--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance
Franck Routier <franck.routier@axege.com> wrote: > I come into cases where the planner under-estimates the number of > rows in some relations, chooses to go for nested loops, and takes > forever to complete the request. People can provide more targeted assistance if you pick one of the offenders and provide enough information for a thorough analysis. It's at least somewhat likely that some tweaks to your configuration or maintenance procedures could help all the queries, but starting with just one is likely to highlight what those changes might be. For ideas on what information to include, see this page: http://wiki.postgresql.org/wiki/SlowQueryQuestions -Kevin
Thanks Kevin and Samuel for your input. The point is we already made a lot of tweaking to try to tune postgresql to behave correctly. I work with Damien, and here is a post he did in july to explain the kind of problems we have http://comments.gmane.org/gmane.comp.db.postgresql.performance/25745 The end of the thread was Robert Hass concluding that "Disabling nestloops altogether, even for one particular query, is often going to be a sledgehammer where you need a scalpel. But then again, a sledgehammer is better than no hammer." So I wanted to better understand to what extend using a sledgehammer will impact me :-) Franck Le jeudi 16 septembre 2010 à 08:49 -0500, Kevin Grittner a écrit : > Franck Routier <franck.routier@axege.com> wrote: > > > I come into cases where the planner under-estimates the number of > > rows in some relations, chooses to go for nested loops, and takes > > forever to complete the request. > > People can provide more targeted assistance if you pick one of the > offenders and provide enough information for a thorough analysis. > It's at least somewhat likely that some tweaks to your configuration > or maintenance procedures could help all the queries, but starting > with just one is likely to highlight what those changes might be. > > For ideas on what information to include, see this page: > > http://wiki.postgresql.org/wiki/SlowQueryQuestions > > -Kevin >
Franck Routier <franck.routier@axege.com> wrote: > So I wanted to better understand to what extend using a > sledgehammer will impact me :-) Disabling it globally is likely to significantly hurt some queries. Before resorting to that, you might decrease effective_cache_size, increase random_page_cost, and (most importantly) do whatever you can to improve statistics. Where those fail, and disabling nested loops helps, I concur with the advice to only do that for specific queries, taking care to reset it afterward. In other words, use that sledgehammer with great care, don't just swing it around wildly.... ;-) -Kevin
On Thu, Sep 16, 2010 at 10:13 AM, Franck Routier <franck.routier@axege.com> wrote: > Thanks Kevin and Samuel for your input. > > The point is we already made a lot of tweaking to try to tune postgresql > to behave correctly. I work with Damien, and here is a post he did in > july to explain the kind of problems we have > http://comments.gmane.org/gmane.comp.db.postgresql.performance/25745 > > The end of the thread was Robert Hass concluding that "Disabling > nestloops altogether, even for one particular query, is > often going to be a sledgehammer where you need a scalpel. But then > again, a sledgehammer is better than no hammer." > > So I wanted to better understand to what extend using a sledgehammer > will impact me :-) One particular case where you may get a nasty surprise is: Nested Loop -> Whatever -> Index Scan This isn't necessarily terrible if the would-be index scan is on a small table, because a hash join may be not too bad. It may not be too good, either, but if the would-be index scan is on a large table the whole thing might turn into a merge join. That can get pretty ugly. Of course in some cases the planner may be able to rejigger the whole plan in some way that mitigates the damage, but not necessarily. One of the things I've noticed about our planner is that it becomes less predictable in stressful situations. As you increase the number of tables involved in join planning, for example, the query planner still delivers a lot of very good plans, but not quite as predictably. Things don't slow down uniformly across the board; instead, most of the plans remain pretty good but every once in a while (and with increasing frequency as you keep cranking up the table count) you get a bad one. Shutting off any of the enable_* constants will, I think, produce a similar effect. Many queries can be adequate handled using some other technique and you won't really notice it, but you may find that you have a few (or someone will eventually write one) which *really* needs whatever technique you turned off for decent performance. At that point you don't have a lot of options... Incidentally, it's "Haas", rather than "Hass". -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company