Thread: [PERFORM] OLAP/reporting queries fall into nested loops over seq scans or otherhorrible planner choices

[PERFORM] OLAP/reporting queries fall into nested loops over seq scans or otherhorrible planner choices

From

Gunther

Date:

02 November 2017, 03:28:41

Hi, this is Gunther, have been with PgSQL for decades, on an off this 
list. Haven't been on for a long time making my way just fine. But there 
is one thing that keeps bothering me both with Oracle and PgSQL. And 
that is the preference for Nested Loops.

Over the years the archives have questions about Nested Loops being 
chosen over Hash Joins. But the responses seem too specific to the 
people's queries, ask many questions, make them post the query plans, 
and often end up frustrating with suggestions to change the data model 
or to add an index and stuff like that.

One should not have to go into that personal detail.

There are some clear boundaries that a smart database should just never 
cross.

Especially with OLAP queries. Think a database that is fine for OLTP, 
has indexes and the index based accesses for a few records joined with a 
dozen other tables all with indexes is no problem. If you fall into a 
Seq Scan scenario or unwanted Hash Join, you usually forgot to add an 
index or forgot to put index columns into your join or other 
constraints. Such are novice questions and we should be beyond that.

But the issue is bulk searches, reports, and any analytic queries 
scenarios. In those queries Nested Loops are almost always a bad choice, 
even if there is an index. In over 20 years of working with RDBMs this 
has been my unfailing heuristics. A report runs slow? Look at plan, is 
there a Nested Loop? Yes? Squash it! And the report runs 10x faster 
instantaneously.

So, all the more troublesome is if any database system (here PgSQL) 
would ever fall into a Nested Loop trap with CPU spinning at 100% for 
several minutes, with a Nested Loop body of anything from a Seq Scan or 
worse with a cardinality of anything over 10 or 100. Nested Loops of 
Nested Loops or Nested Loops of other complex query plan fragments 
should be a no-no and chosen only as an absolute last resort when the 
system cannot find enough memory, even then disk based merge sort should 
be better, i.e., Nested Loops should never be chosen. Period.

If you can set enable_nestloop off and the Hash Join is chosen and the 
performance goes from 1 hour of 100% CPU to 10 seconds completion time, 
then something is deadly wrong. And it doesn't matter to me if I should 
have re-written my query in some funny ways or tweaked my data model, 
these are all unacceptable options when you have a complex system with 
hybrid OLTP/OLAP uses. Don't tell me to de-normalize. I know I can 
materialize joins in tables which I can then use again in joins to save 
time. But that is not the point here.

And I don't think tweaking optimizer statistics is the solution either. 
Because optimizer statistics quickly become worthless when your criteria 
get more complex.

The point is that Nested Loops should never be chosen except in index 
lookup situations or may be memory constraints.

How can I prevent it on a query by query scope? I cannot set 
enable_nestloop = off because one query will be for a full report, wile 
another one might have indexed constraints running in the same session, 
and I don't want to manage side effects and remember to set 
enable_nestloop parameter on and off.

There must be a way to tell the optimizer to penalize nested loops to 
make them the last resort. In Oracle there are those infamous hints, but 
they don't always work either (or it is easy to make mistakes that you 
get no feedback about).

Is there any chance PgSQL can get something like a hint feature? Or is 
there a way to use postgresql.conf to penalize nested loops so that they 
would only ever be chosen in the most straight-forward situations as 
with query parameters that are indexed? I know I need to have sufficient 
work_mem, but if you can set enable_nestloop = off and you get the 
desired Hash Join, there is obviously sufficient work_mem, so that isn't 
the answer either.

Thanks for listening to my rant.

regards,
-Gunther



-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] OLAP/reporting queries fall into nested loops over seqscans or other horrible planner choices

From

"ldh@laurent-hasson.com"

Date:

02 November 2017, 05:44:06

> -----Original Message-----
> From: pgsql-performance-owner@postgresql.org [mailto:pgsql-performance-
> owner@postgresql.org] On Behalf Of Gunther
> Sent: Wednesday, November 01, 2017 20:29
> To: pgsql-performance@postgresql.org
> Subject: [PERFORM] OLAP/reporting queries fall into nested loops over seq
> scans or other horrible planner choices
> 
> Hi, this is Gunther, have been with PgSQL for decades, on an off this list.
> Haven't been on for a long time making my way just fine. But there is one thing
> that keeps bothering me both with Oracle and PgSQL. And that is the
> preference for Nested Loops.
> 
> Over the years the archives have questions about Nested Loops being chosen
> over Hash Joins. But the responses seem too specific to the people's queries,
> ask many questions, make them post the query plans, and often end up
> frustrating with suggestions to change the data model or to add an index and
> stuff like that.
> 
> One should not have to go into that personal detail.
> 
> There are some clear boundaries that a smart database should just never cross.
> 
> Especially with OLAP queries. Think a database that is fine for OLTP, has
> indexes and the index based accesses for a few records joined with a dozen
> other tables all with indexes is no problem. If you fall into a Seq Scan scenario
> or unwanted Hash Join, you usually forgot to add an index or forgot to put index
> columns into your join or other constraints. Such are novice questions and we
> should be beyond that.
> 
> But the issue is bulk searches, reports, and any analytic queries scenarios. In
> those queries Nested Loops are almost always a bad choice, even if there is an
> index. In over 20 years of working with RDBMs this has been my unfailing
> heuristics. A report runs slow? Look at plan, is there a Nested Loop? Yes?
> Squash it! And the report runs 10x faster instantaneously.
> 
> So, all the more troublesome is if any database system (here PgSQL) would
> ever fall into a Nested Loop trap with CPU spinning at 100% for several
> minutes, with a Nested Loop body of anything from a Seq Scan or worse with a
> cardinality of anything over 10 or 100. Nested Loops of Nested Loops or Nested
> Loops of other complex query plan fragments should be a no-no and chosen
> only as an absolute last resort when the system cannot find enough memory,
> even then disk based merge sort should be better, i.e., Nested Loops should
> never be chosen. Period.
> 
> If you can set enable_nestloop off and the Hash Join is chosen and the
> performance goes from 1 hour of 100% CPU to 10 seconds completion time,
> then something is deadly wrong. And it doesn't matter to me if I should have
> re-written my query in some funny ways or tweaked my data model, these are
> all unacceptable options when you have a complex system with hybrid
> OLTP/OLAP uses. Don't tell me to de-normalize. I know I can materialize joins
> in tables which I can then use again in joins to save time. But that is not the
> point here.
> 
> And I don't think tweaking optimizer statistics is the solution either.
> Because optimizer statistics quickly become worthless when your criteria get
> more complex.
> 
> The point is that Nested Loops should never be chosen except in index lookup
> situations or may be memory constraints.
> 
> How can I prevent it on a query by query scope? I cannot set enable_nestloop =
> off because one query will be for a full report, wile another one might have
> indexed constraints running in the same session, and I don't want to manage
> side effects and remember to set enable_nestloop parameter on and off.
> 
> There must be a way to tell the optimizer to penalize nested loops to make
> them the last resort. In Oracle there are those infamous hints, but they don't
> always work either (or it is easy to make mistakes that you get no feedback
> about).
> 
> Is there any chance PgSQL can get something like a hint feature? Or is there a
> way to use postgresql.conf to penalize nested loops so that they would only ever
> be chosen in the most straight-forward situations as with query parameters
> that are indexed? I know I need to have sufficient work_mem, but if you can set
> enable_nestloop = off and you get the desired Hash Join, there is obviously
> sufficient work_mem, so that isn't the answer either.
> 
> Thanks for listening to my rant.
> 
> regards,
> -Gunther
> 
> 
> 
> --
> Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-performance
[Laurent Hasson] 
Hello Gunther,

Just adding to your voice. I recently experienced the same issue with a complex multi-table view, including pivots, and
wassurprised to see all the nested loops everywhere in spite of indices being available. I spent a lot of time
optimizingthe query and went from about 1h to about 3mn, but penalizing nested loops in favor of other "joining"
techniquesseem to make sense as a strategy. Either that, or there is something I really don't understand here either
andwould love to be educated :)
 

Laurent.

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] OLAP/reporting queries fall into nested loops overseq scans or other horrible planner choices

From

Laurenz Albe

Date:

02 November 2017, 11:30:28

Gunther wrote:
> But there 
> is one thing that keeps bothering me both with Oracle and PgSQL. And 
> that is the preference for Nested Loops.

[...]

> But the issue is bulk searches, reports, and any analytic queries 
> scenarios. In those queries Nested Loops are almost always a bad choice, 
> even if there is an index. In over 20 years of working with RDBMs this 
> has been my unfailing heuristics. A report runs slow? Look at plan, is 
> there a Nested Loop? Yes? Squash it! And the report runs 10x faster 
> instantaneously.

[...]

> If you can set enable_nestloop off and the Hash Join is chosen and the 
> performance goes from 1 hour of 100% CPU to 10 seconds completion time, 
> then something is deadly wrong.

[...]

> The point is that Nested Loops should never be chosen except in index 
> lookup situations or may be memory constraints.
> 
> How can I prevent it on a query by query scope? I cannot set 
> enable_nestloop = off because one query will be for a full report, wile 
> another one might have indexed constraints running in the same session, 
> and I don't want to manage side effects and remember to set 
> enable_nestloop parameter on and off.
> 
> There must be a way to tell the optimizer to penalize nested loops to 
> make them the last resort. In Oracle there are those infamous hints, but 
> they don't always work either (or it is easy to make mistakes that you 
> get no feedback about).
> 
> Is there any chance PgSQL can get something like a hint feature?

PostgreSQL doesn't have a way to tell if a query is an OLAP query
running against a star schema or a regular OLTP query, it will treat
both in the same fashion.

I also have had to deal with wrongly chosen nested loop joins, and
testing a query with "enable_nestloop=off" is one of the first things
to try in my experience.

However, it is not true that PostgreSQL "perfers nested loops".
Sometimes a nested loop join is the only sane and efficient way to
process a query, and removing that capability would be just as
bad a disaster as you are experiencing with your OLAP queries.

Bad choices are almost always caused by bad estimates.
Granted, there is no way that estimates can ever be perfect.

So what could be done?

One pragmatic solution would be to wrap every query that you know
to be an OLAP query with

BEGIN;
SET LOCAL enable_nestloop=off;
SELECT ...
COMMIT;

Looking deeper, I would say that wrongly chosen nested loop joins
often come from an underestimate that is close to zero.
PostgreSQL already clamps row count estimates to 1, that is, it will
choose an estimate of 1 whenever it thinks fewer rows will be returned.

Perhaps using a higher clamp like 2 would get rid of many of your
problems, but it is a difficult gamble as it will also prevent some
nested loop joins that would have been the best solution.

Finally, even though the official line of PostgreSQL is to *not* have
query hints, and for a number of good reasons, this is far from being
an unanimous decision.  The scales may tip at some point, though I
personally hope that this point is not too close.

Yours,
Laurenz Albe

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] OLAP/reporting queries fall into nested loops over seqscans or other horrible planner choices

From

Gunther

Date:

02 November 2017, 19:36:13

Thanks for your support Laurent.

I have an idea on one thing you said:

> Just adding to your voice. I recently experienced the same issue with a complex multi-table view, including pivots,
andwas surprised to see all the nested loops everywhere

and here is the clue for me:
> in spite of indices being available.
I would say that sometimes indexes are detrimental. If you don't need
them for other reasons, you might want to not have them. And without the
index, the Nested Loop strategy might not be chosen.

But that is a side-issue, because it can often not be avoided. Just
saying in case it might help.

I also found the opposite now. In the query that made me "blow the lid"
and "complain" here, my team decided to add an index and that did not
get rid of Nested Loops but at least made the inner table access indexed
rather than a table scan and the performance ended up OK. But it's not
always predictable, and these indexes could trap the planner into
sub-optimal solutions still.

I think there is an opportunity for a PgSQL query plan extension,
especially wen dealing with CTE (WITH-clauses), PgSQL could make them a
temporary table and add indexes that it needs for it on the fly, because
after it has done one pass over the inner loop sequential scan it knows
perfectly well how many rows it has, and knowing how many more
iterations are coming from the sub-query that's driving the Nested Loop,
it could decide that it's much faster to put an index on the nested
relation, temporarily materialized. Or it could even decide to change
it's plan mid-way and do the Hash Join.

This is why I had always dreamed that the PgSQL optimizer had some easy
API where one could plug in experimental strategies. I personally am
extremely efficient with XSLT for complex intelligent algorithms, and I
dream of a PgSQL query plan structure exposed as XML which an XSLT
plugin could then process to edit the plan. People could experiment with
awesome intelligent new strategies based on statistics gathered along
the way of the execution.

regards,
-Gunther

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] OLAP/reporting queries fall into nested loops over seqscans or other horrible planner choices

From

Gunther

Date:

02 November 2017, 20:14:29

Thanks you for your thoughtful reply, Laurenz (funny that the people 
interested in this topic are named Laurent and Laurenz :)

> PostgreSQL doesn't have a way to tell if a query is an OLAP query
> running against a star schema or a regular OLTP query, it will treat
> both in the same fashion.
right, of course, and I would not want to go down that road. There OLAP 
vs. OLTP are not just two cut and dry options, and neither is "star 
schema" but one way in which to lay out a simple data model. The real 
world is always more complex than such cut and dry choices
> However, it is not true that PostgreSQL "perfers nested loops".
> Sometimes a nested loop join is the only sane and efficient
> way to process a query ...
of course, it's not preferring NLs deliberately, but it happens awfully 
often (and not just with PgSQL, same problems I have had with Oracle 
over the years).
> Bad choices are almost always caused by bad estimates.
> Granted, there is no way that estimates can ever be perfect.
> ...
> Looking deeper, I would say that wrongly chosen nested loop joins
> often come from an underestimate that is close to zero.
> PostgreSQL already clamps row count estimates to 1, that is, it will
> choose an estimate of 1 whenever it thinks fewer rows will be returned.
>
> Perhaps using a higher clamp like 2 would get rid of many of your
> problems, but it is a difficult gamble as it will also prevent some
> nested loop joins that would have been the best solution.
Wow, that is very interesting! Are you saying that if PgSQL can't know 
what the cardinality is, it assumes a default of 1? That would be very 
slanted a guess. I would think a couple of hundred would be more 
appropriate, or 10% of the average of the base tables for which it does 
have statistics. I would wonder if changing 1 to 2 would make much 
difference, as Seq Search over 1 to 10 tuples should generally be better 
than any other approach, as long as the 1-10 tuples are already readily 
available.
> Finally, even though the official line of PostgreSQL is to *not* have
> query hints, and for a number of good reasons, this is far from being
> an unanimous decision.  The scales may tip at some point, though I
> personally hope that this point is not too close.

I am glad to hear that hints are not completely ruled out by the 
development team. Definitely Oracle hints are painful and should not be 
replicated as is.  Butmay be I can nudge your (and others') personal 
tastes with the following.

You suggested this:

> One pragmatic solution would be to wrap every query that you know
> to be an OLAP query with
> BEGIN;
> SET LOCAL enable_nestloop=off;
> SELECT ...
> COMMIT;
I would also like to put the set enable_nestloop = false statement into 
a combined statement, but when I do it in a transaction like you showed, 
it would not work for a normal PreparedStatement just expecting a 
ResultSet, or at least I haven't been able to make that work. In my Aqua 
Data Studio, if I put the set statement before the select statement, the 
combined statement doesn't return any results. May be I am doing 
something wrong. If there is a way, then I would ave what I need.

If not, I think it might be an easy thing to add.

We already have different scopes of these optimizer parameters like 
enable_nestloop

1. the system wide scope

2. a session wide scope

and I see no reason why one could not just add a non-disruptive syntax 
form to change these parameters on a statement-wide scope. By all means 
in a comment.

Why not

--! set enable_nestloop = false
--! set work_mem = '20 MB'
SELECT *  FROM ....
;

something like that. It would not be a big deal, no completely new 
obscure hint syntax.

And may be, if that is possible so far, then why not add a CTE scope as 
well:

WITH Foo AS (
--! set enable_nestloop = false  SELECT * FROM ... INNER JOIN ... INNER JOIN ... INNER JOIN ... ...
) , Bar AS (  SELECT * FROM Foo INNER JOIN IndexedTable USING(a, b, c)
)
SELECT * FROM Bar ...
;

this would keep the nestloop off for the CTE Foo with that complex join 
but allow it to be used for the CTE Bar or the ultimate query.

I think these features should be relatively easy to add without causing 
SQL compatibility issue and also not opening a can of worms with obscure 
hint features that need a lot of work to implement correctly.

But while we are at dreaming up solution, I think materialized indexed 
sub-plans would also be a nice ting, especially when dealing with CTEs. 
This could be controlled manually to begin with:

WITH Foo AS (
--! set enable_nestloop = false  SELECT * FROM ... INNER JOIN ... INNER JOIN ... INNER JOIN ... ...
)  MATERIALIZE INDEX ON(a, b, c)
, Bar AS (  SELECT * FROM Foo INNER JOIN IndexedTable USING(a, b, c)
)
SELECT * FROM Bar ...
;

And of course if we don't want to disturb SQL syntax, the "materialize 
index on ..." clause could be in a --! comment.

But then, to dream on, PgSQL could make sub-query plans a temporary 
table and add indexes that it needs for it on the fly, because after it 
has done one pass over the inner loop sequential scan it already has a 
perfect guess of what the cardinality is, and knowing how many more 
iterations are coming from the sub-query that's driving the Nested Loop, 
it could decide that it's much faster to put an index on the nested 
relation, temporarily materialized. Or it could even decide to change 
it's plan mid-way and do the Hash Join.

Let's call them dynamic feedback plan optimization.

This is why I had always dreamed that the PgSQL optimizer had some easy 
API where one could plug in experimental strategies. I personally am 
extremely efficient with XSLT for complex intelligent algorithms, and I 
dream of a PgSQL query plan structure exposed as XML which an XSLT 
plugin could then process to edit the plan. People could experiment with 
awesome intelligent new strategies based on statistics gathered along 
the way of the execution.

regards,
-Gunther


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

[PERFORM] Re: OLAP/reporting queries fall into nested loops over seq scans orother horrible planner choices

From

legrand legrand

Date:

02 November 2017, 21:03:50

Hello,

may I suggest you to look at
https://github.com/ossc-db/pg_hint_plan
that mimics Oracle hints syntax

Regards
PAscal



--
Sent from: http://www.postgresql-archive.org/PostgreSQL-performance-f2050081.html


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] OLAP/reporting queries fall into nested loops overseq scans or other horrible planner choices

From

Laurenz Albe

Date:

03 November 2017, 11:01:52

Gunther wrote:
> > Bad choices are almost always caused by bad estimates.
> > Granted, there is no way that estimates can ever be perfect.
> > ...
> > Looking deeper, I would say that wrongly chosen nested loop joins
> > often come from an underestimate that is close to zero.
> > PostgreSQL already clamps row count estimates to 1, that is, it will
> > choose an estimate of 1 whenever it thinks fewer rows will be returned.
> > 
> > Perhaps using a higher clamp like 2 would get rid of many of your
> > problems, but it is a difficult gamble as it will also prevent some
> > nested loop joins that would have been the best solution.
> 
> Wow, that is very interesting! Are you saying that if PgSQL can't know 
> what the cardinality is, it assumes a default of 1? That would be very 
> slanted a guess. I would think a couple of hundred would be more 
> appropriate, or 10% of the average of the base tables for which it does 
> have statistics. I would wonder if changing 1 to 2 would make much 
> difference, as Seq Search over 1 to 10 tuples should generally be better 
> than any other approach, as long as the 1-10 tuples are already readily 
> available.

No, it is not like that.
When PostgreSQL cannot come up with a "real" estimate, it uses
default selectivity estimates.

See include/utils/selfuncs.h:

/*
 * Note: the default selectivity estimates are not chosen entirely at random.
 * We want them to be small enough to ensure that indexscans will be used if
 * available, for typical table densities of ~100 tuples/page.  Thus, for
 * example, 0.01 is not quite small enough, since that makes it appear that
 * nearly all pages will be hit anyway.  Also, since we sometimes estimate
 * eqsel as 1/num_distinct, we probably want DEFAULT_NUM_DISTINCT to equal
 * 1/DEFAULT_EQ_SEL.
 */

/* default selectivity estimate for equalities such as "A = b" */
#define DEFAULT_EQ_SEL  0.005

/* default selectivity estimate for inequalities such as "A < b" */
#define DEFAULT_INEQ_SEL  0.3333333333333333

/* default selectivity estimate for range inequalities "A > b AND A < c" */
#define DEFAULT_RANGE_INEQ_SEL  0.005

/* default selectivity estimate for pattern-match operators such as LIKE */
#define DEFAULT_MATCH_SEL       0.005

/* default number of distinct values in a table */
#define DEFAULT_NUM_DISTINCT  200

/* default selectivity estimate for boolean and null test nodes */
#define DEFAULT_UNK_SEL                 0.005
#define DEFAULT_NOT_UNK_SEL             (1.0 - DEFAULT_UNK_SEL)

Those selectivity estimates are factors, not absolute numbers.

The clamp to 1 happens when, after applying all selectivity factors, the
result is less than 1, precisely to keep the optimizer from choosing a plan
that would become very expensive if a branch is executed *at all*.

> > Finally, even though the official line of PostgreSQL is to *not* have
> > query hints, and for a number of good reasons, this is far from being
> > an unanimous decision.  The scales may tip at some point, though I
> > personally hope that this point is not too close.
> 
> I am glad to hear that hints are not completely ruled out by the 
> development team. Definitely Oracle hints are painful and should not be 
> replicated as is.  Butmay be I can nudge your (and others') personal 
> tastes with the following.

Didn't work for me.
Your hints look just like what Oracle does.

There have been better proposals that aim at fixing the selectivity
estimates, e.g. "multiply your estimate for this join by three".

> In my Aqua 
> Data Studio, if I put the set statement before the select statement, the 
> combined statement doesn't return any results. May be I am doing 
> something wrong. If there is a way, then I would ave what I need.

Check the SQL statements that are generated by your Aqua Data Studio!

Yours,
Laurenz Albe


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

[PERFORM] Re: OLAP/reporting queries fall into nested loops over seq scans orother horrible planner choices

From

Thomas Kellerer

Date:

03 November 2017, 12:28:47

Laurenz Albe schrieb am 02.11.2017 um 09:30:
> Finally, even though the official line of PostgreSQL is to *not* have
> query hints, and for a number of good reasons, this is far from being
> an unanimous decision.  The scales may tip at some point, though I
> personally hope that this point is not too close.

I also think that hints are not the right way to solve problems like that.

I do like Oracle's approach with SQL profiles, where you can force the
optimizer to try harder to find a good execution plan. I _think_ it even
runs the statement with multiple plans and compares the expected outcome
with the actual values. Once a better plan is found that plan can be
attached to that query and the planner will use that plan with subsequent
executions.

This however requires a much bigger infrastructure then simple hints.

(Unrelated, but: maybe a compromise of the never-ending "hints vs. no hints"
discussion would be, to think about integrating the existing "pg_hint_plan"
as a standard contrib module)

Thomas




--
Sent from: http://www.postgresql-archive.org/PostgreSQL-performance-f2050081.html


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] Re: OLAP/reporting queries fall into nested loops overseq scans or other horrible planner choices

From

Adam Brusselback

Date:

03 November 2017, 15:07:13

On Fri, Nov 3, 2017 at 5:28 AM, Thomas Kellerer <spam_eater@gmx.net> wrote:
> I do like Oracle's approach with SQL profiles, where you can force the
> optimizer to try harder to find a good execution plan. I _think_ it even
> runs the statement with multiple plans and compares the expected outcome
> with the actual values. Once a better plan is found that plan can be
> attached to that query and the planner will use that plan with subsequent
> executions.

I also think that this is a really cool approach.  For those specific
problem queries, pretty much tell the optimizer "do your best to make
this as efficient as possible".

To make that more useful though, you'd probably need a shared query
cache that would be persisted through restarts.  I'd assume if you
have a problem query, this very heavy "planning / optimization"
operation would not be something you wanted every connection to have
to do every time they connect.

I wish I was more knowledgeable about the internals so I could more
clearly see how a system like that could come together, and what other
groundwork would be needed building up to it.

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] Re: OLAP/reporting queries fall into nested loops overseq scans or other horrible planner choices

From

Gunther

Date:

03 November 2017, 17:36:57

On Fri, Nov 3, 2017 at 5:28 AM, Thomas Kellerer <spam_eater@gmx.net> wrote:
>> I do like Oracle's approach with SQL profiles, where you can force the
>> optimizer to try harder to find a good execution plan. I _think_ it even
>> runs the statement with multiple plans and compares the expected outcome
>> with the actual values. Once a better plan is found that plan can be
>> attached to that query and the planner will use that plan with subsequent
>> executions.
I have used that approach with Oracle. I didn't like it. It is too 
difficult, too complicated. Requires all sorts of DBA privileges. 
Nothing that would help a lowly user trying his ad-hoc queries.

I think a "dynamic feedback plan optimization" would be more innovative 
and ultimately deliver better on the original RDBMS vision. The RDBMS 
should exert all intelligence that it can to optimize the query 
execution. (I know that means: no reliance on hints.)

There is so much more that could be done, such as materialized and 
potentially indexed partial results. (I know Oracle as materialized 
partial results).

But the dynamic feedback plan would be even cooler.  So that means the 
outer relation should be built or sampled to estimate the selectivity, 
the inner relation should be built completely, and if it is too large, 
it should be thrown back to the optimizer to change the plan.

Or may be the planner needs some second look pattern matching 
criticizer: Any pattern of Nested Loop I would re-check and possibly 
sample a few rows. And Nested Loop with costly inner loop should almost 
always be avoided. Nested Loop of Seq Scan is a no-no unless it can be 
proven that the cardinality of the inner relation to scan is less than 100.

But even more, once you have the inner and outer table of a Nested Loop 
built or sampled, there should be no reason not to run the Hash Join. I 
guess I still don't get why the optimizer even today would EVER consider 
a Nested Loop over a Hash Join, unless there is some clear indication 
that the query will be used to just get the FIRST ROWS (Oracle hint) and 
that those first rows will actually exist (user waits 30 minutes at 100% 
CPU only to be informed that the query has no results!), and that the 
results are likely to come out early in the Nested Loop! So many 
constraints to make that Nested Loop plan a successful strategy. Why 
ever choose it???

I guess, hints or no hints, I think Nested Loops should not be used by 
the optimizer unless it has positive indication  that it meets all the 
criteria for being a good strategy, i.e., that there is a continuous 
path of indexed columns starting with constant query parameters. This is 
the usual OLTP query. And that is what Nested Loops are for. But in all 
other cases, and if space allows at all, always use Hash Joins. It is 
even cheaper to do a trial and error! Assume that space will allow, and 
quit if it doesn't, rather than being sheepish and going to a 1 hour CPU 
bound operation. Because if space does not allow, the chance for Nested 
Loop being a good idea is also close to nil! So if space doesn't allow, 
it would be Sort-Merge on Disk. Especially if the query has a DISTINCT 
or ORDER BY clause anyway! Why is that not always a better strategy?

And yes, until all this is figured out: by all means include the 
pg_hint_plan.c -- pretty please!

regards,
-Gunther

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] Re: OLAP/reporting queries fall into nested loops overseq scans or other horrible planner choices

From

Gunther

Date:

03 November 2017, 17:51:31

Just throwing out some more innovative ideas.

Materialized join tables, I have read somewhere. OK, difficult to keep 
consistent with transactions. Forget that.

But, why not collect statistics on every join that is processed, even if 
the query is interrupted. Then as more and more plans are run, and 
interrupted for being too slow, statistics on the joins are collected 
and can inform the optimizer next time not to use that approach.

Would work like magic for a user.

User writes a query. It runs 3 minutes and as no result. User interrupts 
the query (THANKS PgSQL for allowing that, unlike Oracle!). Now the 
statistics has already been gathered.

User reruns the query, not changing anything. Because the statistics on 
(some of) the joins has been gathered, at least with an initial sample, 
now the planner will likely choose a different plan. Say, now the 
results come in at 2 minutes and the user is satisfied. But still more 
complete statistics was collected.

Now the user changes a few query parameters and runs the query again, or 
puts it into a more complex query. This time the planner has even more 
statistics and chooses an even better plan. And lo and behold now the 
results come in at 10 seconds!

At no point did the user have to analyze the explain plan, come up with 
hints and tricks and nudges to the optimizer. And at no point did the 
user have to become DBA to run some outlandish PL/SQL procedures for 
which he does not have the license key or the special privileges.

But until that is done, please put in the pg_hint_plan.c. Hints don't 
hurt. If you don't like them, don't use them.,

regards,
-Gunther


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

[PERFORM] Re: OLAP/reporting queries fall into nested loops over seq scans orother horrible planner choices

From

legrand legrand

Date:

03 November 2017, 17:55:45

To limit NL usage, wouldn't a modified set of Planner Cost Constants 
https://www.postgresql.org/docs/current/static/runtime-config-query.html
<https://www.postgresql.org/docs/current/static/runtime-config-query.html>  

seq_page_cost 
random_page_cost 
cpu_tuple_cost 
cpu_index_tuple_cost 
cpu_operator_cost 

be more hash join freindly (as Oracle' optimizer_index_cost_adj )?



--
Sent from: http://www.postgresql-archive.org/PostgreSQL-performance-f2050081.html


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] Re: OLAP/reporting queries fall into nested loops overseq scans or other horrible planner choices

From

Gunther

Date:

03 November 2017, 18:13:35

On 11/3/2017 10:55, legrand legrand wrote:
> To limit NL usage, wouldn't a modified set of Planner Cost Constants
> https://www.postgresql.org/docs/current/static/runtime-config-query.html
> <https://www.postgresql.org/docs/current/static/runtime-config-query.html>
>
> seq_page_cost
> random_page_cost
> cpu_tuple_cost
> cpu_index_tuple_cost
> cpu_operator_cost
>
> be more hash join freindly (as Oracle' optimizer_index_cost_adj )?
>
I twiddled with some of these and could nudge it toward a Sort Merge 
instead NL. But it's hit or miss.

May be there should be a tool which you can run periodically which will 
test out the installation to see how IO, CPU, and memory performs. Or, 
again, these statistics should be collected during normal operation so 
that nobody needs to guess them or test them in complex procedures. As 
the system runs, it should sample the seq_page_cost and random_page_cost 
(noticing that it has a SSD or HDD) and it should see how much disk read 
is from cache and how much goes out to disk. Why isn't the executor of 
queries the best person to ask for these cost constants?

regards,
-Gunther

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] Re: OLAP/reporting queries fall into nested loops overseq scans or other horrible planner choices

From

Dave Nicponski

Date:

03 November 2017, 18:56:45

Thank you Gunther for bringing this up. It's been bothering me quite a bit over time as well.

Forgive the naive question, but does the query planner's cost estimator only track a single estimate of cost that gets accumulated and compared across plan variants? Or is it keeping a range or probabilistic distribution? I'm suspecting the former, but i bet either of the latter would fix this rapidly.

The cases that frustrate me are where NL is chosen over something like HJ, where if the query planner is slightly wrong on the lower side, then NL would certainly beat HJ (but by relatively small amounts), but a slight error on the higher side mean that the NL gets punished tremendously, do to the big-o penalty difference it's paying over the HJ approach. Having the planner with some notion of the distribution might help it make a better assessment of the potential consequences for being slightly off in its estimates. If it notices that being off on a plan involving a NL sends the distribution off into hours instead of seconds, it could potentially avoid it even if it might be slightly faster in the mean.

<fantasy> If i ever find time, maybe i'll try to play around with this idea and see how it performs... </fantasy>

-dave-

On Fri, Nov 3, 2017 at 11:13 AM, Gunther <raj@gusw.net> wrote:

On 11/3/2017 10:55, legrand legrand wrote:
To limit NL usage, wouldn't a modified set of Planner Cost Constants
https://www.postgresql.org/docs/current/static/runtime-config-query.html
<https://www.postgresql.org/docs/current/static/runtime-config-query.html>

seq_page_cost
random_page_cost
cpu_tuple_cost
cpu_index_tuple_cost
cpu_operator_cost

be more hash join freindly (as Oracle' optimizer_index_cost_adj )?

I twiddled with some of these and could nudge it toward a Sort Merge instead NL. But it's hit or miss.

May be there should be a tool which you can run periodically which will test out the installation to see how IO, CPU, and memory performs. Or, again, these statistics should be collected during normal operation so that nobody needs to guess them or test them in complex procedures. As the system runs, it should sample the seq_page_cost and random_page_cost (noticing that it has a SSD or HDD) and it should see how much disk read is from cache and how much goes out to disk. Why isn't the executor of queries the best person to ask for these cost constants?

regards,
-Gunther

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Dave Nicponski

Chief Technology Officer

917.696.3081

dave@seamlessdocs.com

30 Vandam Street. 2nd Floor. NYC
855.77.SEAMLESS | SeamlessGov.com