Thread: Idea for reducing planning time

Idea for reducing planning time

From
Tom Lane
Date:
I've been looking into Brian Hirt's complaint that 7.0.3 and 7.1 are
lots slower than 7.0.2 in planning big joins.  The direct cause is that
since we now deduce implied equality clauses, the system has more
potential join paths than it used to --- for example, given "WHERE
a.x = b.y AND b.y = c.z", the old system would not consider joining a to
c and then adding b, because it didn't have a joinqual relating a and c.
Now it does.  There's not a lot to be done about that, but I've been
looking to see if we can make any offsetting speedups.

While digging around, I've realized that the planner is still carrying
around a lot more paths than it really needs to.  The rule in add_path()
is that we will keep a path if it is cheaper OR differently sorted/
better sorted than any other path for the same rel.  But what is not
taken into account is whether the sort ordering of a path actually has
any potential usefulness.  Before saving a path on the grounds that it's
got an otherwise unobtainable sort ordering, we should check to see if
that sort ordering is really going to be useful for a later mergejoin
or for the final output ordering.  It turns out we already have code to
check that for basic indexscan paths --- see useful_for_mergejoin() and
useful_for_ordering() in indxpath.c.  But we fail to make the same sort
of test on paths for join relations, with the result that we carry along
a lot more paths than we could possibly need, and that costs huge
amounts of time.

An example of what's going on is that given a query withFROM a, b, ... other rels ...WHERE a.w = 1 AND a.x = 2 AND b.y
=3 AND b.z = 4 ...
 
if w,x,y,z all have indexes we will consider indexscans on all four
of those indexes.  Which is fine.  But we will then consider nestloops
and mergejoins of a with b that use these four indexscans as the outer
path, and therefore yield results that are sorted by w,x,y,z
respectively.  Those paths will be carried as possible paths for a+b
because they offer different sort orders, even if we have no further
use for those sort orderings.  And then we have a combinatorial blowup
in the number of paths considered at higher join levels.  We should
instead consider that these paths have no useful sort ordering, and
throw away all but the cheapest.

What I'm thinking of doing is truncating the recorded pathkeys of a path
at the first sortkey that's not useful for either a higher-level
mergejoin clause or the requested final output sort ordering.  Then the
logic inside add_path() wouldn't change, but it would only be considering
useful pathkeys and not useless ones.

I'm trying to resist the temptation to make this change right now :-).
It's not quite a bug fix --- well, maybe you could call it a performance
bug fix --- so I'm kind of thinking it shouldn't be done during beta.
OTOH I seem to have lost the argument that Vadim shouldn't commit VACUUM
performance improvements during beta, so maybe this should go in too.
What do you think?
        regards, tom lane


Re: Idea for reducing planning time

From
Alfred Perlstein
Date:
* Tom Lane <tgl@sss.pgh.pa.us> [001213 15:18] wrote:
> 
> I'm trying to resist the temptation to make this change right now :-).
> It's not quite a bug fix --- well, maybe you could call it a performance
> bug fix --- so I'm kind of thinking it shouldn't be done during beta.
> OTOH I seem to have lost the argument that Vadim shouldn't commit VACUUM
> performance improvements during beta, so maybe this should go in too.
> What do you think?

If you're saying that you're OK with the work Vadim has done please
let him know, I'm assuming he hasn't committed out of respect for your
still standing objection.

If you're terribly against it then say so again, I just would rather
it not happen because you objected rather than missed communication.

As far as the work you're proposing, how much of a gain is it over
the current code?  2x? 3x? 20x? :)  There's a difference between a
slight performance increase and something too good to pass up.

thanks,
-- 
-Alfred Perlstein - [bright@wintelcom.net|alfred@freebsd.org]
"I have the heart of a child; I keep it in a jar on my desk."


Re: Idea for reducing planning time

From
The Hermit Hacker
Date:
sorry, meant to respond to the original and deleted it too fast ... 

Tom, if the difference between 7.0 and 7.1 is such that there is a
performance decrease, *please* apply the fix ... with the boon that OUTER
JOINs will provide, would hate to see us with a performance hit reducing
that impact ...

One thing I would like to suggest for this stage of the beta, though, is
that a little 'peer review' before committing the code might be something
that would help 'ease' implementing stuff like this and Vadim's VACUUM
code ... read through Vadim's code and see if it looks okay to you ... get
Vadim to read through your code/patch and see if it looks okay to him
... it adds a day or two to the commit cycle, but at least you can say it
was reviewed before committed ...


On Wed, 13 Dec 2000, Alfred Perlstein wrote:

> * Tom Lane <tgl@sss.pgh.pa.us> [001213 15:18] wrote:
> > 
> > I'm trying to resist the temptation to make this change right now :-).
> > It's not quite a bug fix --- well, maybe you could call it a performance
> > bug fix --- so I'm kind of thinking it shouldn't be done during beta.
> > OTOH I seem to have lost the argument that Vadim shouldn't commit VACUUM
> > performance improvements during beta, so maybe this should go in too.
> > What do you think?
> 
> If you're saying that you're OK with the work Vadim has done please
> let him know, I'm assuming he hasn't committed out of respect for your
> still standing objection.
> 
> If you're terribly against it then say so again, I just would rather
> it not happen because you objected rather than missed communication.
> 
> As far as the work you're proposing, how much of a gain is it over
> the current code?  2x? 3x? 20x? :)  There's a difference between a
> slight performance increase and something too good to pass up.
> 
> thanks,
> -- 
> -Alfred Perlstein - [bright@wintelcom.net|alfred@freebsd.org]
> "I have the heart of a child; I keep it in a jar on my desk."
> 

Marc G. Fournier                   ICQ#7615664               IRC Nick: Scrappy
Systems Administrator @ hub.org 
primary: scrappy@hub.org           secondary: scrappy@{freebsd|postgresql}.org 



Re: Idea for reducing planning time

From
Tom Lane
Date:
Alfred Perlstein <bright@wintelcom.net> writes:
> If you're saying that you're OK with the work Vadim has done please
> let him know, I'm assuming he hasn't committed out of respect for your
> still standing objection.

Well, I'm still against committing it now, but I only have one core
vote, and I seem to be losing 3:1.  I know when to concede ;-)

> As far as the work you're proposing, how much of a gain is it over
> the current code?  2x? 3x? 20x? :)  There's a difference between a
> slight performance increase and something too good to pass up.

Hard to tell without doing the work.  But we already know that extra
paths inside the planner pose a combinatorial penalty --- think
exponential behavior, not linear speedups...
        regards, tom lane


Re: Idea for reducing planning time

From
Bruce Momjian
Date:
> Bruce Momjian <pgman@candle.pha.pa.us> writes:
> > Totally agree.  In the old days, we posted all our patches to the list
> > so people could see.  We used to make cvs commits only on the main
> > server, so we had the patch handy, and it made sense to post it.  Now
> > that we have remote cvs, we don't do it as much, but in this case, cvs
> > diff -c is a big help.
> 
> If you like I'll post the patch, but it strikes me as a waste of list
> bandwidth --- anyone who is likely to actually review it is perfectly
> capable of doing cvs diff for themselves ...

Posting patch is only useful if you want people to review it.  They are
more likely to if you send it to patches, already diff'ed.  In fact, how
do you pull out a patch set from CVS?  You have to use -D and specify a
date/time range, right?

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026
 


Re: Idea for reducing planning time

From
Tom Lane
Date:
Bruce Momjian <pgman@candle.pha.pa.us> writes:
> Totally agree.  In the old days, we posted all our patches to the list
> so people could see.  We used to make cvs commits only on the main
> server, so we had the patch handy, and it made sense to post it.  Now
> that we have remote cvs, we don't do it as much, but in this case, cvs
> diff -c is a big help.

If you like I'll post the patch, but it strikes me as a waste of list
bandwidth --- anyone who is likely to actually review it is perfectly
capable of doing cvs diff for themselves ...
        regards, tom lane


Re: Idea for reducing planning time

From
"Marc G. Fournier"
Date:
On Fri, 15 Dec 2000, Alfred Perlstein wrote:

> * Bruce Momjian <pgman@candle.pha.pa.us> [001215 10:34] wrote:
> > > 
> > > sorry, meant to respond to the original and deleted it too fast ... 
> > > 
> > > Tom, if the difference between 7.0 and 7.1 is such that there is a
> > > performance decrease, *please* apply the fix ... with the boon that OUTER
> > > JOINs will provide, would hate to see us with a performance hit reducing
> > > that impact ...
> > > 
> > > One thing I would like to suggest for this stage of the beta, though, is
> > > that a little 'peer review' before committing the code might be something
> > > that would help 'ease' implementing stuff like this and Vadim's VACUUM
> > > code ... read through Vadim's code and see if it looks okay to you ... get
> > > Vadim to read through your code/patch and see if it looks okay to him
> > > ... it adds a day or two to the commit cycle, but at least you can say it
> > > was reviewed before committed ...
> > > 
> > 
> > Totally agree.  In the old days, we posted all our patches to the list
> > so people could see.  We used to make cvs commits only on the main
> > server, so we had the patch handy, and it made sense to post it.  Now
> > that we have remote cvs, we don't do it as much, but in this case, cvs
> > diff -c is a big help.
> 
> It seems that Tom has committed his fixups but we're still waiting
> on Vadim?

We can't force Vadim to commit them ... only encourage him to :)




Re: Idea for reducing planning time

From
Bruce Momjian
Date:
> 
> sorry, meant to respond to the original and deleted it too fast ... 
> 
> Tom, if the difference between 7.0 and 7.1 is such that there is a
> performance decrease, *please* apply the fix ... with the boon that OUTER
> JOINs will provide, would hate to see us with a performance hit reducing
> that impact ...
> 
> One thing I would like to suggest for this stage of the beta, though, is
> that a little 'peer review' before committing the code might be something
> that would help 'ease' implementing stuff like this and Vadim's VACUUM
> code ... read through Vadim's code and see if it looks okay to you ... get
> Vadim to read through your code/patch and see if it looks okay to him
> ... it adds a day or two to the commit cycle, but at least you can say it
> was reviewed before committed ...
> 

Totally agree.  In the old days, we posted all our patches to the list
so people could see.  We used to make cvs commits only on the main
server, so we had the patch handy, and it made sense to post it.  Now
that we have remote cvs, we don't do it as much, but in this case, cvs
diff -c is a big help.

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026
 


RE: Idea for reducing planning time

From
"Mikheev, Vadim"
Date:
> It seems that Tom has committed his fixups but we're still waiting
> on Vadim?

Sorry guys, I'm busy with WAL issues currently.
CRC will require initdb, so it's better to implement it
first...
Also, is TOAST-table vacuuming fixed now?

Vadim


Re: Idea for reducing planning time

From
Bruce Momjian
Date:
> What I'm thinking of doing is truncating the recorded pathkeys of a path
> at the first sortkey that's not useful for either a higher-level
> mergejoin clause or the requested final output sort ordering.  Then the
> logic inside add_path() wouldn't change, but it would only be considering
> useful pathkeys and not useless ones.

[ Sorry for the delay in replying.  I was talking at Compaq.]

OK, here is my idea.  Do the patch and post it to the hackers list so
people can see the scope of the change.  Then, if no one objects, apply
it.  We can always back it out, because no one else will be playing in
the optimizer.  We can even back it out of a minor release if we find it
is a problem.  Seems we don't want reports that queries are _slower_
than 7.0.X, and I can see how it would happen.

My quick question is that if we have a1=b1 and b1=c1, isn't the join
sorted by a1, b1, and c1, and threfore we don't get more sorted plans?

> 
> I'm trying to resist the temptation to make this change right now :-).
> It's not quite a bug fix --- well, maybe you could call it a performance
> bug fix --- so I'm kind of thinking it shouldn't be done during beta.
> OTOH I seem to have lost the argument that Vadim shouldn't commit VACUUM
> performance improvements during beta, so maybe this should go in too.
> What do you think?

Did you lose the argument with Vadim?  I haven't seen his vacuum commit
yet, though I certainly would like to.  :-)

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026
 


Re: Idea for reducing planning time

From
Tom Lane
Date:
Bruce Momjian <pgman@candle.pha.pa.us> writes:
> My quick question is that if we have a1=b1 and b1=c1, isn't the join
> sorted by a1, b1, and c1, and threfore we don't get more sorted plans?

That's not the issue.  See, before the transitive-equality patch,
if you wrote
select * from a,b,c where a.x = b.y and b.y = c.z

then the system would not consider plans of the structure {a c} b
(ie, start with a join of a to c and then add b).  It would only
consider {a b} c and {b c} a plans, because it follows join clauses
if any are available.

Now that we deduce the additional WHERE clause a.x = c.z, we will
also consider {a c} b.  So we may get a better plan as a result ...
but whether we give back the same plan or not, it takes longer,
because more paths will be considered.  Brian Hirt was unhappy because
planning was taking significantly longer on his seven-way join.


> Did you lose the argument with Vadim?  I haven't seen his vacuum commit
> yet, though I certainly would like to.  :-)

I'm assuming he's going to commit it, though I haven't seen him say so
in so many words.
        regards, tom lane


Re: Idea for reducing planning time

From
Tom Lane
Date:
Bruce Momjian <pgman@candle.pha.pa.us> writes:
>> If you like I'll post the patch, but it strikes me as a waste of list
>> bandwidth --- anyone who is likely to actually review it is perfectly
>> capable of doing cvs diff for themselves ...

> Posting patch is only useful if you want people to review it.  They are
> more likely to if you send it to patches, already diff'ed.  In fact, how
> do you pull out a patch set from CVS?  You have to use -D and specify a
> date/time range, right?

That's a good point --- there doesn't seem to be any real easy way of
extracting a set of changes to different files except to use -D.  And
even that doesn't do anything to separate unrelated patches applied at
about the same time.

For the record, you can get a diff of this kind with a command like
cvs diff -c -D '2000-12-14 17:00' -D '2000-12-14 18:00'

executed in the top level of the tree you want to search.
        regards, tom lane


Re: Idea for reducing planning time

From
Alfred Perlstein
Date:
* Bruce Momjian <pgman@candle.pha.pa.us> [001215 10:34] wrote:
> > 
> > sorry, meant to respond to the original and deleted it too fast ... 
> > 
> > Tom, if the difference between 7.0 and 7.1 is such that there is a
> > performance decrease, *please* apply the fix ... with the boon that OUTER
> > JOINs will provide, would hate to see us with a performance hit reducing
> > that impact ...
> > 
> > One thing I would like to suggest for this stage of the beta, though, is
> > that a little 'peer review' before committing the code might be something
> > that would help 'ease' implementing stuff like this and Vadim's VACUUM
> > code ... read through Vadim's code and see if it looks okay to you ... get
> > Vadim to read through your code/patch and see if it looks okay to him
> > ... it adds a day or two to the commit cycle, but at least you can say it
> > was reviewed before committed ...
> > 
> 
> Totally agree.  In the old days, we posted all our patches to the list
> so people could see.  We used to make cvs commits only on the main
> server, so we had the patch handy, and it made sense to post it.  Now
> that we have remote cvs, we don't do it as much, but in this case, cvs
> diff -c is a big help.

It seems that Tom has committed his fixups but we're still waiting
on Vadim?

-- 
-Alfred Perlstein - [bright@wintelcom.net|alfred@freebsd.org]


Re: Idea for reducing planning time

From
Bruce Momjian
Date:
[ Charset ISO-8859-1 unsupported, converting... ]
> > It seems that Tom has committed his fixups but we're still waiting
> > on Vadim?
> 
> Sorry guys, I'm busy with WAL issues currently.
> CRC will require initdb, so it's better to implement it
> first...
> Also, is TOAST-table vacuuming fixed now?

Can someone please remind me?  CRC allows us to know of WAL log is
corrupt?

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026
 


Re: Idea for reducing planning time

From
Bruce Momjian
Date:
> Bruce Momjian <pgman@candle.pha.pa.us> writes:
> >> If you like I'll post the patch, but it strikes me as a waste of list
> >> bandwidth --- anyone who is likely to actually review it is perfectly
> >> capable of doing cvs diff for themselves ...
> 
> > Posting patch is only useful if you want people to review it.  They are
> > more likely to if you send it to patches, already diff'ed.  In fact, how
> > do you pull out a patch set from CVS?  You have to use -D and specify a
> > date/time range, right?
> 
> That's a good point --- there doesn't seem to be any real easy way of
> extracting a set of changes to different files except to use -D.  And
> even that doesn't do anything to separate unrelated patches applied at
> about the same time.
> 
> For the record, you can get a diff of this kind with a command like
> 
>     cvs diff -c -D '2000-12-14 17:00' -D '2000-12-14 18:00'
> 
> executed in the top level of the tree you want to search.

Shame there isn't a -u (user) option supported by diff.

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026
 


RE: Idea for reducing planning time

From
"Mikheev, Vadim"
Date:
> > Sorry guys, I'm busy with WAL issues currently.
> > CRC will require initdb, so it's better to implement it
> > first...
> > Also, is TOAST-table vacuuming fixed now?
> 
> Can someone please remind me?  CRC allows us to know of WAL log is
> corrupt?

Currently WAL has no means to know is log data consistent or not.
Eg - it may try to redo insertion of tuple which data were only
partially writtent to log disk pages (when record crosses page
boundary).

Vadim


Re: Idea for reducing planning time

From
ncm@zembu.com (Nathan Myers)
Date:
On Fri, Dec 15, 2000 at 03:20:27PM -0500, Bruce Momjian wrote:
> [ Charset ISO-8859-1 unsupported, converting... ]
> > > It seems that Tom has committed his fixups but we're still waiting
> > > on Vadim?
> > 
> > Sorry guys, I'm busy with WAL issues currently.
> > CRC will require initdb, so it's better to implement it
> > first...
> 
> Can someone please remind me?  CRC allows us to know of WAL log is
> corrupt?

There are two uses for CRC in connection with the WAL log 
("write-ahead log log").  First, it helps to reveal corruption 
in the log.  Second, it makes it possible to identify the end of 
the log when the log lives on a raw partition.

Nathan Myers
ncm@zembu.com


Re: Idea for reducing planning time

From
Tatsuo Ishii
Date:
> Bruce Momjian <pgman@candle.pha.pa.us> writes:
> >> If you like I'll post the patch, but it strikes me as a waste of list
> >> bandwidth --- anyone who is likely to actually review it is perfectly
> >> capable of doing cvs diff for themselves ...
> 
> > Posting patch is only useful if you want people to review it.  They are
> > more likely to if you send it to patches, already diff'ed.  In fact, how
> > do you pull out a patch set from CVS?  You have to use -D and specify a
> > date/time range, right?
> 
> That's a good point --- there doesn't seem to be any real easy way of
> extracting a set of changes to different files except to use -D.  And
> even that doesn't do anything to separate unrelated patches applied at
> about the same time.
> 
> For the record, you can get a diff of this kind with a command like
> 
>     cvs diff -c -D '2000-12-14 17:00' -D '2000-12-14 18:00'
> 
> executed in the top level of the tree you want to search.
> 
>             regards, tom lane

Tom, do you have a plan to make a back patch for 7.0.3?  I got a bug
report from a user with a script to reproduce the problem. Seems the
backend consumes infinite memory.

I don't want to recommend her to use 7.0.2 since it has a merge join
problem...
--
Tatsuo Ishii

Re: Idea for reducing planning time

From
Tom Lane
Date:
Tatsuo Ishii <t-ishii@sra.co.jp> writes:
> Tom, do you have a plan to make a back patch for 7.0.3?

No, I don't.  No time for it now.

> I got a bug report from a user with a script to reproduce the
> problem. Seems the backend consumes infinite memory.

Not infinite, surely ;-) ... but possibly more than her kernel will
allow.  As a workaround, I'd suggest setting geqo_threshold smaller,
perhaps 8 or 9.  IIRC, the way to do that in 7.0 isset geqo='on=8';
Not sure if it's possible to set up a system-wide default for that
in 7.0.

BTW, the main reason planning this join in 7.0 is so slow is that
all the options look exactly alike and so the planner has no reason to
discard any paths.  As soon as you create some indexes, load up some
data and VACUUM, the symmetry would be broken and performance should
improve (geqo or not).  Or at least it'd usually be broken.  Is it
possible that all her tables are exactly the same size?
        regards, tom lane


Re: Idea for reducing planning time

From
Tatsuo Ishii
Date:
> Tatsuo Ishii <t-ishii@sra.co.jp> writes:
> > Tom, do you have a plan to make a back patch for 7.0.3?
> 
> No, I don't.  No time for it now.
> 
> > I got a bug report from a user with a script to reproduce the
> > problem. Seems the backend consumes infinite memory.
> 
> Not infinite, surely ;-) ... but possibly more than her kernel will
> allow.  As a workaround, I'd suggest setting geqo_threshold smaller,
> perhaps 8 or 9.  IIRC, the way to do that in 7.0 is
>     set geqo='on=8';
> Not sure if it's possible to set up a system-wide default for that
> in 7.0.

Yes, I thought about geqo too. However, a standard geqo settings
didn't help. It still took 0:49 (7.0.2, 7.1 takes only ~3 seconds).

Then I set:
    Pool_Size 128    Generations 10

and now the query takes only 5 seconds!

> BTW, the main reason planning this join in 7.0 is so slow is that
> all the options look exactly alike and so the planner has no reason to
> discard any paths.  As soon as you create some indexes, load up some
> data and VACUUM, the symmetry would be broken and performance should
> improve (geqo or not).  Or at least it'd usually be broken.  Is it
> possible that all her tables are exactly the same size?

Yes. t_cyuubunrui has four rows, and the others has only a row.
--
Tatsuo Ishii