Thread: Re: [SQL] bad select performance fixed by forbidding hash joins

Re: [SQL] bad select performance fixed by forbidding hash joins

From
George Young
Date:
[PostgreSQL 6.5.0 on i586-pc-linux-gnu, compiled by gcc egcs-2.91.66]
table opset_steps
      (name text, id int2, ver int2) [1400 rows]
      non-unique index is on (id, ver)

table run_opsets
      (status int2, id int2, ver int2, run_id int2, seq int2) [17000 rows]
      pkey is (id, seq), second index on(status, id, ver, run_id)
      select count(*) from run_opsets where status=1; --> 187
      select count(*) from run_opsets where status=3; --> 10564

table runs
      (run_name text, run_id int2, status int2) [900 rows]
      pkey is run_name, second index(run_id, status)
      select count(*)from runs where status=1; -->68

I have vacuum analyzed all relevant tables.

explain select os.name,r.run_name,ro.status from opset_steps os,runs r,run_opsets ro where (ro.status=3 or ro.status=1)
andro.opset_id=os.opset_id and ro.run_id=r.run_id and ro.opset_ver=os.opset_ver and r.status=1; 

Hash Join  (cost=1793.58 rows=14560 width=38)
  ->  Hash Join  (cost=1266.98 rows=14086 width=24)
        ->  Seq Scan on run_opsets ro  (cost=685.51 rows=13903 width=8)
        ->  Hash  (cost=70.84 rows=1389 width=16)
              ->  Seq Scan on opset_steps os  (cost=70.84 rows=1389 width=16)
  ->  Hash  (cost=47.43 rows=374 width=14)
        ->  Seq Scan on runs r  (cost=47.43 rows=374 width=14)

This query takes 16 seconds. [returns 3126 rows]

On Tue, Jul 20, 1999 at 05:42:20PM -0400, Tom Lane wrote:
> On Tue, 20 Jul 1999 14:56:46 -0400 George Young wrote:
> > ... Is this then
> > the best that postgres can do?  Is there some rephrasing/restructuring of
> > this query that would make it faster?
>
> Hard to say.  The query looks reasonable as it stands ---
> ...  You have no restriction
> clause on opset_steps so all of those entries get loaded for hashing;
> can you provide one?
No.
> The system's plan looks pretty reasonable as well.  It might be that
> a merge join would be faster than a hash join but I wouldn't assume
> so.  If you want, you can try forcing the system not to use hashes;
> start psql with environment variable
>     PGOPTIONS="-fh"
> and see what sort of plan and execution time you get.  If that does
> turn out to be a big win it would indicate that the planner is using
> poor cost estimates, which is certainly possible...

Yes!  PGOPTIONS="-fh" made the query time go from 16 seconds to 2 seconds!
Is this a safe thing to leave on permanently, or is there some way to set
PGOPTIONS for just this query?

explain select os.name,r.run_name,ro.status from opset_steps os,runs r,run_opsets ro where (ro.status=3 or ro.status=1)
andro.opset_id=os.opset_id and ro.run_id=r.run_id and ro.opset_ver=os.opset_ver and r.status=1; 

Merge Join  (cost=9295.54 rows=14560 width=38)
  ->  Seq Scan  (cost=8676.01 rows=14371 width=22)
        ->  Sort  (cost=8676.01 rows=14371 width=22)
              ->  Merge Join  (cost=1657.30 rows=14371 width=22)
                    ->  Index Scan using run_opsets_pkey on run_opsets ro  (cost=1031.25 rows=13903 width=8)
                    ->  Seq Scan  (cost=154.91 rows=374 width=14)
                          ->  Sort  (cost=154.91 rows=374 width=14)
                                ->  Seq Scan on runs r  (cost=47.43 rows=374 width=14)
  ->  Index Scan using opset_steps_idx_ver_id on opset_steps os  (cost=99.45 rows=1389 width=16)


With PGOPTIONS=-fh, this query takes ~ 2 seconds! [returns 3126 rows]

--
George Young,  Rm. L-204        gry@ll.mit.edu
MIT Lincoln Laboratory
244 Wood St.
Lexington, Massachusetts  02420-9108    (781) 981-2756

Re: [SQL] bad select performance fixed by forbidding hash joins

From
Tom Lane
Date:
George Young <gry@ll.mit.edu> writes:
> Yes!  PGOPTIONS="-fh" made the query time go from 16 seconds to 2 seconds!
> Is this a safe thing to leave on permanently, or is there some way to set
> PGOPTIONS for just this query?

I wouldn't recommend leaving it on as a long-term solution, because
you're hobbling the system for cases where hashjoin *is* the best
method.  AFAIK there is not a SET VARIABLE method for enabling/disabling
plan types on-the-fly, though perhaps one should be added.

The right long-term solution is to figure out why the system is
misestimating the relative costs of the two plans, and fix the cost
estimates.  (The system is estimating that the mergejoin is about 4x
slower than hash; if it's really 8x faster, there is something pretty
broken about the estimate...)

I am interested in looking into this.  If your data is not proprietary,
perhaps you would be willing to send me a database dump so that I can
reproduce the problem exactly?  (If the dump is no more than a few
megabytes, emailing it should be OK.)  No big hurry, since I probably
won't be able to get to it for a week or so anyway.

            regards, tom lane

> George Young <gry@ll.mit.edu> writes:
> > Yes!  PGOPTIONS="-fh" made the query time go from 16 seconds to 2 seconds!
> > Is this a safe thing to leave on permanently, or is there some way to set
> > PGOPTIONS for just this query?
>
> I wouldn't recommend leaving it on as a long-term solution, because
> you're hobbling the system for cases where hashjoin *is* the best
> method.  AFAIK there is not a SET VARIABLE method for enabling/disabling
> plan types on-the-fly, though perhaps one should be added.

Postgres does have options to prohibit certain join types, so you could
try forcing a certain join type and see what happens.

--
  Bruce Momjian                        |  http://www.op.net/~candle
  maillist@candle.pha.pa.us            |  (610) 853-3000
  +  If your life is a hard drive,     |  830 Blythe Avenue
  +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026

Tom Lane wrote:
>
> The right long-term solution is to figure out why the system is
> misestimating the relative costs of the two plans, and fix the cost
> estimates.  (The system is estimating that the mergejoin is about 4x
> slower than hash; if it's really 8x faster, there is something pretty
> broken about the estimate...)
>


I am pleased that someone is taking seriously the speed of joins problem,
and here is a word or two of mine to say.

The main feature of SQL as a language is it's generality. It doesn't
tell how to make things. It only expresses the result desired. This is
a strong side, because programmer doesn't have to tell the procedure,
and it's at the same time the weakness. This is weakness because
the problem of deciding on a way is handed over to the server, since
there seems theoretically no way to estimate the cost of every query
without doing that query. It is proven very easily. Suppose you
have a join with qualification 'where a = 10'. Having statistics
of how many there are rows with a == 10 you can tell the right way.
Then suppose you have join with qualification 'table1.a = table2.b'.
This is much much more difficult to be estimated in the right way,
although still possible. If you have qualification of the kind
'SOME_WEIRD_FUNCTION(table1.a) = ANOTHER_WEIRD_FUNCTION(table2.b)'
it becomes impossible to tell the right way.

There is a fundamental flaw in SQL which makes it difficult
to realize in practice with high efficiency. The only way to
solve the problem generally is to commit a sin against purism
and start giving hints to server.

I don't state that there is no way of improving concrete Postgres
optimizer. Definitely there is. As far as I can see, now it's
optimizer is too pessimistic about sizes of the results of joins,
it overestimates them badly. If you make it think that it is
going to get fewer rows from a query, it will behave better. Maybe
you should even get a state variable, setting of which could
regulate it's degree of optimism in joins. But all that is quick
and dirty hack. General way is hints, however.

--
Leon.
---------
"This may seem a bit weird, but that's okay, because it is weird." -
Perl manpage.