Thread: Hash join in 8.3

Hash join in 8.3

From
André Volpato
Date:
I´m running some compatibility and performance tests, between two
servers with 8.1 and 8.3 as follows :

[1] 8.1: postgres 8.1.9 / 1GB Ram / P4 3Ghz / Debian 2.6.18-4-686
[2] 8.3: postgres 8.3 b4 / 1GB Ram / P4 3Ghz / Debian 2.6.18-4-686

[2] is faster for every single operation, but I found something with the
planner that seems odd.

Consider this structure:

create table test (
    i bigint unique not null,
    t text
);
populated with 4 million rows with generate_series(1,4000000)

create table jtest (
    i bigint not null,
    constraint jtestfk foreign key (i) references test (i)
);
populated with 6 million rows


And the query:

# select j.i, t.t from jtest j inner join test t on t.i = j.i where
(j.i*1.5) between 3000000 and 4000000;

Planner for [1]:
 Nested Loop  (cost=0.00..270192.02 rows=20000 width=41) (actual
time=4192.514..32781.498 rows=1333334 loops=1)
   ->  Seq Scan on jtest j  (cost=0.00..179412.02 rows=30000 width=8)
(actual time=4147.813..19195.877 rows=1333334 loops=1)
         Filter: ((((i)::numeric * 1.5) >= 3000000::numeric) AND
(((i)::numeric * 1.5) <= 4000000::numeric))
   ->  Index Scan using test_i_key on test t  (cost=0.00..3.01 rows=1
width=41) (actual time=0.007..0.008 rows=1 loops=1333334)
         Index Cond: (t.i = "outer".i)
 Total runtime: 33372.300 ms

Planner for [2]:
 Hash Join  (cost=176924.02..297518.03 rows=20000 width=38) (actual
time=125715.079..239893.461 rows=1333334 loops=1)
   Hash Cond: (t.i = j.i)
   ->  Seq Scan on test t  (cost=0.00..75394.00 rows=4000000 width=38)
(actual time=0.051..4344.157 rows=4000000 loops=1)
   ->  Hash  (cost=176549.02..176549.02 rows=30000 width=8) (actual
time=11711.708..11711.708 rows=1333334 loops=1)
         ->  Seq Scan on jtest j  (cost=0.00..176549.02 rows=30000
width=8) (actual time=2228.052..10812.444 rows=1333334 loops=1)
               Filter: ((((i)::numeric * 1.5) >= 3000000::numeric) AND
(((i)::numeric * 1.5) <= 4000000::numeric))
 Total runtime: 240461.273 ms

Besides the (expected) weak guess on rows for both servers on seq scan
on jtest, there is something nasty with [2] that prevents the planner to
use the index.
For some reason, [1] uses the index first, and then seq scan to filter.
[2] seq scans filter first, and hash aggregate instead of using the index.

Now, turning off hashing:
# set enable_hashjoin=off;
# set enable_hashagg=off;

Again for [2]:
 Merge Join  (cost=178781.75..328370.60 rows=20000 width=38) (actual
time=15703.086..18799.493 rows=1333334 loops=1)
   Merge Cond: (t.i = j.i)
   ->  Index Scan using test_i_key on test t  (cost=0.00..139273.96
rows=4000018 width=38) (actual time=0.125..2895.835 rows=2666667 loops=1)
   ->  Sort  (cost=178781.75..178856.75 rows=30000 width=8) (actual
time=12423.001..13007.569 rows=1333334 loops=1)
         Sort Key: j.i
         Sort Method:  quicksort  Memory: 84852kB
         ->  Seq Scan on jtest j  (cost=0.00..176550.85 rows=30000
width=8) (actual time=2076.495..10417.157 rows=1333334 loops=1)
               Filter: ((((i)::numeric * 1.5) >= 3000000::numeric) AND
(((i)::numeric * 1.5) <= 4000000::numeric))
 Total runtime: 19340.734 ms

Works fine now. Quicksort and index scan.

Some points here:
1. The query is kinda stupid, but its a compatibility test (I´m trying
to figure out how many queries must be rewritten for 8.3)
2. Vacuum is up2date!
3. Is there any way to make [2] use the index ?

--

[]´s,

André Volpato
Ecom Tecnologia LTDA - Análise e Desenvolvimento
andre.volpato@ecomtecnologia.com.br



Re: Hash join in 8.3

From
Tom Lane
Date:
=?ISO-8859-1?Q?Andr=E9_Volpato?= <andre.volpato@ecomtecnologia.com.br> writes:
> Besides the (expected) weak guess on rows for both servers on seq scan
> on jtest, there is something nasty with [2] that prevents the planner to
> use the index.

There isn't anything "preventing" either version from choosing any of
the three plans, as you can easily prove for yourself by experimenting
with enable_nestloop/enable_mergejoin/enable_hashjoin.  The cost
estimates seem close enough that random variations in ANALYZE stats
would change which one looks cheapest.

            regards, tom lane

Re: Hash join in 8.3

From
Gregory Stark
Date:
André Volpato <andre.volpato@ecomtecnologia.com.br> writes:

> And the query:
>
> # select j.i, t.t from jtest j inner join test t on t.i = j.i where (j.i*1.5)
> between 3000000 and 4000000;
>
> Planner for [1]:
> Nested Loop  (cost=0.00..270192.02 rows=20000 width=41) (actual

> Planner for [2]:
> Hash Join  (cost=176924.02..297518.03 rows=20000 width=38) (actual

> Now, turning off hashing:
> # set enable_hashjoin=off;
> # set enable_hashagg=off;
>
> Again for [2]:
> Merge Join  (cost=178781.75..328370.60 rows=20000 width=38) (actual

I think the answer is that if you have bad statistics you'll get a bad plan
and which bad plan is going to be pretty much random.

But I'm curious if you turn off mergejoin whether you can get a Nested Loop
plan and what cost 8.3 gives it. It looks to me like 8.3 came up with a higher
cost for Nested Loop than 8.1.9 (I think 8.1.10 came out with some planner
fixes btw) and so it's deciding these other plans are better. And they might
have been better for the imaginary scenario that the planner thinks is going
on.

--
  Gregory Stark
  EnterpriseDB          http://www.enterprisedb.com
  Ask me about EnterpriseDB's Slony Replication support!

Re: Hash join in 8.3

From
André Volpato
Date:
Gregory Stark escreveu:
> André Volpato <andre.volpato@ecomtecnologia.com.br> writes:
>
>> And the query:
>>
>> # select j.i, t.t from jtest j inner join test t on t.i = j.i where (j.i*1.5)
>> between 3000000 and 4000000;
>>
>> Planner for [1]:
>> Nested Loop  (cost=0.00..270192.02 rows=20000 width=41) (actual
>>
>> Planner for [2]:
>> Hash Join  (cost=176924.02..297518.03 rows=20000 width=38) (actual
>>
>> Now, turning off hashing:
>> # set enable_hashjoin=off;
>> # set enable_hashagg=off;
>>
>> Again for [2]:
>> Merge Join  (cost=178781.75..328370.60 rows=20000 width=38) (actual
>>
>
> I think the answer is that if you have bad statistics you'll get a bad plan
> and which bad plan is going to be pretty much random.
>
I believe the statistics are ok, I´ve runned vacuum analyze before all
those tries.
> But I'm curious if you turn off mergejoin whether you can get a Nested Loop
> plan and what cost 8.3 gives it. It looks to me like 8.3 came up with a higher
> cost for Nested Loop than 8.1.9 (I think 8.1.10 came out with some planner
> fixes btw) and so it's deciding these other plans are better. And they might
> have been better for the imaginary scenario that the planner thinks is going
> on.
Not anymore :)
Nested Loop  (cost=0.00..389461.65 rows=20000 width=38)
Total runtime: 22934.656 ms

Without hash and merge, the plan is exactly the same for 8.1 and 8.3.
No inicial cost for nested loops... it seems that hash < merge < nested
in this case.

--

[]´s,

André Volpato
Ecom Tecnologia LTDA - Análise e Desenvolvimento
andre.volpato@ecomtecnologia.com.br



Re: Hash join in 8.3

From
Tom Lane
Date:
Gregory Stark <stark@enterprisedb.com> writes:
> But I'm curious if you turn off mergejoin whether you can get a Nested Loop
> plan and what cost 8.3 gives it. It looks to me like 8.3 came up with a higher
> cost for Nested Loop than 8.1.9 (I think 8.1.10 came out with some planner
> fixes btw) and so it's deciding these other plans are better. And they might
> have been better for the imaginary scenario that the planner thinks is going
> on.

Actually, now that I think about it, 8.3 should be *more* likely than
8.1 to choose a nestloop-with-inner-indexscan plan.  8.1 didn't have the
changes to allow a discount for repeated inner indexscans.

I'm wondering if

(a) the 8.1 installation being compared to had some planner cost
parameter changes that were not copied into the 8.3 installation; or

(b) the only reason 8.1 likes the nestloop plan is that it has no
statistics on the test tables, whereas 8.3 does have stats because
of autovacuum being on by default.

            regards, tom lane

Re: Hash join in 8.3

From
André Volpato
Date:
Tom Lane escreveu:
> Gregory Stark <stark@enterprisedb.com> writes:
>
>> But I'm curious if you turn off mergejoin whether you can get a Nested Loop
>> plan and what cost 8.3 gives it. It looks to me like 8.3 came up with a higher
>> cost for Nested Loop than 8.1.9 (I think 8.1.10 came out with some planner
>> fixes btw) and so it's deciding these other plans are better. And they might
>> have been better for the imaginary scenario that the planner thinks is going
>> on.
>>
Let me show this part again, to make things easier to understand =)

Nested Loop  (cost=0.00..389461.65 rows=20000 width=38)
Total runtime: 22934.656 ms

Without hash and merge, the plan is exactly the same for 8.1 and 8.3.
No inicial cost for nested loops... it seems that hash < merge < nested
in this case.


> Actually, now that I think about it, 8.3 should be *more* likely than
> 8.1 to choose a nestloop-with-inner-indexscan plan.  8.1 didn't have the
> changes to allow a discount for repeated inner indexscans.
>
> I'm wondering if
>
> (a) the 8.1 installation being compared to had some planner cost
> parameter changes that were not copied into the 8.3 installation; or
>
The parameters are at their default. The only changes made are in shared
buffers, work and maintenance mem.

> (b) the only reason 8.1 likes the nestloop plan is that it has no
> statistics on the test tables, whereas 8.3 does have stats because
> of autovacuum being on by default.
>
I dont know if I understand what test tables you are talking about.
To run this tests I disabled autovacuum, and run "vacuum analyze test /
jtest" before.

--

[]´s,

André Volpato
Ecom Tecnologia LTDA - Análise e Desenvolvimento
andre.volpato@ecomtecnologia.com.br



Re: Hash join in 8.3

From
André Volpato
Date:
Tom Lane escreveu: <blockquote cite="mid:18544.1197574622@sss.pgh.pa.us" type="cite"><pre wrap="">Gregory Stark <a
class="moz-txt-link-rfc2396E"href="mailto:stark@enterprisedb.com"><stark@enterprisedb.com></a> writes:
</pre><blockquotetype="cite"><pre wrap="">But I'm curious if you turn off mergejoin whether you can get a Nested Loop
 
plan and what cost 8.3 gives it. It looks to me like 8.3 came up with a higher
cost for Nested Loop than 8.1.9 (I think 8.1.10 came out with some planner
fixes btw) and so it's deciding these other plans are better. And they might
have been better for the imaginary scenario that the planner thinks is going
on.   </pre></blockquote><pre wrap="">
Actually, now that I think about it, 8.3 should be *more* likely than
8.1 to choose a nestloop-with-inner-indexscan plan.  8.1 didn't have the
changes to allow a discount for repeated inner indexscans.

I'm wondering if

(a) the 8.1 installation being compared to had some planner cost
parameter changes that were not copied into the 8.3 installation; or

(b) the only reason 8.1 likes the nestloop plan is that it has no
statistics on the test tables, whereas 8.3 does have stats because
of autovacuum being on by default.
        regards, tom lane </pre></blockquote><br /> I think I found the answer!<br /><br /> 8.1: likes nested loop even
aftervacuumdb on the database.<br /><br /> 8.3: likes hash at first time but:<br /> - after vacuumdb *on the database*
(Iwas running on the tables.....), it turns out to:<br />  Merge Join  (cost=178779.93..328503.44 rows=30000 width=38)
in20005.207 ms<br /> #set enable_mergejoin=off;<br />  Hash Join  (cost=156644.00..365204.03 rows=30000 width=38) in
29104.390ms<br />  * a very faster hash here, seqscanning the smaller table before the bigger one. Tricky!<br /><br />
Iwont trust table vacuums anymore...<br /><br /><pre class="moz-signature" cols="72">-- 
 

[]´s,

André Volpato
Ecom Tecnologia LTDA - Análise e Desenvolvimento
<a class="moz-txt-link-abbreviated"
href="mailto:andre.volpato@ecomtecnologia.com.br">andre.volpato@ecomtecnologia.com.br</a></pre>

Re: Hash join in 8.3

From
Gregory Stark
Date:
André Volpato <andre.volpato@ecomtecnologia.com.br> writes:

> Gregory Stark escreveu:
>> André Volpato <andre.volpato@ecomtecnologia.com.br> writes:
>>
>> I think the answer is that if you have bad statistics you'll get a bad plan
>> and which bad plan is going to be pretty much random.
>>
> I believe the statistics are ok, I´ve runned vacuum analyze before all those
> tries.

Sorry, I should have said "bad estimates". That is, because of the

  j*1.5 BETWEEN 3000000 AND 4000000

clause the optimizer isn't going to be able to come up with a good estimate of
how many rows that will match. What plan it picks when it has such a bad
estimate is going to be pretty random, dependant on just what plans would be
good in a situation entirely unrelated to the reality.

--
  Gregory Stark
  EnterpriseDB          http://www.enterprisedb.com
  Ask me about EnterpriseDB's Slony Replication support!

Re: Hash join in 8.3

From
Gregory Stark
Date:
André Volpato <andre.volpato@ecomtecnologia.com.br> writes:

> I think I found the answer!<br>
> <br>
> 8.1: likes nested loop even after vacuumdb on the database.<br>
> <br>
> 8.3: likes hash at first time but:<br>
> - after vacuumdb *on the database* (I was running on the tables.....),
> it turns out to:<br>
>  Merge Join  (cost=178779.93..328503.44 rows=30000 width=38) in
> 20005.207 ms<br>
> #set enable_mergejoin=off;<br>
>  Hash Join  (cost=156644.00..365204.03 rows=30000 width=38) in
> 29104.390 ms<br>
>  * a very faster hash here, seqscanning the smaller table before the
> bigger one. Tricky!<br>
> <br>
> I wont trust table vacuums anymore...<br>
> <br>
> <pre class="moz-signature" cols="72">--

HTML-only mail isn't looked upon too favourably here.

You keep saying "vacuum" which makes me think maybe you're not actually
analyzing your tables at all. "vacuum" doesn't analyze the tables, you have to
run "analyze" (or "vacuum analyze") for that.

--
  Gregory Stark
  EnterpriseDB          http://www.enterprisedb.com
  Ask me about EnterpriseDB's 24x7 Postgres support!

Re: Hash join in 8.3

From
André Volpato
Date:
Gregory Stark escreveu:
> André Volpato <andre.volpato@ecomtecnologia.com.br> writes:
>
>> I think I found the answer!
>>
>> 8.1: likes nested loop even after vacuumdb on the database.<br>
>>
>> 8.3: likes hash at first time but:
>> - after vacuumdb *on the database* (I was running on the tables.....), it turns out to:
>> Merge Join (cost=178779.93..328503.44 rows=30000 width=38) in 20005.207 ms
>> # set enable_mergejoin=off;
>> Hash Join(cost=156644.00..365204.03 rows=30000 width=38) in 29104.390 ms
>> * a very faster hash here, seqscanning the smaller table before the bigger one. Tricky!
>>
>> I wont trust table vacuums anymore...
>>
>>
>
> HTML-only mail isn't looked upon too favourably here.
>
My bad. Tbird for some reason isn´t auto-removing html in sent mail to
@postgresql.org.

> You keep saying "vacuum" which makes me think maybe you're not actually
> analyzing your tables at all. "vacuum" doesn't analyze the tables, you have to
> run "analyze" (or "vacuum analyze") for that.
I always run vaccumm analyze. The plan only changes in 8.3 after
"vacuumdb -v -z database".

> Sorry, I should have said "bad estimates". That is, because of the
>
>   j*1.5 BETWEEN 3000000 AND 4000000
That's supposed to be that way :)
I think all of this worth for me to have a clue on how the planner goes
in bad sql, wich causes bad estimates.

Thank you all for your support!

--

[]´s,

André Volpato
Ecom Tecnologia LTDA - Análise e Desenvolvimento
andre.volpato@ecomtecnologia.com.br