Thread: Lot'sa joins - performance tip-up, please?

Lot'sa joins - performance tip-up, please?

From
Mario Splivalo
Date:
I have a quite large query that takes over a minute to run on my laptop.
On the db server it takes olmost 20 seconds, but I have 200+ concurent
users who will be running similair querries, and during the query the
I/O goes bezerk, I read 30MB/s reading (iostat tells so).  So, before
going into denormalization, I wonder if I could do something to speed
things up.

The query is like this:

select
    *
from
    messages
    join services on services.id = messages.service_id
    join ticketing_messages on messages.id = ticketing_messages.message_id
    left join ticketing_winners on ticketing_winners.message_id =
ticketing_messages.message_id
    left join
    (
        select
            *
        from
            ticketing_codes_played
            join ticketing_codes on ticketing_codes.code_id =
ticketing_codes_played.code_id
    ) as codes on codes.message_id = ticketing_messages.message_id
where
    services.type_id = 10
and
    messages.receiving_time between '2006-02-12' and '2006-03-18 23:00:00';

The explain analyze of the above produces this:


QUERY PLAN

-----------------------------------------------------------------------------------------------------------------------------------------------------------------
 Merge Left Join  (cost=221692.04..222029.29 rows=3772 width=264)
(actual time=539169.163..541579.504 rows=75937 loops=1)
   Merge Cond: ("outer".message_id = "inner".message_id)
   ->  Sort  (cost=40080.17..40089.60 rows=3772 width=238) (actual
time=8839.072..9723.371 rows=75937 loops=1)
         Sort Key: messages.id
         ->  Hash Left Join  (cost=2259.40..39856.10 rows=3772
width=238) (actual time=1457.451..7870.830 rows=75937 loops=1)
               Hash Cond: ("outer".message_id = "inner".message_id)
               ->  Nested Loop  (cost=2234.64..39811.76 rows=3772
width=230) (actual time=1418.911..7063.299 rows=75937 loops=1)
                     ->  Index Scan using pk_services on services
(cost=0.00..4.46 rows=1 width=54) (actual time=28.261..28.271 rows=1
loops=1)
                           Index Cond: (1102 = id)
                           Filter: (type_id = 10)
                     ->  Hash Join  (cost=2234.64..39769.58 rows=3772
width=176) (actual time=1390.621..6297.501 rows=75937 loops=1)
                           Hash Cond: ("outer".id = "inner".message_id)
                           ->  Bitmap Heap Scan on messages
(cost=424.43..32909.53 rows=74408 width=162) (actual
time=159.796..4329.125 rows=75937 loops=1)
                                 Recheck Cond: (service_id = 1102)
                                 ->  Bitmap Index Scan on idx_service_id
(cost=0.00..424.43 rows=74408 width=0) (actual time=95.197..95.197
rows=75937 loops=1)
                                       Index Cond: (service_id = 1102)
                           ->  Hash  (cost=1212.37..1212.37 rows=75937
width=14) (actual time=940.372..940.372 rows=75937 loops=1)
                                 ->  Seq Scan on ticketing_messages
(cost=0.00..1212.37 rows=75937 width=14) (actual time=12.122..461.960
rows=75937 loops=1)
               ->  Hash  (cost=21.21..21.21 rows=1421 width=8) (actual
time=38.496..38.496 rows=1421 loops=1)
                     ->  Seq Scan on ticketing_winners
(cost=0.00..21.21 rows=1421 width=8) (actual time=24.534..31.347
rows=1421 loops=1)
   ->  Sort  (cost=181611.87..181756.68 rows=57925 width=26) (actual
time=530330.060..530647.055 rows=57925 loops=1)
         Sort Key: ticketing_codes_played.message_id
         ->  Nested Loop  (cost=0.00..176144.30 rows=57925 width=26)
(actual time=68.322..529472.026 rows=57925 loops=1)
               ->  Seq Scan on ticketing_codes_played
(cost=0.00..863.25 rows=57925 width=8) (actual time=0.042..473.881
rows=57925 loops=1)
               ->  Index Scan using ticketing_codes_pk on
ticketing_codes  (cost=0.00..3.01 rows=1 width=18) (actual
time=9.102..9.108 rows=1 loops=57925)
                     Index Cond: (ticketing_codes.code_id =
"outer".code_id)
 Total runtime: 542000.093 ms
(27 rows)


I'll be more than happy to provide any additional information that I may
be able to gather. I'd be most happy if someone would scream something
like "four joins, smells like a poor design" because design is poor, but
the system is in production, and I have to bare with it.

    Mario
--
"I can do it quick, I can do it cheap, I can do it well. Pick any two."

Mario Splivalo
msplival@jagor.srce.hr



Re: Lot'sa joins - performance tip-up, please?

From
"Dave Dutcher"
Date:
>          ->  Nested Loop  (cost=0.00..176144.30 rows=57925 width=26)
> (actual time=68.322..529472.026 rows=57925 loops=1)
>                ->  Seq Scan on ticketing_codes_played
> (cost=0.00..863.25 rows=57925 width=8) (actual time=0.042..473.881
> rows=57925 loops=1)
>                ->  Index Scan using ticketing_codes_pk on
> ticketing_codes  (cost=0.00..3.01 rows=1 width=18) (actual
> time=9.102..9.108 rows=1 loops=57925)
>                      Index Cond: (ticketing_codes.code_id =
> "outer".code_id)
>  Total runtime: 542000.093 ms
> (27 rows)
>
>
> I'll be more than happy to provide any additional information
> that I may
> be able to gather. I'd be most happy if someone would scream something
> like "four joins, smells like a poor design" because design
> is poor, but
> the system is in production, and I have to bare with it.


It looks like that nested loop which is joining ticketing_codes_played
to ticketing_codes is the slow part.  I'm curious how many rows are in
the ticketing_codes table?

Four or five joins does not seem like a lot to me, but it can be slow if
you are joining big tables with other big tables.


Re: Lot'sa joins - performance tip-up, please?

From
Tom Lane
Date:
Mario Splivalo <msplival@jagor.srce.hr> writes:
> I have a quite large query that takes over a minute to run on my laptop.

The EXPLAIN output you provided doesn't seem to agree with the stated
query.  Where'd the "service_id = 1102" condition come from?

In general, I'd suggest playing around with the join order.  Existing
releases of PG tend to throw up their hands when faced with a mixture of
outer joins and regular joins, and just join the tables in the order
listed.  8.2 will be smarter about this, but for now you have to do it
by hand ...

            regards, tom lane

Re: Lot'sa joins - performance tip-up, please?

From
Mario Splivalo
Date:
On Wed, 2006-05-03 at 10:20 -0500, Dave Dutcher wrote:
> >          ->  Nested Loop  (cost=0.00..176144.30 rows=57925 width=26)
> > (actual time=68.322..529472.026 rows=57925 loops=1)
> >                ->  Seq Scan on ticketing_codes_played
> > (cost=0.00..863.25 rows=57925 width=8) (actual time=0.042..473.881
> > rows=57925 loops=1)
> >                ->  Index Scan using ticketing_codes_pk on
> > ticketing_codes  (cost=0.00..3.01 rows=1 width=18) (actual
> > time=9.102..9.108 rows=1 loops=57925)
> >                      Index Cond: (ticketing_codes.code_id =
> > "outer".code_id)
> >  Total runtime: 542000.093 ms
> > (27 rows)
> >
> >
> > I'll be more than happy to provide any additional information
> > that I may
> > be able to gather. I'd be most happy if someone would scream something
> > like "four joins, smells like a poor design" because design
> > is poor, but
> > the system is in production, and I have to bare with it.
>
>
> It looks like that nested loop which is joining ticketing_codes_played
> to ticketing_codes is the slow part.  I'm curious how many rows are in
> the ticketing_codes table?
>
> Four or five joins does not seem like a lot to me, but it can be slow if
> you are joining big tables with other big tables.

Ticketing_codes table has 11000000 records, and it's expected to grow.

I tried playing with JOIN order as Tom suggested, but performance is the
same.

    Mario


Re: Lot'sa joins - performance tip-up, please?

From
Mario Splivalo
Date:
On Wed, 2006-05-03 at 13:58 -0400, Tom Lane wrote:
> Mario Splivalo <msplival@jagor.srce.hr> writes:
> > I have a quite large query that takes over a minute to run on my laptop.
>
> The EXPLAIN output you provided doesn't seem to agree with the stated
> query.  Where'd the "service_id = 1102" condition come from?

I guess I copypasted the additional WHERE to te EXPLAIN ANALYZE query.
This is the correct one, without the WHERE:

 Hash Left Join  (cost=198628.35..202770.61 rows=121 width=264) (actual
time=998008.264..999645.322 rows=5706 loops=1)
   Hash Cond: ("outer".message_id = "inner".message_id)
   ->  Merge Left Join  (cost=21943.23..21950.96 rows=121 width=238)
(actual time=4375.510..4540.772 rows=5706 loops=1)
         Merge Cond: ("outer".message_id = "inner".message_id)
         ->  Sort  (cost=21847.62..21847.92 rows=121 width=230) (actual
time=3304.787..3378.515 rows=5706 loops=1)
               Sort Key: messages.id
               ->  Hash Join  (cost=20250.16..21843.43 rows=121
width=230) (actual time=1617.370..3102.470 rows=5706 loops=1)
                     Hash Cond: ("outer".message_id = "inner".id)
                     ->  Seq Scan on ticketing_messages
(cost=0.00..1212.37 rows=75937 width=14) (actual time=10.554..609.967
rows=75937 loops=1)
                     ->  Hash  (cost=20244.19..20244.19 rows=2391
width=216) (actual time=1572.889..1572.889 rows=5706 loops=1)
                           ->  Nested Loop  (cost=1519.21..20244.19
rows=2391 width=216) (actual time=385.582..1449.207 rows=5706 loops=1)
                                 ->  Seq Scan on services
(cost=0.00..4.20 rows=3 width=54) (actual time=20.829..20.859 rows=2
loops=1)
                                       Filter: (type_id = 10)
                                 ->  Bitmap Heap Scan on messages
(cost=1519.21..6726.74 rows=1594 width=162) (actual
time=182.346..678.800 rows=2853 loops=2)
                                       Recheck Cond: (("outer".id =
messages.service_id) AND (messages.receiving_time >= '2006-02-12
00:00:00+01'::timestamp with time zone) AND (messages.receiving_time <=
'2006-03-18 23:00:00+01'::timestamp with time zone))
                                       ->  BitmapAnd
(cost=1519.21..1519.21 rows=1594 width=0) (actual time=164.311..164.311
rows=0 loops=2)
                                             ->  Bitmap Index Scan on
idx_service_id  (cost=0.00..84.10 rows=14599 width=0) (actual
time=66.809..66.809 rows=37968 loops=2)
                                                   Index Cond:
("outer".id = messages.service_id)
                                             ->  Bitmap Index Scan on
idx_messages_receiving_time  (cost=0.00..1434.87 rows=164144 width=0)
(actual time=192.633..192.633 rows=184741 loops=1)
                                                   Index Cond:
((receiving_time >= '2006-02-12 00:00:00+01'::timestamp with time zone)
AND (receiving_time <= '2006-03-18 23:00:00+01'::timestamp with time
zone))
         ->  Sort  (cost=95.62..99.17 rows=1421 width=8) (actual
time=1070.678..1072.999 rows=482 loops=1)
               Sort Key: ticketing_winners.message_id
               ->  Seq Scan on ticketing_winners  (cost=0.00..21.21
rows=1421 width=8) (actual time=424.836..1061.834 rows=1421 loops=1)
   ->  Hash  (cost=176144.30..176144.30 rows=57925 width=26) (actual
time=993592.980..993592.980 rows=57925 loops=1)
         ->  Nested Loop  (cost=0.00..176144.30 rows=57925 width=26)
(actual time=1074.984..992536.243 rows=57925 loops=1)
               ->  Seq Scan on ticketing_codes_played
(cost=0.00..863.25 rows=57925 width=8) (actual time=74.479..2047.993
rows=57925 loops=1)
               ->  Index Scan using ticketing_codes_pk on
ticketing_codes  (cost=0.00..3.01 rows=1 width=18) (actual
time=17.044..17.052 rows=1 loops=57925)
                     Index Cond: (ticketing_codes.code_id =
"outer".code_id)
 Total runtime: 999778.981 ms


> In general, I'd suggest playing around with the join order.  Existing
> releases of PG tend to throw up their hands when faced with a mixture of
> outer joins and regular joins, and just join the tables in the order
> listed.  8.2 will be smarter about this, but for now you have to do it
> by hand ...

No luck for me there. But, I found out that if I first do join on
ticketing_codes and ticketing_codes_played, put the result to temporary
table, and then join that temporary table with the rest of the query
(the SELECT that is in parenthesis is transfered to a temporary table)
the query is almost twice as fast.

As mentioned before, ticketing_codes has 11000000 records.

    Mario

P.S. Is it just me, or posting to psql-perofrmance is laged, quite a
bit?


Re: Lot'sa joins - performance tip-up, please?

From
"Jim C. Nasby"
Date:
On Thu, May 04, 2006 at 04:45:57PM +0200, Mario Splivalo wrote:
Well, here's the problem...

>          ->  Nested Loop  (cost=0.00..176144.30 rows=57925 width=26)
> (actual time=1074.984..992536.243 rows=57925 loops=1)
>                ->  Seq Scan on ticketing_codes_played
> (cost=0.00..863.25 rows=57925 width=8) (actual time=74.479..2047.993
> rows=57925 loops=1)
>                ->  Index Scan using ticketing_codes_pk on
> ticketing_codes  (cost=0.00..3.01 rows=1 width=18) (actual
> time=17.044..17.052 rows=1 loops=57925)
>                      Index Cond: (ticketing_codes.code_id =
> "outer".code_id)

Anyone have any idea why on earth it's doing that instead of a hash or
merge join?

In any case, try swapping the order of ticketing_codes_played and
ticketing_codes. Actually, that'd probably make it worse.

Try SET enable_nestloop = off;
--
Jim C. Nasby, Sr. Engineering Consultant      jnasby@pervasive.com
Pervasive Software      http://pervasive.com    work: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf       cell: 512-569-9461

Re: Lot'sa joins - performance tip-up, please?

From
Mario Splivalo
Date:
On Wed, 2006-05-10 at 17:10 -0500, Jim C. Nasby wrote:
> On Thu, May 04, 2006 at 04:45:57PM +0200, Mario Splivalo wrote:
> Well, here's the problem...
>
> >          ->  Nested Loop  (cost=0.00..176144.30 rows=57925 width=26)
> > (actual time=1074.984..992536.243 rows=57925 loops=1)
> >                ->  Seq Scan on ticketing_codes_played
> > (cost=0.00..863.25 rows=57925 width=8) (actual time=74.479..2047.993
> > rows=57925 loops=1)
> >                ->  Index Scan using ticketing_codes_pk on
> > ticketing_codes  (cost=0.00..3.01 rows=1 width=18) (actual
> > time=17.044..17.052 rows=1 loops=57925)
> >                      Index Cond: (ticketing_codes.code_id =
> > "outer".code_id)
>
> Anyone have any idea why on earth it's doing that instead of a hash or
> merge join?
>
> In any case, try swapping the order of ticketing_codes_played and
> ticketing_codes. Actually, that'd probably make it worse.

I tried that, no luck. The best performance I achieve with creating
temporary table. And...

>
> Try SET enable_nestloop = off;

This helps also. I don't get sequential scans any more. I'd like a tip
on how to set 'enable_nestloop = off' trough JDBC?

    Mario
--
"I can do it quick, I can do it cheap, I can do it well. Pick any two."

Mario Splivalo
msplival@jagor.srce.hr



Re: Lot'sa joins - performance tip-up, please?

From
Markus Schaber
Date:
Hi, Mario,

Mario Splivalo wrote:

> This helps also. I don't get sequential scans any more. I'd like a tip
> on how to set 'enable_nestloop = off' trough JDBC?

statement.execute("SET enable_nestloop TO off"); should do.

HTH,
Markus

--
Markus Schaber | Logical Tracking&Tracing International AG
Dipl. Inf.     | Software Development GIS

Fight against software patents in EU! www.ffii.org www.nosoftwarepatents.org