Thread: multiple joins + Order by + LIMIT query performance issue

multiple joins + Order by + LIMIT query performance issue

From
Antoine Baudoux
Date:
Hello,

I have a query that runs for hours when joining 4 tables but takes
milliseconds when joining one MORE table to the query.
I have One big table, t_event (8 million rows) and 4 small tables
(t_network,t_system,t_service, t_interface, all < 1000 rows). This
query takes a few milliseconds :
[code]
select * from t_Event event
inner join t_Service service on event.service_id=service.id
inner join t_System system on service.system_id=system.id
inner join t_Interface interface on system.id=interface.system_id
inner join t_Network network on interface.network_id=network.id
where (network.customer_id=1) order by event.c_date desc limit 25

"Limit  (cost=23981.18..23981.18 rows=1 width=977)"
"  ->  Sort  (cost=23981.18..23981.18 rows=1 width=977)"
"        Sort Key: this_.c_date"
"        ->  Nested Loop  (cost=0.00..23981.17 rows=1 width=977)"
"              ->  Nested Loop  (cost=0.00..23974.89 rows=1 width=961)"
"                    ->  Nested Loop  (cost=0.00..191.42 rows=1
width=616)"
"                          Join Filter: (service_s3_.system_id =
service1_.system_id)"
"                          ->  Nested Loop  (cost=0.00..9.29 rows=1
width=576)"
"                                ->  Seq Scan on t_network
service_s4_  (cost=0.00..1.01 rows=1 width=18)"
"                                      Filter: (customer_id = 1)"
"                                ->  Index Scan using
interface_network_id_idx on t_interface service_s3_  (cost=0.00..8.27
rows=1 width=558)"
"                                      Index Cond:
(service_s3_.network_id = service_s4_.id)"
"                          ->  Seq Scan on t_service service1_
(cost=0.00..109.28 rows=5828 width=40)"
"                    ->  Index Scan using event_svc_id_idx on t_event
this_  (cost=0.00..23681.12 rows=8188 width=345)"
"                          Index Cond: (this_.service_id =
service1_.id)"
"              ->  Index Scan using t_system_pkey on t_system
service_s2_  (cost=0.00..6.27 rows=1 width=16)"
"                    Index Cond: (service_s2_.id = service1_.system_id)"
[/code]

This one takes HOURS, but I'm joining one table LESS :

[code]
select * from t_Event event
inner join t_Service service on event.service_id=service.id
inner join t_System system on service.system_id=system.id
inner join t_Interface interface on system.id=interface.system_id
where (interface.network_id=1) order by event.c_date desc limit 25

"Limit  (cost=147.79..2123.66 rows=10 width=959)"
"  ->  Nested Loop  (cost=147.79..2601774.46 rows=13167 width=959)"
"        Join Filter: (service1_.id = this_.service_id)"
"        ->  Index Scan Backward using event_date_idx on t_event
this_  (cost=0.00..887080.22 rows=8466896 width=345)"
"        ->  Materialize  (cost=147.79..147.88 rows=9 width=614)"
"              ->  Hash Join  (cost=16.56..147.79 rows=9 width=614)"
"                    Hash Cond: (service1_.system_id = service_s2_.id)"
"                    ->  Seq Scan on t_service service1_
(cost=0.00..109.28 rows=5828 width=40)"
"                    ->  Hash  (cost=16.55..16.55 rows=1 width=574)"
"                          ->  Nested Loop  (cost=0.00..16.55 rows=1
width=574)"
"                                ->  Index Scan using
interface_network_id_idx on t_interface service_s3_  (cost=0.00..8.27
rows=1 width=558)"
"                                      Index Cond: (network_id = 1)"
"                                ->  Index Scan using t_system_pkey on
t_system service_s2_  (cost=0.00..8.27 rows=1 width=16)"
"                                      Index Cond: (service_s2_.id =
service_s3_.system_id)"
[/code]

My understanding is that in the first case the sort is done after all
the table joins and filtering, but in the second case ALL the rows in
t_event are scanned and sorted before the join. There is an index on
the sorting column. If I remove this index, the query runs very fast.
But I still need this index for other queries.So I must force the
planner to do the sort after the join, in the second case. How can i
do that?

Thanks a lot for your help,

Antoine


Re: multiple joins + Order by + LIMIT query performance issue

From
"Guillaume Smet"
Date:
Antoine,

On Tue, May 6, 2008 at 5:03 PM, Antoine Baudoux <ab@taktik.be> wrote:
>  "Limit  (cost=23981.18..23981.18 rows=1 width=977)"
>  "  ->  Sort  (cost=23981.18..23981.18 rows=1 width=977)"
>  "        Sort Key: this_.c_date"

Can you please provide the EXPLAIN ANALYZE output instead of EXPLAIN?

Thanks.

--
Guillaume

Re: multiple joins + Order by + LIMIT query performance issue

From
Antoine Baudoux
Date:
Here is the explain analyse for the first query, the other is still
running...


explain analyse select * from t_Event event
inner join t_Service service on event.service_id=service.id
inner join t_System system on service.system_id=system.id
inner join t_Interface interface on system.id=interface.system_id
inner join t_Network network on interface.network_id=network.id
where (network.customer_id=1) order by event.c_date desc limit 25

Limit  (cost=11761.44..11761.45 rows=1 width=976) (actual
time=0.047..0.047 rows=0 loops=1)
   ->  Sort  (cost=11761.44..11761.45 rows=1 width=976) (actual
time=0.045..0.045 rows=0 loops=1)
         Sort Key: event.c_date
         Sort Method:  quicksort  Memory: 17kB
         ->  Nested Loop  (cost=0.00..11761.43 rows=1 width=976)
(actual time=0.024..0.024 rows=0 loops=1)
               ->  Nested Loop  (cost=0.00..11755.15 rows=1 width=960)
(actual time=0.024..0.024 rows=0 loops=1)
                     ->  Nested Loop  (cost=0.00..191.42 rows=1
width=616) (actual time=0.024..0.024 rows=0 loops=1)
                           Join Filter: (interface.system_id =
service.system_id)
                           ->  Nested Loop  (cost=0.00..9.29 rows=1
width=576) (actual time=0.023..0.023 rows=0 loops=1)
                                 ->  Seq Scan on t_network network
(cost=0.00..1.01 rows=1 width=18) (actual time=0.009..0.009 rows=1
loops=1)
                                       Filter: (customer_id = 1)
                                 ->  Index Scan using
interface_network_id_idx on t_interface interface  (cost=0.00..8.27
rows=1 width=558) (actual time=0.011..0.011 rows=0 loops=1)
                                       Index Cond:
(interface.network_id = network.id)
                           ->  Seq Scan on t_service service
(cost=0.00..109.28 rows=5828 width=40) (never executed)
                     ->  Index Scan using event_svc_id_idx on t_event
event  (cost=0.00..11516.48 rows=3780 width=344) (never executed)
                           Index Cond: (event.service_id = service.id)
               ->  Index Scan using t_system_pkey on t_system system
(cost=0.00..6.27 rows=1 width=16) (never executed)
                     Index Cond: (system.id = service.system_id)
Total runtime: 0.362 ms



On May 6, 2008, at 5:38 PM, Guillaume Smet wrote:

> Antoine,
>
> On Tue, May 6, 2008 at 5:03 PM, Antoine Baudoux <ab@taktik.be> wrote:
>> "Limit  (cost=23981.18..23981.18 rows=1 width=977)"
>> "  ->  Sort  (cost=23981.18..23981.18 rows=1 width=977)"
>> "        Sort Key: this_.c_date"
>
> Can you please provide the EXPLAIN ANALYZE output instead of EXPLAIN?
>
> Thanks.
>
> --
> Guillaume


Re: multiple joins + Order by + LIMIT query performance issue

From
Shaun Thomas
Date:
On Tue, 2008-05-06 at 16:03 +0100, Antoine Baudoux wrote:

> My understanding is that in the first case the sort is
> done after all the table joins and filtering, but in the
> second case ALL the rows in t_event are scanned and sorted
> before the join.

You've actually run into a problem that's bitten us in the ass a couple
of times.  The problem with your second query is that it's *too*
efficient.  You'll notice the first plan uses a bevy of nest-loops,
which is very risky if the row estimates are not really really accurate.
The planner says "Hey, customer_id=1 could be several rows in the
t_network table, but not too many... I better check them one by one."
I've turned off nest-loops sometimes to avoid queries that would run
several hours due to mis-estimation, but it looks like yours was just
fine.

The second query says "Awesome!  Only one network... I can just search
the index of t_event backwards for this small result set!"

But here's the rub... try your query *without* the limit clause, and you
may find it's actually faster, because the planner suddenly thinks it
will have to scan the whole table, so it choses an alternate plan
(probably back to the nest-loop).  Alternatively, take off the order-by
clause, and it'll remove the slow backwards index-scan.

I'm not sure what causes this, but the problem with indexes is that
they're not necessarily in the order you want unless you also cluster
them, so a backwards index scan is almost always the wrong answer.
Personally I consider this a bug, and it's been around since at least
the 8.1 tree.  The only real answer is that you have a fast version of
the query, so try and play with it until it acts the way you want.

--

Shaun Thomas
Database Administrator

Leapfrog Online
807 Greenwood Street
Evanston, IL 60201
Tel. 847-440-8253
Fax. 847-570-5750
www.leapfrogonline.com



Re: multiple joins + Order by + LIMIT query performance issue

From
Antoine Baudoux
Date:
    Thanks a lot for your answer, there are some points I didnt understand

On May 6, 2008, at 6:43 PM, Shaun Thomas wrote:

>
> The second query says "Awesome!  Only one network... I can just search
> the index of t_event backwards for this small result set!"
>

Shouldnt It be the opposite? considering that only a few row must be
"joined" (Sorry but I'm not familiar with DBMS terms) with the
t_event table, why not simply look up the corresponding rows in the
t_event table using the service_id foreign key, then do the sort? Isnt
the planner fooled by the index on the sorting column? If I remove the
index the query runs OK.


> But here's the rub... try your query *without* the limit clause, and
> you
> may find it's actually faster, because the planner suddenly thinks it
> will have to scan the whole table, so it choses an alternate plan
> (probably back to the nest-loop).  Alternatively, take off the order-
> by
> clause, and it'll remove the slow backwards index-scan.

You are right, if i remove the order-by clause It doesnt backwards
index-scan.

And if I remove the limit and keep the  order-by clause, the backwards
index-scan is gone too, and the query runs in a few millisecs!!

This is crazy, so simply by adding a LIMIT to a query, the planning is
changed in a very bad way. Does the planner use the LIMIT as a sort of
hint?


Thank you for your explanations,


Antoine Baudoux

Re: multiple joins + Order by + LIMIT query performance issue

From
Tom Lane
Date:
Shaun Thomas <sthomas@leapfrogonline.com> writes:
> I'm not sure what causes this, but the problem with indexes is that
> they're not necessarily in the order you want unless you also cluster
> them, so a backwards index scan is almost always the wrong answer.

Whether the scan is forwards or backwards has nothing to do with it.
The planner is using the index ordering to avoid having to do a
full-table scan and sort.  It's essentially betting that it will find
25 (or whatever your LIMIT is) rows that satisfy the other query
conditions soon enough in the index scan to make this faster than the
full-scan approach.  If there are a lot fewer matching rows than it
expects, or if the target rows aren't uniformly scattered in the index
ordering, then this way can be a loss; but when it's a win it can be
a big win, too, so "it's a bug take it out" is an unhelpful opinion.

If a misestimate of this kind is bugging you enough that you're willing
to change the query, I think you can fix it like this:

    select ... from foo order by x limit n;
=>
    select ... from (select ... from foo order by x) ss limit n;

The subselect will be planned without awareness of the LIMIT, so you
should get a plan using a sort rather than one that bets on the LIMIT
being reached quickly.

            regards, tom lane

Re: multiple joins + Order by + LIMIT query performance issue

From
"Heikki Linnakangas"
Date:
Antoine Baudoux wrote:
> Here is the explain analyse for the first query, the other is still
> running...
>
>
> explain analyse select * from t_Event event
> inner join t_Service service on event.service_id=service.id
> inner join t_System system on service.system_id=system.id
> inner join t_Interface interface on system.id=interface.system_id
> inner join t_Network network on interface.network_id=network.id
> where (network.customer_id=1) order by event.c_date desc limit 25
>
> Limit  (cost=11761.44..11761.45 rows=1 width=976) (actual
> time=0.047..0.047 rows=0 loops=1)
>   ->  Sort  (cost=11761.44..11761.45 rows=1 width=976) (actual
> time=0.045..0.045 rows=0 loops=1)
>         Sort Key: event.c_date
>         Sort Method:  quicksort  Memory: 17kB
>         ->  Nested Loop  (cost=0.00..11761.43 rows=1 width=976) (actual
> time=0.024..0.024 rows=0 loops=1)
>               ->  Nested Loop  (cost=0.00..11755.15 rows=1 width=960)
> (actual time=0.024..0.024 rows=0 loops=1)
>                     ->  Nested Loop  (cost=0.00..191.42 rows=1
> width=616) (actual time=0.024..0.024 rows=0 loops=1)
>                           Join Filter: (interface.system_id =
> service.system_id)
>                           ->  Nested Loop  (cost=0.00..9.29 rows=1
> width=576) (actual time=0.023..0.023 rows=0 loops=1)
>                                 ->  Seq Scan on t_network network
> (cost=0.00..1.01 rows=1 width=18) (actual time=0.009..0.009 rows=1 loops=1)
>                                       Filter: (customer_id = 1)
>                                 ->  Index Scan using
> interface_network_id_idx on t_interface interface  (cost=0.00..8.27
> rows=1 width=558) (actual time=0.011..0.011 rows=0 loops=1)
>                                       Index Cond: (interface.network_id
> = network.id)
>                           ->  Seq Scan on t_service service
> (cost=0.00..109.28 rows=5828 width=40) (never executed)
>                     ->  Index Scan using event_svc_id_idx on t_event
> event  (cost=0.00..11516.48 rows=3780 width=344) (never executed)
>                           Index Cond: (event.service_id = service.id)
>               ->  Index Scan using t_system_pkey on t_system system
> (cost=0.00..6.27 rows=1 width=16) (never executed)
>                     Index Cond: (system.id = service.system_id)
> Total runtime: 0.362 ms

Are the queries even returning the same results (except for the extra
columns coming from t_network)? It looks like in this version, the
network-interface join is performed first, which returns zero rows, so
the rest of the joins don't need to be performed at all. That's why it's
fast.

Which version of PostgreSQL is this, BTW?

--
   Heikki Linnakangas
   EnterpriseDB   http://www.enterprisedb.com

Re: multiple joins + Order by + LIMIT query performance issue

From
"Heikki Linnakangas"
Date:
Antoine Baudoux wrote:
> Here is the explain analyse for the first query, the other is still
> running...
>
>
> explain analyse select * from t_Event event
> inner join t_Service service on event.service_id=service.id
> inner join t_System system on service.system_id=system.id
> inner join t_Interface interface on system.id=interface.system_id
> inner join t_Network network on interface.network_id=network.id
> where (network.customer_id=1) order by event.c_date desc limit 25
>
> Limit  (cost=11761.44..11761.45 rows=1 width=976) (actual
> time=0.047..0.047 rows=0 loops=1)
>   ->  Sort  (cost=11761.44..11761.45 rows=1 width=976) (actual
> time=0.045..0.045 rows=0 loops=1)
>         Sort Key: event.c_date
>         Sort Method:  quicksort  Memory: 17kB
>         ->  Nested Loop  (cost=0.00..11761.43 rows=1 width=976) (actual
> time=0.024..0.024 rows=0 loops=1)
>               ->  Nested Loop  (cost=0.00..11755.15 rows=1 width=960)
> (actual time=0.024..0.024 rows=0 loops=1)
>                     ->  Nested Loop  (cost=0.00..191.42 rows=1
> width=616) (actual time=0.024..0.024 rows=0 loops=1)
>                           Join Filter: (interface.system_id =
> service.system_id)
>                           ->  Nested Loop  (cost=0.00..9.29 rows=1
> width=576) (actual time=0.023..0.023 rows=0 loops=1)
>                                 ->  Seq Scan on t_network network
> (cost=0.00..1.01 rows=1 width=18) (actual time=0.009..0.009 rows=1 loops=1)
>                                       Filter: (customer_id = 1)
>                                 ->  Index Scan using
> interface_network_id_idx on t_interface interface  (cost=0.00..8.27
> rows=1 width=558) (actual time=0.011..0.011 rows=0 loops=1)
>                                       Index Cond: (interface.network_id
> = network.id)
>                           ->  Seq Scan on t_service service
> (cost=0.00..109.28 rows=5828 width=40) (never executed)
>                     ->  Index Scan using event_svc_id_idx on t_event
> event  (cost=0.00..11516.48 rows=3780 width=344) (never executed)
>                           Index Cond: (event.service_id = service.id)
>               ->  Index Scan using t_system_pkey on t_system system
> (cost=0.00..6.27 rows=1 width=16) (never executed)
>                     Index Cond: (system.id = service.system_id)
> Total runtime: 0.362 ms

Are the queries returning the same results (except for the extra columns
coming from t_network)? It looks like in this version, the
network-interface join is performed first, which returns zero rows, so
the rest of the joins don't need to be performed at all. That's why it's
fast.

--
   Heikki Linnakangas
   EnterpriseDB   http://www.enterprisedb.com

Re: multiple joins + Order by + LIMIT query performance issue

From
Shaun Thomas
Date:
On Tue, 2008-05-06 at 18:59 +0100, Tom Lane wrote:

> Whether the scan is forwards or backwards has nothing
> to do with it.  The planner is using the index ordering
> to avoid having to do a full-table scan and sort.

Oh, I know that.  I just noticed that when this happened to us, more
often than not, it was a reverse index scan that did it.  The thing that
annoyed me most was when it happened on an index that, even on a table
having 20M rows, the cardinality is < 10 on almost every value of that
index.  In our case, having a "LIMIT 1" was much worse than just getting
back 5 or 10 rows and throwing away everything after the first one.

> but when it's a win it can be a big win, too, so "it's
> a bug take it out" is an unhelpful opinion.

That's just it... it *can* be a big win.  But when it's a loss, you're
index-scanning a 20M+ row table for no reason.  We got around it,
obviously, but it was a definite surprise when a query that normally
runs in 0.5ms time randomly and inexplicably runs at 4-120s.  This is
disaster for a feed loader chewing through a few ten-thousand entries.

But that's just me grousing about not having query hints or being able
to tell Postgres to never, ever, ever index-scan certain tables. :)

--

Shaun Thomas
Database Administrator

Leapfrog Online
807 Greenwood Street
Evanston, IL 60201
Tel. 847-440-8253
Fax. 847-570-5750
www.leapfrogonline.com



Re: multiple joins + Order by + LIMIT query performance issue

From
Shaun Thomas
Date:
On Tue, 2008-05-06 at 18:24 +0100, Antoine Baudoux wrote:

> Isnt the planner fooled by the index on the sorting column?
> If I remove the index the query runs OK.

In your case, for whatever reason, the stats say doing the index scan on
the sorted column will give you the results faster.  That isn't always
the case, and sometimes you can give the same query different where
clauses and that same slow-index-scan will randomly be fast.  It's all
based on the index distribution and the particular values being fetched.

This goes back to what Tom said.  If you know a "miss" can result in
terrible performance, it's best to just recode the query to avoid the
situation.

> This is crazy, so simply by adding a LIMIT to a query, the planning is
> changed in a very bad way. Does the planner use the LIMIT as a sort of
> hint?

Yes.  That's actually what tells it the index scan can be a "big win."
If it scans the index backwards on values returned from some of your
joins, it may just have to find 25 rows and then it can immediately stop
scanning and just give you the results.  In normal cases, this is a
massive performance boost when you have an order clause and are
expecting a ton of results, (say you're getting the first 25 rows of
10000 or something).  But if it would be faster to generate the results
and *then* sort, but Postgres thinks otherwise, you're pretty much
screwed.

But that's the long answer.  You have like 3 ways to get around this
now, so pick one. ;)

--

Shaun Thomas
Database Administrator

Leapfrog Online
807 Greenwood Street
Evanston, IL 60201
Tel. 847-440-8253
Fax. 847-570-5750
www.leapfrogonline.com



Re: multiple joins + Order by + LIMIT query performance issue

From
Antoine Baudoux
Date:
>
> If a misestimate of this kind is bugging you enough that you're
> willing
> to change the query, I think you can fix it like this:
>
>     select ... from foo order by x limit n;
> =>
>     select ... from (select ... from foo order by x) ss limit n;
>
> The subselect will be planned without awareness of the LIMIT, so you
> should get a plan using a sort rather than one that bets on the LIMIT
> being reached quickly.

I tried that, using a subquery. Unfortunately this does not change
anything :

select * from (select * from t_Event event
inner join t_Service service on event.service_id=service.id
inner join t_System system on service.system_id=system.id
inner join t_Interface interface on system.id=interface.system_id
where (interface.network_id=1) order by event.c_date desc ) ss limit 25

"Limit  (cost=147.79..5563.93 rows=25 width=3672)"
"  ->  Subquery Scan ss  (cost=147.79..2896263.01 rows=13368
width=3672)"
"        ->  Nested Loop  (cost=147.79..2896129.33 rows=13368
width=958)"
"              Join Filter: (service.id = event.service_id)"
"              ->  Index Scan Backward using event_date_idx on t_event
event  (cost=0.00..1160633.69 rows=8569619 width=344)"
"              ->  Materialize  (cost=147.79..147.88 rows=9 width=614)"
"                    ->  Hash Join  (cost=16.56..147.79 rows=9
width=614)"
"                          Hash Cond: (service.system_id = system.id)"
"                          ->  Seq Scan on t_service service
(cost=0.00..109.28 rows=5828 width=40)"
"                          ->  Hash  (cost=16.55..16.55 rows=1
width=574)"
"                                ->  Nested Loop  (cost=0.00..16.55
rows=1 width=574)"
"                                      ->  Index Scan using
interface_network_id_idx on t_interface interface  (cost=0.00..8.27
rows=1 width=558)"
"                                            Index Cond: (network_id =
1)"
"                                      ->  Index Scan using
t_system_pkey on t_system system  (cost=0.00..8.27 rows=1 width=16)"
"                                            Index Cond: (system.id =
interface.system_id)"


The worst thing about all this is that there are ZERO rows to join
with the t_event table. So the planner decide to index-scan 8 millions
row, where there is no hope of finding a match!
This seems a very ,very , very poor decision

Re: multiple joins + Order by + LIMIT query performance issue

From
Matthew Wakeling
Date:
On Tue, 6 May 2008, Tom Lane wrote:
> If a misestimate of this kind is bugging you enough that you're willing
> to change the query, I think you can fix it like this:
>
>     select ... from foo order by x limit n;
> =>
>     select ... from (select ... from foo order by x) ss limit n;
>
> The subselect will be planned without awareness of the LIMIT, so you
> should get a plan using a sort rather than one that bets on the LIMIT
> being reached quickly.

Surely if that's the case, that in itself is a bug? Apart from being
"useful", I mean.

Matthew

--
"Television is a medium because it is neither rare nor well done."
  -- Fred Friendly

Re: multiple joins + Order by + LIMIT query performance issue

From
Antoine Baudoux
Date:
Ok, I've tried everything, and the planner keeps choosing index scans
when it shouldnt.

Is there a way to disable index scans?


Antoine

Re: multiple joins + Order by + LIMIT query performance issue

From
"Scott Marlowe"
Date:
On Fri, May 9, 2008 at 1:18 AM, Antoine Baudoux <ab@taktik.be> wrote:
> Ok, I've tried everything, and the planner keeps choosing index scans when
> it shouldnt.
>
>  Is there a way to disable index scans?

You can use "set enable_indexscan off;" as the first command  I've had
one or two reporting queries in the past that it was a necessity to do
that before running certain queries on very large datasets where a seq
scan would kill performance.