Thread: Limit + group + join

From:
Tobias Brox
Date:

Consider this setup - which is a gross simplification of parts of our
production system ;-)

  create table c (id integer primary key);
  create table b (id integer primary key, c_id integer);
  create index b_on_c on b(c_id)

  insert into c (select ... lots of IDs ...);
  insert into b (select id, id from c); /* keep it simple :-) */

Now, I'm just interessted in some few rows.

All those gives good plans:

explain select c.id from c order by c.id limit 1;
explain select c.id from c group by c.id order by c.id limit 1;
explain select c.id from c join b on c_id=c.id order by c.id limit 1;

... BUT ... combining join, group and limit makes havoc:

explain select c.id from c join b on c_id=c.id  group by c.id order by c.id
desc limit 5;
                                     QUERY PLAN
-------------------------------------------------------------------------------------
 Limit  (cost=3809.65..3809.67 rows=5 width=4)
   ->  Group  (cost=3809.65..3940.59 rows=26187 width=4)
         ->  Sort  (cost=3809.65..3875.12 rows=26188 width=4)
               Sort Key: c.id
               ->  Hash Join  (cost=559.34..1887.89 rows=26188 width=4)
                     Hash Cond: ("outer".id = "inner".c_id)
                     ->  Seq Scan on c  (cost=0.00..403.87 rows=26187 width=4)
                     ->  Hash  (cost=403.87..403.87 rows=26187 width=4)
                           ->  Seq Scan on b  (cost=0.00..403.87 rows=26187 width=4)
(9 rows)

I get the same behaviour on pg 7.4.7 and pg 8.0.2.  Of course, I can
probably use subqueries instead of join - though, I would have wished the
planner could do better ;-)

--
Notice of Confidentiality: This information may be confidential, and
blah-blah-blah - so please keep your eyes closed.  Please delete and destroy
this email.  Failure to comply will cause my lawyer to yawn.

From:
"Jeffrey W. Baker"
Date:

On Fri, 2005-08-26 at 02:27 +0200, Tobias Brox wrote:
> Consider this setup - which is a gross simplification of parts of our
> production system ;-)
>
>   create table c (id integer primary key);
>   create table b (id integer primary key, c_id integer);
>   create index b_on_c on b(c_id)
>
>   insert into c (select ... lots of IDs ...);
>   insert into b (select id, id from c); /* keep it simple :-) */
>
> Now, I'm just interessted in some few rows.
>
> All those gives good plans:
>
> explain select c.id from c order by c.id limit 1;
> explain select c.id from c group by c.id order by c.id limit 1;
> explain select c.id from c join b on c_id=c.id order by c.id limit 1;
>
> ... BUT ... combining join, group and limit makes havoc:
>
> explain select c.id from c join b on c_id=c.id  group by c.id order by c.id
> desc limit 5;

Where's b in this join clause?  It looks like a cartesian product to me.

-jwb


From:
Tobias Brox
Date:

[Jeffrey W. Baker - Thu at 06:56:59PM -0700]
> > explain select c.id from c join b on c_id=c.id  group by c.id order by c.id
> > desc limit 5;
>
> Where's b in this join clause?

"join b on c_id=c.id"

It just a funny way of writing:

select c.id from c,b where c_id=c.id  group by c.id order by c.id desc limit 5;

> It looks like a cartesian product to me.

No.  The query will return exactly the same as the simplest query:

  select c.id from c order by c.id  desc limit 5;

As said, this is a gross oversimplification of the production envorinment.
In the production environment, I really need to use both join, group and
limit.  I tested a bit with subqueries, it was not a good solution
(selecting really a lot of rows and aggregates from many of the tables).

The next idea is to hack it up by manually finding out where the "limit"
will cut, and place a restriction in the where-part of the query.

--
Notice of Confidentiality: This information may be confidential, and
blah-blah-blah - so please keep your eyes closed.  Please delete and destroy
this email.  Failure to comply will cause my lawyer to yawn.

From:
"Jeffrey W. Baker"
Date:

On Thu, 2005-08-25 at 18:56 -0700, Jeffrey W. Baker wrote:
> On Fri, 2005-08-26 at 02:27 +0200, Tobias Brox wrote:
> > Consider this setup - which is a gross simplification of parts of our
> > production system ;-)
> >
> >   create table c (id integer primary key);
> >   create table b (id integer primary key, c_id integer);
> >   create index b_on_c on b(c_id)
> >
> >   insert into c (select ... lots of IDs ...);
> >   insert into b (select id, id from c); /* keep it simple :-) */
> >
> > Now, I'm just interessted in some few rows.
> >
> > All those gives good plans:
> >
> > explain select c.id from c order by c.id limit 1;
> > explain select c.id from c group by c.id order by c.id limit 1;
> > explain select c.id from c join b on c_id=c.id order by c.id limit 1;
> >
> > ... BUT ... combining join, group and limit makes havoc:
> >
> > explain select c.id from c join b on c_id=c.id  group by c.id order by c.id
> > desc limit 5;
>
> Where's b in this join clause?  It looks like a cartesian product to me.

Nevermind.  I read c_id as c.id.

-jwb


From:
Mark Kirkwood
Date:

Tobias,
Interesting example:

The 'desc' seems to be the guy triggering the sort, e.g:

explain select c.id from c join b on c_id=c.id  group by c.id order by
c.id limit 5;
                                        QUERY PLAN

-----------------------------------------------------------------------------------------
  Limit  (cost=0.00..0.28 rows=5 width=4)
    ->  Group  (cost=0.00..4476.00 rows=80000 width=4)
          ->  Merge Join  (cost=0.00..4276.00 rows=80000 width=4)
                Merge Cond: ("outer".id = "inner".c_id)
                ->  Index Scan using c_pkey on c  (cost=0.00..1518.00
rows=80000 width=4)
                ->  Index Scan using b_on_c on b  (cost=0.00..1558.00
rows=80000 width=4)
(6 rows)

Whereas with it back in again:

explain select c.id from c join b on c_id=c.id  group by c.id order by
c.id desc limit 5;
                                       QUERY PLAN

--------------------------------------------------------------------------------------
  Limit  (cost=10741.08..10741.11 rows=5 width=4)
    ->  Group  (cost=10741.08..11141.08 rows=80000 width=4)
          ->  Sort  (cost=10741.08..10941.08 rows=80000 width=4)
                Sort Key: c.id
                ->  Hash Join  (cost=1393.00..4226.00 rows=80000 width=4)
                      Hash Cond: ("outer".c_id = "inner".id)
                      ->  Seq Scan on b  (cost=0.00..1233.00 rows=80000
width=4)
                      ->  Hash  (cost=1193.00..1193.00 rows=80000 width=4)
                            ->  Seq Scan on c  (cost=0.00..1193.00
rows=80000 width=4)
(9 rows)


However being a bit brutal:

set enable_mergejoin=false;
set enable_hashjoin=false;

explain select c.id from c join b on c_id=c.id  group by c.id order by
c.id desc limit 5;
                                             QUERY PLAN

--------------------------------------------------------------------------------------------------
  Limit  (cost=0.00..15.24 rows=5 width=4)
    ->  Group  (cost=0.00..243798.00 rows=80000 width=4)
          ->  Nested Loop  (cost=0.00..243598.00 rows=80000 width=4)
                ->  Index Scan Backward using c_pkey on c
(cost=0.00..1518.00 rows=80000 width=4)
                ->  Index Scan using b_on_c on b  (cost=0.00..3.01
rows=1 width=4)
                      Index Cond: (b.c_id = "outer".id)
(6 rows)

What is interesting is why this plan is being rejected...

Cheers

Mark

Tobias Brox wrote:
> Consider this setup - which is a gross simplification of parts of our
> production system ;-)
>
>   create table c (id integer primary key);
>   create table b (id integer primary key, c_id integer);
>   create index b_on_c on b(c_id)
>
>   insert into c (select ... lots of IDs ...);
>   insert into b (select id, id from c); /* keep it simple :-) */
>
> Now, I'm just interessted in some few rows.
>
> All those gives good plans:
>
> explain select c.id from c order by c.id limit 1;
> explain select c.id from c group by c.id order by c.id limit 1;
> explain select c.id from c join b on c_id=c.id order by c.id limit 1;
>
> ... BUT ... combining join, group and limit makes havoc:
>
> explain select c.id from c join b on c_id=c.id  group by c.id order by c.id
> desc limit 5;
>

From:
Tobias Brox
Date:

[Mark Kirkwood - Fri at 03:01:01PM +1200]
> Tobias,
> Interesting example:
>
> The 'desc' seems to be the guy triggering the sort, e.g:

Oh; really an accident that I didn't notice myself, I was actually going to
remove all instances of "desc" in my simplification, but seems like I forgot.

> However being a bit brutal:
>
> set enable_mergejoin=false;
> set enable_hashjoin=false;

:-) maybe I can use that in production.  I'll check.

--
Notice of Confidentiality: This information may be confidential, and
blah-blah-blah - so please keep your eyes closed.  Please delete and destroy
this email.  Failure to comply will cause my lawyer to yawn.

From:
Stephan Szabo
Date:

On Fri, 26 Aug 2005, Mark Kirkwood wrote:

> However being a bit brutal:
>
> set enable_mergejoin=false;
> set enable_hashjoin=false;
>
> explain select c.id from c join b on c_id=c.id  group by c.id order by
> c.id desc limit 5;
>                                              QUERY PLAN
>
> --------------------------------------------------------------------------------------------------
>   Limit  (cost=0.00..15.24 rows=5 width=4)
>     ->  Group  (cost=0.00..243798.00 rows=80000 width=4)
>           ->  Nested Loop  (cost=0.00..243598.00 rows=80000 width=4)
>                 ->  Index Scan Backward using c_pkey on c
> (cost=0.00..1518.00 rows=80000 width=4)
>                 ->  Index Scan using b_on_c on b  (cost=0.00..3.01
> rows=1 width=4)
>                       Index Cond: (b.c_id = "outer".id)
> (6 rows)
>
> What is interesting is why this plan is being rejected...

Well, it expects 80000 probles into b_on_c to be more expensive than the
hash join and sort.  I wonder what explain analyze shows for the original
and the version with the enables changed.

From:
"Merlin Moncure"
Date:

Mark Kirkwood
> > The 'desc' seems to be the guy triggering the sort, e.g:
>
> Oh; really an accident that I didn't notice myself, I was actually
going
> to
> remove all instances of "desc" in my simplification, but seems like I
> forgot.

If desc is the problem you can push the query into a subquery without
sorting and sort the result.  This is called an inline view.  Sometimes
you can pull a couple of tricks to force the view to materialize before
it is sorted.

aka
select q.*
from
(
   some_complex_query
) q order by ...;

Merlin

From:
Tom Lane
Date:

Mark Kirkwood <> writes:
> What is interesting is why this plan is being rejected...

Which PG version are you using exactly?  That mistake looks like an
artifact of the 8.0 "fuzzy plan cost" patch, which we fixed recently:
http://archives.postgresql.org/pgsql-committers/2005-07/msg00474.php

But Tobias wasn't happy with 7.4 either, so I'm not sure that the fuzzy
cost issue explains his results.

As far as the "desc" point goes, the problem is that mergejoins aren't
capable of dealing with backward sort order, so a merge plan isn't
considered for that case (or at least, it would have to have a sort
after it, which pretty much defeats the point for a fast-start plan).
I have some ideas about fixing this but it won't happen before 8.2.

            regards, tom lane

From:
Mark Kirkwood
Date:

Tom Lane wrote:
> Mark Kirkwood <> writes:
>
>>What is interesting is why this plan is being rejected...
>
>
> Which PG version are you using exactly?  That mistake looks like an
> artifact of the 8.0 "fuzzy plan cost" patch, which we fixed recently:
> http://archives.postgresql.org/pgsql-committers/2005-07/msg00474.php
>

Right on - 8.0.3 (I might look at how CVS tip handles this, could be
interesting).

> But Tobias wasn't happy with 7.4 either, so I'm not sure that the fuzzy
> cost issue explains his results.
>
> As far as the "desc" point goes, the problem is that mergejoins aren't
> capable of dealing with backward sort order, so a merge plan isn't
> considered for that case (or at least, it would have to have a sort
> after it, which pretty much defeats the point for a fast-start plan).
> I have some ideas about fixing this but it won't happen before 8.2.

That doesn't explain why the nested loop is being kicked tho', or have I
missed something obvious? - it's been known to happen :-)...

Cheers

Mark


From:
Tom Lane
Date:

Mark Kirkwood <> writes:
> That doesn't explain why the nested loop is being kicked tho',

No, but I think the fuzzy-cost bug does.  There are two different issues
here.

            regards, tom lane

From:
Mark Kirkwood
Date:

Interestingly enough, 7.4.8 and 8.1devel-2005-08-23 all behave the same
as 8.0.3 for me (tables freshly ANALYZEd):

joinlimit=# SELECT version();
                                              version

-------------------------------------------------------------------------------------------------
  PostgreSQL 7.4.8 on i386-unknown-freebsd5.4, compiled by GCC gcc (GCC)
3.4.2 [FreeBSD] 20040728
(1 row)

joinlimit=# EXPLAIN SELECT c.id FROM c JOIN b ON c_id=c.id  GROUP BY
c.id ORDER BY c.id DESC LIMIT 5;
                                           QUERY PLAN

-----------------------------------------------------------------------------------------------
  Limit  (cost=10591.36..10591.39 rows=5 width=4)
    ->  Group  (cost=10591.36..10992.02 rows=80131 width=4)
          ->  Sort  (cost=10591.36..10791.69 rows=80131 width=4)
                Sort Key: c.id
                ->  Merge Join  (cost=0.00..4064.66 rows=80131 width=4)
                      Merge Cond: ("outer".id = "inner".c_id)
                      ->  Index Scan using c_pkey on c
(cost=0.00..1411.31 rows=80131 width=4)
                      ->  Index Scan using b_on_c on b
(cost=0.00..1451.72 rows=80172 width=4)
(8 rows)

joinlimit=# EXPLAIN SELECT c.id FROM c JOIN b ON c_id=c.id  GROUP BY
c.id ORDER BY c.id  LIMIT 5;
                                        QUERY PLAN

-----------------------------------------------------------------------------------------
  Limit  (cost=0.00..0.27 rows=5 width=4)
    ->  Group  (cost=0.00..4264.99 rows=80131 width=4)
          ->  Merge Join  (cost=0.00..4064.66 rows=80131 width=4)
                Merge Cond: ("outer".id = "inner".c_id)
                ->  Index Scan using c_pkey on c  (cost=0.00..1411.31
rows=80131 width=4)
                ->  Index Scan using b_on_c on b  (cost=0.00..1451.72
rows=80172 width=4)
(6 rows)


joinlimit=# SELECT version();
                                               version

----------------------------------------------------------------------------------------------------
  PostgreSQL 8.1devel on i386-unknown-freebsd5.4, compiled by GCC gcc
(GCC) 3.4.2 [FreeBSD] 20040728
(1 row)

joinlimit=# EXPLAIN SELECT c.id FROM c JOIN b ON c_id=c.id  GROUP BY
c.id ORDER BY c.id DESC LIMIT 5;
                                           QUERY PLAN

-----------------------------------------------------------------------------------------------
  Limit  (cost=10654.53..10654.55 rows=5 width=4)
    ->  Group  (cost=10654.53..11054.53 rows=80000 width=4)
          ->  Sort  (cost=10654.53..10854.53 rows=80000 width=4)
                Sort Key: c.id
                ->  Merge Join  (cost=0.00..4139.44 rows=80000 width=4)
                      Merge Cond: ("outer".id = "inner".c_id)
                      ->  Index Scan using c_pkey on c
(cost=0.00..1450.00 rows=80000 width=4)
                      ->  Index Scan using b_on_c on b
(cost=0.00..1490.00 rows=80000 width=4)
(8 rows)

joinlimit=# EXPLAIN SELECT c.id FROM c JOIN b ON c_id=c.id  GROUP BY
c.id ORDER BY c.id LIMIT 5;
                                        QUERY PLAN

-----------------------------------------------------------------------------------------
  Limit  (cost=0.00..0.27 rows=5 width=4)
    ->  Group  (cost=0.00..4339.44 rows=80000 width=4)
          ->  Merge Join  (cost=0.00..4139.44 rows=80000 width=4)
                Merge Cond: ("outer".id = "inner".c_id)
                ->  Index Scan using c_pkey on c  (cost=0.00..1450.00
rows=80000 width=4)
                ->  Index Scan using b_on_c on b  (cost=0.00..1490.00
rows=80000 width=4)
(6 rows)

The non default server params of relevance are:

shared_buffers = 12000
effective_cache_size = 100000
work_mem/sort_mem = 20480

I did wonder if the highish sort_mem might be a factor, but no, with it
  set to 1024 I get the same behaviour (just higher sort cost estimates).

Cheers

Mark

Tom Lane wrote:
>
>
> Which PG version are you using exactly?  That mistake looks like an
> artifact of the 8.0 "fuzzy plan cost" patch, which we fixed recently:
> http://archives.postgresql.org/pgsql-committers/2005-07/msg00474.php
>
> But Tobias wasn't happy with 7.4 either, so I'm not sure that the fuzzy
> cost issue explains his results.
>
> As far as the "desc" point goes, the problem is that mergejoins aren't
> capable of dealing with backward sort order, so a merge plan isn't
> considered for that case (or at least, it would have to have a sort
> after it, which pretty much defeats the point for a fast-start plan).
> I have some ideas about fixing this but it won't happen before 8.2.
>

From:
Greg Stark
Date:

Tom Lane <> writes:

> As far as the "desc" point goes, the problem is that mergejoins aren't
> capable of dealing with backward sort order, so a merge plan isn't
> considered for that case (or at least, it would have to have a sort
> after it, which pretty much defeats the point for a fast-start plan).
> I have some ideas about fixing this but it won't happen before 8.2.

Of course in this case assuming "id" is an integer column you can just sort by
-id instead.

--
greg

From:
Tom Lane
Date:

Mark Kirkwood <> writes:
> joinlimit=# EXPLAIN SELECT c.id FROM c JOIN b ON c_id=c.id  GROUP BY
> c.id ORDER BY c.id DESC LIMIT 5;
> [ fails to pick an available index-scan-backward plan ]

I looked into this and found that indeed the desirable join plan was
getting generated, but it wasn't picked because query_planner didn't
have an accurate idea of how much of the join needed to be scanned to
satisfy the GROUP BY step.  I've committed some changes that hopefully
will let 8.1 be smarter about GROUP BY ... LIMIT queries.

            regards, tom lane

From:
Mark Kirkwood
Date:

Tom Lane wrote:
>
> I looked into this and found that indeed the desirable join plan was
> getting generated, but it wasn't picked because query_planner didn't
> have an accurate idea of how much of the join needed to be scanned to
> satisfy the GROUP BY step.  I've committed some changes that hopefully
> will let 8.1 be smarter about GROUP BY ... LIMIT queries.
>

Very nice :-)

joinlimit=# EXPLAIN SELECT c.id FROM c JOIN b ON c_id=c.id  GROUP BY
c.id ORDER BY c.id DESC LIMIT 5;
                                             QUERY PLAN

--------------------------------------------------------------------------------------------------
  Limit  (cost=0.00..15.23 rows=5 width=4)
    ->  Group  (cost=0.00..243730.00 rows=80000 width=4)
          ->  Nested Loop  (cost=0.00..243530.00 rows=80000 width=4)
                ->  Index Scan Backward using c_pkey on c
(cost=0.00..1450.00 rows=80000 width=4)
                ->  Index Scan using b_on_c on b  (cost=0.00..3.01
rows=1 width=4)
                      Index Cond: (b.c_id = "outer".id)
(6 rows)

This is 8.1devel from today.

regards

Mark

From:
Tobias Brox
Date:

[Tom Lane]
> I looked into this and (...) I've committed some changes that hopefully will
> let 8.1 be smarter about GROUP BY ... LIMIT queries.

[Mark Kirkwood]
> Very nice :-)
(...)
> This is 8.1devel from today.

Splendid :-) Unfortunately we will not be upgrading for some monthes still,
but anyway I'm happy.  This provides yet another good argument for upgrading
sooner.  I'm also happy to see such a perfect match:

 - A problem that can be reduced from beeing complex and
   production-specific, to simple and easily reproducible.

 - Enthusiastic people testing it and pinpointing even more precisely what
   conditions will cause the condition

 - Programmers actually fixing the issue

 - Testers verifying that it was fixed

Long live postgresql! :-)

--
Notice of Confidentiality: This email is sent unencrypted over the network,
and may be stored on several email servers; it can be read by third parties
as easy as a postcard.  Do not rely on email for confidential information.

From:
"Merlin Moncure"
Date:

Tobias wrote:
> Splendid :-) Unfortunately we will not be upgrading for some monthes
> still,
> but anyway I'm happy.  This provides yet another good argument for
> upgrading
> sooner.  I'm also happy to see such a perfect match:
>
>  - A problem that can be reduced from beeing complex and
>    production-specific, to simple and easily reproducible.
>
>  - Enthusiastic people testing it and pinpointing even more precisely
what
>    conditions will cause the condition
>
>  - Programmers actually fixing the issue
>
>  - Testers verifying that it was fixed
>
> Long live postgresql! :-)

In the last three or so years since I've been really active with
postgresql, I've found two or three issues/bugs which I was able to
reproduce and reduce to a test case.  In all instances the fix was in
cvs literally within minutes.

Merlin