Thread: Query planner unaware of possibly best plan

Query planner unaware of possibly best plan

From
Denes Daniel
Date:
Hi,
I think the query planner is unaware of the possibly best plan in some
situations. See the test case below:

-- --------------------------------------------------- --

CREATE TABLE tparent (
  id integer NOT NULL,
  ord integer NOT NULL,
  CONSTRAINT par_pkey_id PRIMARY KEY (id),
  CONSTRAINT par_uniq_ord UNIQUE (ord)
);

CREATE TABLE tchild (
  par_id integer NOT NULL,
  ord integer NOT NULL,
  CONSTRAINT chi_pkey_parid_ord PRIMARY KEY (par_id, ord),
  CONSTRAINT chi_fkey FOREIGN KEY (par_id) REFERENCES tparent(id)
);

INSERT INTO tparent VALUES (1, 3);
INSERT INTO tparent VALUES (2, 1);
INSERT INTO tparent VALUES (3, 4);
INSERT INTO tparent VALUES (4, 5);
INSERT INTO tparent VALUES (5, 2);

INSERT INTO tchild VALUES (1, 2);
INSERT INTO tchild VALUES (1, 1);
INSERT INTO tchild VALUES (2, 1);
INSERT INTO tchild VALUES (2, 3);
INSERT INTO tchild VALUES (2, 2);
INSERT INTO tchild VALUES (3, 1);
INSERT INTO tchild VALUES (3, 2);
INSERT INTO tchild VALUES (4, 1);
INSERT INTO tchild VALUES (5, 2);
INSERT INTO tchild VALUES (5, 1);

ANALYZE tparent;
ANALYZE tchild;

SET enable_seqscan TO false;
SET enable_bitmapscan TO false;
SET enable_hashjoin TO false;
SET enable_mergejoin TO false;
SET enable_sort TO false;

EXPLAIN ANALYZE
SELECT *
FROM tparent JOIN tchild ON tchild.par_id = tparent.id
WHERE tparent.ord BETWEEN 1 AND 4
ORDER BY tparent.ord, tchild.ord;

-- --------------------------------------------------- --

Sort
(cost=100000132.10..100000140.10 rows=8 width=16)
(actual time=0.440..0.456 rows=9 loops=1)
Sort Key: tparent.ord, tchild.ord

-> Nested Loop
   (cost=0.00..84.10 rows=8 width=16)
   (actual time=0.179..0.270 rows=9 loops=1)

   -> Index Scan using par_uniq_ord on tparent
      (cost=0.00..20.40 rows=4 width=8)
      (actual time=0.089..0.098 rows=4 loops=1)
      Index Cond: ((ord >= 1) AND (ord <= 4))

   -> Index Scan using chi_pkey_parid_ord on tchild
      (cost=0.00..9.93 rows=2 width=8)
      (actual time=0.023..0.028 rows=2 loops=4)
      Index Cond: (tchild.par_id = "outer".id)

-- --------------------------------------------------- --

Even though I forced the nested loop plan using both indexes (that
returns the rows in the correct order), there is a needless sort step on
the top, consuming half of the time even on such small tables.
Now it's clear why the planner did not choose this plan, why I had to
force it: because it isn't the best if the sort is still there.

The first time I posted this
( http://archives.postgresql.org/pgsql-general/2007-05/msg01306.php )
and read Tom's answer I was convinced that this is rarely a problem,
but now I don't think so, since I ran into it for the third time.

Can that sort step somehow be eliminated if the NestLoop's outer
table is being scanned via a unique index? If not, how can I rewrite my
indexes/query in such a way that it's still safe (the rows always come in
the order I want), but I don't have to wait for that needless sort?

I'm using PostgreSQL 8.1.8.


Thanks in advance,
Denes Daniel
____________________________________________________




Olvasd az [origo]-t a mobilodon: mini magazinok a Mobizin-en
___________________________________________________
www.t-mobile.hu/mobizin


Re: Query planner unaware of possibly best plan

From
Simon Riggs
Date:
On Fri, 2007-09-21 at 17:36 +0200, Denes Daniel wrote:

> Even though I forced the nested loop plan using both indexes (that
> returns the rows in the correct order), there is a needless sort step on
> the top, consuming half of the time even on such small tables.
> Now it's clear why the planner did not choose this plan, why I had to
> force it: because it isn't the best if the sort is still there.

Ordering by parent, child is fairly common but the variation you've got
here isn't that common. You'd need to make a case considering all the
alternatives; nobody will agree without a balanced case that includes
what is best for everyone.

Your EXPLAIN looks edited. Have you also edited the sort costs? They
look slightly higher than we might expect. Please provide the full
normal EXPLAIN output.

--
  Simon Riggs
  2ndQuadrant  http://www.2ndQuadrant.com


Re: Query planner unaware of possibly best plan

From
Denes Daniel
Date:
Simon Riggs <simon@2ndquadrant.com> írta:

> Ordering by parent, child is fairly common but the variation you've got
> here isn't that common. You'd need to make a case considering all the
> alternatives; nobody will agree without a balanced case that includes
> what is best for everyone.
>
> Your EXPLAIN looks edited. Have you also edited the sort costs? They
> look slightly higher than we might expect. Please provide the full
> normal EXPLAIN output.
>
> --
>   Simon Riggs
>   2ndQuadrant  http://www.2ndQuadrant.com



I've just inserted some newlines, so it's better to read than when my
email-client wraps the lines automatically. Did not touch the information
itself. But here is the normal output of EXPLAIN ANALYZE:

EXPLAIN ANALYZE SELECT * FROM tparent JOIN tchild ON tchild.par_id =
tparent.id WHERE tparent.ord BETWEEN 1 AND 4 ORDER BY tparent.ord,
tchild.ord;

                                                              QUERY PLAN
-------------------------------------------------------------------------------------------
--------------------------------------------
 Sort  (cost=100000132.10..100000140.10 rows=8 width=16) (actual
time=0.302..0.319 rows=9 loops=1)
   Sort Key: tparent.ord, tchild.ord
   ->  Nested Loop  (cost=0.00..84.10 rows=8 width=16) (actual
time=0.181..0.267 rows=9 loops=1)
         ->  Index Scan using par_uniq_ord on tparent  (cost=0.00..20.40
rows=4 width=8) (actual time=0.100..0.109 rows=4 loops=1)
               Index Cond: ((ord >= 1) AND (ord <= 4))
         ->  Index Scan using chi_pkey_parid_ord on tchild
(cost=0.00..9.93 rows=2 width=8) (actual time=0.020..0.026 rows=2
loops=4)
               Index Cond: (tchild.par_id = "outer".id)
 Total runtime: 0.412 ms
(8 rows)

The costs may be different because I've tuned the query planner's
parameters.

> Ordering by parent, child is fairly common but the variation you've got
> here isn't that common.
How else can you order by parent, child other than first ordering by a
unique key of parent, then something in child? (Except for
child.parent_id, child.something because this has all the information in
child and can rely on a single multicolumn index.)


Denes Daniel
------------------------------------------------------------------------





Olvasd az [origo]-t a mobilodon: mini magazinok a Mobizin-en
___________________________________________________
www.t-mobile.hu/mobizin


Re: Query planner unaware of possibly best plan

From
Alvaro Herrera
Date:
Denes Daniel wrote:

> I've just inserted some newlines, so it's better to read than when my
> email-client wraps the lines automatically. Did not touch the information
> itself. But here is the normal output of EXPLAIN ANALYZE:

The best thing to do is paste them in a text file and send it as an
attachment.

>                                                               QUERY PLAN
> -------------------------------------------------------------------------------------------
> --------------------------------------------
>  Sort  (cost=100000132.10..100000140.10 rows=8 width=16) (actual
> time=0.302..0.319 rows=9 loops=1)
>    Sort Key: tparent.ord, tchild.ord
>    ->  Nested Loop  (cost=0.00..84.10 rows=8 width=16) (actual
> time=0.181..0.267 rows=9 loops=1)
>          ->  Index Scan using par_uniq_ord on tparent  (cost=0.00..20.40
> rows=4 width=8) (actual time=0.100..0.109 rows=4 loops=1)
>                Index Cond: ((ord >= 1) AND (ord <= 4))
>          ->  Index Scan using chi_pkey_parid_ord on tchild
> (cost=0.00..9.93 rows=2 width=8) (actual time=0.020..0.026 rows=2
> loops=4)
>                Index Cond: (tchild.par_id = "outer".id)
>  Total runtime: 0.412 ms
> (8 rows)
>
> The costs may be different because I've tuned the query planner's
> parameters.

Why did you set enable_sort=off?  It's not like sorting 9 rows is going
to take any noticeable amount of time anyway.

--
Alvaro Herrera                        http://www.advogato.org/person/alvherre
"No hay hombre que no aspire a la plenitud, es decir,
la suma de experiencias de que un hombre es capaz"

Re: Query planner unaware of possibly best plan

From
Denes Daniel
Date:
In reply to Alvaro Herrera:

> The best thing to do is paste them in a text file and send it as an
> attachment.

Okay, it's attached.

> Why did you set enable_sort=off?  It's not like sorting 9 rows is going
> to take any noticeable amount of time anyway.

Of course it's no problem for 9 rows, but this is only a test case. In
production there will be much more. I just wanted to show that the
planner doesn't even consider a plan without a sort step, using purely
index scans.


Denes Daniel
------------------------------------------------------------




Olvasd az [origo]-t a mobilodon: mini magazinok a Mobizin-en
___________________________________________________
www.t-mobile.hu/mobizin

Attachment

Re: Query planner unaware of possibly best plan

From
Denes Daniel
Date:
Simon Riggs <simon@2ndquadrant.com> wrote:

> On Fri, 2007-09-21 at 21:20 +0200, Dániel Dénes wrote:
>
> > The costs may be different because I've tuned the query planner's
> > parameters.
>
> OK, understood.
>
> > > Ordering by parent, child is fairly common but the variation you've
> > > got here isn't that common.
> > How else can you order by parent, child other than first ordering by
> > a unique key of parent, then something in child? (Except for
> > child.parent_id, child.something because this has all the
> > information in child and can rely on a single multicolumn index.)
>
> Why "except"? Whats wrong with ordering that way?
>
> Make the case. **I** want it is not sufficient...
>
> --
>   Simon Riggs
>   2ndQuadrant  http://www.2ndQuadrant.com




In reply to Simon Riggs <simon@2ndquadrant.com>:

> > How else can you order by parent, child other than first ordering by
> > a unique key of parent, then something in child? (Except for
> > child.parent_id, child.something because this has all the
> > information in child and can rely on a single multicolumn index.)
>
> Why "except"? Whats wrong with ordering that way?

Well, nothing, but what if I have to order by some other unique key? Of
course I could do that by redundantly storing the parent's data in child
and then creating a multicolumn index, but...

Just to see clear: when I found this, I was trying to make a slightly
different query. It was like:

SELECT *
FROM tparent JOIN tchild ON tchild.par_id = tparent.id
WHERE tparent.uniqcol1 = 123
ORDER BY tparent.uniqcol2, tchild.ord;

where there was a unique index on (tparent.uniqcol1, tparent.uniqcol2)
and the columns are marked NOT NULL.
I expected a plan like doing an index scan on parent.uniqcol2 where
uniqcol1 = 123, and (using a nestloop and child's pkey) joining in the
children in the correct order (without a sort). But I got something else,
so I tried everything to get what I wanted -- just to see the costs why
the planner chose something else. After some time I found out that
there is no such plan, so no matter what I do it will sort...
So that's how I got here. But since the original problem isn't that clean
& simple, I thought I'd make a test case, that's easy to follow, and
illustrates the problem: that the planner doesn't even consider my
plan. If it did, I think that'd be the one that gets executed. But tell me if
I'm wrong somewhere.



> Make the case. **I** want it is not sufficient...

Sorry, I can't understand that... I'm far from perfect in english. Please
clarify so I can do what you ask me to.


Denes Daniel
-----------------------------------------------------




Olvasd az [origo]-t a mobilodon: mini magazinok a Mobizin-en
___________________________________________________
www.t-mobile.hu/mobizin


Re: Query planner unaware of possibly best plan

From
Tom Lane
Date:
Denes Daniel <panther-d@freemail.hu> writes:
> Simon Riggs <simon@2ndquadrant.com> wrote:
>> Make the case. **I** want it is not sufficient...

> Sorry, I can't understand that... I'm far from perfect in english.

The point here is that you've repeated the same example N times without
actually making a case that it's interesting to support.  We have to
think about the intellectual complexity that would be added to the
planner to support this case, and the cycles that would be expended
on every query (and wasted, for most queries) on trying to detect
whether the case applies.  If it were simple and cheap to do, these
arguments wouldn't hold much weight, but it doesn't look to me like
either is the case.

Another problem is that it's not clear there's much to be gained.
Avoiding the sort step is only interesting if the query produces so many
rows that a sort would be expensive ... but if that's the case, it seems
unlikely that a nestloop indexscan plan would be the best choice anyway.

So basically this looks like a lot of work for a narrow and questionable
gain.  If you want it to happen you need to convince people that it's
easier and more useful than it looks.

            regards, tom lane

Re: Query planner unaware of possibly best plan

From
Denes Daniel
Date:
Tom Lane <tgl@sss.pgh.pa.us> wrote:

> The point here is that you've repeated the same example N times
> without actually making a case that it's interesting to support.  We
> have to think about the intellectual complexity that would be added
> to the planner to support this case, and the cycles that would be
> expended on every query (and wasted, for most queries) on trying to
> detect whether the case applies.  If it were simple and cheap to do,
> these arguments wouldn't hold much weight, but it doesn't look to
> me like either is the case.
>
> Another problem is that it's not clear there's much to be gained.
> Avoiding the sort step is only interesting if the query produces so
> many rows that a sort would be expensive ... but if that's the case, it
> seems unlikely that a nestloop indexscan plan would be the best
> choice anyway.
>
> So basically this looks like a lot of work for a narrow and questionable
> gain.  If you want it to happen you need to convince people that it's
> easier and more useful than it looks.
>
>             regards, tom lane




Well, I probably won't tell you that it's easy to do, because you know
the planner far far better than I do -- I'm just a user who knows almost
nothing about what's happening inside it. I just thought that if the
planner is examining a plan, where the outer scan is using a unique
index with all it's columns AND doing a nestloop join with the inner scan
returning the rows in some order, then the planner could say "this is
already ordered by (outer.unique, inner.some_order)" and wouldn't
make a sort at the end. Of course this is far from perfect, because what
about three/four/more table joins... maybe that can be derived from
this, I don't really know now.
Another situation that could be optimized this way is if I write "ORDER
BY outer.idxcol1, outer.idxcol2, outer.id, inner.something" where
 + there is a non-unique index on (outer.idxcol1, outer.idxcol2),
 + outer.id is a PRI KEY,
 + there is an index on (inner.outer_id, inner.something) too
but that's getting really complicated. Of course if the simpler case
would be working, then I could create a unique index on (outer.idxcol1,
outer.idxcol2, outer.id) and this would run optimized too.

I think what's common is:
 + outer scan produces uniquely sorted rows
 + nested loop
 + inner scan produces sorted rows
 + ORDER BY says outer.unique_sort, inner.sort
If this is the situation, the sort could be done separately. Even like:

->  Nested Loop
  -> Sort by outer.unique
     ->  Scan producing the required rows from outer
  -> Sort by inner.something
     ->  Scan producing the required rows from inner
         Filter or Index Cond for the join

And this is good for any query that needs data from two tables, but
wants to get the joined rows so that all rows for a single outer-row
are "packed together" (I mean they do not mix).


Now about if it's worth it. I tried a query on real data, with lots of rows.
Tables involved:
- threads: ~200K rows
- msgs: ~8M rows
I wanted to see the last msgs from the last threads, but did not want
to mix them. It's like PgSQL's mailing archive viewed by threads. I
wanted them paged, not all 8M msgs at once (LIMIT/OFFSET). Query is:

SELECT *
FROM threads AS thr JOIN msgs AS msg ON msg.thrid = thr.id
WHERE thr.forid = 1
ORDER BY thr.id DESC, msg.msgid DESC
LIMIT 100

thr.forid is a FKEY to forums. Say forum 1 is the pgsql-performance list.
Table msgs has only a PRI KEY: (thrid, msgid).
Table threads has a PRI KEY: (id) and a unique index: (forid, id).
First time I ran the query with seqscan, bitmapscan, hashjoin,
mergejoin disabled (just to get the nested loop plan with the needless
sort). Second time I ran it without disabling anything, but I modified
ORDER BY to only "thr.id DESC" (deleted "msg.msgid DESC"), to give me
the same plan as before, but without the sort.
PSQL input and output attached.

I think the 5000x speed would be worth it.


Regards,
Denes Daniel
---------------------------------




Játssz a megújult Kvízparton! Válassz több mint 400 kvíz közül minden témában!
________________________________________________________
http://cthandler.adverticum.net/?cturl=http%3A%2F%2Fkvizpart.hu%2F?fmalja


Attachment