Thread: Query planner unaware of possibly best plan
Hi, I think the query planner is unaware of the possibly best plan in some situations. See the test case below: -- --------------------------------------------------- -- CREATE TABLE tparent ( id integer NOT NULL, ord integer NOT NULL, CONSTRAINT par_pkey_id PRIMARY KEY (id), CONSTRAINT par_uniq_ord UNIQUE (ord) ); CREATE TABLE tchild ( par_id integer NOT NULL, ord integer NOT NULL, CONSTRAINT chi_pkey_parid_ord PRIMARY KEY (par_id, ord), CONSTRAINT chi_fkey FOREIGN KEY (par_id) REFERENCES tparent(id) ); INSERT INTO tparent VALUES (1, 3); INSERT INTO tparent VALUES (2, 1); INSERT INTO tparent VALUES (3, 4); INSERT INTO tparent VALUES (4, 5); INSERT INTO tparent VALUES (5, 2); INSERT INTO tchild VALUES (1, 2); INSERT INTO tchild VALUES (1, 1); INSERT INTO tchild VALUES (2, 1); INSERT INTO tchild VALUES (2, 3); INSERT INTO tchild VALUES (2, 2); INSERT INTO tchild VALUES (3, 1); INSERT INTO tchild VALUES (3, 2); INSERT INTO tchild VALUES (4, 1); INSERT INTO tchild VALUES (5, 2); INSERT INTO tchild VALUES (5, 1); ANALYZE tparent; ANALYZE tchild; SET enable_seqscan TO false; SET enable_bitmapscan TO false; SET enable_hashjoin TO false; SET enable_mergejoin TO false; SET enable_sort TO false; EXPLAIN ANALYZE SELECT * FROM tparent JOIN tchild ON tchild.par_id = tparent.id WHERE tparent.ord BETWEEN 1 AND 4 ORDER BY tparent.ord, tchild.ord; -- --------------------------------------------------- -- Sort (cost=100000132.10..100000140.10 rows=8 width=16) (actual time=0.440..0.456 rows=9 loops=1) Sort Key: tparent.ord, tchild.ord -> Nested Loop (cost=0.00..84.10 rows=8 width=16) (actual time=0.179..0.270 rows=9 loops=1) -> Index Scan using par_uniq_ord on tparent (cost=0.00..20.40 rows=4 width=8) (actual time=0.089..0.098 rows=4 loops=1) Index Cond: ((ord >= 1) AND (ord <= 4)) -> Index Scan using chi_pkey_parid_ord on tchild (cost=0.00..9.93 rows=2 width=8) (actual time=0.023..0.028 rows=2 loops=4) Index Cond: (tchild.par_id = "outer".id) -- --------------------------------------------------- -- Even though I forced the nested loop plan using both indexes (that returns the rows in the correct order), there is a needless sort step on the top, consuming half of the time even on such small tables. Now it's clear why the planner did not choose this plan, why I had to force it: because it isn't the best if the sort is still there. The first time I posted this ( http://archives.postgresql.org/pgsql-general/2007-05/msg01306.php ) and read Tom's answer I was convinced that this is rarely a problem, but now I don't think so, since I ran into it for the third time. Can that sort step somehow be eliminated if the NestLoop's outer table is being scanned via a unique index? If not, how can I rewrite my indexes/query in such a way that it's still safe (the rows always come in the order I want), but I don't have to wait for that needless sort? I'm using PostgreSQL 8.1.8. Thanks in advance, Denes Daniel ____________________________________________________ Olvasd az [origo]-t a mobilodon: mini magazinok a Mobizin-en ___________________________________________________ www.t-mobile.hu/mobizin
On Fri, 2007-09-21 at 17:36 +0200, Denes Daniel wrote: > Even though I forced the nested loop plan using both indexes (that > returns the rows in the correct order), there is a needless sort step on > the top, consuming half of the time even on such small tables. > Now it's clear why the planner did not choose this plan, why I had to > force it: because it isn't the best if the sort is still there. Ordering by parent, child is fairly common but the variation you've got here isn't that common. You'd need to make a case considering all the alternatives; nobody will agree without a balanced case that includes what is best for everyone. Your EXPLAIN looks edited. Have you also edited the sort costs? They look slightly higher than we might expect. Please provide the full normal EXPLAIN output. -- Simon Riggs 2ndQuadrant http://www.2ndQuadrant.com
Simon Riggs <simon@2ndquadrant.com> írta: > Ordering by parent, child is fairly common but the variation you've got > here isn't that common. You'd need to make a case considering all the > alternatives; nobody will agree without a balanced case that includes > what is best for everyone. > > Your EXPLAIN looks edited. Have you also edited the sort costs? They > look slightly higher than we might expect. Please provide the full > normal EXPLAIN output. > > -- > Simon Riggs > 2ndQuadrant http://www.2ndQuadrant.com I've just inserted some newlines, so it's better to read than when my email-client wraps the lines automatically. Did not touch the information itself. But here is the normal output of EXPLAIN ANALYZE: EXPLAIN ANALYZE SELECT * FROM tparent JOIN tchild ON tchild.par_id = tparent.id WHERE tparent.ord BETWEEN 1 AND 4 ORDER BY tparent.ord, tchild.ord; QUERY PLAN ------------------------------------------------------------------------------------------- -------------------------------------------- Sort (cost=100000132.10..100000140.10 rows=8 width=16) (actual time=0.302..0.319 rows=9 loops=1) Sort Key: tparent.ord, tchild.ord -> Nested Loop (cost=0.00..84.10 rows=8 width=16) (actual time=0.181..0.267 rows=9 loops=1) -> Index Scan using par_uniq_ord on tparent (cost=0.00..20.40 rows=4 width=8) (actual time=0.100..0.109 rows=4 loops=1) Index Cond: ((ord >= 1) AND (ord <= 4)) -> Index Scan using chi_pkey_parid_ord on tchild (cost=0.00..9.93 rows=2 width=8) (actual time=0.020..0.026 rows=2 loops=4) Index Cond: (tchild.par_id = "outer".id) Total runtime: 0.412 ms (8 rows) The costs may be different because I've tuned the query planner's parameters. > Ordering by parent, child is fairly common but the variation you've got > here isn't that common. How else can you order by parent, child other than first ordering by a unique key of parent, then something in child? (Except for child.parent_id, child.something because this has all the information in child and can rely on a single multicolumn index.) Denes Daniel ------------------------------------------------------------------------ Olvasd az [origo]-t a mobilodon: mini magazinok a Mobizin-en ___________________________________________________ www.t-mobile.hu/mobizin
Denes Daniel wrote: > I've just inserted some newlines, so it's better to read than when my > email-client wraps the lines automatically. Did not touch the information > itself. But here is the normal output of EXPLAIN ANALYZE: The best thing to do is paste them in a text file and send it as an attachment. > QUERY PLAN > ------------------------------------------------------------------------------------------- > -------------------------------------------- > Sort (cost=100000132.10..100000140.10 rows=8 width=16) (actual > time=0.302..0.319 rows=9 loops=1) > Sort Key: tparent.ord, tchild.ord > -> Nested Loop (cost=0.00..84.10 rows=8 width=16) (actual > time=0.181..0.267 rows=9 loops=1) > -> Index Scan using par_uniq_ord on tparent (cost=0.00..20.40 > rows=4 width=8) (actual time=0.100..0.109 rows=4 loops=1) > Index Cond: ((ord >= 1) AND (ord <= 4)) > -> Index Scan using chi_pkey_parid_ord on tchild > (cost=0.00..9.93 rows=2 width=8) (actual time=0.020..0.026 rows=2 > loops=4) > Index Cond: (tchild.par_id = "outer".id) > Total runtime: 0.412 ms > (8 rows) > > The costs may be different because I've tuned the query planner's > parameters. Why did you set enable_sort=off? It's not like sorting 9 rows is going to take any noticeable amount of time anyway. -- Alvaro Herrera http://www.advogato.org/person/alvherre "No hay hombre que no aspire a la plenitud, es decir, la suma de experiencias de que un hombre es capaz"
In reply to Alvaro Herrera: > The best thing to do is paste them in a text file and send it as an > attachment. Okay, it's attached. > Why did you set enable_sort=off? It's not like sorting 9 rows is going > to take any noticeable amount of time anyway. Of course it's no problem for 9 rows, but this is only a test case. In production there will be much more. I just wanted to show that the planner doesn't even consider a plan without a sort step, using purely index scans. Denes Daniel ------------------------------------------------------------ Olvasd az [origo]-t a mobilodon: mini magazinok a Mobizin-en ___________________________________________________ www.t-mobile.hu/mobizin
Attachment
Simon Riggs <simon@2ndquadrant.com> wrote: > On Fri, 2007-09-21 at 21:20 +0200, Dániel Dénes wrote: > > > The costs may be different because I've tuned the query planner's > > parameters. > > OK, understood. > > > > Ordering by parent, child is fairly common but the variation you've > > > got here isn't that common. > > How else can you order by parent, child other than first ordering by > > a unique key of parent, then something in child? (Except for > > child.parent_id, child.something because this has all the > > information in child and can rely on a single multicolumn index.) > > Why "except"? Whats wrong with ordering that way? > > Make the case. **I** want it is not sufficient... > > -- > Simon Riggs > 2ndQuadrant http://www.2ndQuadrant.com In reply to Simon Riggs <simon@2ndquadrant.com>: > > How else can you order by parent, child other than first ordering by > > a unique key of parent, then something in child? (Except for > > child.parent_id, child.something because this has all the > > information in child and can rely on a single multicolumn index.) > > Why "except"? Whats wrong with ordering that way? Well, nothing, but what if I have to order by some other unique key? Of course I could do that by redundantly storing the parent's data in child and then creating a multicolumn index, but... Just to see clear: when I found this, I was trying to make a slightly different query. It was like: SELECT * FROM tparent JOIN tchild ON tchild.par_id = tparent.id WHERE tparent.uniqcol1 = 123 ORDER BY tparent.uniqcol2, tchild.ord; where there was a unique index on (tparent.uniqcol1, tparent.uniqcol2) and the columns are marked NOT NULL. I expected a plan like doing an index scan on parent.uniqcol2 where uniqcol1 = 123, and (using a nestloop and child's pkey) joining in the children in the correct order (without a sort). But I got something else, so I tried everything to get what I wanted -- just to see the costs why the planner chose something else. After some time I found out that there is no such plan, so no matter what I do it will sort... So that's how I got here. But since the original problem isn't that clean & simple, I thought I'd make a test case, that's easy to follow, and illustrates the problem: that the planner doesn't even consider my plan. If it did, I think that'd be the one that gets executed. But tell me if I'm wrong somewhere. > Make the case. **I** want it is not sufficient... Sorry, I can't understand that... I'm far from perfect in english. Please clarify so I can do what you ask me to. Denes Daniel ----------------------------------------------------- Olvasd az [origo]-t a mobilodon: mini magazinok a Mobizin-en ___________________________________________________ www.t-mobile.hu/mobizin
Denes Daniel <panther-d@freemail.hu> writes: > Simon Riggs <simon@2ndquadrant.com> wrote: >> Make the case. **I** want it is not sufficient... > Sorry, I can't understand that... I'm far from perfect in english. The point here is that you've repeated the same example N times without actually making a case that it's interesting to support. We have to think about the intellectual complexity that would be added to the planner to support this case, and the cycles that would be expended on every query (and wasted, for most queries) on trying to detect whether the case applies. If it were simple and cheap to do, these arguments wouldn't hold much weight, but it doesn't look to me like either is the case. Another problem is that it's not clear there's much to be gained. Avoiding the sort step is only interesting if the query produces so many rows that a sort would be expensive ... but if that's the case, it seems unlikely that a nestloop indexscan plan would be the best choice anyway. So basically this looks like a lot of work for a narrow and questionable gain. If you want it to happen you need to convince people that it's easier and more useful than it looks. regards, tom lane
Tom Lane <tgl@sss.pgh.pa.us> wrote: > The point here is that you've repeated the same example N times > without actually making a case that it's interesting to support. We > have to think about the intellectual complexity that would be added > to the planner to support this case, and the cycles that would be > expended on every query (and wasted, for most queries) on trying to > detect whether the case applies. If it were simple and cheap to do, > these arguments wouldn't hold much weight, but it doesn't look to > me like either is the case. > > Another problem is that it's not clear there's much to be gained. > Avoiding the sort step is only interesting if the query produces so > many rows that a sort would be expensive ... but if that's the case, it > seems unlikely that a nestloop indexscan plan would be the best > choice anyway. > > So basically this looks like a lot of work for a narrow and questionable > gain. If you want it to happen you need to convince people that it's > easier and more useful than it looks. > > regards, tom lane Well, I probably won't tell you that it's easy to do, because you know the planner far far better than I do -- I'm just a user who knows almost nothing about what's happening inside it. I just thought that if the planner is examining a plan, where the outer scan is using a unique index with all it's columns AND doing a nestloop join with the inner scan returning the rows in some order, then the planner could say "this is already ordered by (outer.unique, inner.some_order)" and wouldn't make a sort at the end. Of course this is far from perfect, because what about three/four/more table joins... maybe that can be derived from this, I don't really know now. Another situation that could be optimized this way is if I write "ORDER BY outer.idxcol1, outer.idxcol2, outer.id, inner.something" where + there is a non-unique index on (outer.idxcol1, outer.idxcol2), + outer.id is a PRI KEY, + there is an index on (inner.outer_id, inner.something) too but that's getting really complicated. Of course if the simpler case would be working, then I could create a unique index on (outer.idxcol1, outer.idxcol2, outer.id) and this would run optimized too. I think what's common is: + outer scan produces uniquely sorted rows + nested loop + inner scan produces sorted rows + ORDER BY says outer.unique_sort, inner.sort If this is the situation, the sort could be done separately. Even like: -> Nested Loop -> Sort by outer.unique -> Scan producing the required rows from outer -> Sort by inner.something -> Scan producing the required rows from inner Filter or Index Cond for the join And this is good for any query that needs data from two tables, but wants to get the joined rows so that all rows for a single outer-row are "packed together" (I mean they do not mix). Now about if it's worth it. I tried a query on real data, with lots of rows. Tables involved: - threads: ~200K rows - msgs: ~8M rows I wanted to see the last msgs from the last threads, but did not want to mix them. It's like PgSQL's mailing archive viewed by threads. I wanted them paged, not all 8M msgs at once (LIMIT/OFFSET). Query is: SELECT * FROM threads AS thr JOIN msgs AS msg ON msg.thrid = thr.id WHERE thr.forid = 1 ORDER BY thr.id DESC, msg.msgid DESC LIMIT 100 thr.forid is a FKEY to forums. Say forum 1 is the pgsql-performance list. Table msgs has only a PRI KEY: (thrid, msgid). Table threads has a PRI KEY: (id) and a unique index: (forid, id). First time I ran the query with seqscan, bitmapscan, hashjoin, mergejoin disabled (just to get the nested loop plan with the needless sort). Second time I ran it without disabling anything, but I modified ORDER BY to only "thr.id DESC" (deleted "msg.msgid DESC"), to give me the same plan as before, but without the sort. PSQL input and output attached. I think the 5000x speed would be worth it. Regards, Denes Daniel --------------------------------- Játssz a megújult Kvízparton! Válassz több mint 400 kvíz közül minden témában! ________________________________________________________ http://cthandler.adverticum.net/?cturl=http%3A%2F%2Fkvizpart.hu%2F?fmalja