Thread: Planner issue on sorting joining of two tables with limit
Hello, everybody!
I'm using PostgreSQL 8.4.3, compiled by Visual C++ build 1400, 32-bit on Windows XP SP3.
I use following data model for issue reproducing.
CREATE TABLE test1
(
id integer NOT NULL,
"value" double precision,
CONSTRAINT test1_pkey PRIMARY KEY (id)
);
CREATE INDEX test1_value_idx ON test1(value);
CREATE TABLE test2
(
id integer NOT NULL,
id1 integer NOT NULL REFERENCES test2 (id),
"value" double precision,
CONSTRAINT test2_pkey PRIMARY KEY (id)
);
CREATE INDEX test2_id1_value_idx ON test2(id1, value);
Following statements generate 200 rows of test data into test1 table and 1000000 rows of test data into test2 table.
INSERT INTO test1 (id, value) (SELECT x, random() from generate_series(1,200) x);
INSERT INTO test2 (id, id1, value) (SELECT x, (random()*200)::int + 1, random() from generate_series(1,1000000) x);
The following statements return top 10 rows from joining of two tables ordered by table1.value and table2.value. The
SELECT t1.value AS value1, t2.value AS value2 FROM test1 t1 JOIN test2 t2 ON t2.id1 = t1.id ORDER BY t1.value, t2.value
LIMIT 10
value1 | value2
---------------------+---------------------
0.00562104489654303 | 0.00039379671216011
0.00562104489654303 | 0.000658359378576279
0.00562104489654303 | 0.000668979249894619
0.00562104489654303 | 0.000768951140344143
0.00562104489654303 | 0.00121330376714468
0.00562104489654303 | 0.00122168939560652
0.00562104489654303 | 0.00124016962945461
0.00562104489654303 | 0.00134057039394975
0.00562104489654303 | 0.00169069319963455
0.00562104489654303 | 0.00171623658388853
(10 rows)
The statement plan doesn't use indexes. So the statement it is slow.
Limit (cost=50614.88..50614.91 rows=10 width=16) (actual time=8388.331..8388.382 rows=10 loops=1)
-> Sort (cost=50614.88..53102.45 rows=995025 width=16) (actual time=8388.324..8388.340 rows=10 loops=1)
Sort Key: t1.value, t2.value
Sort Method: top-N heapsort Memory: 17kB
-> Hash Join (cost=6.50..29112.75 rows=995025 width=16) (actual time=0.982..6290.516 rows=997461 loops=1)
Hash Cond: (t2.id1 = t1.id)
-> Seq Scan on test2 t2 (cost=0.00..15406.00 rows=1000000 width=12) (actualtime=0.088..2047.910 rows=1000000 loops=1)
-> Hash (cost=4.00..4.00 rows=200 width=12) (actual time=0.870..0.870 rows=200 loops=1)
-> Seq Scan on test1 t1 (cost=0.00..4.00 rows=200 width=12) (actual time=0.010..0.428 rows=200 loops=1)
Total runtime: 8388.473 ms
I can remove ordering by test2.value.
SELECT t1.value AS value1, t2.value AS value2 FROM test1 t1 JOIN test2 t2 ON t2.id1 = t1.id ORDER BY t1.value LIMIT 10
Then the result is the same.
value1 | value2
---------------------+---------------------
0.00562104489654303 | 0.00039379671216011
0.00562104489654303 | 0.000658359378576279
0.00562104489654303 | 0.000668979249894619
0.00562104489654303 | 0.000768951140344143
0.00562104489654303 | 0.00121330376714468
0.00562104489654303 | 0.00122168939560652
0.00562104489654303 | 0.00124016962945461
0.00562104489654303 | 0.00134057039394975
0.00562104489654303 | 0.00169069319963455
0.00562104489654303 | 0.00171623658388853
(10 rows)
The statement plan uses indexes and statement runs fast. This plan is exactly what I need.
Limit (cost=0.00..0.62 rows=10 width=16) (actual time=0.049..0.148 rows=10 loops=1)
-> Nested Loop (cost=0.00..62109.86 rows=995025 width=16) (actual time=0.044..0.107 rows=10 loops=1)
-> Index Scan using test1_value_idx on test1 t1 (cost=0.00..19.19 rows=200 width=12) (actual time=0.017..0.017 rows=1 loops=1)
-> Index Scan using test2_id1_value_idx on test2 t2 (cost=0.00..248.27 rows=4975 width=12) (actual time=0.013..0.042 rows=10 loops=1)
Index Cond: (t2.id1 = t1.id)
Total runtime: 0.224 ms
So PostgreSQL planner can produce the plan I need but it doesn't produce this plan when I specify particular second ordering column. So is there any way to make planner produce desired plan when particular second ordering column is specified?
With best regards,
Korotkov Alexander.
I'm using PostgreSQL 8.4.3, compiled by Visual C++ build 1400, 32-bit on Windows XP SP3.
I use following data model for issue reproducing.
CREATE TABLE test1
(
id integer NOT NULL,
"value" double precision,
CONSTRAINT test1_pkey PRIMARY KEY (id)
);
CREATE INDEX test1_value_idx ON test1(value);
CREATE TABLE test2
(
id integer NOT NULL,
id1 integer NOT NULL REFERENCES test2 (id),
"value" double precision,
CONSTRAINT test2_pkey PRIMARY KEY (id)
);
CREATE INDEX test2_id1_value_idx ON test2(id1, value);
Following statements generate 200 rows of test data into test1 table and 1000000 rows of test data into test2 table.
INSERT INTO test1 (id, value) (SELECT x, random() from generate_series(1,200) x);
INSERT INTO test2 (id, id1, value) (SELECT x, (random()*200)::int + 1, random() from generate_series(1,1000000) x);
The following statements return top 10 rows from joining of two tables ordered by table1.value and table2.value. The
SELECT t1.value AS value1, t2.value AS value2 FROM test1 t1 JOIN test2 t2 ON t2.id1 = t1.id ORDER BY t1.value, t2.value
LIMIT 10
value1 | value2
---------------------+---------------------
0.00562104489654303 | 0.00039379671216011
0.00562104489654303 | 0.000658359378576279
0.00562104489654303 | 0.000668979249894619
0.00562104489654303 | 0.000768951140344143
0.00562104489654303 | 0.00121330376714468
0.00562104489654303 | 0.00122168939560652
0.00562104489654303 | 0.00124016962945461
0.00562104489654303 | 0.00134057039394975
0.00562104489654303 | 0.00169069319963455
0.00562104489654303 | 0.00171623658388853
(10 rows)
The statement plan doesn't use indexes. So the statement it is slow.
Limit (cost=50614.88..50614.91 rows=10 width=16) (actual time=8388.331..8388.382 rows=10 loops=1)
-> Sort (cost=50614.88..53102.45 rows=995025 width=16) (actual time=8388.324..8388.340 rows=10 loops=1)
Sort Key: t1.value, t2.value
Sort Method: top-N heapsort Memory: 17kB
-> Hash Join (cost=6.50..29112.75 rows=995025 width=16) (actual time=0.982..6290.516 rows=997461 loops=1)
Hash Cond: (t2.id1 = t1.id)
-> Seq Scan on test2 t2 (cost=0.00..15406.00 rows=1000000 width=12) (actualtime=0.088..2047.910 rows=1000000 loops=1)
-> Hash (cost=4.00..4.00 rows=200 width=12) (actual time=0.870..0.870 rows=200 loops=1)
-> Seq Scan on test1 t1 (cost=0.00..4.00 rows=200 width=12) (actual time=0.010..0.428 rows=200 loops=1)
Total runtime: 8388.473 ms
I can remove ordering by test2.value.
SELECT t1.value AS value1, t2.value AS value2 FROM test1 t1 JOIN test2 t2 ON t2.id1 = t1.id ORDER BY t1.value LIMIT 10
Then the result is the same.
value1 | value2
---------------------+---------------------
0.00562104489654303 | 0.00039379671216011
0.00562104489654303 | 0.000658359378576279
0.00562104489654303 | 0.000668979249894619
0.00562104489654303 | 0.000768951140344143
0.00562104489654303 | 0.00121330376714468
0.00562104489654303 | 0.00122168939560652
0.00562104489654303 | 0.00124016962945461
0.00562104489654303 | 0.00134057039394975
0.00562104489654303 | 0.00169069319963455
0.00562104489654303 | 0.00171623658388853
(10 rows)
The statement plan uses indexes and statement runs fast. This plan is exactly what I need.
Limit (cost=0.00..0.62 rows=10 width=16) (actual time=0.049..0.148 rows=10 loops=1)
-> Nested Loop (cost=0.00..62109.86 rows=995025 width=16) (actual time=0.044..0.107 rows=10 loops=1)
-> Index Scan using test1_value_idx on test1 t1 (cost=0.00..19.19 rows=200 width=12) (actual time=0.017..0.017 rows=1 loops=1)
-> Index Scan using test2_id1_value_idx on test2 t2 (cost=0.00..248.27 rows=4975 width=12) (actual time=0.013..0.042 rows=10 loops=1)
Index Cond: (t2.id1 = t1.id)
Total runtime: 0.224 ms
So PostgreSQL planner can produce the plan I need but it doesn't produce this plan when I specify particular second ordering column. So is there any way to make planner produce desired plan when particular second ordering column is specified?
With best regards,
Korotkov Alexander.
=?KOI8-R?B?68/Sz9TLz9cg4czFy9PBzsTS?= <aekorotkov@gmail.com> writes: > So PostgreSQL planner can produce the plan I need but it doesn't produce > this plan when I specify particular second ordering column. Well, no, because that plan wouldn't produce the specified ordering; or at least it would be a lucky coincidence if it did. It's only sorting on t1.value. > So is there any > way to make planner produce desired plan when particular second ordering > column is specified? Not when the ordering columns come from two different tables. (If they were in the same table then scanning a two-column index could produce the demanded sort order.) I don't see any way to satisfy this query without an explicit sort step, which means it has to look at the whole join output. If you're willing to make assumptions like "the required 10 rows will be within the first 100 t1.value rows" then you could nest an ORDER BY t1.value LIMIT 100 query inside something that did an ORDER BY with both columns. But this fails if you have too many duplicate t1.value values, and your test case suggests that you might have a lot of them. In any case it would stop being fast if you make the inner LIMIT very large. regards, tom lane
Well, no, because that plan wouldn't produce the specified ordering;
or at least it would be a lucky coincidence if it did. It's only
sorting on t1.value.
I just don't find why it is coincidence. I think that such plan will always produce result ordered by two columns, because such nested index scan always produce this result.
Let's consider some simple example in order to illustrate how this plan works.
t1
id | value
---+------
1 | 0.1
2 | 0.3
3 | 0.2
t2
id | id1 | value
---+-----+------
1 | 2 | 0.2
2 | 1 | 0.9
3 | 2 | 0.6
4 | 1 | 0.7
5 | 1 | 0.4
6 | 3 | 0.2
1) The outer index scan will find the row of t1 with least value using test1_value_idx. It will be row (1, 0.1)
2) The inner index scan will find all the rows in t2 where id1 = 1 using test2_id1_value_idx. But index test2_id1_value_idx have second order by value column and the index scan result will be ordered by value. That's why inner index scan will find rows (5, 1, 0.4), (4, 1, 0.7) and (2, 1, 0.9). And following query output will be produced:
value1 | value2
--------+-------
0.1 | 0.4
0.1 | 0.7
0.1 | 0.9
3) The outer index scan will find the row of t1 with the second value using test1_value_idx. It will be row (3, 0.2)
4) The inner index scan will find all the rows in t2 where id1 = 3 using test2_id1_value_idx. This row is (6, 3, 0.2). The following query output will be produced:
value1 | value2
--------+-------
0.2 | 0.2
5) The outer index scan will find the row of t1 with the third value using test1_value_idx. It will be row (2, 0.3)
6) The inner index scan will find all the rows in t2 where id1 = 2 using test2_id1_value_idx. These rows are (1, 2, 0.2) and (3, 2, 0.6). The following query output will be produced:
value1 | value2
--------+-------
0.3 | 0.2
0.3 | 0.6
And the whole query result is:
value1 | value2
--------+-------
0.1 | 0.4
0.1 | 0.7
0.1 | 0.9
0.2 | 0.2
0.3 | 0.2
0.3 | 0.6
And this result is really ordered by t1.value, t2.value.
I can't find error in my reasoning :)
The query without limit produce similar plan.
Let's consider some simple example in order to illustrate how this plan works.
t1
id | value
---+------
1 | 0.1
2 | 0.3
3 | 0.2
t2
id | id1 | value
---+-----+------
1 | 2 | 0.2
2 | 1 | 0.9
3 | 2 | 0.6
4 | 1 | 0.7
5 | 1 | 0.4
6 | 3 | 0.2
1) The outer index scan will find the row of t1 with least value using test1_value_idx. It will be row (1, 0.1)
2) The inner index scan will find all the rows in t2 where id1 = 1 using test2_id1_value_idx. But index test2_id1_value_idx have second order by value column and the index scan result will be ordered by value. That's why inner index scan will find rows (5, 1, 0.4), (4, 1, 0.7) and (2, 1, 0.9). And following query output will be produced:
value1 | value2
--------+-------
0.1 | 0.4
0.1 | 0.7
0.1 | 0.9
3) The outer index scan will find the row of t1 with the second value using test1_value_idx. It will be row (3, 0.2)
4) The inner index scan will find all the rows in t2 where id1 = 3 using test2_id1_value_idx. This row is (6, 3, 0.2). The following query output will be produced:
value1 | value2
--------+-------
0.2 | 0.2
5) The outer index scan will find the row of t1 with the third value using test1_value_idx. It will be row (2, 0.3)
6) The inner index scan will find all the rows in t2 where id1 = 2 using test2_id1_value_idx. These rows are (1, 2, 0.2) and (3, 2, 0.6). The following query output will be produced:
value1 | value2
--------+-------
0.3 | 0.2
0.3 | 0.6
And the whole query result is:
value1 | value2
--------+-------
0.1 | 0.4
0.1 | 0.7
0.1 | 0.9
0.2 | 0.2
0.3 | 0.2
0.3 | 0.6
And this result is really ordered by t1.value, t2.value.
I can't find error in my reasoning :)
The query without limit produce similar plan.
Nested Loop (cost=0.00..62109.86 rows=995025 width=16)
-> Index Scan using test1_value_idx on test1 t1 (cost=0.00..19.19 rows=200 width=12)
-> Index Scan using test2_id1_value_idx on test2 t2 (cost=0.00..248.27 rows=4975 width=12)
And I checked that the result is ordered by t1.value and t2.value.
2010/4/26 Tom Lane <tgl@sss.pgh.pa.us>
Коротков Александр <aekorotkov@gmail.com> writes:Well, no, because that plan wouldn't produce the specified ordering;
> So PostgreSQL planner can produce the plan I need but it doesn't produce
> this plan when I specify particular second ordering column.
or at least it would be a lucky coincidence if it did. It's only
sorting on t1.value.Not when the ordering columns come from two different tables. (If they
> So is there any
> way to make planner produce desired plan when particular second ordering
> column is specified?
were in the same table then scanning a two-column index could produce
the demanded sort order.) I don't see any way to satisfy this query
without an explicit sort step, which means it has to look at the whole
join output.
If you're willing to make assumptions like "the required 10 rows will be
within the first 100 t1.value rows" then you could nest an ORDER BY
t1.value LIMIT 100 query inside something that did an ORDER BY with both
columns. But this fails if you have too many duplicate t1.value values,
and your test case suggests that you might have a lot of them. In any
case it would stop being fast if you make the inner LIMIT very large.
regards, tom lane
I found my mistake. My supposition is working only if value column in t1 table is unique. But if I replace the index by unique one then plan is the same.
On Mon, May 3, 2010 at 5:57 PM, Alexander Korotkov <aekorotkov@gmail.com> wrote:
Well, no, because that plan wouldn't produce the specified ordering;
or at least it would be a lucky coincidence if it did. It's only
sorting on t1.value.EXPLAIN SELECT t1.value AS value1, t2.value AS value2 FROM test1 t1 JOIN test2 t2 ON t2.id1 = t1.id ORDER BY t1.valueI just don't find why it is coincidence. I think that such plan will always produce result ordered by two columns, because such nested index scan always produce this result.2) The inner index scan will find all the rows in t2 where id1 = 1 using test2_id1_value_idx. But index test2_id1_value_idx have second order by value column and the index scan result will be ordered by value. That's why inner index scan will find rows (5, 1, 0.4), (4, 1, 0.7) and (2, 1, 0.9). And following query output will be produced:
Let's consider some simple example in order to illustrate how this plan works.
t1
id | value
---+------
1 | 0.1
2 | 0.3
3 | 0.2
t2
id | id1 | value
---+-----+------
1 | 2 | 0.2
2 | 1 | 0.9
3 | 2 | 0.6
4 | 1 | 0.7
5 | 1 | 0.4
6 | 3 | 0.2
1) The outer index scan will find the row of t1 with least value using test1_value_idx. It will be row (1, 0.1)4) The inner index scan will find all the rows in t2 where id1 = 3 using test2_id1_value_idx. This row is (6, 3, 0.2). The following query output will be produced:
value1 | value2
--------+-------
0.1 | 0.4
0.1 | 0.7
0.1 | 0.9
3) The outer index scan will find the row of t1 with the second value using test1_value_idx. It will be row (3, 0.2)6) The inner index scan will find all the rows in t2 where id1 = 2 using test2_id1_value_idx. These rows are (1, 2, 0.2) and (3, 2, 0.6). The following query output will be produced:
value1 | value2
--------+-------
0.2 | 0.2
5) The outer index scan will find the row of t1 with the third value using test1_value_idx. It will be row (2, 0.3)
value1 | value2
--------+-------
0.3 | 0.2
0.3 | 0.6
And the whole query result is:
value1 | value2
--------+-------
0.1 | 0.4
0.1 | 0.7
0.1 | 0.9
0.2 | 0.2
0.3 | 0.2
0.3 | 0.6
And this result is really ordered by t1.value, t2.value.
I can't find error in my reasoning :)
The query without limit produce similar plan.
Nested Loop (cost=0.00..62109.86 rows=995025 width=16)-> Index Scan using test1_value_idx on test1 t1 (cost=0.00..19.19 rows=200 width=12)-> Index Scan using test2_id1_value_idx on test2 t2 (cost=0.00..248.27 rows=4975 width=12)And I checked that the result is ordered by t1.value and t2.value.2010/4/26 Tom Lane <tgl@sss.pgh.pa.us>Коротков Александр <aekorotkov@gmail.com> writes:
> So PostgreSQL planner can produce the plan I need but it doesn't produce
> this plan when I specify particular second ordering column.Well, no, because that plan wouldn't produce the specified ordering;
or at least it would be a lucky coincidence if it did. It's only
sorting on t1.value.Not when the ordering columns come from two different tables. (If they
> So is there any
> way to make planner produce desired plan when particular second ordering
> column is specified?
were in the same table then scanning a two-column index could produce
the demanded sort order.) I don't see any way to satisfy this query
without an explicit sort step, which means it has to look at the whole
join output.
If you're willing to make assumptions like "the required 10 rows will be
within the first 100 t1.value rows" then you could nest an ORDER BY
t1.value LIMIT 100 query inside something that did an ORDER BY with both
columns. But this fails if you have too many duplicate t1.value values,
and your test case suggests that you might have a lot of them. In any
case it would stop being fast if you make the inner LIMIT very large.
regards, tom lane
Alexander Korotkov <aekorotkov@gmail.com> wrote: > Alexander Korotkov <aekorotkov@gmail.com> wrote: >>> Well, no, because that plan wouldn't produce the specified >>> ordering; or at least it would be a lucky coincidence if it did. >>> It's only sorting on t1.value. >>> >> I just don't find why it is coincidence. I think that such plan >> will always produce result ordered by two columns, because such >> nested index scan always produce this result. Assuming a nested index scan, or any particular plan, is unwise. New data or just the "luck of the draw" on your next ANALYZE could result in a totally different plan which wouldn't produce the same ordering unless specified. > I found my mistake. My supposition is working only if value column > in t1 table is unique. But if I replace the index by unique one > then plan is the same. Yeah, maybe, for the moment. When you have ten times the quantity of data, a completely different plan may be chosen. If you want a particular order, ask for it. The planner will even take the requested ordering into account when choosing a plan, so the cutoff for switching to an in-memory hash table or a bitmap index scan might shift a bit based on the calculated cost of sorting data. -Kevin
"Kevin Grittner" <Kevin.Grittner@wicourts.gov> writes: > Alexander Korotkov <aekorotkov@gmail.com> wrote: >>> I just don't find why it is coincidence. I think that such plan >>> will always produce result ordered by two columns, because such >>> nested index scan always produce this result. > Assuming a nested index scan, or any particular plan, is unwise. I think he's proposing that the planner should recognize that a plan of this type produces a result sorted by the additional index columns. I'm not convinced either that the sortedness property really holds, or that it would be worth the extra planning effort to check for; but it's not a fundamentally misguided proposal. regards, tom lane
On Fri, May 7, 2010 at 11:35 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > "Kevin Grittner" <Kevin.Grittner@wicourts.gov> writes: >> Alexander Korotkov <aekorotkov@gmail.com> wrote: >>>> I just don't find why it is coincidence. I think that such plan >>>> will always produce result ordered by two columns, because such >>>> nested index scan always produce this result. > >> Assuming a nested index scan, or any particular plan, is unwise. > > I think he's proposing that the planner should recognize that a plan > of this type produces a result sorted by the additional index columns. > I'm not convinced either that the sortedness property really holds, > or that it would be worth the extra planning effort to check for; > but it's not a fundamentally misguided proposal. I took a slightly different point - a nested loop will be ordered by the ordering of the outer side and then, within that, the ordering of the inner side, presuming (not quite sure how to phrase this) that the outer side is "unique enough" with respect to the ordering. I'm not too sure whether there's anything useful we can do with this information in a reasonable number of CPU cycles, but it is something I've noticed before while reading the code. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company