Thread: BUG #15717: Index not used when ordering by left joined table column
BUG #15717: Index not used when ordering by left joined table column
From
PG Bug reporting form
Date:
The following bug has been logged on the website: Bug reference: 15717 Logged by: Alex Email address: cdalxndr@yahoo.com PostgreSQL version: 10.0 Operating system: Windows 10 Description: The following query runs slow, as it seq scans tables, without using any index: select wp.id as id from web_page wp left join web_url_path url on wp.id=url.page order by url.priority asc limit 1 Explain analyze: "Limit (cost=180140.55..180140.55 rows=1 width=8) (actual time=2402.310..2402.310 rows=1 loops=1)" " -> Sort (cost=180140.55..185578.31 rows=2175106 width=8) (actual time=2402.309..2402.309 rows=1 loops=1)" " Sort Key: url.priority" " Sort Method: top-N heapsort Memory: 25kB" " -> Hash Right Join (cost=68981.25..169265.02 rows=2175106 width=8) (actual time=588.010..2232.158 rows=2159417 loops=1)" " Hash Cond: (url.page = wp.id)" " -> Seq Scan on web_url_path url (cost=0.00..47376.06 rows=2175106 width=8) (actual time=0.044..403.351 rows=2175106 loops=1)" " -> Hash (cost=33480.89..33480.89 rows=2163789 width=4) (actual time=581.473..581.473 rows=2158888 loops=1)" " Buckets: 131072 Batches: 32 Memory Usage: 3403kB" " -> Seq Scan on web_page wp (cost=0.00..33480.89 rows=2163789 width=4) (actual time=0.076..251.148 rows=2158888 loops=1)" "Planning time: 0.295 ms" "Execution time: 2402.956 ms" Note that I have indexes (btree) on columns "web_url_path.priority" and "web_url_path.page". I was expecting the "priority" column index would be used to retrieve first url then do a reverse join to get the corresponding page, and continue if no page is found. Note that the following query uses the "priority" index: select u.id from web_url_path u order by u.priority limit 1 Explain analyze: "Limit (cost=0.43..0.50 rows=1 width=8) (actual time=0.014..0.014 rows=1 loops=1)" " -> Index Scan using web_url_path_priority_idx on web_url_path u (cost=0.43..144920.91 rows=2175106 width=8) (actual time=0.014..0.014 rows=1 loops=1)" "Planning time: 0.114 ms" "Execution time: 0.026 ms"
PG Bug reporting form <noreply@postgresql.org> writes: > The following query runs slow, as it seq scans tables, without using any > index: > select > wp.id as id > from > web_page wp > left join web_url_path url on wp.id=url.page > order by > url.priority asc > limit 1 > I was expecting the "priority" column index would be used to retrieve first > url then do a reverse join to get the corresponding page, and continue if no > page is found. It would probably do that if you used a plain join, or a right join. But a nestloop plan cannot be driven from the inside of an outer join, and that's the case this query presents. regards, tom lane
Indeed, with an inner join the index is used.
Thanks for the assistance.
On Thursday, March 28, 2019, 3:19:30 AM GMT+2, Tom Lane <tgl@sss.pgh.pa.us> wrote:
PG Bug reporting form <noreply@postgresql.org> writes:
> The following query runs slow, as it seq scans tables, without using any
> index:
> select
> wp.id as id
> from
> web_page wp
> left join web_url_path url on wp.id=url.page
> order by
> url.priority asc
> limit 1
> I was expecting the "priority" column index would be used to retrieve first
> url then do a reverse join to get the corresponding page, and continue if no
> page is found.
It would probably do that if you used a plain join, or a right join.
But a nestloop plan cannot be driven from the inside of an outer join,
and that's the case this query presents.
regards, tom lane