Thread: BUG #7629: Suboptimal query plan when index search is possible and an additional search operator is given.
BUG #7629: Suboptimal query plan when index search is possible and an additional search operator is given.
From
dmigowski@ikoffice.de
Date:
The following bug has been logged on the website: Bug reference: 7629 Logged by: Daniel Migowski Email address: dmigowski@ikoffice.de PostgreSQL version: 9.1.2 Operating system: Debian Linux 5.0 Description: = Hi! PostgreSQL chooses to seq scan a table when I give an additional sort operator, and I don't know why. This is the small example: I have a table "delivery". There is a functional, partial index on the table I'd like to use for querying: CREATE INDEX idx_delivery_naturalnumber ON delivery(prep_natural_sort(number)) = WHERE recordstatus < 'X'; The field "number" is not null, btw. Then I do a query = select d.id, d.number from Delivery d = where d.recordStatus<'X' = and (d.tenant_id=3D1 or d.tenant_id=3D2) = order by prep_natural_sort(d.number) ASC = limit 1 The where-clause matches the partial index, and the order by clause matches the index fields, so no problem here. = "Limit (cost=3D0.00..0.34 rows=3D1 width=3D12)" " Output: id, number, (prep_natural_sort((number)::text))" " -> Index Scan using idx_delivery_naturalnumber on public.delivery d = (cost=3D0.00..46655.30 rows=3D135680 width=3D12)" " Output: id, number, prep_natural_sort((number)::text)" " Filter: ((d.tenant_id =3D 1) OR (d.tenant_id =3D 2))" However, when I also want to order by id: select d.id, d.number from Delivery d = where d.recordStatus<'X' = and (d.tenant_id=3D1 or d.tenant_id=3D2) = order by prep_natural_sort(d.number) ASC, = d.id ASC = limit 1 it does a sequential scan. "Limit (cost=3D43443.23..43443.24 rows=3D1 width=3D12)" " Output: id, number, (prep_natural_sort((number)::text))" " -> Sort (cost=3D43443.23..43782.43 rows=3D135680 width=3D12)" " Output: id, number, (prep_natural_sort((number)::text))" " Sort Key: (prep_natural_sort((d.number)::text)), d.id" " -> Seq Scan on public.delivery d (cost=3D0.00..42764.83 rows=3D135680 width=3D12)" " Output: id, number, prep_natural_sort((number)::text)" " Filter: ((d.recordstatus < 'X'::recordstatus) AND ((d.tenant_id =3D 1) OR (d.tenant_id =3D 2)))" This is a bit stupid, also because the relevant data could be fetched very fast by the first order-by expression, and then the results could be ordered again, which is then much faster than doing a full sequential scan on the data. Do you think its worth to optimize for this case? What are the problems? If the index scan is considered fast in the first case, why not do it also in the second case? The only difference would be that one had to fetch a few more rows in the index scan potentially, because the secondary sort would need to impose an order on the first rows returned from the index. = select VERSION(); "PostgreSQL 9.1.2 on i686-pc-linux-gnu, compiled by gcc (Debian 4.3.2-1.1) 4.3.2, 32-bit"
Re: BUG #7629: Suboptimal query plan when index search is possible and an additional search operator is given.
From
Tom Lane
Date:
dmigowski@ikoffice.de writes: > However, when I also want to order by id: > order by prep_natural_sort(d.number) ASC, > d.id ASC > it does a sequential scan. Sure. That index doesn't satisfy this sort order. (It could have gotten chosen anyway, if the partial-index predicate were selective enough, but evidently it isn't.) > This is a bit stupid, also because the relevant data could be fetched very > fast by the first order-by expression, and then the results could be ordered > again, which is then much faster than doing a full sequential scan on the > data. That's an unsupported assertion, which we'd have to write a great deal of code before we could even test. There are a lot of more useful places to spend hacking time, with greater assurance of the work not being wasted. regards, tom lane