Thread: BUG #17651: Possible optimization for sort on not nullable field (indexed)
BUG #17651: Possible optimization for sort on not nullable field (indexed)
From
PG Bug reporting form
Date:
The following bug has been logged on the website: Bug reference: 17651 Logged by: Dimar van Rietschoten Email address: d_van_rietschoten@hotmail.com PostgreSQL version: 14.4 Operating system: Using google cloudsql Description: It is not a bug, but might be an opportunity for the optimizer. I have a table with: CREATE TABLE test ( period text NOT NULL, retailer_id int NOT NULL, order_lines int NOT NULL ); CREATE INDEX ix_test ON test(period, retailer_id, order_lines); The table contains 170.632.407 records. When performing the query: explain analyze select * from test where retailer_id = :retailerId and period = :period order by order_lines desc; I see a query plan: Index Scan Backward using ix_test on test (cost=0.57..694223.20 rows=2287319 width=28) (actual time=0.030..726.666 rows=2189972 loops=1) Index Cond: ((period = 'LAST_WEEK'::text) AND (retailer_id = 0)) However when using the nulls last in the order by (on a not nullable column), I get this query plan: Sort (cost=998367.86..1004086.15 rows=2287319 width=28) (actual time=1520.288..1795.242 rows=2189972 loops=1) Sort Key: order_lines DESC NULLS LAST Sort Method: external merge Disk: 81504kB -> Index Scan using ix_test on test (cost=0.57..694223.20 rows=2287319 width=28) (actual time=0.041..688.701 rows=2189972 loops=1) Index Cond: ((period = 'LAST_WEEK'::text) AND (retailer_id = 0)) (Query is: explain analyze select * from test where retailer_id = :retailerId and period = :period order by order_lines desc; ) Expected: as the column is not nullable, the index can still be used and there is no need for the separate sort. Some background: this query is dynamically generated. Some columns are nullable, others are not. Making this distinction in the code that generates the query makes the code harder to maintain. It would help if we can leave this to the optimizer instead.
Re: BUG #17651: Possible optimization for sort on not nullable field (indexed)
From
Julien Rouhaud
Date:
Hi, On Wed, Oct 19, 2022 at 10:16:42AM +0000, PG Bug reporting form wrote: > > It is not a bug, but might be an opportunity for the optimizer. > > I have a table with: > CREATE TABLE test > ( > period text NOT NULL, > retailer_id int NOT NULL, > order_lines int NOT NULL > ); > > CREATE INDEX ix_test ON test(period, retailer_id, order_lines); > > The table contains 170.632.407 records. > > When performing the query: > explain analyze > select * from test > where retailer_id = :retailerId and period = :period > order by order_lines desc; > > I see a query plan: > Index Scan Backward using ix_test on test (cost=0.57..694223.20 > rows=2287319 width=28) (actual time=0.030..726.666 rows=2189972 loops=1) > Index Cond: ((period = 'LAST_WEEK'::text) AND (retailer_id = 0)) > > However when using the nulls last in the order by (on a not nullable > column), I get this query plan: > Sort (cost=998367.86..1004086.15 rows=2287319 width=28) (actual > time=1520.288..1795.242 rows=2189972 loops=1) > Sort Key: order_lines DESC NULLS LAST > Sort Method: external merge Disk: 81504kB > -> Index Scan using ix_test on test (cost=0.57..694223.20 rows=2287319 > width=28) (actual time=0.041..688.701 rows=2189972 loops=1) > Index Cond: ((period = 'LAST_WEEK'::text) AND (retailer_id = 0)) > > Expected: as the column is not nullable, the index can still be used and > there is no need for the separate sort. That could work is you're only referencing a plain table, but if there are joins involved we won't have this information anymore, so it would add extra work in the planner for very few opportunities to optimize things. There's a patch proposed recently (https://commitfest.postgresql.org/40/3755/) to track whether a column reference could be NULL or not when joins are involved, so maybe such an optimization would make more sense once this infrastructure is available.