Thread: BUG #18477: A specific SQL query with "ORDER BY ... NULLS FIRST" is performing poorly if an ordering column is n
BUG #18477: A specific SQL query with "ORDER BY ... NULLS FIRST" is performing poorly if an ordering column is n
From
PG Bug reporting form
Date:
The following bug has been logged on the website: Bug reference: 18477 Logged by: Alexander777 Email address: alexander.berezin3000@gmail.com PostgreSQL version: 14.6 Operating system: CentOS 7.6.1810 Description: Summary: A specific SQL query with "ORDER BY ... NULLS FIRST" is performing poorly if an ordering column is not nullable. Description: Recently, I encountered an intriguing performance issue with a SQL query over a table with 30+ millions of rows. This query takes significantly longer to execute than I anticipated. After a thorough investigation, I extracted a minimal example that clearly illustrates the problem, particularly when the ordering column is not NULL-able. Steps to Reproduce: 1. Database Setup: - PostgreSQL version: 14.6 - Client: psql 14.6 - Operating system: CentOS 7 - Architecture: x86-64 2. Create simple table and index CREATE TABLE T1 ( pk BIGSERIAL NOT NULL PRIMARY KEY, updated_at BIGINT NOT NULL DEFAULT 0 ); CREATE INDEX idx_t1_updated_at_pk ON T1 (updated_at, pk); 3. Add big number of rows, in my environment there are ~35M rows SELECT COUNT(*) FROM T1; count ---------- 35493666 (1 row) 4. Problematic Query: SELECT updated_at FROM T1 ORDER BY updated_at NULLS FIRST LIMIT 130000 5. Actual Result: The query takes tens of seconds to execute, which is significantly slower than expected. 5.1 EXPLAIN (ANALYZE,BUFFERS) output from problematic query: QUERY PLAN ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ Limit (cost=1774902.21..1790069.93 rows=130000 width=8) (actual time=12107.287..12225.239 rows=130000 loops=1) Buffers: shared hit=894055 read=162956 dirtied=3807, temp read=153949 written=231607 I/O Timings: read=6287.278 -> Gather Merge (cost=1774902.21..5093645.38 rows=28444384 width=8) (actual time=12107.285..12216.140 rows=130000 loops=1) Workers Planned: 2 Workers Launched: 2 Buffers: shared hit=894055 read=162956 dirtied=3807, temp read=153949 written=231607 I/O Timings: read=6287.278 -> Sort (cost=1773902.18..1809457.66 rows=14222192 width=8) (actual time=12043.738..12048.354 rows=44698 loops=3) Sort Key: updated_at NULLS FIRST Sort Method: external merge Disk: 204184kB Buffers: shared hit=894055 read=162956 dirtied=3807, temp read=153949 written=231607 I/O Timings: read=6287.278 Worker 0: Sort Method: external merge Disk: 210344kB Worker 1: Sort Method: external merge Disk: 210576kB -> Parallel Index Only Scan using idx_t1_updated_at_pk on t1 (cost=0.56..494747.42 rows=14222192 width=8) (actual time=0.088..7815.929 rows=11827789 loops=3) Heap Fetches: 858732 Buffers: shared hit=893979 read=162956 dirtied=3807 I/O Timings: read=6287.278 Planning Time: 0.099 ms Execution Time: 12278.073 ms (21 rows) 5.2 Same query but w/o NULLS FIRST SELECT updated_at FROM T1 ORDER BY updated_at LIMIT 130000 5.2.1 EXPLAIN (ANALYZE,BUFFERS) output: QUERY PLAN --------------------------------------------------------------------------------------------------------------------------------------------------------------------------- Limit (cost=0.56..2643.19 rows=130000 width=8) (actual time=0.063..30.874 rows=130000 loops=1) Buffers: shared hit=12 read=501 I/O Timings: read=9.383 -> Index Only Scan using idx_t1_updated_at_pk on t1 (cost=0.56..693858.10 rows=34133260 width=8) (actual time=0.062..21.435 rows=130000 loops=1) Heap Fetches: 0 Buffers: shared hit=12 read=501 I/O Timings: read=9.383 Planning Time: 0.083 ms Execution Time: 35.562 ms (9 rows) 5.3 Same query but with NULLS LAST SELECT updated_at FROM T1 ORDER BY updated_at NULLS LAST LIMIT 130000 5.3.1 EXPLAIN (ANALYZE,BUFFERS) output: QUERY PLAN --------------------------------------------------------------------------------------------------------------------------------------------------------------------------- Limit (cost=0.56..2643.19 rows=130000 width=8) (actual time=1.768..35.431 rows=130000 loops=1) Buffers: shared hit=11 read=502 I/O Timings: read=13.143 -> Index Only Scan using idx_t1_updated_at_pk on t1 (cost=0.56..693858.10 rows=34133260 width=8) (actual time=1.767..26.107 rows=130000 loops=1) Heap Fetches: 0 Buffers: shared hit=11 read=502 I/O Timings: read=13.143 Planning Time: 0.135 ms Execution Time: 40.427 ms (9 rows) 6. Expected Result: The query in 5. should have performance like in 5.2 or 5.3 for non-nullable columns. 7. Workaround: Workaround is to not specify NULLS FIRST explicitly for non-nullable columns. 8. Severity: Medium - This issue causes significant performance issues when querying on big tables (like ERROR: temporary file size exceeds temp_file_limit (5242880kB)).
Re: BUG #18477: A specific SQL query with "ORDER BY ... NULLS FIRST" is performing poorly if an ordering column is n
From
Tom Lane
Date:
PG Bug reporting form <noreply@postgresql.org> writes: > A specific SQL query with "ORDER BY ... NULLS FIRST" is performing poorly if > an ordering column is not nullable. The reason it's performing poorly is that ORDER BY updated_at NULLS FIRST is not compatible with the sort order of your index (which is, by default, NULLS LAST). So the query has to be done with an explicit sort, which requires reading the whole table. I know you are going to say that it shouldn't matter as long as the column is marked NOT NULL, but too bad: it does. This is not a bug, and it's not something we're likely to expend a great deal of sweat on improving. If you know the column is null-free, why are you writing NULLS FIRST? If you have a good reason to write NULLS FIRST, why not declare the index to match? regards, tom lane
Re: BUG #18477: A specific SQL query with "ORDER BY ... NULLS FIRST" is performing poorly if an ordering column is n
From
Alexander Alexander
Date:
I fixed it by removing NULLS FIRST from the query (it was introduced by our query builder). However, it seems the query optimizer could account for such scenarios.
Additionally, as you mentioned, the default index is created with NULLS LAST, but in this case, the column is non-nullable, making NULLS LAST unnecessary as well.
Additionally, as you mentioned, the default index is created with NULLS LAST, but in this case, the column is non-nullable, making NULLS LAST unnecessary as well.
Thank you,
Alexander.
чт, 23 мая 2024 г. в 20:26, Tom Lane <tgl@sss.pgh.pa.us>:
PG Bug reporting form <noreply@postgresql.org> writes:
> A specific SQL query with "ORDER BY ... NULLS FIRST" is performing poorly if
> an ordering column is not nullable.
The reason it's performing poorly is that
ORDER BY updated_at NULLS FIRST
is not compatible with the sort order of your index (which is,
by default, NULLS LAST). So the query has to be done with an
explicit sort, which requires reading the whole table.
I know you are going to say that it shouldn't matter as long as the
column is marked NOT NULL, but too bad: it does. This is not a bug,
and it's not something we're likely to expend a great deal of sweat
on improving. If you know the column is null-free, why are you
writing NULLS FIRST? If you have a good reason to write NULLS FIRST,
why not declare the index to match?
regards, tom lane
Re: BUG #18477: A specific SQL query with "ORDER BY ... NULLS FIRST" is performing poorly if an ordering column is n
From
Alvaro Herrera
Date:
On 2024-May-24, Alexander Alexander wrote: > Additionally, as you mentioned, the default index is created with NULLS > LAST, but in this case, the column is non-nullable, making NULLS LAST > unnecessary as well. But the NOT NULL constraint could be dropped at any minute, so the system needs to know where NULLs would go if that were to happen. -- Álvaro Herrera PostgreSQL Developer — https://www.EnterpriseDB.com/ "Los dioses no protegen a los insensatos. Éstos reciben protección de otros insensatos mejor dotados" (Luis Wu, Mundo Anillo)
Re: BUG #18477: A specific SQL query with "ORDER BY ... NULLS FIRST" is performing poorly if an ordering column is n
From
David Rowley
Date:
On Fri, 24 May 2024 at 22:03, Alvaro Herrera <alvherre@alvh.no-ip.org> wrote: > > On 2024-May-24, Alexander Alexander wrote: > > > Additionally, as you mentioned, the default index is created with NULLS > > LAST, but in this case, the column is non-nullable, making NULLS LAST > > unnecessary as well. > > But the NOT NULL constraint could be dropped at any minute, so the > system needs to know where NULLs would go if that were to happen. In my understanding, the planner will hold a lock that will prevent a concurrent session from doing ALTER TABLE ... DROP NOT NULL, so if the planner were to do an optimisation such as this, I think it should be safe. Can you explain where the hazard is? In the EXECUTE of a prepared statement case, DROP NOT NULL should cause a relcache invalidation that should be noticed during AcquireExecutorLocks() which should result in a re-plan. In the re-plan, the optimisation will not be enabled. Isn't the argument you're making here just the same as in [1] which Tom explained was safe in [2]? I think this concern may have come from our inability to allow to allow functional dependency detection of columns that are dependant on a UNIQUE + NOT NULL constraint. e.g. CREATE TABLE t (a INT NOT NULL UNIQUE, b INT NOT NULL); CREATE VIEW v_t AS SELECT a,b FROM t GROUP BY a; The same works ok if you swap "UNIQUE" for "PRIMARY KEY" as PKs make the columns non-nullable. For UNIQUE constraints, we cannot allow the view to be created because we have no way to add a dependency to block the NOT NULL from being dropped which would invalidate the view. (Thanks for your work on getting us closer to allowing that. I hope you get more time to work on that for v18) David [1] https://postgr.es/m/202401231915.uwk6zrqbdvsu@alvherre.pgsql [2] https://postgr.es/m/4071562.1706038734@sss.pgh.pa.us
Re: BUG #18477: A specific SQL query with "ORDER BY ... NULLS FIRST" is performing poorly if an ordering column is n
From
Tom Lane
Date:
David Rowley <dgrowleyml@gmail.com> writes: > On Fri, 24 May 2024 at 22:03, Alvaro Herrera <alvherre@alvh.no-ip.org> wrote: >> But the NOT NULL constraint could be dropped at any minute, so the >> system needs to know where NULLs would go if that were to happen. > In my understanding, the planner will hold a lock that will prevent a > concurrent session from doing ALTER TABLE ... DROP NOT NULL, so if the > planner were to do an optimisation such as this, I think it should be > safe. Can you explain where the hazard is? This has nothing to do with the planner. The question Alvaro responded to is whether the DDL definition of the index establishes which end nulls should go at. Which it does, independently of whether the table actually contains any nulls. Yes, we could possibly set things up so that an index with nulls last is considered to match a query that specifies NULLS FIRST if we know that the column is not-nullable. But I refuse to believe that this would be a good use of either development effort or planner cycles. AFAICS the problem is purely self-inflicted damage: why is the user specifying NULLS FIRST if he knows the column is not-null? regards, tom lane
Re: BUG #18477: A specific SQL query with "ORDER BY ... NULLS FIRST" is performing poorly if an ordering column is n
From
David Rowley
Date:
On Thu, 6 Jun 2024 at 15:48, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Yes, we could possibly set things up so that an index with nulls last > is considered to match a query that specifies NULLS FIRST if we know > that the column is not-nullable. But I refuse to believe that this > would be a good use of either development effort or planner cycles. > AFAICS the problem is purely self-inflicted damage: why is the user > specifying NULLS FIRST if he knows the column is not-null? I do agree with you when I think about this for the limited scope that you've described here. However, I think you've constrained your thinking of the usefulness of such an optimisation far more narrowly than it could be applied. While I've not given this much thought, it did occur to me that if you had two nullable columns which are in the same EquivalanceClass, which was generated by a strict equality clause. If these two columns are not in the same table, what's the reason, assuming the join condition is strict that we couldn't perform a Merge Join using presorted results from a NULLS FIRST index on one side of the join and a NULLS LAST on the other side? If the user has some genuine reason for creating a NULLS FIRST index for some other query, then it might be nice if we were able to use that index for Merge Joins instead of them having to create another index to speed up the join query. I've no plans to go and do anything to improve this situation, I just didn't want this thread to be left in a state that would put off anyone from making improvements in this area. Having the ability to better reuse existing indexes is a worthy goal, at least in my book. If an EquivalenceClass had some ability to track non-nullness of its members based on strict OpExprs, then we probably could optimise these sorts of queries better. I don't know how that would work with canonical PathKeys, but the EquivalenceClass must at least exist before a PathKey can, so maybe PathKey.pk_nulls_first can be left false when the EquivalenceClass states that no nulls can exist after evaluation of the OpExprs that allowed the EquivalenceClass to exist. There may be some hazards there I've not thought about, but it seems like considering those would be part of the work of anyone working on a patch to improve this area. David
Re: BUG #18477: A specific SQL query with "ORDER BY ... NULLS FIRST" is performing poorly if an ordering column is n
From
Laurenz Albe
Date:
On Thu, 2024-06-06 at 17:04 +1200, David Rowley wrote: > On Thu, 6 Jun 2024 at 15:48, Tom Lane <tgl@sss.pgh.pa.us> wrote: > > AFAICS the problem is purely self-inflicted damage: why is the user > > specifying NULLS FIRST if he knows the column is not-null? > > If the user has some genuine reason for creating a NULLS FIRST index > for some other query, then it might be nice if we were able to use > that index for Merge Joins instead of them having to create another > index to speed up the join query. I cannot imagine a genuine reason for creating a NULLS FIRST index on a NOT NULL column, but perhaps I am too narrow-minded. Yours, Laurenz Albe
Re: BUG #18477: A specific SQL query with "ORDER BY ... NULLS FIRST" is performing poorly if an ordering column is n
From
David Rowley
Date:
On Thu, 6 Jun 2024 at 20:10, Laurenz Albe <laurenz.albe@cybertec.at> wrote: > > On Thu, 2024-06-06 at 17:04 +1200, David Rowley wrote: > > If the user has some genuine reason for creating a NULLS FIRST index > > for some other query, then it might be nice if we were able to use > > that index for Merge Joins instead of them having to create another > > index to speed up the join query. > > I cannot imagine a genuine reason for creating a NULLS FIRST index > on a NOT NULL column, but perhaps I am too narrow-minded. You didn't quote the part where I said "if you had two nullable columns". It might be worth rereading what I wrote and taking the time to understand it. David