Thread: A performance issue in ROW_NUMBER() OVER(ORDER BY NULL) [27 times slow than OVER()] V14.5

Okay,
  I have some converted code that uses this syntax.
  For 20 Million rows it was taking 15-20 minutes!  (versus 3 minutes) on live data.
  See here: https://explain.depesz.com/s/VQFJ  [There are 2 optimizations, removing the ORDER BY NULL, and just using a sequence]
  (The above is a live data run)  The IO with the ORDER BY NULL is crazy high.  10x the READ in GB!

  The solution is to remove the ORDER BY NULL.  [since that is not sortable, should it be ignored?]

  This does NOT SHOW UP with 1 million rows.

simple example (slow):
explain (verbose, buffers, analyze) select row_number() over(ORDER BY NULL) as rn, x, random() from generate_series(1,20000000) x order by x desc, random();
                                                                           QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------------------------------------
 Sort  (cost=3365975.67..3415975.67 rows=20000000 width=52) (actual time=816050.580..817984.447 rows=20000000 loops=1)
   Output: (row_number() OVER (?)), x, (random()), NULL::text
   Sort Key: x.x DESC, (random())
   Sort Method: external merge  Disk: 665376kB
   Buffers: temp read=20136158 written=161301
   ->  WindowAgg  (cost=0.00..550000.00 rows=20000000 width=52) (actual time=12585.463..791440.272 rows=20000000 loops=1)
         Output: row_number() OVER (?), x, random(), (NULL::text)
         Buffers: temp read=20052986 written=78126
         ->  Function Scan on pg_catalog.generate_series x  (cost=0.00..200000.00 rows=20000000 width=36) (actual time=3282.882..6343.470 rows=20000000 loops=1)
               Output: NULL::text, x
               Function Call: generate_series(1, 20000000)
               Buffers: temp read=34180 written=34180
 Query Identifier: 7415410443092007025
 Planning Time: 0.036 ms
 JIT:
   Functions: 7
   Options: Inlining true, Optimization true, Expressions true, Deforming true
   Timing: Generation 0.231 ms, Inlining 5.021 ms, Optimization 18.850 ms, Emission 12.934 ms, Total 37.036 ms
 Execution Time: 819003.415 ms
(19 rows)

Time: 819004.041 ms (13:39.004)



the fixed version:
explain (verbose, buffers, analyze) select row_number() over() as rn, x, random() from generate_series(1,20000000) x order by x desc, random();
                                                                           QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------------------------------
 Sort  (cost=3159725.67..3209725.67 rows=20000000 width=20) (actual time=23361.306..25466.728 rows=20000000 loops=1)
   Output: (row_number() OVER (?)), x, (random())
   Sort Key: x.x DESC, (random())
   Sort Method: external merge  Disk: 665376kB
   Buffers: temp read=117352 written=117355
   ->  WindowAgg  (cost=0.00..500000.00 rows=20000000 width=20) (actual time=3409.882..9789.217 rows=20000000 loops=1)
         Output: row_number() OVER (?), x, random()
         Buffers: temp read=34180 written=34180
         ->  Function Scan on pg_catalog.generate_series x  (cost=0.00..200000.00 rows=20000000 width=4) (actual time=3409.853..6768.737 rows=20000000 loops=1)
               Output: x
               Function Call: generate_series(1, 20000000)
               Buffers: temp read=34180 written=34180
 Query Identifier: -8739706385719272689
 Planning Time: 0.067 ms
 JIT:
   Functions: 4
   Options: Inlining true, Optimization true, Expressions true, Deforming true
   Timing: Generation 0.180 ms, Inlining 32.829 ms, Optimization 5.890 ms, Emission 3.899 ms, Total 42.798 ms
 Execution Time: 26070.514 ms

Time: 26071.112 ms (00:26.071)



Kirk Wolak <wolakk@gmail.com> writes:
>   I have some converted code that uses this syntax.

Seems kinda dumb, but ...

>   The solution is to remove the ORDER BY NULL.  [since that is not
> sortable, should it be ignored?]
>   This does NOT SHOW UP with 1 million rows.

I don't see it at all.  Comparing your two test queries on released
branches, I see maybe 2x penalty for the ORDER BY NULL, not 30x.
(In HEAD there's only about 13% penalty.)  I wonder what PG version
you are testing.

            regards, tom lane



On Mon, 20 Feb 2023 at 10:18, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> I don't see it at all.  Comparing your two test queries on released
> branches, I see maybe 2x penalty for the ORDER BY NULL, not 30x.
> (In HEAD there's only about 13% penalty.)  I wonder what PG version
> you are testing.

I suspect ed1a88dda would be what made this faster in master. We'll
check for peer rows to check "NULL IS NOT DISTINCT FROM NULL" prior to
that change with the ORDER BY NULL query.

David



David Rowley <dgrowleyml@gmail.com> writes:
> I suspect ed1a88dda would be what made this faster in master. We'll
> check for peer rows to check "NULL IS NOT DISTINCT FROM NULL" prior to
> that change with the ORDER BY NULL query.

Mmm, yeah, probably so: "order by null rows between unbounded preceding
and current row" has about the same performance in v15 as HEAD has
with just "order by null".

I suspect most of the remaining performance discrepancy is just triggered
by having to pass the extra always-NULL column forward through the various
plan steps.  We could teach createplan.c to generate a WindowAgg plan node
that omits the useless column from ordNumCols/ordColIdx/etc, but I'm not
sure if that'd save much in itself.  (The comment in create_windowagg_plan
shows we already thought about that and decided it wasn't worth the
trouble.)  Getting rid of the useless targetlist column altogether would
be way more invasive, and I'm not inclined to try.

            regards, tom lane





po 20. 2. 2023 v 0:26 odesílatel David Rowley <dgrowleyml@gmail.com> napsal:
On Mon, 20 Feb 2023 at 10:18, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> I don't see it at all.  Comparing your two test queries on released
> branches, I see maybe 2x penalty for the ORDER BY NULL, not 30x.
> (In HEAD there's only about 13% penalty.)  I wonder what PG version
> you are testing.

I suspect ed1a88dda would be what made this faster in master. We'll
check for peer rows to check "NULL IS NOT DISTINCT FROM NULL" prior to
that change with the ORDER BY NULL query.

yes, I tested it on master, and the query is little bit slower still, but not too much (27 sec x 24 sec)

Regards

Pavel
 

David


On Mon, 20 Feb 2023 at 13:17, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> I suspect most of the remaining performance discrepancy is just triggered
> by having to pass the extra always-NULL column forward through the various
> plan steps.  We could teach createplan.c to generate a WindowAgg plan node
> that omits the useless column from ordNumCols/ordColIdx/etc, but I'm not
> sure if that'd save much in itself.  (The comment in create_windowagg_plan
> shows we already thought about that and decided it wasn't worth the
> trouble.)

I wonder what the comment had in mind when it said it doesn't seem
worth it.  Doing an if (IsA(tle->expr, Const)) continue; seems pretty
simple and low-cost.  I've not looked at what the comment mentions
about RANGE OFFSET.  Assuming we'd need to not remove any ORDER BY
clauses when the WindowClause is doing that.

> Getting rid of the useless targetlist column altogether would
> be way more invasive, and I'm not inclined to try.

Yeah, that would likely add more complexity than it would be worth.

David



David Rowley <dgrowleyml@gmail.com> writes:
> On Mon, 20 Feb 2023 at 13:17, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> We could teach createplan.c to generate a WindowAgg plan node
>> that omits the useless column from ordNumCols/ordColIdx/etc, but I'm not
>> sure if that'd save much in itself.

> I wonder what the comment had in mind when it said it doesn't seem
> worth it.  Doing an if (IsA(tle->expr, Const)) continue; seems pretty
> simple and low-cost.

The right way to do this would be to pass forward information
about which window sort columns had been deemed redundant during
make_pathkeys_for_sortclauses: there are a lot more ways for that
to happen than "it's a constant".

The recent addition of make_pathkeys_for_sortclauses_extended would
make it pretty easy for make_pathkeys_for_window to obtain a reduced list
of SortGroupClauses, and then in theory you could match that against the
original lists to identify which sort columns are redundant (though
repeated keys might make that tricky; might be better to provide another
more-direct output representation).  Pass the information forward in
WindowAggPath, et voila.

I remain doubtful that it's worth the trouble though.

> I've not looked at what the comment mentions
> about RANGE OFFSET.  Assuming we'd need to not remove any ORDER BY
> clauses when the WindowClause is doing that.

There's some Asserts in nodeWindowAgg.c about ordNumCols being
positive, probably related.

            regards, tom lane



On Sun, Feb 19, 2023 at 4:18 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
Kirk Wolak <wolakk@gmail.com> writes:
>   I have some converted code that uses this syntax.

Seems kinda dumb, but ...

>   The solution is to remove the ORDER BY NULL.  [since that is not
> sortable, should it be ignored?]
>   This does NOT SHOW UP with 1 million rows.

I don't see it at all.  Comparing your two test queries on released
branches, I see maybe 2x penalty for the ORDER BY NULL, not 30x.
(In HEAD there's only about 13% penalty.)  I wonder what PG version
you are testing.

                        regards, tom lane
Tom,
  I put V14.5 in the subject line (I could have made it more clear).
  It appears in new versions, as confirmed by Pavel, it is already addressed in some newer versions.

  Also, would it make sense to have EXPLAIN output the version of PG?  I think that might be useful,
because it becomes a COMMON next question?

Thanks,

Kirk Out!