Re: range_agg extremely slow compared to naive implementation in obscure circumstances - Mailing list pgsql-bugs

From Pavel Stehule
Subject Re: range_agg extremely slow compared to naive implementation in obscure circumstances
Date
Msg-id CAFj8pRA8udegzsc0BhY3tU7YgStuJ9N8V-wyNKnzi71fY=M1wg@mail.gmail.com
Whole thread Raw
In response to range_agg extremely slow compared to naive implementation in obscure circumstances  (Duncan Sands <duncan.sands@deepbluecap.com>)
Responses Re: range_agg extremely slow compared to naive implementation in obscure circumstances  (David Rowley <dgrowleyml@gmail.com>)
List pgsql-bugs


po 30. 1. 2023 v 11:24 odesílatel Duncan Sands <duncan.sands@deepbluecap.com> napsal:
In general range_agg is faster than the naive version

CREATE AGGREGATE public.naive_range_agg(anymultirange) (
     SFUNC = multirange_union,
     STYPE = anymultirange
);

however here is an example where using it is over 6000 times slower.  I'm not
sure exactly what feature of the example triggers this - I failed to create a
synthetic testcase using generate_series, thus the attached table data.

How to reproduce (Ubuntu 22.10, x86_64, postgresql 15.1-1.pgdg22.10+1):

$ cp data.txt.gz /tmp/
$ gunzip /tmp/data.txt.gz
$ psql
Pager usage is off.
psql (15.1 (Ubuntu 15.1-1.pgdg22.10+1))
Type "help" for help.

duncan=> CREATE TEMP TABLE wacky(priority bigint, valid tstzrange);
CREATE TABLE
duncan=> \COPY wacky FROM /tmp/data.txt
COPY 98094
duncan=> CREATE AGGREGATE public.naive_range_agg(anymultirange) (
     SFUNC = multirange_union,
     STYPE = anymultirange
);
CREATE AGGREGATE
duncan=> \timing on
Timing is on.
duncan=> EXPLAIN (ANALYZE) SELECT FROM (SELECT valid,
naive_range_agg(valid::tstzmultirange) OVER (ORDER BY priority DESC GROUPS
BETWEEN UNBOUNDED PRECEDING AND 1 PRECEDING) AS more_recent FROM wacky) foo
WHERE valid <@ more_recent;
                                                          QUERY PLAN

-----------------------------------------------------------------------------------------------------------------------------
  Subquery Scan on foo  (cost=11063.57..13879.37 rows=433 width=0) (actual
time=88.086..88.087 rows=0 loops=1)
    Filter: (foo.valid <@ foo.more_recent)
    Rows Removed by Filter: 98094
    ->  WindowAgg  (cost=11063.57..12796.37 rows=86640 width=72) (actual
time=16.102..84.242 rows=98094 loops=1)
          ->  Sort  (cost=11063.57..11280.17 rows=86640 width=40) (actual
time=16.096..20.205 rows=98094 loops=1)
                Sort Key: wacky.priority DESC
                Sort Method: external merge  Disk: 3848kB
                ->  Seq Scan on wacky  (cost=0.00..1588.40 rows=86640 width=40)
(actual time=0.021..5.479 rows=98094 loops=1)
  Planning Time: 0.277 ms
  Execution Time: 88.631 ms
(10 rows)

Time: 98.261 ms
duncan=> EXPLAIN (ANALYZE) SELECT FROM (SELECT valid,
range_agg(valid::tstzmultirange) OVER (ORDER BY priority DESC GROUPS BETWEEN
UNBOUNDED PRECEDING AND 1 PRECEDING) AS more_recent FROM wacky) foo WHERE valid
<@ more_recent;
                                                          QUERY PLAN

-----------------------------------------------------------------------------------------------------------------------------
  Subquery Scan on foo  (cost=11063.57..13879.37 rows=433 width=0) (actual
time=566009.972..566009.973 rows=0 loops=1)
    Filter: (foo.valid <@ foo.more_recent)
    Rows Removed by Filter: 98094
    ->  WindowAgg  (cost=11063.57..12796.37 rows=86640 width=72) (actual
time=21.996..565998.800 rows=98094 loops=1)
          ->  Sort  (cost=11063.57..11280.17 rows=86640 width=40) (actual
time=21.988..26.154 rows=98094 loops=1)
                Sort Key: wacky.priority DESC
                Sort Method: external merge  Disk: 3848kB
                ->  Seq Scan on wacky  (cost=0.00..1588.40 rows=86640 width=40)
(actual time=0.014..6.868 rows=98094 loops=1)
  Planning Time: 0.178 ms
  Execution Time: 566010.770 ms
(10 rows)


Maybe there is some problem in range_deserialize function

  40,53%  postgres                                        [.] range_deserialize                                                      ◆
   8,28%  postgres                                        [.] FunctionCall2Coll                                                      ▒
   7,14%  postgres                                        [.] range_compare                                                          ▒
   4,95%  postgres                                        [.] qsort_arg                                                              ▒
   4,51%  postgres                                        [.] range_cmp_bounds                                                       ▒
   2,49%  postgres                                        [.] timestamp_cmp                                                          ▒
   1,73%  postgres                                        [.] range_serialize                                                        ▒
   0,91%  postgres                                        [.] AllocSetAlloc                    
 
Regards

Pavel

Time: 566018.613 ms (09:26.019)

pgsql-bugs by date:

Previous
From: Duncan Sands
Date:
Subject: range_agg extremely slow compared to naive implementation in obscure circumstances
Next
From: David Rowley
Date:
Subject: Re: range_agg extremely slow compared to naive implementation in obscure circumstances