Thread: slow array(subselect)

slow array(subselect)

From

Pavel Stehule

Date:

04 March 2018, 23:19:37

Just curious

postgres=# explain analyze select array_upper(array_agg(i),1) from generate_series(1,100000) g(i);
                                                           QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------
Aggregate (cost=12.50..12.52 rows=1 width=4) (actual time=43.164..43.164 rows=1 loops=1)
   -> Function Scan on generate_series g (cost=0.00..10.00 rows=1000 width=4) (actual time=19.150..31.083 rows=100000 loops=1)
Planning time: 0.100 ms
Execution time: 44.392 ms
(4 rows)

postgres=# explain analyze select array_upper(array(select i from generate_series(1,1000000) g(i)),1);
                                                              QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------------
Result (cost=10.00..10.01 rows=1 width=4) (actual time=334.567..334.568 rows=1 loops=1)
   InitPlan 1 (returns $0)
     -> Function Scan on generate_series g (cost=0.00..10.00 rows=1000 width=4) (actual time=141.503..254.483 rows=1000000 loops=1)
Planning time: 0.171 ms
Execution time: 339.057 ms
(5 rows)

This code is about 8times slower

Regards

Pavel

Re: slow array(subselect)

From

Tomas Vondra

Date:

04 March 2018, 23:36:55


On 03/04/2018 09:19 PM, Pavel Stehule wrote:
> Hi
> 
> Just curious
> 
> postgres=# explain analyze select array_upper(array_agg(i),1) from
> generate_series(1,100000) g(i);
>                                                            QUERY
> PLAN                                                           
>
---------------------------------------------------------------------------------------------------------------------------------
>  Aggregate  (cost=12.50..12.52 rows=1 width=4) (actual
> time=43.164..43.164 rows=1 loops=1)
>    ->  Function Scan on generate_series g  (cost=0.00..10.00 rows=1000
> width=4) (actual time=19.150..31.083 rows=100000 loops=1)
>  Planning time: 0.100 ms
>  Execution time: 44.392 ms
> (4 rows)
> 
> postgres=# explain analyze select array_upper(array(select i from
> generate_series(1,1000000) g(i)),1);
>                                                               QUERY
> PLAN                                                             
>
--------------------------------------------------------------------------------------------------------------------------------------
>  Result  (cost=10.00..10.01 rows=1 width=4) (actual
> time=334.567..334.568 rows=1 loops=1)
>    InitPlan 1 (returns $0)
>      ->  Function Scan on generate_series g  (cost=0.00..10.00 rows=1000
> width=4) (actual time=141.503..254.483 rows=1000000 loops=1)
>  Planning time: 0.171 ms
>  Execution time: 339.057 ms
> (5 rows)
> 
> This code is about 8times slower
> 

Because you're passing 100000 in the first query and 1000000 in the
second one. So it should be about 10x slower, more or less.

regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: slow array(subselect)

From

Pavel Stehule

Date:

04 March 2018, 23:40:14

2018-03-04 21:36 GMT+01:00 Tomas Vondra <tomas.vondra@2ndquadrant.com>:

On 03/04/2018 09:19 PM, Pavel Stehule wrote:
> Hi
>
> Just curious
>
> postgres=# explain analyze select array_upper(array_agg(i),1) from
> generate_series(1,100000) g(i);
>                                                            QUERY
> PLAN
> ---------------------------------------------------------------------------------------------------------------------------------
> Aggregate (cost=12.50..12.52 rows=1 width=4) (actual
> time=43.164..43.164 rows=1 loops=1)
>    -> Function Scan on generate_series g (cost=0.00..10.00 rows=1000
> width=4) (actual time=19.150..31.083 rows=100000 loops=1)
> Planning time: 0.100 ms
> Execution time: 44.392 ms
> (4 rows)
>
> postgres=# explain analyze select array_upper(array(select i from
> generate_series(1,1000000) g(i)),1);
>                                                               QUERY
> PLAN
> --------------------------------------------------------------------------------------------------------------------------------------
> Result (cost=10.00..10.01 rows=1 width=4) (actual
> time=334.567..334.568 rows=1 loops=1)
>    InitPlan 1 (returns $0)
>      -> Function Scan on generate_series g (cost=0.00..10.00 rows=1000
> width=4) (actual time=141.503..254.483 rows=1000000 loops=1)
> Planning time: 0.171 ms
> Execution time: 339.057 ms
> (5 rows)
>
> This code is about 8times slower
>

Because you're passing 100000 in the first query and 1000000 in the
second one. So it should be about 10x slower, more or less.

true. No miracle. I am tululum :/

Thank you

Pavel

regards

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services