Re: Query run in 27s with 15.2 vs 37ms with 14.6 - Mailing list pgsql-bugs

From Charles
Subject Re: Query run in 27s with 15.2 vs 37ms with 14.6
Date
Msg-id CABthHP-maAwqVEUfvMK0sWp3oaPQuOg=+fRj5hJGFBXNBDJX7A@mail.gmail.com
Whole thread Raw
In response to Re: Query run in 27s with 15.2 vs 37ms with 14.6  (Stephen Frost <sfrost@snowman.net>)
Responses Re: Query run in 27s with 15.2 vs 37ms with 14.6  (Stephen Frost <sfrost@snowman.net>)
List pgsql-bugs


On Tue, Feb 21, 2023 at 3:58 AM Stephen Frost <sfrost@snowman.net> wrote:
Greetings,

* Charles (peacech@gmail.com) wrote:
> Wrapping the query with a select * from (...) t where length(code) = 4 puts
> the execution time back to 27 seconds.
>
> This is a bit unexpected since I expect that the result from the inner
> query to be executed first and then filtered.

It's really not- PG will (correctly) attempt to pull in such subselects
into the overall optimization, which is generally better for everyone.
If you want to force it, you can use a WITH MATERIALIZED CTE, or throw
in an 'OFFSET 0' as a hack into your sub-select, but really it's a much
better idea to generate extended stats on what you're filtering as has
been suggested, or come up with a better data representation where
you're not doing a search on a 'length()' as you are.



Generating extended statistics on the expression (length(code)) causes the planner to generate has join which runs in 183ms (still slower than 37ms on 14.6). Using materialized cte (no extended stats) results in nested loop that runs in 229ms. I think I'll revert back to postgresql 14 for now. Thank you for the suggestions.

with extended stats:
Hash Join  (cost=65865.64..69478.05 rows=244 width=89) (actual time=171.915..183.674 rows=779 loops=1)
  Hash Cond: (s.code = va.code)
  ->  Bitmap Heap Scan on stock_price s  (cost=16.12..3624.83 rows=244 width=29) (actual time=0.079..0.238 rows=779 loops=1)
        Recheck Cond: (date = '2023-02-20'::date)
        Filter: ((open > 0) AND (value > 0) AND (length(code) = 4))
        Rows Removed by Filter: 210
        Heap Blocks: exact=34
        ->  Bitmap Index Scan on idx_stock_price_date  (cost=0.00..16.06 rows=1017 width=0) (actual time=0.042..0.042 rows=1827 loops=1)
              Index Cond: (date = '2023-02-20'::date)
  ->  Hash  (cost=65836.90..65836.90 rows=1010 width=37) (actual time=171.827..183.109 rows=907 loops=1)
        Buckets: 1024  Batches: 1  Memory Usage: 49kB
        ->  Subquery Scan on va  (cost=65563.34..65836.90 rows=1010 width=37) (actual time=169.957..182.987 rows=907 loops=1)
              ->  Finalize GroupAggregate  (cost=65563.34..65826.80 rows=1010 width=37) (actual time=169.957..182.947 rows=907 loops=1)
                    Group Key: stock_price.code
                    InitPlan 1 (returns $0)
                      ->  Limit  (cost=0.53..0.58 rows=1 width=4) (actual time=0.012..0.012 rows=1 loops=1)
                            ->  Index Only Scan Backward using stock_date_pkey on stock_date  (cost=0.28..363.63 rows=7359 width=4) (actual time=0.010..0.011 rows=6 loops=1)
                                  Heap Fetches: 6
                    ->  Gather Merge  (cost=65562.76..65798.45 rows=2020 width=37) (actual time=169.947..181.671 rows=2616 loops=1)
                          Workers Planned: 2
                          Params Evaluated: $0
                          Workers Launched: 2
                          ->  Sort  (cost=64562.74..64565.26 rows=1010 width=37) (actual time=156.536..156.575 rows=872 loops=3)
                                Sort Key: stock_price.code
                                Sort Method: quicksort  Memory: 94kB
                                Worker 0:  Sort Method: quicksort  Memory: 93kB
                                Worker 1:  Sort Method: quicksort  Memory: 90kB
                                ->  Partial HashAggregate  (cost=64499.71..64512.34 rows=1010 width=37) (actual time=156.002..156.246 rows=872 loops=3)
                                      Group Key: stock_price.code
                                      Batches: 1  Memory Usage: 577kB
                                      Worker 0:  Batches: 1  Memory Usage: 577kB
                                      Worker 1:  Batches: 1  Memory Usage: 577kB
                                      ->  Parallel Seq Scan on stock_price  (cost=0.00..64038.54 rows=92234 width=13) (actual time=68.301..155.471 rows=1427 loops=3)
                                            Filter: ((value > 0) AND (date > $0))
                                            Rows Removed by Filter: 906975
Planning Time: 0.340 ms
Execution Time: 183.833 ms

materialized cte:
Nested Loop  (cost=65839.48..69480.90 rows=1 width=89) (actual time=156.363..229.143 rows=779 loops=1)
  Join Filter: (s.code = va.code)
  Rows Removed by Join Filter: 705774
  CTE vol_avg
    ->  Finalize GroupAggregate  (cost=65561.00..65823.42 rows=1006 width=37) (actual time=154.575..156.254 rows=907 loops=1)
          Group Key: stock_price.code
          InitPlan 1 (returns $0)
            ->  Limit  (cost=0.53..0.58 rows=1 width=4) (actual time=0.013..0.013 rows=1 loops=1)
                  ->  Index Only Scan Backward using stock_date_pkey on stock_date  (cost=0.28..363.63 rows=7359 width=4) (actual time=0.011..0.012 rows=6 loops=1)
                        Heap Fetches: 6
          ->  Gather Merge  (cost=65560.42..65795.17 rows=2012 width=37) (actual time=154.564..155.143 rows=2625 loops=1)
                Workers Planned: 2
                Params Evaluated: $0
                Workers Launched: 2
                ->  Sort  (cost=64560.40..64562.92 rows=1006 width=37) (actual time=138.777..138.822 rows=875 loops=3)
                      Sort Key: stock_price.code
                      Sort Method: quicksort  Memory: 92kB
                      Worker 0:  Sort Method: quicksort  Memory: 93kB
                      Worker 1:  Sort Method: quicksort  Memory: 93kB
                      ->  Partial HashAggregate  (cost=64497.65..64510.23 rows=1006 width=37) (actual time=138.193..138.455 rows=875 loops=3)
                            Group Key: stock_price.code
                            Batches: 1  Memory Usage: 577kB
                            Worker 0:  Batches: 1  Memory Usage: 577kB
                            Worker 1:  Batches: 1  Memory Usage: 577kB
                            ->  Parallel Seq Scan on stock_price  (cost=0.00..64038.54 rows=91822 width=13) (actual time=133.465..137.738 rows=1427 loops=3)
                                  Filter: ((value > 0) AND (date > $0))
                                  Rows Removed by Filter: 906975
  ->  Bitmap Heap Scan on stock_price s  (cost=16.06..3624.77 rows=1 width=29) (actual time=0.071..0.570 rows=779 loops=1)
        Recheck Cond: (date = '2023-02-20'::date)
        Filter: ((open > 0) AND (value > 0) AND (length(code) = 4))
        Rows Removed by Filter: 210
        Heap Blocks: exact=34
        ->  Bitmap Index Scan on idx_stock_price_date  (cost=0.00..16.06 rows=1017 width=0) (actual time=0.036..0.036 rows=1827 loops=1)
              Index Cond: (date = '2023-02-20'::date)
  ->  CTE Scan on vol_avg va  (cost=0.00..20.12 rows=1006 width=64) (actual time=0.198..0.244 rows=907 loops=779)
Planning Time: 0.499 ms
Execution Time: 229.381 ms

pgsql-bugs by date:

Previous
From: Charles
Date:
Subject: Re: Query run in 27s with 15.2 vs 37ms with 14.6
Next
From: Stephen Frost
Date:
Subject: Re: Query run in 27s with 15.2 vs 37ms with 14.6