Re: Odd Choice of seq scan - Mailing list pgsql-performance

From Chris Hoover
Subject Re: Odd Choice of seq scan
Date
Msg-id F42718BF-15F1-4FFC-9702-A319D893A912@aweber.com
Whole thread Raw
In response to Odd Choice of seq scan  (Paul McGarry <paul@paulmcgarry.com>)
List pgsql-performance
I don’t have a database running the versions you are, but what I’ve had to do to get around thing like is it to write
thequery something like this: 

WITH orderids AS (
SELECT ‘546111’ AS orderid
UNION
SELECT orderid
  FROM orderstotrans
 WHERE transid IN ('546111')
)
select orders.orderid
  FROM orderids
  JOIN orders USING (orderid);

Hope this helps your situation.

Thanks,


Chris Hoover
Senior DBA
AWeber.com
Cell: (803) 528-2269
Email: chrish@aweber.com



> On Dec 1, 2022, at 7:52 PM, Paul McGarry <paul@paulmcgarry.com> wrote:
>
> Hi there,
>
> I'm wondering if anyone has any insight into what might make the database choose a sequential scan for a query (table
defsand plan below) like : 
>
> SELECT orders.orderid FROM orders
> WHERE (
> orders.orderid IN ('546111')
>   OR
> orders.orderid IN  (select orderid FROM orderstotrans WHERE (transid IN ('546111')))
> );
>
> I have a couple of environments, all on Postgresql 13.7 and:
> - on one the query executes with an sequential scan on the orders table
> - on the other sequential scan on an index (ie walks index and filters, rather than looking up ids on the index as an
indexcondition.) 
>
> Plan and tables are below, but it seems to me that the planner knows the subplan is going to return 1 row (max) and
should"know" that there is a max of 2 IDs to look up an indexes would be faster than a sequential scan (of either table
orindex) and filter. I've tried re analyzing to make sure stats are good and it hasn't helped 
>
> I can get a good plan that does use the index efficiently by using a union, eg:
>
> select orders.orderid FROM orders
> WHERE (
> orders.orderid IN (
>   SELECT '546111'
>   UNION
>   SELECT orderid FROM orderstotrans WHERE (transid IN ('546111'))
> )
> );
>
> but I want to understand what warning signs I should be aware of with the original query that put it on the path of a
badplan, so I don't do it again. 
>
>
> Plan - seq scan of table:
> =====
> > explain
> select orders.orderid FROM orders WHERE (orders.orderid IN ('546111') OR orders.orderid IN  (select orderid FROM
orderstotransWHERE (transid IN ('546111')))); 
>
>                                               QUERY PLAN
> -------------------------------------------------------------------------------------------------------
>  Seq Scan on orders  (cost=8.45..486270.87 rows=4302781 width=8)
>    Filter: ((orderid = '546111'::bigint) OR (hashed SubPlan 1))
>    SubPlan 1
>      ->  Index Scan using orderstotrans_transid_key on orderstotrans  (cost=0.43..8.45 rows=1 width=8)
>            Index Cond: (transid = '546111'::bigint)
> (5 rows)
> =====
>
> Plan - Seq scan and filter of index:
> =====
> > explain select orders.orderid FROM orders WHERE (orders.orderid IN ('546111') OR orders.orderid IN  (select orderid
FROMorderstotrans WHERE (transid IN ('546111')))); 
>                                               QUERY PLAN
> -------------------------------------------------------------------------------------------------------
>  Index Only Scan using orders_pkey on orders  (cost=9.16..4067888.60 rows=64760840 width=8)
>    Filter: ((orderid = '546111'::bigint) OR (hashed SubPlan 1))
>    SubPlan 1
>      ->  Index Scan using orderstotrans_transid_key on orderstotrans  (cost=0.57..8.59 rows=1 width=8)
>            Index Cond: (transid = '546111'::bigint)
> (5 rows)
> =====
>
>
> Tables:
> =====
>                                    Table "test.orders"
>         Column        |            Type             | Collation | Nullable |   Default
> ----------------------+-----------------------------+-----------+----------+--------------
>  orderid              | bigint                      |           | not null |
>  istest               | smallint                    |           | not null | 0
>  orderstatusid        | integer                     |           |          |
>  customername         | text                        |           |          |
>  customeraddress      | text                        |           |          |
>  customercountry      | text                        |           |          |
>  customercity         | text                        |           |          |
>  customerstate        | text                        |           |          |
>  customerzip          | text                        |           |          |
>     "orders_pkey" PRIMARY KEY, btree (orderid)
>
>               Table "test.orderstotrans"
>    Column    |  Type   | Collation | Nullable | Default
> -------------+---------+-----------+----------+---------
>  orderid     | bigint  |           |          |
>  transid     | bigint  |           |          |
>  orderitemid | integer |           |          |
> Indexes:
>     "orderstotrans_orderid_idx" btree (orderid)
>     "orderstotrans_orderitemid_idx" btree (orderitemid)
>     "orderstotrans_transid_key" UNIQUE, btree (transid)
>
>
> Happier plan for the union version:
> ====
> explain select orders.orderid FROM orders
> WHERE (
> orders.orderid IN (
>   SELECT '3131275553'
>   UNION
>   select orderid FROM orderstotrans WHERE (transid IN ('3131275553'))
> )
> );
>                                                                            QUERY PLAN
                                           
>
-----------------------------------------------------------------------------------------------------------------------------------------------------------------
>  Nested Loop  (cost=9.21..21.84 rows=2 width=8) (actual time=0.034..0.043 rows=1 loops=1)
>    ->  Unique  (cost=8.64..8.65 rows=2 width=8) (actual time=0.024..0.026 rows=2 loops=1)
>          ->  Sort  (cost=8.64..8.64 rows=2 width=8) (actual time=0.023..0.024 rows=2 loops=1)
>                Sort Key: ('3131275553'::bigint)
>                Sort Method: quicksort  Memory: 25kB
>                ->  Append  (cost=0.00..8.63 rows=2 width=8) (actual time=0.001..0.019 rows=2 loops=1)
>                      ->  Result  (cost=0.00..0.01 rows=1 width=8) (actual time=0.001..0.001 rows=1 loops=1)
>                      ->  Index Scan using orderstotrans_transid_key on orderstotrans  (cost=0.57..8.59 rows=1
width=8)(actual time=0.015..0.016 rows=1 loops=1) 
>                            Index Cond: (transid = '3131275553'::bigint)
>    ->  Index Only Scan using orders_pkey on orders  (cost=0.57..6.58 rows=1 width=8) (actual time=0.007..0.007 rows=0
loops=2)
>          Index Cond: (orderid = ('3131275553'::bigint))
>          Heap Fetches: 0
>  Planning Time: 0.165 ms
>  Execution Time: 0.065 ms
> (14 rows)
> ====
> (though that plan is a bit misleading, as that index condition isn't exactly what is used, ie with:
>
> select orders.orderid FROM orders
> WHERE (
> orders.orderid IN (
>   SELECT '3131275553'
>   UNION
>   select orderid FROM orderstotrans WHERE (transid IN ('3131275553'))
> )
> );
>   orderid
> -----------
>  439155713
> (1 row)
>
> the orderid it matches, isn't the one the planner showed, but it works)
>




pgsql-performance by date:

Previous
From: Paul McGarry
Date:
Subject: Re: Odd Choice of seq scan
Next
From: Jeff Janes
Date:
Subject: Re: Catching up with performance & PostgreSQL 15