Re: WHERE col = ANY($1) extended to 2 or more columns? - Mailing list pgsql-general

From Peter J. Holzer
Subject Re: WHERE col = ANY($1) extended to 2 or more columns?
Date
Msg-id 20230211152149.ci75aun2m25sb645@hjp.at
Whole thread Raw
In response to Re: WHERE col = ANY($1) extended to 2 or more columns?  (Dominique Devienne <ddevienne@gmail.com>)
Responses Re: WHERE col = ANY($1) extended to 2 or more columns?  ("Peter J. Holzer" <hjp-pgsql@hjp.at>)
List pgsql-general
On 2023-02-09 18:35:42 +0100, Dominique Devienne wrote:
> On Thu, Feb 9, 2023 at 5:37 PM David G. Johnston <david.g.johnston@gmail.com>
> wrote:
>
>     On Thu, Feb 9, 2023 at 9:28 AM Alban Hertroys <haramrae@gmail.com> wrote:
>
>         > On 9 Feb 2023, at 16:41, Dominique Devienne <ddevienne@gmail.com>
>         wrote:
>         > Now we'd like to do the same for composite keys, and I don't know how
>         to do that.
>
>         This works:
>         => select (1, 'one'::text) in ((1, 'two'::text), (2, 'one'::text), (1,
>         'one'::text), (2, 'two'::text));
>
>     But you cannot write the right-side of the IN as a single parameter which
>     seems to be the primary constraint trying to be conformed to.
>
>
> Right. The goal is to (re)use a prepared statement (i.e. plan once), and bind
> the RHS (binary) array
> and do a single exec (single round-trip) to get the matching rows. AFAIK, this
> is the fastest way.

Planning time is often much less than execution time, so minimizing it
may not give you the fastest results.

For example I tried to fetch two data points from a medium sized (3 mio
rows) table ((macrobondtimeseries, date) is the primary key here):

wds=> explain (analyze) select macrobondtimeseries, date, value
from facttable_oecd_naq_2018
where (macrobondtimeseries, date) = any(
    array [
        ( 'naq_mex_b1_gi_cqr_a'     , '2013-01-01' ),
        ( 'naq_lux_p3s13_gpsa_a'    , '1961-01-01' )
    ]::tsd[]
);

╔════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════╗
║                                                                   QUERY PLAN
                        ║ 

╟────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╢
║ Gather  (cost=1000.00..334945.37 rows=35242 width=34) (actual time=6.194..1618.968 rows=2 loops=1)
                        ║ 
║   Workers Planned: 2
                        ║ 
║   Workers Launched: 2
                        ║ 
║   ->  Parallel Seq Scan on facttable_oecd_naq_2018  (cost=0.00..330421.17 rows=14684 width=34) (actual
time=1054.739..1589.818rows=1 loops=3) ║ 
║         Filter: (ROW(macrobondtimeseries, date) = ANY
('{"(naq_mex_b1_gi_cqr_a,2013-01-01)","(naq_lux_p3s13_gpsa_a,1961-01-01)"}'::tsd[]))    ║ 
║         Rows Removed by Filter: 1178191
                        ║ 
║ Planning Time: 1.833 ms
                        ║ 
║ JIT:
                        ║ 
║   Functions: 12
                        ║ 
║   Options: Inlining false, Optimization false, Expressions true, Deforming true
                        ║ 
║   Timing: Generation 1.026 ms, Inlining 0.000 ms, Optimization 0.948 ms, Emission 12.613 ms, Total 14.586 ms
                        ║ 
║ Execution Time: 1619.729 ms
                        ║ 

╚════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════╝
(12 rows)

wds=> explain (analyze) select macrobondtimeseries, date, value
from facttable_oecd_naq_2018
where (macrobondtimeseries, date) in
(
    ( 'naq_mex_b1_gi_cqr_a'     , '2013-01-01' ),
    ( 'naq_lux_p3s13_gpsa_a'    , '1961-01-01' )
);

╔═══════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════╗
║                                                                                                    QUERY PLAN
                                                                                           ║ 

╟───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╢
║ Bitmap Heap Scan on facttable_oecd_naq_2018  (cost=4.13..7.17 rows=2 width=34) (actual time=0.259..0.263 rows=2
loops=1)                                                                                         ║ 
║   Recheck Cond: ((((macrobondtimeseries)::text = 'naq_mex_b1_gi_cqr_a'::text) AND (date = '2013-01-01'::date)) OR
(((macrobondtimeseries)::text= 'naq_lux_p3s13_gpsa_a'::text) AND (date = '1961-01-01'::date))) ║ 
║   Heap Blocks: exact=1
                                                                                           ║ 
║   ->  BitmapOr  (cost=4.13..4.13 rows=2 width=0) (actual time=0.184..0.185 rows=0 loops=1)
                                                                                           ║ 
║         ->  Bitmap Index Scan on facttable_oecd_naq_2018_pkey  (cost=0.00..2.06 rows=1 width=0) (actual
time=0.124..0.125rows=1 loops=1)                                                                         ║ 
║               Index Cond: (((macrobondtimeseries)::text = 'naq_mex_b1_gi_cqr_a'::text) AND (date =
'2013-01-01'::date))                                                                                          ║ 
║         ->  Bitmap Index Scan on facttable_oecd_naq_2018_pkey  (cost=0.00..2.06 rows=1 width=0) (actual
time=0.058..0.058rows=1 loops=1)                                                                         ║ 
║               Index Cond: (((macrobondtimeseries)::text = 'naq_lux_p3s13_gpsa_a'::text) AND (date =
'1961-01-01'::date))                                                                                         ║ 
║ Planning Time: 1.414 ms
                                                                                           ║ 
║ Execution Time: 0.330 ms
                                                                                           ║ 

╚═══════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════╝
(10 rows)

The latter is almost 1000 times faster. Saving 1.8 ms on planning time
doesn't help you if you then waste 1.6 s on execution.

So sometimes it pays off to give the planner a little bit of extra
information to work on.

        hp


--
   _  | Peter J. Holzer    | Story must make more sense than reality.
|_|_) |                    |
| |   | hjp@hjp.at         |    -- Charles Stross, "Creative writing
__/   | http://www.hjp.at/ |       challenge!"

Attachment

pgsql-general by date:

Previous
From: jian he
Date:
Subject: Re: How to pass table column values to function
Next
From: "Peter J. Holzer"
Date:
Subject: Re: WHERE col = ANY($1) extended to 2 or more columns?