Re: WHERE col = ANY($1) extended to 2 or more columns? - Mailing list pgsql-general
| From | Peter J. Holzer |
|---|---|
| Subject | Re: WHERE col = ANY($1) extended to 2 or more columns? |
| Date | |
| Msg-id | 20230211220031.cdrrb5dnkiu6hden@hjp.at Whole thread Raw |
| In response to | Re: WHERE col = ANY($1) extended to 2 or more columns? ("Peter J. Holzer" <hjp-pgsql@hjp.at>) |
| List | pgsql-general |
On 2023-02-11 16:21:49 +0100, Peter J. Holzer wrote:
> On 2023-02-09 18:35:42 +0100, Dominique Devienne wrote:
> > Right. The goal is to (re)use a prepared statement (i.e. plan once), and bind
> > the RHS (binary) array
> > and do a single exec (single round-trip) to get the matching rows. AFAIK, this
> > is the fastest way.
>
> Planning time is often much less than execution time, so minimizing it
> may not give you the fastest results.
>
> For example I tried to fetch two data points from a medium sized (3 mio
> rows) table ((macrobondtimeseries, date) is the primary key here):
>
> wds=> explain (analyze) select macrobondtimeseries, date, value
> from facttable_oecd_naq_2018
> where (macrobondtimeseries, date) = any(
> array [
> ( 'naq_mex_b1_gi_cqr_a' , '2013-01-01' ),
> ( 'naq_lux_p3s13_gpsa_a' , '1961-01-01' )
> ]::tsd[]
> );
[Sequential scan]
> ║ Planning Time: 1.833 ms
> ║ Execution Time: 1619.729 ms
> ╚════════════════════════════════════════════════════════════════════
> (12 rows)
>
> wds=> explain (analyze) select macrobondtimeseries, date, value
> from facttable_oecd_naq_2018
> where (macrobondtimeseries, date) in
> (
> ( 'naq_mex_b1_gi_cqr_a' , '2013-01-01' ),
> ( 'naq_lux_p3s13_gpsa_a' , '1961-01-01' )
> );
[Bitmap index scan]
> ║ Planning Time: 1.414 ms
> ║ Execution Time: 0.330 ms
> ╚════════════════════════════════════════════════════════════════════
> (10 rows)
>
> The latter is almost 1000 times faster. Saving 1.8 ms on planning time
> doesn't help you if you then waste 1.6 s on execution.
>
> So sometimes it pays off to give the planner a little bit of extra
> information to work on.
And here is a variant which does achieve a good execution
plan with an array:
wds=> explain(analyze) with a as (
select * from unnest (
array [
( 'naq_mex_b1_gi_cqr_a' , '2013-01-01' ),
( 'naq_lux_p3s13_gpsa_a' , '1961-01-01' )
]::tsd[]
)
)
select macrobondtimeseries, date, value
from a natural join facttable_oecd_naq_2018;
╔═════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════╗
║ QUERY PLAN
║
╟─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╢
║ Nested Loop (cost=0.56..7.17 rows=1 width=44) (actual time=0.075..0.115 rows=2 loops=1)
║
║ -> Function Scan on unnest (cost=0.00..0.02 rows=2 width=36) (actual time=0.011..0.012 rows=2 loops=1)
║
║ -> Index Scan using facttable_oecd_naq_2018_pkey on facttable_oecd_naq_2018 (cost=0.56..3.57 rows=1 width=34)
(actualtime=0.045..0.045 rows=1 loops=2) ║
║ Index Cond: (((macrobondtimeseries)::text = (unnest.macrobondtimeseries)::text) AND (date = unnest.date))
║
║ Planning Time: 1.061 ms
║
║ Execution Time: 0.172 ms
║
╚═════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════╝
(6 rows)
(At least with these parameters on this table with this version of
PostgreSQL. Your mileage may vary, of course.)
hp
--
_ | Peter J. Holzer | Story must make more sense than reality.
|_|_) | |
| | | hjp@hjp.at | -- Charles Stross, "Creative writing
__/ | http://www.hjp.at/ | challenge!"
Attachment
pgsql-general by date: