Re: Select in subselect vs select = any array - Mailing list pgsql-performance

From Pavel Stehule
Subject Re: Select in subselect vs select = any array
Date
Msg-id AANLkTikDbBT69mmGW=2uq9Jzbbeki44ZsXPnsMXqXApt@mail.gmail.com
Whole thread Raw
In response to Select in subselect vs select = any array  (Adam Tistler <atistler@gmail.com>)
Responses Re: Select in subselect vs select = any array
List pgsql-performance
Hello

2011/3/20 Adam Tistler <atistler@gmail.com>:
> I have noticed that SELECT ... = ANY(ARRAY(...))  is about twice as fast as SELECT IN ( ... ).
> Can anyone explain a reason for this?  Results are the bottom and are reproducible.  I can test with other versions
ifthat is necessary. 
>

send a result of EXPLAIN ANALYZE SELECT ..., please

The reasons can be different - less seq scans, indexes

Regards

Pavel Stehule



> ./configure --prefix=/usr/local/pgsql84 --with-openssl --with-perl
> CentOS release 5.4 (Final)
> psql (PostgreSQL) 8.4.1
>
> prompt2=# select count(*) from nodes;
>  count
> --------
>  754734
> (1 row)
>
>
> prompt2=# \d nodes
>                                        Table "public.nodes"
>    Column    |           Type           |                         Modifiers
> --------------+--------------------------+-----------------------------------------------------------
>  node_id      | integer                  | not null default nextval(('node_id_seq'::text)::regclass)
>  node_type_id | integer                  | not null
>  template_id  | integer                  | not null
>  timestamp    | timestamp with time zone | default ('now'::text)::timestamp(6) with time zone
> Indexes:
>    "nodes_pkey" PRIMARY KEY, btree (node_id)
>    "n_node_id_index" btree (node_id)
>    "n_node_type_id_index" btree (node_type_id)
>    "n_template_id_index" btree (template_id)
>
> prompt2=# select count(*) from nodes where node_id = any(  Array(select node_id from nodes limit 100000) );
>  count
> --------
>  100000
> (1 row)
>
> Time: 404.530 ms
> prompt2=# select count(*) from nodes where node_id = any(  Array(select node_id from nodes limit 100000) );
>  count
> --------
>  100000
> (1 row)
>
> Time: 407.316 ms
> prompt2=# select count(*) from nodes where node_id = any(  Array(select node_id from nodes limit 100000) );
>  count
> --------
>  100000
> (1 row)
>
> Time: 408.728 ms
> prompt2=# select count(*) from nodes where node_id in (select node_id from nodes limit 100000 );
>  count
> --------
>  100000
> (1 row)
>
> Time: 793.840 ms
> prompt2=# select count(*) from nodes where node_id in (select node_id from nodes limit 100000 );
>  count
> --------
>  100000
> (1 row)
>
> Time: 779.137 ms
> prompt2=# select count(*) from nodes where node_id in (select node_id from nodes limit 100000 );
>  count
> --------
>  100000
> (1 row)
>
> Time: 781.820 ms
>
>
> --
> Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-performance
>

pgsql-performance by date:

Previous
From: Adam Tistler
Date:
Subject: Select in subselect vs select = any array
Next
From: Phoenix Kiula
Date:
Subject: Re: REINDEX takes half a day (and still not complete!)