subquery returning array - Mailing list pgsql-hackers

From Zeljko Vrba
Subject subquery returning array
Date
Msg-id 42917453.1000706@ifi.uio.no
Whole thread Raw
Responses Re: subquery returning array
Re: subquery returning array
List pgsql-hackers
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Hi! I have posted this as a user comment in section 9.17:

- ----
= ANY(array expression) doesn't work in the obvious way when the array
expression is a subselect. For example:

select * from stat3 where stat3.id = any ('{4,5,6,7}');

works (and returns the expected tuples). However,

select * from stat3 where stat3.id = any (select stat3 from
helix_request where id=11);

DOESN'T work and complains with an error: operator does not exist:
integer = integer[]. The inner select returns EXACTLY ONE value, namely
the same array as the literal in the first example.

The solution is:

select * from stat3 where (select stat3.id = any (stat3) from
helix_request where id=11);

I thank to Chris Kings-Lynne ("KL") for helping me out with this over IRC.
- ----

KL suggested to mail this question to the hackers list. The problem with
this solution is that postgresql uses sequential scan for the proposed
solution:
Seq Scan on stat3  (cost=0.00..40018.94 rows=3321 width=32) (actual
time=0.112..75.911 rows=4 loops=1)  Filter: (subplan)  SubPlan    ->  Index Scan using helix_request_pkey on
helix_request
(cost=0.00..6.01 rows=1 width=32) (actual time=0.007..0.008 rows=1
loops=6756)          Index Cond: (id = 11)Total runtime: 76.040 ms
(6 rows)

If I write select * from stat3 where id in (4,5,6,7) and index scan is used:Index Scan using stat3_pkey, stat3_pkey,
stat3_pkey,stat3_pkey on
 
stat3  (cost=0.00..23.94 rows=4 width=32) (actual time=0.066..0.091
rows=4 loops=1)  Index Cond: ((id = 4) OR (id = 5) OR (id = 6) OR (id = 7))Total runtime: 0.164 ms

I'm going to have MANY queries of this kind and having sequential scan
for each such query will lead to quadratic performance - unacceptable
for the amount of data I will have (millions of rows).

Is there yet another way of making WHERE field = ANY (subselect
returning an array) work? Or make postgres to use index?

OK, I know the suggestion from the manual: usually it's bad database
design searching through arrays and a separate table is better. however,
to convert {4,5,6,7} into a table would require a table with two
columns, e.g. (1,4), (1,5), (1,6), (1,7) where the first column would be
used to identify elements in the single array). this additional column
is a waste of space.

KL has mentioned a package for converting an array into a tuple. Where
can I find it? Also, what is the limit on the number of elements in the
IN (...) condition before the database resorts to sequential scan?

PS: Please reply also via e-mail as I'm not a regular subscriber of this
list.

Thanks.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.7 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org

iQCVAwUBQpF0U2nN2B8CwPRmAQKN8gP+JCzLiX5b48kMYmHRwTSFZWN5Jydfw0iH
MABuYj2mKCY9Dgmd4pLK8Xlxhf/tEYzd3N2lcPFYf1vIXCSpbFasRrO3hJ4WjRLr
MZ6MLXCn59Y8wtd8Iz8uug47XuYVGSreZESKA9tRfl+u8t16nPC9nEMyDNDFIRI9
ZOnBMCnQzrY=
=5qmB
-----END PGP SIGNATURE-----


pgsql-hackers by date:

Previous
From: José Orlando Pereira
Date:
Subject: Re: Two-phase commit issues
Next
From: Richard Huxton
Date:
Subject: Re: subquery returning array