Home > mailing lists

Re: Performance Problem with sub-select using array - Mailing list pgsql-sql

From	Aaron Bono
Subject	Re: Performance Problem with sub-select using array
Date	August 29, 2006 14:34:46
Msg-id	bf05e51c0608290734w7795d260q160abc9b91e8a58e@mail.gmail.com Whole thread Raw
In response to	Performance Problem with sub-select using array ("Travis Whitton" <tinymountain@gmail.com>)
List	pgsql-sql

Tree view

On 8/28/06, Travis Whitton <tinymountain@gmail.com> wrote:

I'm pretty sure you're right, which leads me to my next question. Is it possible to pass a column from an outer query to a subquery? For example, is there a way to do something like.

SELECT owners.id AS owner_id, array(SELECT dogs.name WHERE owners.id = owner_id) ...

I would just do a normal inner-join, but then I get a row for each item that would otherwise come back nicely packaged in the array. The overhead of rearranging the data takes even more time than the subquery approach.

I don't think you can do that but I may be wrong. I usually try to stay away from correlated sub-queries because of performance concerns and query complexity. I find simple subqueries with well formed inner/outer joins work much better.

Does anyone know where documentation about the array function can be found? I did a search but cannot find it on the postgresql web site.

On 8/28/06, Aaron Bono < postgresql@aranya.com> wrote:
On 8/24/06, Travis Whitton <tinymountain@gmail.com > wrote:
Hello all, I'm running the following query on about 6,000 records worth of data, and it takes about 8 seconds to complete. Can anyone provide any suggestions to improve performance? I have an index on two columns in the transacts table (program_id, customer_id). If I specify a number for customer.id in the sub-select, query time is reduced to about 2 seconds, which still seems like a million years for only 6,000 records, but I'm guessing that the sub-select can't resolve the id since it's done before the outer query, so it scans the entre recordset for every row? Transacts is a many to many table for customers and programs. I know this query doesn't even reference any columns from programs; however, I dynamically insert where clauses to constrain the result set.

SELECT distinct customers.id, first_name, last_name, address1, contact_city, contact_state, primary_phone, email, array(select programs.program_name from transacts, programs where customer_id = customers.id and programs.id = transacts.program_id and submit_status = 'success') AS partners from customers, transacts, programs where transacts.customer_id = customers.id and transacts.program_id = programs.id

My guess is that your problem is that you may be getting 6000 rows, but the array(select ....) is having to run once for each of record returned (so it is running 6000 times).

Try an explain analyze: http://www.postgresql.org/docs/7.4/interactive/sql-explain.html - that will reveal more of where the performance problem is.

==================================================================
   Aaron Bono
   Aranya Software Technologies, Inc.
   http://www.aranya.com
   http://codeelixir.com
==================================================================

pgsql-sql by date:

From: "Aaron Bono"
Date: 29 August 2006, 14:28:52
Subject: Re: Performance Problem with sub-select using array

From: Sumeet
Date: 29 August 2006, 16:44:55
Subject:

Re: Performance Problem with sub-select using array - Mailing list pgsql-sql

Previous

Next