On 2/3/2010 11:17 AM, Matt White wrote:
> On Feb 2, 1:11 pm, a...@squeakycode.net (Andy Colson) wrote:
>> On 2/2/2010 1:03 PM, Matt White wrote:
>>
>>
>>
>>
>>
>>> On Feb 2, 6:06 am, Edgardo Portal<egportal2...@yahoo.com> wrote:
>>>> On 2010-02-02, Matt White<mattw...@gmail.com> wrote:
>>
>>>>> I have a relatively straightforward query that by itself isn't that
>>>>> slow, but we have to run it up to 40 times on one webpage load, so it
>>>>> needs to run much faster than it does. Here it is:
>>
>>>>> SELECT COUNT(*) FROM users, user_groups
>>>>> WHERE users.user_group_id = user_groups.id AND NOT users.deleted AND
>>>>> user_groups.partner_id IN
>>>>> (partner_id_1, partner_id_2);
>>
>>>>> The structure is partners have user groups which have users. In the
>>>>> test data there are over 200,000 user groups and users but only ~3000
>>>>> partners. Anyone have any bright ideas on how to speed this query up?
>>
>>>> Can you avoid running it 40 times, maybe by restructuring the
>>>> query (or making a view) along the lines of the following and
>>>> adding some logic to your page?
>>
>>>> SELECT p.partner_id, ug.user_group_id, u.id, count(*)
>>>> FROM partners p
>>>> LEFT JOIN user_groups ug
>>>> ON ug.partner_id=p.partner_id
>>>> LEFT JOIN users u
>>>> ON u.user_group_id=ug.id
>>>> WHERE NOT u.deleted
>>>> GROUP BY 1,2,3
>>>> ;
>>
>>> Thanks for the suggestion. The view didn't seem to speed things up.
>>> Perhaps we can reduce the number of times it's called, we'll see. Any
>>> additional ideas would be helpful. Thanks.
>>
>> I agree with Edgardo, I think the biggest time saver will be reducing
>> trips to the database.
>>
>> But... do you have an index on users.user_group_id?
>>
>> Does rewriting it change the plan any?
>>
>> SELECT COUNT(*) FROM users
>> inner join user_groups on (users.user_group_id = user_groups.id)
>> where NOT users.deleted
>> AND user_groups.partner_id IN (partner_id_1, partner_id_2);
>>
>> And... it looks like the row guestimate is off a litte:
>>
>> Index Scan using user_groups_partner_id_idx
>> on user_groups
>> (cost=0.00..133.86 rows=3346 width=8)
>> (actual time=0.049..96.992 rows=100001 loops=2)
>>
>> It guessed 3,346 rows, but actually got 100,001. Have you run an
>> analyze on it? If so, maybe bumping up the stats might help?
>>
>> -Andy
>>
>> --
>> Sent via pgsql-performance mailing list (pgsql-performa...@postgresql.org)
>> To make changes to your subscription:http://www.postgresql.org/mailpref/pgsql-performance
>
> Andy,
>
> I have run analyze, see my query plan in my original post. You'll have
> to forgive me for being a bit of a Postgres noob but what do you mean
> by "bumping up the stats"?
Thats not what I mean. "explain analyze select..." is what you did, and
correct. What I meant was "analyze user_groups".
see:
http://www.postgresql.org/docs/8.4/interactive/sql-analyze.html
an analyze will make PG look at a table, and calc stats on it, so it can
make better guesses. By default analyze only looks at a few rows (well
a small percent of rows) and makes guesses about the entire table based
on those rows. If it guesses wrong, sometimes you need to tell it to
analyze more rows (ie. a bigger percentage of the table).
By "bumping the stats" I was referring to this:
http://wiki.postgresql.org/wiki/Planner_Statistics
I have never had to do it, so dont know much about it. It may or may
not help. Just thought it was something you could try.
-Andy