Re: queries with lots of UNIONed relations - Mailing list pgsql-performance

From Robert Haas
Subject Re: queries with lots of UNIONed relations
Date
Msg-id AANLkTikg+DmHbRrxdQbeX3B5PyXD0rRziT0QowQ-_LnM@mail.gmail.com
Whole thread Raw
In response to Re: queries with lots of UNIONed relations  (Andy Colson <andy@squeakycode.net>)
Responses Re: queries with lots of UNIONed relations  (Andy Colson <andy@squeakycode.net>)
Re: queries with lots of UNIONed relations  (Jon Nelson <jnelson+pgsql@jamponi.net>)
List pgsql-performance
On Thu, Jan 13, 2011 at 5:47 PM, Andy Colson <andy@squeakycode.net> wrote:
>>>> I don't believe there is any case where hashing each individual relation
>>>> is a win compared to hashing them all together.  If the optimizer were
>>>> smart enough to be considering the situation as a whole, it would always
>>>> do the latter.
>>>
>>> You might be right, but I'm not sure.  Suppose that there are 100
>>> inheritance children, and each has 10,000 distinct values, but none of
>>> them are common between the tables.  In that situation, de-duplicating
>>> each individual table requires a hash table that can hold 10,000
>>> entries.  But deduplicating everything at once requires a hash table
>>> that can hold 1,000,000 entries.
>>>
>>> Or am I all wet?
>>
>> Yeah, I'm all wet, because you'd still have to re-de-duplicate at the
>> end.  But then why did the OP get a speedup?  *scratches head*
>
> Because it all fix it memory and didnt swap to disk?

Doesn't make sense.  The re-de-duplication at the end should use the
same amount of memory regardless of whether the individual relations
have already been de-duplicated.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

pgsql-performance by date:

Previous
From: Andy Colson
Date:
Subject: Re: queries with lots of UNIONed relations
Next
From: Andy Colson
Date:
Subject: Re: queries with lots of UNIONed relations