Re: queries with lots of UNIONed relations - Mailing list pgsql-performance

From Mladen Gogala
Subject Re: queries with lots of UNIONed relations
Date
Msg-id 4D2FC0B7.30705@vmsinfo.com
Whole thread Raw
In response to Re: queries with lots of UNIONed relations  (Robert Haas <robertmhaas@gmail.com>)
List pgsql-performance
On 1/13/2011 5:41 PM, Robert Haas wrote:
> You might be right, but I'm not sure.  Suppose that there are 100
> inheritance children, and each has 10,000 distinct values, but none of
> them are common between the tables.  In that situation, de-duplicating
> each individual table requires a hash table that can hold 10,000
> entries.  But deduplicating everything at once requires a hash table
> that can hold 1,000,000 entries.
>
> Or am I all wet?
>

Have you considered using Google's map-reduce framework for things like
that? Union and group functions look like  ideal candidates for such a
thing.  I am not sure whether map-reduce can be married to a relational
database, but I must say that I was impressed with the speed of MongoDB.
I am not suggesting that PostgreSQL should sacrifice its ACID compliance
for speed, but Mongo sure does look like a speeding bullet.
On the other hand, the algorithms that have been paralleled for a long
time are precisely sort/merge and hash algorithms used for union and
group by functions. This is what I have in mind:
http://labs.google.com/papers/mapreduce.html

--
Mladen Gogala
Sr. Oracle DBA
1500 Broadway
New York, NY 10036
(212) 329-5251
www.vmsinfo.com


pgsql-performance by date:

Previous
From: Tom Lane
Date:
Subject: Re: queries with lots of UNIONed relations
Next
From: Craig Ringer
Date:
Subject: Re: The good, old times