Re: TB-sized databases - Mailing list pgsql-performance

From Ron Mayer
Subject Re: TB-sized databases
Date
Msg-id 47DF57B3.9090206@cheapcomplexdevices.com
Whole thread Raw
In response to Re: TB-sized databases  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: TB-sized databases
List pgsql-performance
Tom Lane wrote:
> Bruce Momjian <bruce@momjian.us> writes:
>> I have _not_ added a TODO for this item.  Let me know if one is needed.
>
> Please do, I think it's an open issue.
>
> * Consider Cartesian joins when both relations are needed to form an
>   indexscan qualification for a third relation
>


Would another possible condition for considering
Cartesian joins be be:

   * Consider Cartesian joins when a unique constraint can prove
     that at most one row will be pulled from one of the tables
     that would be part of this join?

In the couple cases where this happened to me it was
in queries on a textbook star schema like this:

  select * from fact
           join dim1 using (dim1_id)
           join dim2 using (dim2_id)
          where dim1.value = 'something'
            and dim2.valuex = 'somethingelse'
            and dim2.valuey = 'more';

and looking up all the IDs before hitting the huge
fact table.  Often in these cases the where clause
on the dimension tables are on values with a unique
constraint.

If I understand right - if the constraint can prove
it'll return at most 1 row - that means the cartesian
join is provably safe from blowing up.

Not sure if that's redundant with the condition you
mentioned, or if it's yet a separate condition where
we might also want to consider cartesian joins.

Ron M


pgsql-performance by date:

Previous
From: "Peter Koczan"
Date:
Subject: Re: What is the best way to storage music files in Postgresql
Next
From: James Mansion
Date:
Subject: Re: Benchmark: Dell/Perc 6, 8 disk RAID 10