Re: Performance With Joins on Large Tables - Mailing list pgsql-performance

From Tom Lane
Subject Re: Performance With Joins on Large Tables
Date
Msg-id 18008.1158185322@sss.pgh.pa.us
Whole thread Raw
In response to Re: Performance With Joins on Large Tables  ("Joshua Marsh" <icub3d@gmail.com>)
Responses Re: Performance With Joins on Large Tables  ("Joshua Marsh" <icub3d@gmail.com>)
List pgsql-performance
"Joshua Marsh" <icub3d@gmail.com> writes:
>>> On 9/13/06, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>>> Are the tables perhaps nearly in order by the dsiacctno fields?
>>
>>> My assumption would be they are in exact order.  The text file I used
>>> in the COPY statement had them in order, so if COPY preserves that in
>>> the database, then it is in order.
>>
>> Ah.  So the question is why the planner isn't noticing that.  What do
>> you see in the pg_stats view for the two dsiacctno fields --- the
>> correlation field in particular?

> Here are the results:
> data=# select tablename, attname, n_distinct, avg_width, correlation
> from pg_stats where tablename in ('view_505', 'r3s169') and attname =
> 'dsiacctno';
>  tablename |  attname  | n_distinct | avg_width | correlation
> -----------+-----------+------------+-----------+-------------
>  view_505  | dsiacctno |         -1 |        13 |    -0.13912
>  r3s169    | dsiacctno |      44156 |        13 |   -0.126824
> (2 rows)

Wow, that correlation value is *way* away from order.  If they were
really in exact order by dsiacctno then I'd expect to see 1.0 in
that column.  Can you take another look at the tables and confirm
the ordering?  Does the correlation change if you do an ANALYZE on the
tables?  (Some small change is to be expected due to random sampling,
but this is way off.)

            regards, tom lane

pgsql-performance by date:

Previous
From: Jamal Ghaffour
Date:
Subject: Unsubscribe
Next
From: "Merlin Moncure"
Date:
Subject: Re: sql-bench