Home > mailing lists

Re: Planner reluctant to start from subquery - Mailing list pgsql-performance

From	Tom Lane
Subject	Re: Planner reluctant to start from subquery
Date	February 1, 2006 16:36:16
Msg-id	4359.1138826175@sss.pgh.pa.us Whole thread Raw
In response to	Re: Planner reluctant to start from subquery ("Kevin Grittner" <Kevin.Grittner@wicourts.gov>)
Responses	Re: Planner reluctant to start from subquery
List	pgsql-performance

Tree view

"Kevin Grittner" <Kevin.Grittner@wicourts.gov> writes:
> Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> I'm interested to poke at this ... are you in a position to provide a
>> test case?

> I can't supply the original data, since many of the tables have
> millions of rows, with some of the data (related to juvenile, paternity,
> sealed, and expunged cases) protected by law.  I could try to put
> together a self-contained example, but I'm not sure the best way to do
> that, since the table sizes and value distributions may be significant
> here.  Any thoughts on that?

I think that the only aspect of the data that really matters here is the
number of distinct values, which would affect decisions about whether
HashAggregate is appropriate or not.  And you could probably get the
same thing to happen with at most a few tens of thousands of rows.

Also, all we need to worry about is the columns used in the WHERE/JOIN
conditions, which looks to be mostly case numbers, dates, and county
identification ... how much confidential info is there in that?  At
worst you could translate the case numbers to some randomly generated
identifiers.

            regards, tom lane

pgsql-performance by date:

From: "Jeffrey W. Baker"
Date: 01 February 2006, 16:28:22
Subject: Re: Index Usage using IN

From: Tom Lane
Date: 01 February 2006, 16:41:11
Subject: Re: Index Usage using IN

Re: Planner reluctant to start from subquery - Mailing list pgsql-performance

Previous

Next