Home > mailing lists

Re: Optimizing a huge_table/tiny_table join - Mailing list pgsql-performance

From	Jim Nasby
Subject	Re: Optimizing a huge_table/tiny_table join
Date	May 25, 2006 19:13:33
Msg-id	C7B931D6-F9C7-4DC4-8F48-B571C6A3AEF1@decibel.org Whole thread Raw
In response to	Re: Optimizing a huge_table/tiny_table join ("Dawid Kuroczko" <qnex42@gmail.com>)
List	pgsql-performance

Tree view

On May 25, 2006, at 12:07 PM, Dawid Kuroczko wrote:
> On 5/25/06, kynn@panix.com <kynn@panix.com> wrote:
>> Well, they're not my statistics; they're explain's.  You mean there's
>> a bug in explain?  I agree that it makes no sense that the costs
>> don't
>> differ as much as one would expect, but you can see right there the
>> numbers of rows for the two tables.  At any rate, how would one go
>> about finding an explanation for these strange stats?
>
> Well, the query planner uses statistics to deduce the best plan
> possible.  Explain includes this statistical data in its output.
> See:
> http://www.postgresql.org/docs/8.1/interactive/planner-stats.html
> ...for information about what it is all about.
>
> The idea is that your statistics are probably not detailed enough
> to help the planner.  See ALTER TABLE SET STATISTICS to change
> that.

http://www.pervasive-postgres.com/lp/newsletters/2006/
Insights_postgres_Mar.asp#4 might also be worth your time to read.

> Hmm, there is a probability (though statistics are more probable
> go) that you're using some older version of PostgreSQL, and you're
> hitting same problem as I did:
>
> http://archives.postgresql.org/pgsql-performance/2005-07/msg00345.php
>
> Tom has provided back then a patch, which fixed it:
>
> http://archives.postgresql.org/pgsql-performance/2005-07/msg00352.php
>
> ...but I don't remember when it made into release.

According to cvs, it's been in since 8.1 and 8.0.4:

Revision 1.111.4.2: download - view: text, markup, annotated - select
for diffs
Fri Jul 22 19:12:33 2005 UTC (10 months ago) by tgl
Branches: REL8_0_STABLE
CVS tags: REL8_0_8, REL8_0_7, REL8_0_6, REL8_0_5, REL8_0_4
Diff to: previous 1.111.4.1: preferred, colored; branchpoint 1.111:
preferred, colored; next MAIN 1.112: preferred, colored
Changes since revision 1.111.4.1: +18 -37 lines

Fix compare_fuzzy_path_costs() to behave a bit more sanely.  The
original
coding would ignore startup cost differences of less than 1% of the
estimated total cost; which was OK for normal planning but highly not OK
if a very small LIMIT was applied afterwards, so that startup cost
becomes
the name of the game.  Instead, compare startup and total costs fuzzily
but independently.  This changes the plan selected for two queries in
the
regression tests; adjust expected-output files for resulting changes in
row order.  Per reports from Dawid Kuroczko and Sam Mason.

Revision 1.124: download - view: text, markup, annotated - select for
diffs
Fri Jul 22 19:12:01 2005 UTC (10 months ago) by tgl
Branches: MAIN
CVS tags: REL8_1_0BETA3, REL8_1_0BETA2, REL8_1_0BETA1
Diff to: previous 1.123: preferred, colored
Changes since revision 1.123: +18 -37 lines

Fix compare_fuzzy_path_costs() to behave a bit more sanely.  The
original
coding would ignore startup cost differences of less than 1% of the
estimated total cost; which was OK for normal planning but highly not OK
if a very small LIMIT was applied afterwards, so that startup cost
becomes
the name of the game.  Instead, compare startup and total costs fuzzily
but independently.  This changes the plan selected for two queries in
the
regression tests; adjust expected-output files for resulting changes in
row order.  Per reports from Dawid Kuroczko and Sam Mason.

--
Jim C. Nasby, Database Architect                decibel@decibel.org
Give your computer some brain candy! www.distributed.net Team #1828

Windows: "Where do you want to go today?"
Linux: "Where do you want to go tomorrow?"
FreeBSD: "Are you guys coming, or what?"

pgsql-performance by date:

From: Mark Lewis
Date: 25 May 2006, 18:26:38
Subject: Re: is it possible to make this faster?

From: Jim Nasby
Date: 25 May 2006, 19:17:53
Subject: Re: is it possible to make this faster?

Re: Optimizing a huge_table/tiny_table join - Mailing list pgsql-performance

Previous

Next