Re: Searching for Duplicates and Hosed the System - Mailing list pgsql-general

From Tom Lane
Subject Re: Searching for Duplicates and Hosed the System
Date
Msg-id 2663.1187580624@sss.pgh.pa.us
Whole thread Raw
In response to Re: Searching for Duplicates and Hosed the System  (Bill Thoen <bthoen@gisnet.com>)
Responses Re: Searching for Duplicates and Hosed the System  (Bill Thoen <bthoen@gisnet.com>)
List pgsql-general
Bill Thoen <bthoen@gisnet.com> writes:
> Tom, here's the "explain" results: Does this help explain what went wrong?
> (And yes, I think there will be a *lot* of groups.)

> explain select count(*) as count,fips_st_cd, fips_cnty_cd, farm_nbr,
> tract_nbr, field_nbr,crop_cd,crop_status_cd,practice_cd,seq_nbr from
> compliance_2006 group by fips_st_cd, fips_cnty_cd, farm_nbr, tract_nbr,
> field_nbr,crop_cd,crop_status_cd,practice_cd,seq_nbr order by 1 desc;

>                              QUERY PLAN
> --------------------------------------------------------
>  Sort  (cost=15119390.46..15123902.54 rows=1804832 width=160)
>    Sort Key: count(*)
>    ->  GroupAggregate  (cost=13782933.29..14301822.43 rows=1804832
> width=160)
>          ->  Sort  (cost=13782933.29..13828054.08 rows=18048318 width=160)
>                Sort Key: fips_st_cd, fips_cnty_cd, farm_nbr, tract_nbr,
> field_nbr, crop_cd, crop_status_cd, practice_cd, seq_nbr
>                ->  Seq Scan on compliance_2006  (cost=0.00..1039927.18
> rows=18048318 width=160)
> (6 rows)

Hmm ... no, actually, that shows the planner doing the right thing for
lotsa groups: picking GroupAggregate instead of HashAggregate.  The
estimated number of groups is 1804832, which might or might not have
much to do with reality but in any case seems enough to keep it away
from HashAggregate.

Do you have autovacuum running, or a scheduled cronjob that runs ANALYZE
or VACUUM ANALYZE?  The only theory I can think of at this point is that
your database statistics are more correct now than they were when you
had the problem.

If you try the query again, does it behave more sanely?

            regards, tom lane

pgsql-general by date:

Previous
From: "Mike Rylander"
Date:
Subject: Re: tsearch2: plainto_tsquery() with OR?
Next
From: "Robin Helgelin"
Date:
Subject: Re: entry log