Re: estimating # of distinct values - Mailing list pgsql-hackers

From Alvaro Herrera
Subject Re: estimating # of distinct values
Date
Msg-id 1293740412-sup-9219@alvh.no-ip.org
Whole thread Raw
In response to Re: estimating # of distinct values  (Tomas Vondra <tv@fuzzy.cz>)
Responses Re: estimating # of distinct values
List pgsql-hackers
Excerpts from Tomas Vondra's message of jue dic 30 16:38:03 -0300 2010:

> > Since the need to regularly VACUUM tables hit by updated or deleted
> > won't go away any time soon, we could piggy-back the bit field
> > rebuilding onto VACUUM to avoid a second scan.
> 
> Well, I guess it's a bit more complicated. First of all, there's a local
> VACUUM when doing HOT updates. Second, you need to handle inserts too
> (what if the table just grows?).
> 
> But I'm not a VACUUM expert, so maybe I'm wrong and this is the right
> place to handle rebuilds of distinct stats.

I was thinking that we could have two different ANALYZE modes, one
"full" and one "incremental"; autovacuum could be modified to use one or
the other depending on how many changes there are (of course, the user
could request one or the other, too; not sure what should be the default
behavior).  So the incremental one wouldn't worry about deletes, only
inserts, and could be called very frequently.  The other one would
trigger a full table scan (or nearly so) to produce a better estimate in
the face of many deletions.

I haven't followed this discussion closely so I'm not sure that this
would be workable.

-- 
Álvaro Herrera <alvherre@commandprompt.com>
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support


pgsql-hackers by date:

Previous
From: Marti Raudsepp
Date:
Subject: Re: Sync Rep Design
Next
From: Robert Treat
Date:
Subject: Re: pg_dump --split patch