Re: estimating # of distinct values - Mailing list pgsql-hackers

From Alvaro Herrera
Subject Re: estimating # of distinct values
Date
Msg-id 1293802260-sup-5579@alvh.no-ip.org
Whole thread Raw
In response to Re: estimating # of distinct values  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: estimating # of distinct values  (Jim Nasby <jim@nasby.net>)
List pgsql-hackers
Excerpts from Tom Lane's message of jue dic 30 23:02:04 -0300 2010:
> Alvaro Herrera <alvherre@commandprompt.com> writes:
> > I was thinking that we could have two different ANALYZE modes, one
> > "full" and one "incremental"; autovacuum could be modified to use one or
> > the other depending on how many changes there are (of course, the user
> > could request one or the other, too; not sure what should be the default
> > behavior).
> 
> How is an incremental ANALYZE going to work at all?  It has no way to
> find out the recent changes in the table, for *either* inserts or
> deletes.  Unless you want to seqscan the whole table looking for tuples
> with xmin later than something-or-other ... which more or less defeats
> the purpose.

Yeah, I was thinking that this incremental ANALYZE would be the stream
in the "stream-based estimator" but evidently it doesn't work that way.
The stream that needs to be passed to the estimator consists of new
tuples as they are being inserted into the table, so this would need to
be done by the inserter process ... or it'd need to transmit the CTIDs
for someone else to stream them ... not an easy thing, in itself.

-- 
Álvaro Herrera <alvherre@commandprompt.com>
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support


pgsql-hackers by date:

Previous
From: Alvaro Herrera
Date:
Subject: Re: Snapshot synchronization, again...
Next
From: Robert Haas
Date:
Subject: Re: and it's not a bunny rabbit, either