Re: PATCH: adaptive ndistinct estimator v4 - Mailing list pgsql-hackers

From Jeff Janes
Subject Re: PATCH: adaptive ndistinct estimator v4
Date
Msg-id CAMkU=1ySyCY1=8ZEeaEEPWD-9wn7ccXbQ6o=UJHU=3ZqA3-kxw@mail.gmail.com
Whole thread Raw
In response to Re: PATCH: adaptive ndistinct estimator v4  (Tomas Vondra <tomas.vondra@2ndquadrant.com>)
Responses Re: PATCH: adaptive ndistinct estimator v4
List pgsql-hackers
On Tue, Mar 31, 2015 at 12:02 PM, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote:
Hi all,

attached is v4 of the patch implementing adaptive ndistinct estimator.

Hi Tomas,

I have a case here where the adaptive algorithm underestimates ndistinct by a factor of 7 while the default estimator is pretty close.

5MB file:


# create table foo2 (x text);
# \copy foo2 from program 'bzcat ~/temp/foo1.txt.bz2'
# analyze verbose foo2;
INFO:  analyzing "public.foo2"
INFO:  "foo2": scanned 6021 of 6021 pages, containing 1113772 live rows and 0 dead rows; 30000 rows in sample, 1113772 estimated total rows
WARNING: ndistinct estimate current=998951.78 adaptive=135819.00

Cheers,

Jeff

pgsql-hackers by date:

Previous
From: Abhijit Menon-Sen
Date:
Subject: Re: initdb -S and tablespaces
Next
From: Heikki Linnakangas
Date:
Subject: Re: FPW compression leaks information