Re: pretty bad n_distinct estimate, causing HashAgg OOM on TPC-H - Mailing list pgsql-hackers

From Tomas Vondra
Subject Re: pretty bad n_distinct estimate, causing HashAgg OOM on TPC-H
Date
Msg-id 5588D644.1000909@2ndquadrant.com
Whole thread Raw
In response to Re: pretty bad n_distinct estimate, causing HashAgg OOM on TPC-H  (Jeff Janes <jeff.janes@gmail.com>)
List pgsql-hackers
Hi,

On 06/22/2015 07:47 AM, Jeff Janes wrote:
> On Sat, Jun 20, 2015 at 8:28 AM, Tomas Vondra
> <tomas.vondra@2ndquadrant.com <mailto:tomas.vondra@2ndquadrant.com>> wrote:
>
> Hi Tomas,
>
>
>     I've lobotomized the sampling a bit to really produce a random set
>     of blocks first, and that produces way better estimates:
>
>         statistics target     estimate               random
>         -----------------------------------------------------------------
>         100                     429491 (10000x)   334430766 (14x)
>         1000                   4240418  (1000x)   439010499 (10x)
>
>     Also, the number of sampled blocks is not that different. With
>     target 100, the current sampler reads ~2900 blocks, while a
>     completely random sampler uses 3000 blocks. So, where's the benefit?
>
>
> I don't know you did.  The block sampling is already random, unless
> there is some overlooked bug, it can't be made more random.  Could you
> post the patch?

Attached is an early experimental version of the sampling with density
correction. It breaks the SYSTEM sampling for TABLESAMPLE, because it
changes signature of one of the BlockSampler methods, but that can be
fixed later. I also suspect it fails with some small tables that would
get sampled completely with the current patch.

The density-correction idea is quite simple, and the sampling works like
this:

   (1) randomly sample the blocks

       Generate random sequence of block numbers, and track how many
       times a particular block was sampled - so if we got block number
       K twice, we know we should sample 2 tuples from that block.

   (2) sample the tuples

       For each of the blocks, sample as many tuples as needed. Also
       note the number of tuples on the block.

   (3) determine maximum density

       Find how what is the maximum number of tuples per block.

   (4) resample the tuples

       For each block, resample the tuples using ratio vs. maximum
       density. So if a block has 50% density, each sampled tuple will
       be evicted with ~50% probability.

   (5) replace the evicted tuples (not implemented)

       In step (4) some of the tuples will get evicted from the sample,
       so we should replace them with other tuples. Not yet done, so the
       sample may get smaller than requested.

And it seems to be working quite fine. I've done various tests with
tables specifically crafted to use different tuple widths for different
values, like for example this:

create table test as
select
     (case when i < 500000 then 1 else 2 end) AS id,
     (case when i < 500000 then 1 else null end)::bigint as val1,
     (case when i < 500000 then 1 else null end)::bigint as val2,
     (case when i < 500000 then 1 else null end)::bigint as val3
from generate_series(1,1000000) s(i);

which creates a table with tuples containing 50% id=1 and 50% id=2, with
the tuples id=1 being much wider (as id=2 have NULLs at the end). And it
works fine - the numbers are pretty much the same as current estimates.

It also significantly improves the ndistinct estimates - for example on
the TPC-H data set, I do get ndistinct=-1 for the l_orderkey column even
with statistics target=100, which is way better than the 10000x
under-estimate produced by current algorithm.

I do take this as a demonstration that our ndistinct estimator is not
such crap as some claim it to be, but that we're merely feeding it
skewed samples. I still think the adaptive estimator is a neat idea, but
it can't be fixed while keeping the same skewed sampling. That has to be
addressed first ...

regards

--
Tomas Vondra                   http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Attachment

pgsql-hackers by date:

Previous
From: Kouhei Kaigai
Date:
Subject: Re: upper planner path-ification
Next
From: Amit Kapila
Date:
Subject: Re: checkpointer continuous flushing