Thread: PoC: Using Count-Min Sketch for join cardinality estimation
Hi, During the recent "CMU vaccination" talk given by Robert [1], a couple of the attendees (some of which were engineers working on various other database systems) asked whether PostgreSQL optimizer uses sketches. Which it does not, as far as I'm aware. Perhaps some of our statistics could be considered sketches, but we've not using data structures like hyperloglog, count-min sketch, etc. But it reminded me that I thought about using one of the common sketches in the past, namely the Count-Min sketch [2], which is often mentioned as useful to estimating join cardinalities. There's a couple papers explaining how it works [3], [4], [5], but the general idea is that it approximates frequency table, i.e. a table tracking frequencies for all values. Our MCV list is one way to do that, but that only keeps a limited number of common values - for the rest we approximate the frequencies as uniform distribution. When the MCV covers only a tiny fraction of the data, or missing entirely, this may be an issue. We can't possibly store exact frequencies all values for tables with many distinct values. The Count-Min sketch works around this by tracking frequencies in a limited number of counters - imagine you have 128 counters. To add a value to the sketch, we hash it and the hash says which counter to increment. To estimate a join size, we simply calculate "dot product" of the two sketches (which need to use the same number of counters): S = sum(s1(i) * s2(i) for i in 1 .. 128) The actual sketches have multiple of those arrays (e.g. 8) using different hash functions, and we use the minimum of the sums. That limits the error, but I'll ignore it here for simplicity. The attached patch is a very simple (and perhaps naive) implementation adding count-min sketch to pg_statistic for all attributes with a hash function (as a new statistics slot kind), and considering it in equijoinsel_inner. There's a GUC use_count_min_sketch to make it easier to see how it works. A simple example create table t1 (a int, b int); create table t2 (a int, b int); insert into t1 select pow(random(), 2) * 1000, i from generate_series(1,30000) s(i); insert into t2 select pow(random(), 2) * 1000, i from generate_series(1,30000) s(i); analyze t1, t2; explain analyze select * from t1 join t2 using (a); QUERY PLAN ------------------------------------------------------------------ Hash Join (cost=808.00..115470.35 rows=8936685 width=12) (actual time=31.231..1083.330 rows=2177649 loops=1) So it's about 4x over-estimated, while without the count-min sketch it's about 2x under-estimated: set use_count_min_sketch = false; QUERY PLAN ------------------------------------------------------------------ Merge Join (cost=5327.96..18964.16 rows=899101 width=12) (actual time=60.780..2896.829 rows=2177649 loops=1) More about this a bit later. The nice thing on count-min sketch is that there are pretty clear boundaries for error: size(t1,t2) <= dot_product(s1,2) <= epsilon * size(t1) * size(t2) where s1/s2 are sketches on t1/t2, and epsilon is relative error. User may pick epsilon, and that determines size of the necessary sketch as 2/epsilon. So with 128 buckets, the relative error is ~1.6%. The trouble here is that this is relative to cartesian product of the two relations. So with two relations, each 30k rows, the error is up to ~14.5M. Which is not great. We can pick lower epsilon value, but that increases the sketch size. Where does the error come from? Each counter combines frequencies for multiple distinct values. So for example with 128 counters and 1024 distinct values, each counter is representing ~4 values on average. But the dot product ignores this - it treats as if all the frequency was for a single value. It calculates the worst case for the bucket, because if you split the frequency e.g. in half, the estimate is always lower (f/2)^2 + (f/2)^2 < f^2 So maybe this could calculate the average number of items per counter and correct for this, somehow. We'd lose some of the sketch guarantees, but maybe it's the right thing to do. There's a bunch of commented-out code doing this in different ways, and with the geometric mean variant the result looks like this: QUERY PLAN ------------------------------------------------------------------ Merge Join (cost=5328.34..53412.58 rows=3195688 width=12) (actual time=64.037..2937.818 rows=2177649 loops=1) which is much closer, but of course that depends on how exactly is the data set skewed. There's a bunch of other open questions: 1) The papers about count-min sketch seem to be written for streaming use cases, which implies all the inserted data pass through the sketch. This patch only builds the sketch on analyze sample, which makes it less reliable. I doubt we want to do something different (e.g. because it'd require handling deletes, etc.). 2) The patch considers the sketch before MCVs, simply because it makes it much simpler to enable/disable the sketch, and compare it to MCVs. That's probably not what should be done - if we have MCVs, we should prefer using that, simply because it determines the frequencies more accurately than the sketch. And only use the sketch as a fallback, when we don't have MCVs on both sides of the join, instead of just assuming uniform distribution and relying on ndistinct. We may have histograms, but AFAIK we don't use those when estimating joins (at least not equijoins). That's another thing we might maybe look into, comparing the histograms to verify how much they overlap. But that's irrelevant here. Anyway, count-min sketches would be a better way to estimate the part not covered by MCVs - we might even assume the uniform distribution for individual counters, because that's what we do without MCVs anyway. 3) It's not clear to me how to extend this for multiple columns, so that it can be used to estimate joins on multiple correlated columns. For MCVs it was pretty simple, but let's say we add this as a new extended statistics kind, and user does CREATE STATISTICS s (cmsketch) ON a, b, c FROM t; Should that build sketch on (a,b,c) or something else? The trouble is a sketch on (a,b,c) is useless for joins on (a,b). We might do something like for ndistinct coefficients, and build a sketch for each combination of the columns. The sketches are much larger than ndistinct coefficients, though. But maybe that's fine - with 8 columns we'd need ~56 sketches, each ~8kB. So that's not extreme. regards [1] https://db.cs.cmu.edu/events/vaccination-2021-postgresql-optimizer-methodology-robert-haas/ [2] https://en.wikipedia.org/wiki/Count%E2%80%93min_sketch [3] https://dsf.berkeley.edu/cs286/papers/countmin-latin2004.pdf [4] http://dimacs.rutgers.edu/~graham/pubs/papers/cmsoft.pdf [5] http://dimacs.rutgers.edu/~graham/pubs/papers/cmz-sdm.pdf -- Tomas Vondra EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Attachment
On Wed, Jun 16, 2021 at 12:23 PM Tomas Vondra <tomas.vondra@enterprisedb.com> wrote:
> The attached patch is a very simple (and perhaps naive) implementation
> adding count-min sketch to pg_statistic for all attributes with a hash
> function (as a new statistics slot kind), and considering it in
> equijoinsel_inner. There's a GUC use_count_min_sketch to make it easier
> to see how it works.
Cool! I have some high level questions below.
> So it's about 4x over-estimated, while without the count-min sketch it's
> about 2x under-estimated:
> The nice thing on count-min sketch is that there are pretty clear
> boundaries for error:
>
> size(t1,t2) <= dot_product(s1,2) <= epsilon * size(t1) * size(t2)
>
> where s1/s2 are sketches on t1/t2, and epsilon is relative error. User
> may pick epsilon, and that determines size of the necessary sketch as
> 2/epsilon. So with 128 buckets, the relative error is ~1.6%.
>
> The trouble here is that this is relative to cartesian product of the
> two relations. So with two relations, each 30k rows, the error is up to
> ~14.5M. Which is not great. We can pick lower epsilon value, but that
> increases the sketch size.
+ * depth 8 and width 128 is sufficient for relative error ~1.5% with a
+ * probability of approximately 99.6%
Okay, so in the example above, we have a 99.6% probability of having less than 14.5M, but the actual error is much smaller. Do we know how tight the error bounds are with some lower probability?
> There's a bunch of other open questions:
>
> 1) The papers about count-min sketch seem to be written for streaming
> use cases, which implies all the inserted data pass through the sketch.
> This patch only builds the sketch on analyze sample, which makes it less
> reliable. I doubt we want to do something different (e.g. because it'd
> require handling deletes, etc.).
We currently determine the sample size from the number of histogram buckets requested, which is from the guc we expose. If these sketches are more designed for the whole stream, do we have any idea how big a sample we need to be reasonably accurate with them?
> 2) The patch considers the sketch before MCVs, simply because it makes
> it much simpler to enable/disable the sketch, and compare it to MCVs.
> That's probably not what should be done - if we have MCVs, we should
> prefer using that, simply because it determines the frequencies more
> accurately than the sketch. And only use the sketch as a fallback, when
> we don't have MCVs on both sides of the join, instead of just assuming
> uniform distribution and relying on ndistinct.
> Anyway, count-min sketches would be a better way to estimate the part
> not covered by MCVs - we might even assume the uniform distribution for
> individual counters, because that's what we do without MCVs anyway.
When we calculate the sketch, would it make sense to exclude the MCVs that we found? And use both sources for the estimate?
--
John Naylor
EDB: http://www.enterprisedb.com
> The attached patch is a very simple (and perhaps naive) implementation
> adding count-min sketch to pg_statistic for all attributes with a hash
> function (as a new statistics slot kind), and considering it in
> equijoinsel_inner. There's a GUC use_count_min_sketch to make it easier
> to see how it works.
Cool! I have some high level questions below.
> So it's about 4x over-estimated, while without the count-min sketch it's
> about 2x under-estimated:
> The nice thing on count-min sketch is that there are pretty clear
> boundaries for error:
>
> size(t1,t2) <= dot_product(s1,2) <= epsilon * size(t1) * size(t2)
>
> where s1/s2 are sketches on t1/t2, and epsilon is relative error. User
> may pick epsilon, and that determines size of the necessary sketch as
> 2/epsilon. So with 128 buckets, the relative error is ~1.6%.
>
> The trouble here is that this is relative to cartesian product of the
> two relations. So with two relations, each 30k rows, the error is up to
> ~14.5M. Which is not great. We can pick lower epsilon value, but that
> increases the sketch size.
+ * depth 8 and width 128 is sufficient for relative error ~1.5% with a
+ * probability of approximately 99.6%
Okay, so in the example above, we have a 99.6% probability of having less than 14.5M, but the actual error is much smaller. Do we know how tight the error bounds are with some lower probability?
> There's a bunch of other open questions:
>
> 1) The papers about count-min sketch seem to be written for streaming
> use cases, which implies all the inserted data pass through the sketch.
> This patch only builds the sketch on analyze sample, which makes it less
> reliable. I doubt we want to do something different (e.g. because it'd
> require handling deletes, etc.).
We currently determine the sample size from the number of histogram buckets requested, which is from the guc we expose. If these sketches are more designed for the whole stream, do we have any idea how big a sample we need to be reasonably accurate with them?
> 2) The patch considers the sketch before MCVs, simply because it makes
> it much simpler to enable/disable the sketch, and compare it to MCVs.
> That's probably not what should be done - if we have MCVs, we should
> prefer using that, simply because it determines the frequencies more
> accurately than the sketch. And only use the sketch as a fallback, when
> we don't have MCVs on both sides of the join, instead of just assuming
> uniform distribution and relying on ndistinct.
> Anyway, count-min sketches would be a better way to estimate the part
> not covered by MCVs - we might even assume the uniform distribution for
> individual counters, because that's what we do without MCVs anyway.
When we calculate the sketch, would it make sense to exclude the MCVs that we found? And use both sources for the estimate?
--
John Naylor
EDB: http://www.enterprisedb.com
On 6/17/21 1:31 AM, John Naylor wrote: > On Wed, Jun 16, 2021 at 12:23 PM Tomas Vondra > <tomas.vondra@enterprisedb.com <mailto:tomas.vondra@enterprisedb.com>> > wrote: > > > The attached patch is a very simple (and perhaps naive) implementation > > adding count-min sketch to pg_statistic for all attributes with a hash > > function (as a new statistics slot kind), and considering it in > > equijoinsel_inner. There's a GUC use_count_min_sketch to make it easier > > to see how it works. > > Cool! I have some high level questions below. > > > So it's about 4x over-estimated, while without the count-min sketch it's > > about 2x under-estimated: > > > The nice thing on count-min sketch is that there are pretty clear > > boundaries for error: > > > > size(t1,t2) <= dot_product(s1,2) <= epsilon * size(t1) * size(t2) > > > > where s1/s2 are sketches on t1/t2, and epsilon is relative error. User > > may pick epsilon, and that determines size of the necessary sketch as > > 2/epsilon. So with 128 buckets, the relative error is ~1.6%. > > > > The trouble here is that this is relative to cartesian product of the > > two relations. So with two relations, each 30k rows, the error is up to > > ~14.5M. Which is not great. We can pick lower epsilon value, but that > > increases the sketch size. > > + * depth 8 and width 128 is sufficient for relative error ~1.5% with a > + * probability of approximately 99.6% > > Okay, so in the example above, we have a 99.6% probability of having > less than 14.5M, but the actual error is much smaller. Do we know how > tight the error bounds are with some lower probability? > I don't recall such formula mentioned in any of the papers. The [3] paper has a proof in section 4.2, deriving the formula using Markov's inequality, but it's not obvious how to relax that (it's been ages since I last did things like this). > > There's a bunch of other open questions: > > > > 1) The papers about count-min sketch seem to be written for streaming > > use cases, which implies all the inserted data pass through the sketch. > > This patch only builds the sketch on analyze sample, which makes it less > > reliable. I doubt we want to do something different (e.g. because it'd > > require handling deletes, etc.). > > We currently determine the sample size from the number of histogram > buckets requested, which is from the guc we expose. If these sketches > are more designed for the whole stream, do we have any idea how big a > sample we need to be reasonably accurate with them? > Not really, but to be fair even for the histograms it's only really supported by "seems to work in practice" :-( My feeling is it's more about the number of distinct values rather than the size of the table. If there are only a couple distinct values, small sample is good enough. With many distinct values, we may need a larger sample, but maybe not - we'll have to try, I guess. FWIW there's a lot of various assumptions in the join estimates. For example we assume the domains match (i.e. domain of the smaller table is subset of the larger table) etc. > > 2) The patch considers the sketch before MCVs, simply because it makes > > it much simpler to enable/disable the sketch, and compare it to MCVs. > > That's probably not what should be done - if we have MCVs, we should > > prefer using that, simply because it determines the frequencies more > > accurately than the sketch. And only use the sketch as a fallback, when > > we don't have MCVs on both sides of the join, instead of just assuming > > uniform distribution and relying on ndistinct. > > > Anyway, count-min sketches would be a better way to estimate the part > > not covered by MCVs - we might even assume the uniform distribution for > > individual counters, because that's what we do without MCVs anyway. > > When we calculate the sketch, would it make sense to exclude the MCVs > that we found? And use both sources for the estimate? > Not sure. I've thought about this a bit, and excluding the MCV values from the sketch would make it more like a MCV+histogram. So we'd have MCV and then (sketch, histogram) on the non-MCV values. I think the partial sketch is mostly useless, at least for join estimates. Imagine we have MCV and sketch on both sides of the join, so we have (MCV1, sketch1) and (MCV2, sketch2). Naively, we could do estimate using (MCV1, MCV2) and then (sketch1,sketch2). But that's too simplistic - there may be "overlap" between MCV1 and sketch2, for example? So it seems more likely we'll just do MCV estimation if both sides have it, and switch to sketch-only estimation otherwise. There's also the fact that we exclude values wider than (1kB), so that the stats are not too big, and there's no reason to do that for the sketch (which is fixed-size thanks to hashing). It's a bit simpler to build the full sketch during the initial scan of the data. But it's not a very important detail - it's trivial to both add and remove values from the sketch, if needed. So we can either exclude the MCV values and "add them" to the partial sketch later, or we can build a full sketch and then subtract them later. regards -- Tomas Vondra EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On 6/17/21 2:23 AM, Tomas Vondra wrote: > On 6/17/21 1:31 AM, John Naylor wrote: >> On Wed, Jun 16, 2021 at 12:23 PM Tomas Vondra >> <tomas.vondra@enterprisedb.com <mailto:tomas.vondra@enterprisedb.com>> >> wrote: >> >> ... >> >> + * depth 8 and width 128 is sufficient for relative error ~1.5% with a >> + * probability of approximately 99.6% >> >> Okay, so in the example above, we have a 99.6% probability of having >> less than 14.5M, but the actual error is much smaller. Do we know how >> tight the error bounds are with some lower probability? >> > > I don't recall such formula mentioned in any of the papers. The [3] > paper has a proof in section 4.2, deriving the formula using Markov's > inequality, but it's not obvious how to relax that (it's been ages since > I last did things like this). > I've been thinking about this a bit more, and while I still don't know about a nice formula, I think I have a fairly good illustration that may provide some intuition about the "typical" error. I'll talk about self joins, because it makes some of the formulas simpler. But in principle the same thing works for a join of two relations too. Imagine you have a relation with N rows and D distinct values, and let's build a count-min sketch on it, with W counters. So assuming d=1 for simplicity, we have one set of counters with frequencies: [f(1), f(2), ..., f(W)] Now, the dot product effectively calculates S = sum[ f(i)^2 for i in 1 ... W ] which treats each counter as if it was just a single distinct value. But we know that this is the upper boundary of the join size estimate, because if we "split" a grou in any way, the join will always be lower: (f(i) - X)^2 + X^2 <= f(i)^2 It's as if you have a rectangle - if you split a side in some way and calculate the area of those smaller rectangles, it'll be smaller than the are of the whole rectangle. To minimize the area, the parts need to be of equal size, and for K parts it's K * (f(i) / K) ^ 2 = f(i)^2 / K This is the "minimum relative error" case assuming uniform distribution of the data, I think. If there are D distinct values in the data set, then for uniform distribution we can assume each counter represents about D / W = K distinct values, and we can assume f(i) = N / W, so then S = W * (N/W)^2 / (D/W) = N^2 / D Of course, this is the exact cardinality of the join - the count-min sketch simply multiplies the f(i) values, ignoring D entirely. But I think this shows that the fewer distinct values are there and/or the more skewed the data set is, the closer the estimate is to the actual value. More uniform data sets with more distinct values will end up closer to the (N^2 / D) size, and the sketch will significantly over-estimate this. So the question is whether to attempt to do any "custom" correction based on number of distinct values (which I think the count-min sketch does not do, because the papers assumes it's unknown). I still don't know about an analytical solution, giving us smaller confidence interval (with lower probability). But we could perform some experiments, generating data sets with various data distribution and then measure how accurate the adjusted estimate is. But I think the fact that for "more skewed" data sets the estimate is closer to reality is very interesting, and pretty much what we want. It's probably better than just assuming uniformity on both sides, which is what we do when we only have MCV on one side (that's a fairly common case, I think). The other interesting feature is that it *always* overestimates (at least the default version, not the variant adjusted by distinct values). That's probably good, because under-estimates are generally much more dangerous than over-estimates (the execution often degrades pretty quickly, not gracefully). regards -- Tomas Vondra EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Wed, Jun 16, 2021 at 8:23 PM Tomas Vondra <tomas.vondra@enterprisedb.com> wrote:
> Not really, but to be fair even for the histograms it's only really
> supported by "seems to work in practice" :-(
Hmm, we cite a theoretical result in analyze.c, but I don't know if there is something better out there:
* The following choice of minrows is based on the paper
* "Random sampling for histogram construction: how much is enough?"
* by Surajit Chaudhuri, Rajeev Motwani and Vivek Narasayya, in
What is more troubling to me is that we set the number of MCVs to the number of histograms. Since b5db1d93d2a6 we have a pretty good method of finding the MCVs that are justified. When that first went in, I experimented with removing the MCV limit and found it easy to create value distributions that lead to thousands of MCVs. I guess the best justification now for the limit is plan time, but if we have a sketch also, we can choose one or the other based on a plan-time speed vs accuracy tradeoff (another use for a planner effort guc). In that scenario, for tables with many MCVs we would only use them for restriction clauses.
--
John Naylor
EDB: http://www.enterprisedb.com
> Not really, but to be fair even for the histograms it's only really
> supported by "seems to work in practice" :-(
Hmm, we cite a theoretical result in analyze.c, but I don't know if there is something better out there:
* The following choice of minrows is based on the paper
* "Random sampling for histogram construction: how much is enough?"
* by Surajit Chaudhuri, Rajeev Motwani and Vivek Narasayya, in
What is more troubling to me is that we set the number of MCVs to the number of histograms. Since b5db1d93d2a6 we have a pretty good method of finding the MCVs that are justified. When that first went in, I experimented with removing the MCV limit and found it easy to create value distributions that lead to thousands of MCVs. I guess the best justification now for the limit is plan time, but if we have a sketch also, we can choose one or the other based on a plan-time speed vs accuracy tradeoff (another use for a planner effort guc). In that scenario, for tables with many MCVs we would only use them for restriction clauses.
--
John Naylor
EDB: http://www.enterprisedb.com
On 6/18/21 7:03 PM, John Naylor wrote: > On Wed, Jun 16, 2021 at 8:23 PM Tomas Vondra > <tomas.vondra@enterprisedb.com <mailto:tomas.vondra@enterprisedb.com>> > wrote: > > > Not really, but to be fair even for the histograms it's only really > > supported by "seems to work in practice" :-( > > Hmm, we cite a theoretical result in analyze.c, but I don't know if > there is something better out there: > > * The following choice of minrows is based on the paper > * "Random sampling for histogram construction: how much is enough?" > * by Surajit Chaudhuri, Rajeev Motwani and Vivek Narasayya, in > True. I read that paper (long time ago), and it certainly gives some very interesting guidance and guarantees regarding relative error. And now that I look at it, the theorems 5 & 6, and the corollary 1 do provide a way to calculate probability of a lower error (essentially vary the f, get the probability). I still think there's a lot of reliance on experience from practice, because even with such strong limits delta=0.5 of a histogram with 100 buckets, representing 1e9 rows, is still plenty of space for errors. > What is more troubling to me is that we set the number of MCVs to the > number of histograms. Since b5db1d93d2a6 we have a pretty good method of > finding the MCVs that are justified. When that first went in, I > experimented with removing the MCV limit and found it easy to create > value distributions that lead to thousands of MCVs. I guess the best > justification now for the limit is plan time, but if we have a sketch > also, we can choose one or the other based on a plan-time speed vs > accuracy tradeoff (another use for a planner effort guc). In that > scenario, for tables with many MCVs we would only use them for > restriction clauses. > Sorry, I'm not sure what you mean by "we set the number of MCVs to the number of histograms" :-( When you say "MCV limit" you mean that we limit the number of items to statistics target, right? I agree plan time is one concern - but it's also about analyze, as we need larger sample to build a larger MCV or histogram (as the paper you referenced shows). I think the sketch is quite interesting for skewed data sets where the MCV can represent only small fraction of the data, exactly because of the limit. For (close to) uniform data distributions we can just use ndistinct estimates to get estimates that are better than those from a sketch, I think. So I think we should try using MCV first, and then use sketches for the rest of the data (or possibly all data, if one side does not have MCV). FWIW I think the sketch may be useful even for restriction clauses, which is what the paper calls "point queries". Again, maybe this should use the same correction depending on ndistinct estimate. regards -- Tomas Vondra EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Fri, Jun 18, 2021 at 3:43 PM Tomas Vondra <tomas.vondra@enterprisedb.com> wrote:
> Sorry, I'm not sure what you mean by "we set the number of MCVs to the
> number of histograms" :-(
>
> When you say "MCV limit" you mean that we limit the number of items to
> statistics target, right? I agree plan time is one concern - but it's
> also about analyze, as we need larger sample to build a larger MCV or
> histogram (as the paper you referenced shows).
Ah, I didn't realize the theoretical limit applied to the MCVs too, but that makes sense since they're basically singleton histogram buckets.
--
John Naylor
EDB: http://www.enterprisedb.com
> number of histograms" :-(
>
> When you say "MCV limit" you mean that we limit the number of items to
> statistics target, right? I agree plan time is one concern - but it's
> also about analyze, as we need larger sample to build a larger MCV or
> histogram (as the paper you referenced shows).
Ah, I didn't realize the theoretical limit applied to the MCVs too, but that makes sense since they're basically singleton histogram buckets.
--
John Naylor
EDB: http://www.enterprisedb.com
On 6/18/21 9:54 PM, John Naylor wrote: > > On Fri, Jun 18, 2021 at 3:43 PM Tomas Vondra > <tomas.vondra@enterprisedb.com <mailto:tomas.vondra@enterprisedb.com>> > wrote: > > > Sorry, I'm not sure what you mean by "we set the number of MCVs to the > > number of histograms" :-( > > > > When you say "MCV limit" you mean that we limit the number of items to > > statistics target, right? I agree plan time is one concern - but it's > > also about analyze, as we need larger sample to build a larger MCV or > > histogram (as the paper you referenced shows). > > Ah, I didn't realize the theoretical limit applied to the MCVs too, but > that makes sense since they're basically singleton histogram buckets. > Something like that, yes. Looking at MCV items as singleton histogram buckets is interesting, although I'm not sure that was the reasoning when calculating the MCV size. AFAIK it was kinda the other way around, i.e. the sample size is derived from the histogram paper, and when building the MCV we ask what's sufficiently different from the average frequency, based on the sample size etc. regards -- Tomas Vondra EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company