Re: [RFC] Improving multi-column filter cardinality estimation using MCVs and HyperLogLog - Mailing list pgsql-hackers

From Bruce Momjian
Subject Re: [RFC] Improving multi-column filter cardinality estimation using MCVs and HyperLogLog
Date
Msg-id Yo1ZS3ut2jDzmD/y@momjian.us
Whole thread Raw
In response to Re: [RFC] Improving multi-column filter cardinality estimation using MCVs and HyperLogLog  (Tomas Vondra <tomas.vondra@enterprisedb.com>)
Responses Re: [RFC] Improving multi-column filter cardinality estimation using MCVs and HyperLogLog
List pgsql-hackers
On Mon, May 16, 2022 at 12:09:41AM +0200, Tomas Vondra wrote:
> I think it's an interesting idea. In principle it allows deducing the
> multi-column MCV for arbitrary combination of columns, not determined in
> advance. We'd have the MCV with HLL instead of frequencies for columns
> A, B and C:
> 
> (a1, hll(a1))
> (a2, hll(a2))
> (...)
> (aK, hll(aK))
> 
> 
> (b1, hll(b1))
> (b2, hll(b2))
> (...)
> (bL, hll(bL))
> 
> (c1, hll(c1))
> (c2, hll(c2))
> (...)
> (cM, hll(cM))
> 
> and from this we'd be able to build MCV for any combination of those
> three columns.

Sorry, but I am lost here.  I read about HLL here:

    https://towardsdatascience.com/hyperloglog-a-simple-but-powerful-algorithm-for-data-scientists-aed50fe47869

However, I don't see how they can be combined for multiple columns. 
Above, I know A,B,C are columns, but what is a1, a2, etc?

-- 
  Bruce Momjian  <bruce@momjian.us>        https://momjian.us
  EDB                                      https://enterprisedb.com

  Indecision is a decision.  Inaction is an action.  Mark Batterson




pgsql-hackers by date:

Previous
From: Zhihong Yu
Date:
Subject: Re: adding status for COPY progress report
Next
From: Robert Haas
Date:
Subject: Re: postgres_fdw has insufficient support for large object