Re: [HACKERS] PATCH: multivariate histograms and MCV lists - Mailing list pgsql-hackers

From Dean Rasheed
Subject Re: [HACKERS] PATCH: multivariate histograms and MCV lists
Date
Msg-id CAEZATCVG8qEbC_49Qm4xhg=qy2A1EmbMyBSnkhiyXXZf1mBZnQ@mail.gmail.com
Whole thread Raw
In response to Re: [HACKERS] PATCH: multivariate histograms and MCV lists  (Tomas Vondra <tomas.vondra@2ndquadrant.com>)
Responses Re: [HACKERS] PATCH: multivariate histograms and MCV lists  (Tomas Vondra <tomas.vondra@2ndquadrant.com>)
List pgsql-hackers
On 18 March 2018 at 23:57, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote:
> Attached is an updated version of the patch series, addressing issues
> pointed out by Alvaro.

I'm just starting to look at this now, and I think I'll post
individual comments/questions as I get to them rather than trying to
review the whole thing, because it's quite a large patch. Apologies if
some of this has already been discussed.

Looking at the changes to UpdateStatisticsForTypeChange():

+   memset(nulls, 1, Natts_pg_statistic_ext * sizeof(bool));

why the "1" there -- is it just a typo?

A wider concern I have is that I think this function is trying to be
too clever by only resetting selected stats. IMO it should just reset
all stats unconditionally when the column type changes, which would be
consistent with what we do for regular stats.

Consider, for example, what would happen if a column was changed from
real to int -- all the data values will be coerced to integers, losing
precision, and any ndistinct and dependency stats would likely be
completely wrong afterwards. IMO that's a bug, and should be
back-patched independently of these new types of extended stats.

Thoughts?

Regards,
Dean


pgsql-hackers by date:

Previous
From: Alexander Korotkov
Date:
Subject: Re: WIP: Covering + unique indexes.
Next
From: Damir Simunic
Date:
Subject: Re: Proposal: http2 wire format