Re: Choosing values for multivariate MCV lists - Mailing list pgsql-hackers

From Dean Rasheed
Subject Re: Choosing values for multivariate MCV lists
Date
Msg-id CAEZATCWBXQ=P0oTEOauLYKWW_s2di0ZvTZGwZ_6_KaHwOrJgQA@mail.gmail.com
Whole thread Raw
In response to Re: Choosing values for multivariate MCV lists  (Tomas Vondra <tomas.vondra@2ndquadrant.com>)
Responses Re: Choosing values for multivariate MCV lists  (Tomas Vondra <tomas.vondra@2ndquadrant.com>)
List pgsql-hackers
On Sat, 29 Jun 2019 at 14:01, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote:
>
> >>>>>However, it looks like the problem is with mcv_list_items()'s use
> >>>>>of %f to convert to text, which is pretty ugly.
> >>>>
> >>>There's one issue with the signature, though - currently the function
> >>>returns null flags as bool array, but values are returned as simple
> >>>text value (formatted in array-like way, but still just a text).
> >>>
> >>IMO fixing this to return a text array is worth doing, even though it
> >>means a catversion bump.
> >>
> Attached is a cleaned-up version of that patch. The main difference is
> that instead of using construct_md_array() this uses ArrayBuildState to
> construct the arrays, which is much easier. The docs don't need any
> update because those were already using text[] for the parameter, the
> code was inconsistent with it.
>

Cool, this looks a lot neater and fixes the issues discussed with both
floating point values no longer being converted to text, and proper
text arrays for values.

One minor nitpick -- it doesn't seem to be necessary to build the
arrays *outfuncs and *fmgrinfo. You may as well just fetch the
individual column output function information at the point where it's
used (in the "if (!item->isnull[i])" block) rather than building those
arrays.


> This does require catversion bump, but as annoying as it is, I think
> it's worth it (and there's also the thread discussing the serialization
> issues). Barring objections, I'll get it committed later next week, once
> I get back from PostgresLondon.
>
> As I mentioned before, if we don't want any additional catversion bumps,
> it's possible to pass the arrays through output functions - that would
> allow us keeping the text output (but correct, unlike what we have now).
>

I think this is a clear bug fix, so I'd vote for fixing it properly
now, with a catversion bump.

Regards,
Dean



pgsql-hackers by date:

Previous
From: "Jamison, Kirk"
Date:
Subject: RE: [PATCH] Speedup truncates of relation forks
Next
From: Thomas Munro
Date:
Subject: Re: How to estimate the shared memory size required for parallel scan?