Re: ANALYZE patch for review - Mailing list pgsql-patches
From | Tom Lane |
---|---|
Subject | Re: ANALYZE patch for review |
Date | |
Msg-id | 22605.1075390271@sss.pgh.pa.us Whole thread Raw |
In response to | Re: ANALYZE patch for review ("Mark Cave-Ayland" <m.cave-ayland@webbased.co.uk>) |
List | pgsql-patches |
"Mark Cave-Ayland" <m.cave-ayland@webbased.co.uk> writes: > Is what you're saying that a custom function would do something like > this: > typedef struct > { > VacAttrStats *v; > int mydata1; > int mydata2; > int mydata3; > } mystruct; > myanalyzeinfo = palloc0(sizeof(mystruct)); > .. > .. > return myanalyzeinfo; Exactly, except that the return would look like return (VacAttrStruct *) myanalyzeinfo; (or at least it would if you weren't converting to Datum anyway). The main analyze code would think it had a plain pointer to VacAttrStats, but the data analysis subroutine associated with this setup routine would know it could cast the pointer back to (mystruct *). > VacAttrStats *vacattrptr; > vacattrptr = (VacAttrStats *)myanalyzeinfo; > vacattrptr->minrows = 2; Seems like the hard way; you can just access myanalyzeinfo->v.minrows when you're working with a pointer declared as mystruct *. > is there a way of arranging > the structures to ensure that the compiler throws a warning if > VacAttrStats is not the first element of the structure? Not that I know of, but we use techniques like this in very many places in the PG code and have not had troubles of that sort. I'm not too concerned about that. > I'm beginning to think that perhaps we're looking at this in the wrong > way, and that a more elegant version of what you're suggesting could be > implemented using a major/minor method of identifying a statistics type. If you suppose that the "major" field is the upper bits of the statistics ID value, then this is just a slightly different way of thinking about the range-based allocation method I suggested before. However, the range-based method can adapt to allocating different amounts of identifier space to different owners, whereas a major/minor approach can't easily do that since you've defined it to be 2^N minor IDs for each major code. > I suppose this also means that there might need to be a mechanism where > a custom analysis function would want to add more than one row to the > pg_statistic table (ie it collects more than one set of statistics per > column); As long as the custom analysis function is tied to a datatype, I don't see the need for that. It already has the ability to define up to four different "kinds" of stats for its column, and if someone makes a case for wanting more, I'd be inclined to widen pg_statistic rather than get into multiple rows per column. (To point out just one problem, such an arrangement would break the unique index scheme presently used.) I'm not sure yet about whether we need special hooks for keeping stats on functional indexes, but if we do they'd surely not be looked up by datatype, or at least not datatype alone. regards, tom lane
pgsql-patches by date: