Thread: Re: STATISTICS retained in CREATE TABLE ... LIKE (INCLUDING ALL)?
On 21 January 2018 at 19:21, David Rowley <david.rowley@2ndquadrant.com> wrote: > On 20 January 2018 at 18:50, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> Stephen Froehlich <s.froehlich@cablelabs.com> writes: >>> Are custom statistics in PG10 retained in LIKE (INCLUDING ALL) or do I need to recreate the statistics by hand each timeI create a new table? >> >> LIKE doesn't know about those, no. Perhaps it should. > > Agreed. ALL does not mean MOST. (Moving to -hackers) The attached implements this. Looking at "LIKE ALL" in more detail in the docs it claims to be equivalent to "INCLUDING DEFAULTS INCLUDING IDENTITY INCLUDING CONSTRAINTS INCLUDING INDEXES INCLUDING STORAGE INCLUDING COMMENTS", so it didn't seem right to magically have LIKE ALL include something that there's no option to include individually, so I added INCLUDING STATISTICS. This also required writing code allow statistics to be created without a name. The code I added to choose the default name is in the form <tablename>_<column1>_<column2>_stat. I'm sure someone will want something else, but that's just what I've personally standardised on. I'd also be fine with "stx" or "sta". Making this work required a bit of shuffling of error checking in CreateStatistics() so that we generate a name before complaining about any duplicate name. I'm unsure if this should be categorised as a bug fix or a new feature. -- David Rowley http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Attachment
Moin, On Fri, January 26, 2018 2:30 am, David Rowley wrote: > On 21 January 2018 at 19:21, David Rowley <david.rowley@2ndquadrant.com> > wrote: >> On 20 January 2018 at 18:50, Tom Lane <tgl@sss.pgh.pa.us> wrote: >>> Stephen Froehlich <s.froehlich@cablelabs.com> writes: >>>> Are custom statistics in PG10 retained in LIKE (INCLUDING ALL) or do I >>>> need to recreate the statistics by hand each time I create a new >>>> table? >>> >>> LIKE doesn't know about those, no. Perhaps it should. >> >> Agreed. ALL does not mean MOST. > > (Moving to -hackers) > > The attached implements this. > > Looking at "LIKE ALL" in more detail in the docs it claims to be > equivalent to "INCLUDING DEFAULTS INCLUDING IDENTITY INCLUDING > CONSTRAINTS INCLUDING INDEXES INCLUDING STORAGE INCLUDING COMMENTS", > so it didn't seem right to magically have LIKE ALL include something > that there's no option to include individually, so I added INCLUDING > STATISTICS. Note sure if someone want's to exclude statistics, but it is good that one can. So +1 from me. Looking at the patch, at first I thought the order was sorted and you swapped STORAGE and STATISTICS by accident. But then, it seems the order is semi-random. Should that list be sorted or is it already sorted by some criteria that I don't see? - <literal>INCLUDING DEFAULTS INCLUDING IDENTITY INCLUDING CONSTRAINTS INCLUDING INDEXES INCLUDING STORAGE INCLUDING COMMENTS</literal>. + <literal>INCLUDING DEFAULTS INCLUDING IDENTITY INCLUDING CONSTRAINTS INCLUDING INDEXES INCLUDING STORAGE INCLUDING STATISTICS INCLUDING COMMENTS</literal>. Best wishes, Tels
On 27 January 2018 at 00:03, Tels <nospam-pg-abuse@bloodgate.com> wrote: > Looking at the patch, at first I thought the order was sorted and you > swapped STORAGE and STATISTICS by accident. But then, it seems the order > is semi-random. Should that list be sorted or is it already sorted by some > criteria that I don't see? > > - <literal>INCLUDING DEFAULTS INCLUDING IDENTITY INCLUDING > CONSTRAINTS INCLUDING INDEXES INCLUDING STORAGE INCLUDING > COMMENTS</literal>. > + <literal>INCLUDING DEFAULTS INCLUDING IDENTITY INCLUDING > CONSTRAINTS INCLUDING INDEXES INCLUDING STORAGE INCLUDING STATISTICS > INCLUDING COMMENTS</literal>. It looks like they were in order of how they're defined in enum TableLikeOption up until [1], then I'm not so sure what the new order is based on after that. I'd offer to put it back to the order of the enum, but I want to minimise the invasiveness of the patch. I'm not sure yet if it should be classed as a bug fix or a new feature. On looking at this I realised I missed changing the syntax synopsis. The attached adds this. [1] https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=3217327053638085d24dd4d276e7c1f7ac2c4c6b -- David Rowley http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Attachment
David Rowley <david.rowley@2ndquadrant.com> writes: > I'd offer to put it back to the order of the enum, but I want to > minimise the invasiveness of the patch. I'm not sure yet if it should > be classed as a bug fix or a new feature. FWIW, I'd call it a new feature. I think the ordering of these items suffers greatly from "add new stuff at the end no matter what" syndrome. Feel free to design an ordering that makes more sense, but then please apply it consistently. regards, tom lane
Hi, On 01/27/2018 10:09 PM, David Rowley wrote: > On 27 January 2018 at 00:03, Tels <nospam-pg-abuse@bloodgate.com> wrote: >> Looking at the patch, at first I thought the order was sorted and you >> swapped STORAGE and STATISTICS by accident. But then, it seems the order >> is semi-random. Should that list be sorted or is it already sorted by some >> criteria that I don't see? >> >> - <literal>INCLUDING DEFAULTS INCLUDING IDENTITY INCLUDING >> CONSTRAINTS INCLUDING INDEXES INCLUDING STORAGE INCLUDING >> COMMENTS</literal>. >> + <literal>INCLUDING DEFAULTS INCLUDING IDENTITY INCLUDING >> CONSTRAINTS INCLUDING INDEXES INCLUDING STORAGE INCLUDING STATISTICS >> INCLUDING COMMENTS</literal>. > > It looks like they were in order of how they're defined in enum > TableLikeOption up until [1], then I'm not so sure what the new order > is based on after that. > > I'd offer to put it back to the order of the enum, but I want to > minimise the invasiveness of the patch. I'm not sure yet if it should > be classed as a bug fix or a new feature. > > On looking at this I realised I missed changing the syntax synopsis. > The attached adds this. > Thanks for working on a patch. This should have been in the statistics patch, no doubt about that. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On 01/27/2018 10:45 PM, Tom Lane wrote: > David Rowley <david.rowley@2ndquadrant.com> writes: >> I'd offer to put it back to the order of the enum, but I want to >> minimise the invasiveness of the patch. I'm not sure yet if it should >> be classed as a bug fix or a new feature. > > FWIW, I'd call it a new feature. > I'm not sure what exactly the feature would be? I mean "keep statistics even if you only ask for indexes" does not seem particularly helpful to me. So I see it more like a bug - I certainly think it should have been handled differently in 10. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On 28 January 2018 at 12:00, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote: > On 01/27/2018 10:45 PM, Tom Lane wrote: >> David Rowley <david.rowley@2ndquadrant.com> writes: >>> I'd offer to put it back to the order of the enum, but I want to >>> minimise the invasiveness of the patch. I'm not sure yet if it should >>> be classed as a bug fix or a new feature. >> >> FWIW, I'd call it a new feature. >> > > I'm not sure what exactly the feature would be? I mean "keep statistics > even if you only ask for indexes" does not seem particularly helpful to > me. So I see it more like a bug - I certainly think it should have been > handled differently in 10. FWIW I'm leaning more towards this being a bug fix too. The only thing that put doubt in my mind was after digging into the code I realised that "ALL" means all of TableLikeOption's bits rather than all things copyable about the table. However, you'd only have to think slightly beyond that to then think the bug is that the TableLikeOption enum is just missing an option for including statistics in the first place. If I'd realised this was missing during my review I'd have pushed back and said to Tomas that this needs to be added before commit. Now I'll ask; On me doing so, would anyone have pushed back on that request and said that what I'm asking is a separate feature? If the answer to that is "no", then this is a bug that should be fixed and backpacked to v10. -- David Rowley http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
On 28 January 2018 at 10:45, Tom Lane <tgl@sss.pgh.pa.us> wrote: > David Rowley <david.rowley@2ndquadrant.com> writes: >> I'd offer to put it back to the order of the enum, but I want to >> minimise the invasiveness of the patch. > > I think the ordering of these items suffers greatly from "add new stuff > at the end no matter what" syndrome. Feel free to design an ordering > that makes more sense, but then please apply it consistently. I've attached a patch which puts them in alphabetical order. -- David Rowley http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Attachment
On 28 January 2018 at 12:00, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote:
> On 01/27/2018 10:45 PM, Tom Lane wrote:
>> David Rowley <david.rowley@2ndquadrant.com> writes:
>>> I'd offer to put it back to the order of the enum, but I want to
>>> minimise the invasiveness of the patch. I'm not sure yet if it should
>>> be classed as a bug fix or a new feature.
>>
>> FWIW, I'd call it a new feature.
>>
>
> I'm not sure what exactly the feature would be? I mean "keep statistics
> even if you only ask for indexes" does not seem particularly helpful to
> me. So I see it more like a bug - I certainly think it should have been
> handled differently in 10.
Now I'll ask; On me doing so, would anyone have pushed back on that
request and said that what I'm asking is a separate feature?
If the answer to that is "no", then this is a bug that should be fixed
and backpacked to v10.
Its a bug of omission (I'm going with no one saying no to your proposition) - though that still doesn't automatically allow us to back-patch it.
This bug has an obvious if annoying work-around and fixing the bug will likely cause people's code, that uses said work-around, to fail. Breaking people's working code in stable release branches is generally a no-no.
However, given that this was discovered 4 months after the feature was released suggests to me that we are justified, and community-serving, to back-patch this. Put more bluntly, we can ask for more leeway in the first few patch releases of a new feature since more people will benefit from 5 years of a fully-baked feature than may be harmed by said change. We shouldn't abuse that but an obvious new feature bug/oversight like this seems reasonable.
David J.
On 30 January 2018 at 14:14, David G. Johnston <david.g.johnston@gmail.com> wrote: > This bug has an obvious if annoying work-around and fixing the bug will > likely cause people's code, that uses said work-around, to fail. Breaking > people's working code in stable release branches is generally a no-no. > > However, given that this was discovered 4 months after the feature was > released suggests to me that we are justified, and community-serving, to > back-patch this. Put more bluntly, we can ask for more leeway in the first > few patch releases of a new feature since more people will benefit from 5 > years of a fully-baked feature than may be harmed by said change. We > shouldn't abuse that but an obvious new feature bug/oversight like this > seems reasonable. That seems quite rational. To prevent this getting lost I've added it to the March commitfest [1]. In the commitfest application I've classed it (for now) as a bug fix. If that changes then we can alter it in the commitfest app. [1] https://commitfest.postgresql.org/17/1501/ -- David Rowley http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
On 02/01/2018 07:26 AM, David Rowley wrote: > On 30 January 2018 at 14:14, David G. Johnston > <david.g.johnston@gmail.com> wrote: >> This bug has an obvious if annoying work-around and fixing the bug will >> likely cause people's code, that uses said work-around, to fail. Breaking >> people's working code in stable release branches is generally a no-no. >> >> However, given that this was discovered 4 months after the feature was >> released suggests to me that we are justified, and community-serving, to >> back-patch this. Put more bluntly, we can ask for more leeway in the first >> few patch releases of a new feature since more people will benefit from 5 >> years of a fully-baked feature than may be harmed by said change. We >> shouldn't abuse that but an obvious new feature bug/oversight like this >> seems reasonable. > > That seems quite rational. > > To prevent this getting lost I've added it to the March commitfest [1]. > > In the commitfest application I've classed it (for now) as a bug fix. > If that changes then we can alter it in the commitfest app. > > [1] https://commitfest.postgresql.org/17/1501/ > Thank you for taking care of this. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
I've attached a rebased patch which fixes up the conflicts caused by 8b08f7d. -- David Rowley http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Attachment
David G. Johnston wrote: > This bug has an obvious if annoying work-around and fixing the bug will > likely cause people's code, that uses said work-around, to fail. Breaking > people's working code in stable release branches is generally a no-no. > > However, given that this was discovered 4 months after the feature was > released suggests to me that we are justified, and community-serving, to > back-patch this. Put more bluntly, we can ask for more leeway in the first > few patch releases of a new feature since more people will benefit from 5 > years of a fully-baked feature than may be harmed by said change. We > shouldn't abuse that but an obvious new feature bug/oversight like this > seems reasonable. I agree with this argumentation and in my opinion we should back-patch this as a bugfix. Certainly if I had remembered about CREATE TABLE LIKE I would have stalled the ext.stats patch. -- Álvaro Herrera https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Hi, On 03/03/2018 11:20 AM, David Rowley wrote: > I've attached a rebased patch which fixes up the conflicts caused by 8b08f7d. > The patch seems fine to me. A couple of minor points 1) I wonder if we should make the list in create_table.sgml to also be alphabetically sorted. 2) I think the code in parse_utilcmd.c was missing a bunch of comments. Nothing particularly serious, but the surrounding code has them ... 3) The transformExtendedStatistics call in transformCreateStmt was a bit too high, interleaving with the transforms of various constraints. I think we should move it a couple of lines down, to keep those calls together (transformAlterTableStmt also does the call after handling all the constraint-related stuff). 4) I see you've added generateClonedExtStatsStmt to parse_utilcmd.h, but it was only really called in parse_utilcmd.c, so I've made it static. I don't think we need to expose stuff unnecessarily. The attached patch does those changes - feel free to pick only those you like / agree with. BTW the last point made me thinking, because parse_utilcmd.h also exposes generateClonedIndexStmt. That is however necessary, because it's called from DefineRelation when copying indexes from partitioned table to partitions. I'm wondering - shouldn't we do the same thing for extended statistics? regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Attachment
Tomas Vondra wrote: > 4) I see you've added generateClonedExtStatsStmt to parse_utilcmd.h, but > it was only really called in parse_utilcmd.c, so I've made it static. I > don't think we need to expose stuff unnecessarily. > BTW the last point made me thinking, because parse_utilcmd.h also > exposes generateClonedIndexStmt. That is however necessary, because it's > called from DefineRelation when copying indexes from partitioned table > to partitions. I'm wondering - shouldn't we do the same thing for > extended statistics? Maybe, but that would not be a bugfix anymore. So if we do want that, that is definitely a new feature, so it should be its own patch; the copying of indexes to partitions is a new feature in pg11. -- Álvaro Herrera https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On 03/05/2018 08:57 PM, Alvaro Herrera wrote: > Tomas Vondra wrote: > >> 4) I see you've added generateClonedExtStatsStmt to parse_utilcmd.h, but >> it was only really called in parse_utilcmd.c, so I've made it static. I >> don't think we need to expose stuff unnecessarily. > >> BTW the last point made me thinking, because parse_utilcmd.h also >> exposes generateClonedIndexStmt. That is however necessary, because it's >> called from DefineRelation when copying indexes from partitioned table >> to partitions. I'm wondering - shouldn't we do the same thing for >> extended statistics? > > Maybe, but that would not be a bugfix anymore. So if we do want that, > that is definitely a new feature, so it should be its own patch; the > copying of indexes to partitions is a new feature in pg11. > Sure, I wasn't really suggesting squashing that into this bugfix. -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
I admit to much head-scratching, erasing my entire ccache cache, the autoconf cache and doing two complete rebuilds from scratch, because I was seeing 40 errors in regression tests. But it turned out to be about this hunk, which was identical to the idea I had while skimming David's original, "hey why don't we just copy the list": > +/* > + * transformExtendedStatistics > + * handle extended statistics > + * > + * Right now, there's nothing to do here, so we just copy the list. > + */ > static void > transformExtendedStatistics(CreateStmtContext *cxt) > { > - ListCell *lc; > - > - foreach(lc, cxt->extstats) > - cxt->alist = lappend(cxt->alist, lfirst(lc)); > + cxt->alist = list_copy(cxt->extstats); > } > > /* But as it turns out, it's wrong! list_concat() is what is needed here, not list_copy. -- Álvaro Herrera https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Pushed now, to branches master and pg10, with Tomas changes. I made a few changes of my own 1. you forgot to update various src/backend/nodes/ files 2. I got rid of "NameData stxname" variable in CreateStatistics, which seems pointless now. We can work with a cstring only. Not sure why we had that one ... 3. I chose not to backpatch the node->stxcomment thing. It makes me nervous to modify a parse node. So cloning the comments is a PG11 thing. Hopefully it's not *too* bad ... 4. See elsewhere in the thread about list_copy vs. list_concat :-) Thanks, -- Álvaro Herrera https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On 6 March 2018 at 11:43, Alvaro Herrera <alvherre@2ndquadrant.com> wrote: > Pushed now, to branches master and pg10, with Tomas changes. I made a > few changes of my own Great! Many thanks to both of you for making those changes and thanks Alvaro for pushing. > 3. I chose not to backpatch the node->stxcomment thing. It makes me > nervous to modify a parse node. So cloning the comments is a PG11 > thing. Hopefully it's not *too* bad ... Makes sense. We've had other objects previously that the comments were lost sometimes, and I don't think they were noticed too quickly, so perhaps this is an unlikely case that will bother too many people. > 4. See elsewhere in the thread about list_copy vs. list_concat :-) I saw that. Thanks for fixing. The only weird thing I see in the changes is that the comment here claims it makes a copy, but it does not. + * Right now, there's nothing to do here, so we just copy the list. + */ +static void +transformExtendedStatistics(CreateStmtContext *cxt) +{ + cxt->alist = list_concat(cxt->alist, cxt->extstats); -- David Rowley http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
David Rowley wrote: > On 6 March 2018 at 11:43, Alvaro Herrera <alvherre@2ndquadrant.com> wrote: > > 4. See elsewhere in the thread about list_copy vs. list_concat :-) > > I saw that. Thanks for fixing. The only weird thing I see in the > changes is that the comment here claims it makes a copy, but it does > not. > > + * Right now, there's nothing to do here, so we just copy the list. Agreed, changed. -- Álvaro Herrera https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services