Thread: Re: STATISTICS retained in CREATE TABLE ... LIKE (INCLUDING ALL)?

Re: STATISTICS retained in CREATE TABLE ... LIKE (INCLUDING ALL)?

From
David Rowley
Date:
On 21 January 2018 at 19:21, David Rowley <david.rowley@2ndquadrant.com> wrote:
> On 20 January 2018 at 18:50, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Stephen Froehlich <s.froehlich@cablelabs.com> writes:
>>> Are custom statistics in PG10 retained in LIKE (INCLUDING ALL) or do I need to recreate the statistics by hand each
timeI create a new table?
 
>>
>> LIKE doesn't know about those, no.  Perhaps it should.
>
> Agreed. ALL does not mean MOST.

(Moving to -hackers)

The attached implements this.

Looking at "LIKE ALL" in more detail in the docs it claims to be
equivalent to "INCLUDING DEFAULTS INCLUDING IDENTITY INCLUDING
CONSTRAINTS INCLUDING INDEXES INCLUDING STORAGE INCLUDING COMMENTS",
so it didn't seem right to magically have LIKE ALL include something
that there's no option to include individually, so I added INCLUDING
STATISTICS.

This also required writing code allow statistics to be created without
a name. The code I added to choose the default name is in the form
<tablename>_<column1>_<column2>_stat. I'm sure someone will want
something else, but that's just what I've personally standardised on.
I'd also be fine with "stx" or "sta". Making this work required a bit
of shuffling of error checking in CreateStatistics() so that we
generate a name before complaining about any duplicate name.

I'm unsure if this should be categorised as a bug fix or a new feature.

-- 
 David Rowley                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Attachment

Re: STATISTICS retained in CREATE TABLE ... LIKE (INCLUDING ALL)?

From
"Tels"
Date:
Moin,

On Fri, January 26, 2018 2:30 am, David Rowley wrote:
> On 21 January 2018 at 19:21, David Rowley <david.rowley@2ndquadrant.com>
> wrote:
>> On 20 January 2018 at 18:50, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>> Stephen Froehlich <s.froehlich@cablelabs.com> writes:
>>>> Are custom statistics in PG10 retained in LIKE (INCLUDING ALL) or do I
>>>> need to recreate the statistics by hand each time I create a new
>>>> table?
>>>
>>> LIKE doesn't know about those, no.  Perhaps it should.
>>
>> Agreed. ALL does not mean MOST.
>
> (Moving to -hackers)
>
> The attached implements this.
>
> Looking at "LIKE ALL" in more detail in the docs it claims to be
> equivalent to "INCLUDING DEFAULTS INCLUDING IDENTITY INCLUDING
> CONSTRAINTS INCLUDING INDEXES INCLUDING STORAGE INCLUDING COMMENTS",
> so it didn't seem right to magically have LIKE ALL include something
> that there's no option to include individually, so I added INCLUDING
> STATISTICS.

Note sure if someone want's to exclude statistics, but it is good that one
can. So +1 from me.

Looking at the patch, at first I thought the order was sorted and you
swapped STORAGE and STATISTICS by accident. But then, it seems the order
is semi-random. Should that list be sorted or is it already sorted by some
criteria that I don't see?

-      <literal>INCLUDING DEFAULTS INCLUDING IDENTITY INCLUDING
CONSTRAINTS INCLUDING INDEXES INCLUDING STORAGE INCLUDING
COMMENTS</literal>.
+      <literal>INCLUDING DEFAULTS INCLUDING IDENTITY INCLUDING
CONSTRAINTS INCLUDING INDEXES INCLUDING STORAGE INCLUDING STATISTICS
INCLUDING COMMENTS</literal>.

Best wishes,

Tels


Re: STATISTICS retained in CREATE TABLE ... LIKE (INCLUDING ALL)?

From
David Rowley
Date:
On 27 January 2018 at 00:03, Tels <nospam-pg-abuse@bloodgate.com> wrote:
> Looking at the patch, at first I thought the order was sorted and you
> swapped STORAGE and STATISTICS by accident. But then, it seems the order
> is semi-random. Should that list be sorted or is it already sorted by some
> criteria that I don't see?
>
> -      <literal>INCLUDING DEFAULTS INCLUDING IDENTITY INCLUDING
> CONSTRAINTS INCLUDING INDEXES INCLUDING STORAGE INCLUDING
> COMMENTS</literal>.
> +      <literal>INCLUDING DEFAULTS INCLUDING IDENTITY INCLUDING
> CONSTRAINTS INCLUDING INDEXES INCLUDING STORAGE INCLUDING STATISTICS
> INCLUDING COMMENTS</literal>.

It looks like they were in order of how they're defined in enum
TableLikeOption up until [1], then I'm not so sure what the new order
is based on after that.

I'd offer to put it back to the order of the enum, but I want to
minimise the invasiveness of the patch. I'm not sure yet if it should
be classed as a bug fix or a new feature.

On looking at this I realised I missed changing the syntax synopsis.
The attached adds this.

[1] https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=3217327053638085d24dd4d276e7c1f7ac2c4c6b

-- 
 David Rowley                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Attachment

Re: STATISTICS retained in CREATE TABLE ... LIKE (INCLUDING ALL)?

From
Tom Lane
Date:
David Rowley <david.rowley@2ndquadrant.com> writes:
> I'd offer to put it back to the order of the enum, but I want to
> minimise the invasiveness of the patch. I'm not sure yet if it should
> be classed as a bug fix or a new feature.

FWIW, I'd call it a new feature.

I think the ordering of these items suffers greatly from "add new stuff
at the end no matter what" syndrome.  Feel free to design an ordering
that makes more sense, but then please apply it consistently.

            regards, tom lane


Re: STATISTICS retained in CREATE TABLE ... LIKE (INCLUDING ALL)?

From
Tomas Vondra
Date:
Hi,

On 01/27/2018 10:09 PM, David Rowley wrote:
> On 27 January 2018 at 00:03, Tels <nospam-pg-abuse@bloodgate.com> wrote:
>> Looking at the patch, at first I thought the order was sorted and you
>> swapped STORAGE and STATISTICS by accident. But then, it seems the order
>> is semi-random. Should that list be sorted or is it already sorted by some
>> criteria that I don't see?
>>
>> -      <literal>INCLUDING DEFAULTS INCLUDING IDENTITY INCLUDING
>> CONSTRAINTS INCLUDING INDEXES INCLUDING STORAGE INCLUDING
>> COMMENTS</literal>.
>> +      <literal>INCLUDING DEFAULTS INCLUDING IDENTITY INCLUDING
>> CONSTRAINTS INCLUDING INDEXES INCLUDING STORAGE INCLUDING STATISTICS
>> INCLUDING COMMENTS</literal>.
> 
> It looks like they were in order of how they're defined in enum
> TableLikeOption up until [1], then I'm not so sure what the new order
> is based on after that.
> 
> I'd offer to put it back to the order of the enum, but I want to
> minimise the invasiveness of the patch. I'm not sure yet if it should
> be classed as a bug fix or a new feature.
> 
> On looking at this I realised I missed changing the syntax synopsis.
> The attached adds this.
> 

Thanks for working on a patch. This should have been in the statistics
patch, no doubt about that.


regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Re: STATISTICS retained in CREATE TABLE ... LIKE (INCLUDING ALL)?

From
Tomas Vondra
Date:

On 01/27/2018 10:45 PM, Tom Lane wrote:
> David Rowley <david.rowley@2ndquadrant.com> writes:
>> I'd offer to put it back to the order of the enum, but I want to
>> minimise the invasiveness of the patch. I'm not sure yet if it should
>> be classed as a bug fix or a new feature.
> 
> FWIW, I'd call it a new feature.
> 

I'm not sure what exactly the feature would be? I mean "keep statistics
even if you only ask for indexes" does not seem particularly helpful to
me. So I see it more like a bug - I certainly think it should have been
handled differently in 10.


regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Re: STATISTICS retained in CREATE TABLE ... LIKE (INCLUDING ALL)?

From
David Rowley
Date:
On 28 January 2018 at 12:00, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote:
> On 01/27/2018 10:45 PM, Tom Lane wrote:
>> David Rowley <david.rowley@2ndquadrant.com> writes:
>>> I'd offer to put it back to the order of the enum, but I want to
>>> minimise the invasiveness of the patch. I'm not sure yet if it should
>>> be classed as a bug fix or a new feature.
>>
>> FWIW, I'd call it a new feature.
>>
>
> I'm not sure what exactly the feature would be? I mean "keep statistics
> even if you only ask for indexes" does not seem particularly helpful to
> me. So I see it more like a bug - I certainly think it should have been
> handled differently in 10.

FWIW I'm leaning more towards this being a bug fix too. The only thing
that put doubt in my mind was after digging into the code I realised
that "ALL" means all of TableLikeOption's bits rather than all things
copyable about the table. However, you'd only have to think slightly
beyond that to then think the bug is that the TableLikeOption enum is
just missing an option for including statistics in the first place.

If I'd realised this was missing during my review I'd have pushed back
and said to Tomas that this needs to be added before commit.

Now I'll ask; On me doing so, would anyone have pushed back on that
request and said that what I'm asking is a separate feature?

If the answer to that is "no", then this is a bug that should be fixed
and backpacked to v10.

-- 
 David Rowley                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


Re: STATISTICS retained in CREATE TABLE ... LIKE (INCLUDING ALL)?

From
David Rowley
Date:
On 28 January 2018 at 10:45, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> David Rowley <david.rowley@2ndquadrant.com> writes:
>> I'd offer to put it back to the order of the enum, but I want to
>> minimise the invasiveness of the patch.
>
> I think the ordering of these items suffers greatly from "add new stuff
> at the end no matter what" syndrome.  Feel free to design an ordering
> that makes more sense, but then please apply it consistently.

I've attached a patch which puts them in alphabetical order.

-- 
 David Rowley                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Attachment

Re: STATISTICS retained in CREATE TABLE ... LIKE (INCLUDING ALL)?

From
"David G. Johnston"
Date:
On Mon, Jan 29, 2018 at 2:55 AM, David Rowley <david.rowley@2ndquadrant.com> wrote:
On 28 January 2018 at 12:00, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote:
> On 01/27/2018 10:45 PM, Tom Lane wrote:
>> David Rowley <david.rowley@2ndquadrant.com> writes:
>>> I'd offer to put it back to the order of the enum, but I want to
>>> minimise the invasiveness of the patch. I'm not sure yet if it should
>>> be classed as a bug fix or a new feature.
>>
>> FWIW, I'd call it a new feature.
>>
>
> I'm not sure what exactly the feature would be? I mean "keep statistics
> even if you only ask for indexes" does not seem particularly helpful to
> me. So I see it more like a bug - I certainly think it should have been
> handled differently in 10.

Now I'll ask; On me doing so, would anyone have pushed back on that
request and said that what I'm asking is a separate feature?

If the answer to that is "no", then this is a bug that should be fixed
and backpacked to v10.

​Its a bug of omission (I'm going with no one saying no to your proposition) - though that still doesn't automatically allow us to back-patch it.

This bug has an obvious if annoying work-around and fixing the bug will likely cause people's code, that uses said work-around, to fail.  Breaking people's working code in stable release branches is generally a no-no.

However, given that this was discovered 4 months after the feature was released suggests to me that we are justified, and community-serving, to back-patch this.  Put more bluntly, we can ask for more leeway in the first few patch releases of a new feature since more people will benefit from 5 years of a fully-baked feature than may be harmed by said change.  We shouldn't abuse that but an obvious new feature bug/oversight like this seems reasonable.

David J.

Re: STATISTICS retained in CREATE TABLE ... LIKE (INCLUDING ALL)?

From
David Rowley
Date:
On 30 January 2018 at 14:14, David G. Johnston
<david.g.johnston@gmail.com> wrote:
> This bug has an obvious if annoying work-around and fixing the bug will
> likely cause people's code, that uses said work-around, to fail.  Breaking
> people's working code in stable release branches is generally a no-no.
>
> However, given that this was discovered 4 months after the feature was
> released suggests to me that we are justified, and community-serving, to
> back-patch this.  Put more bluntly, we can ask for more leeway in the first
> few patch releases of a new feature since more people will benefit from 5
> years of a fully-baked feature than may be harmed by said change.  We
> shouldn't abuse that but an obvious new feature bug/oversight like this
> seems reasonable.

That seems quite rational.

To prevent this getting lost I've added it to the March commitfest [1].

In the commitfest application I've classed it (for now) as a bug fix.
If that changes then we can alter it in the commitfest app.

[1] https://commitfest.postgresql.org/17/1501/


-- 
 David Rowley                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


Re: STATISTICS retained in CREATE TABLE ... LIKE (INCLUDING ALL)?

From
Tomas Vondra
Date:

On 02/01/2018 07:26 AM, David Rowley wrote:
> On 30 January 2018 at 14:14, David G. Johnston
> <david.g.johnston@gmail.com> wrote:
>> This bug has an obvious if annoying work-around and fixing the bug will
>> likely cause people's code, that uses said work-around, to fail.  Breaking
>> people's working code in stable release branches is generally a no-no.
>>
>> However, given that this was discovered 4 months after the feature was
>> released suggests to me that we are justified, and community-serving, to
>> back-patch this.  Put more bluntly, we can ask for more leeway in the first
>> few patch releases of a new feature since more people will benefit from 5
>> years of a fully-baked feature than may be harmed by said change.  We
>> shouldn't abuse that but an obvious new feature bug/oversight like this
>> seems reasonable.
> 
> That seems quite rational.
> 
> To prevent this getting lost I've added it to the March commitfest [1].
> 
> In the commitfest application I've classed it (for now) as a bug fix.
> If that changes then we can alter it in the commitfest app.
> 
> [1] https://commitfest.postgresql.org/17/1501/
> 

Thank you for taking care of this.

regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Re: STATISTICS retained in CREATE TABLE ... LIKE (INCLUDING ALL)?

From
David Rowley
Date:
I've attached a rebased patch which fixes up the conflicts caused by 8b08f7d.

-- 
 David Rowley                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Attachment

Re: STATISTICS retained in CREATE TABLE ... LIKE (INCLUDING ALL)?

From
Alvaro Herrera
Date:
David G. Johnston wrote:

> This bug has an obvious if annoying work-around and fixing the bug will
> likely cause people's code, that uses said work-around, to fail.  Breaking
> people's working code in stable release branches is generally a no-no.
> 
> However, given that this was discovered 4 months after the feature was
> released suggests to me that we are justified, and community-serving, to
> back-patch this.  Put more bluntly, we can ask for more leeway in the first
> few patch releases of a new feature since more people will benefit from 5
> years of a fully-baked feature than may be harmed by said change.  We
> shouldn't abuse that but an obvious new feature bug/oversight like this
> seems reasonable.

I agree with this argumentation and in my opinion we should back-patch
this as a bugfix.  Certainly if I had remembered about CREATE TABLE LIKE
I would have stalled the ext.stats patch.

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Re: STATISTICS retained in CREATE TABLE ... LIKE (INCLUDING ALL)?

From
Tomas Vondra
Date:
Hi,

On 03/03/2018 11:20 AM, David Rowley wrote:
> I've attached a rebased patch which fixes up the conflicts caused by 8b08f7d.
> 

The patch seems fine to me. A couple of minor points

1) I wonder if we should make the list in create_table.sgml to also be
alphabetically sorted.

2) I think the code in parse_utilcmd.c was missing a bunch of comments.
Nothing particularly serious, but the surrounding code has them ...

3) The transformExtendedStatistics call in transformCreateStmt was a bit
too high, interleaving with the transforms of various constraints. I
think we should move it a couple of lines down, to keep those calls
together (transformAlterTableStmt also does the call after handling all
the constraint-related stuff).

4) I see you've added generateClonedExtStatsStmt to parse_utilcmd.h, but
it was only really called in parse_utilcmd.c, so I've made it static. I
don't think we need to expose stuff unnecessarily.

The attached patch does those changes - feel free to pick only those you
like / agree with.


BTW the last point made me thinking, because parse_utilcmd.h also
exposes generateClonedIndexStmt. That is however necessary, because it's
called from DefineRelation when copying indexes from partitioned table
to partitions. I'm wondering - shouldn't we do the same thing for
extended statistics?


regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Attachment

Re: STATISTICS retained in CREATE TABLE ... LIKE (INCLUDING ALL)?

From
Alvaro Herrera
Date:
Tomas Vondra wrote:

> 4) I see you've added generateClonedExtStatsStmt to parse_utilcmd.h, but
> it was only really called in parse_utilcmd.c, so I've made it static. I
> don't think we need to expose stuff unnecessarily.

> BTW the last point made me thinking, because parse_utilcmd.h also
> exposes generateClonedIndexStmt. That is however necessary, because it's
> called from DefineRelation when copying indexes from partitioned table
> to partitions. I'm wondering - shouldn't we do the same thing for
> extended statistics?

Maybe, but that would not be a bugfix anymore.  So if we do want that,
that is definitely a new feature, so it should be its own patch; the
copying of indexes to partitions is a new feature in pg11.

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Re: STATISTICS retained in CREATE TABLE ... LIKE (INCLUDING ALL)?

From
Tomas Vondra
Date:

On 03/05/2018 08:57 PM, Alvaro Herrera wrote:
> Tomas Vondra wrote:
> 
>> 4) I see you've added generateClonedExtStatsStmt to parse_utilcmd.h, but
>> it was only really called in parse_utilcmd.c, so I've made it static. I
>> don't think we need to expose stuff unnecessarily.
> 
>> BTW the last point made me thinking, because parse_utilcmd.h also
>> exposes generateClonedIndexStmt. That is however necessary, because it's
>> called from DefineRelation when copying indexes from partitioned table
>> to partitions. I'm wondering - shouldn't we do the same thing for
>> extended statistics?
> 
> Maybe, but that would not be a bugfix anymore.  So if we do want that,
> that is definitely a new feature, so it should be its own patch; the
> copying of indexes to partitions is a new feature in pg11.
> 

Sure, I wasn't really suggesting squashing that into this bugfix.
-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Re: STATISTICS retained in CREATE TABLE ... LIKE (INCLUDING ALL)?

From
Alvaro Herrera
Date:
I admit to much head-scratching, erasing my entire ccache cache, the
autoconf cache and doing two complete rebuilds from scratch, because 
I was seeing 40 errors in regression tests.  But it
turned out to be about this hunk, which was identical to the idea I had
while skimming David's original, "hey why don't we just copy the list":

> +/*
> + * transformExtendedStatistics
> + *        handle extended statistics
> + *
> + * Right now, there's nothing to do here, so we just copy the list.
> + */
>  static void
>  transformExtendedStatistics(CreateStmtContext *cxt)
>  {
> -    ListCell *lc;
> -
> -    foreach(lc, cxt->extstats)
> -        cxt->alist = lappend(cxt->alist, lfirst(lc));
> +    cxt->alist = list_copy(cxt->extstats);
>  }
>  
>  /*

But as it turns out, it's wrong!  list_concat() is what is needed here,
not list_copy.

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Re: STATISTICS retained in CREATE TABLE ... LIKE (INCLUDING ALL)?

From
Alvaro Herrera
Date:
Pushed now, to branches master and pg10, with Tomas changes.  I made a
few changes of my own

1. you forgot to update various src/backend/nodes/ files
2. I got rid of "NameData stxname" variable in CreateStatistics, which
seems pointless now.  We can work with a cstring only.  Not sure why we
had that one ...
3. I chose not to backpatch the node->stxcomment thing.  It makes me
nervous to modify a parse node.  So cloning the comments is a PG11
thing.  Hopefully it's not *too* bad ...
4. See elsewhere in the thread about list_copy vs. list_concat :-)

Thanks,

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Re: STATISTICS retained in CREATE TABLE ... LIKE (INCLUDING ALL)?

From
David Rowley
Date:
On 6 March 2018 at 11:43, Alvaro Herrera <alvherre@2ndquadrant.com> wrote:
> Pushed now, to branches master and pg10, with Tomas changes.  I made a
> few changes of my own

Great! Many thanks to both of you for making those changes and thanks
Alvaro for pushing.

> 3. I chose not to backpatch the node->stxcomment thing.  It makes me
> nervous to modify a parse node.  So cloning the comments is a PG11
> thing.  Hopefully it's not *too* bad ...

Makes sense. We've had other objects previously that the comments were
lost sometimes, and I don't think they were noticed too quickly, so
perhaps this is an unlikely case that will bother too many people.

> 4. See elsewhere in the thread about list_copy vs. list_concat :-)

I saw that. Thanks for fixing. The only weird thing I see in the
changes is that the comment here claims it makes a copy, but it does
not.

+ * Right now, there's nothing to do here, so we just copy the list.
+ */
+static void
+transformExtendedStatistics(CreateStmtContext *cxt)
+{
+   cxt->alist = list_concat(cxt->alist, cxt->extstats);


-- 
 David Rowley                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


Re: STATISTICS retained in CREATE TABLE ... LIKE (INCLUDING ALL)?

From
Alvaro Herrera
Date:
David Rowley wrote:
> On 6 March 2018 at 11:43, Alvaro Herrera <alvherre@2ndquadrant.com> wrote:

> > 4. See elsewhere in the thread about list_copy vs. list_concat :-)
> 
> I saw that. Thanks for fixing. The only weird thing I see in the
> changes is that the comment here claims it makes a copy, but it does
> not.
> 
> + * Right now, there's nothing to do here, so we just copy the list.

Agreed, changed.

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services