Thread: [HACKERS] WITH clause in CREATE STATISTICS
Simon just pointed out that having the WITH clause appear in the middle of the CREATE STATISTICS command looks odd; apparently somebody else already complained on list about the same. Other commands put the WITH clause at the end, so perhaps we should do likewise in the new command. Here's a patch to implement that. I verified that if I change qualified_name to qualified_name_list, bison does not complain about conflicts, so this new syntax should support extension to multiple relations without a problem. Discuss. -- Álvaro Herrera https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Attachment
Alvaro Herrera <alvherre@2ndquadrant.com> writes: > Simon just pointed out that having the WITH clause appear in the middle > of the CREATE STATISTICS command looks odd; apparently somebody else > already complained on list about the same. Other commands put the WITH > clause at the end, so perhaps we should do likewise in the new command. > Here's a patch to implement that. I verified that if I change > qualified_name to qualified_name_list, bison does not complain about > conflicts, so this new syntax should support extension to multiple > relations without a problem. Yeah, WITH is fully reserved, so as long as the clause looks like WITH ( stuff... ) you're pretty much gonna be able to drop it wherever you want. > Discuss. +1 for WITH at the end; the existing syntax looks weird to me too. regards, tom lane
On 04/21/2017 12:13 AM, Tom Lane wrote: > Alvaro Herrera <alvherre@2ndquadrant.com> writes: >> Simon just pointed out that having the WITH clause appear in the middle >> of the CREATE STATISTICS command looks odd; apparently somebody else >> already complained on list about the same. Other commands put the WITH >> clause at the end, so perhaps we should do likewise in the new command. > >> Here's a patch to implement that. I verified that if I change >> qualified_name to qualified_name_list, bison does not complain about >> conflicts, so this new syntax should support extension to multiple >> relations without a problem. > > Yeah, WITH is fully reserved, so as long as the clause looks like > WITH ( stuff... ) you're pretty much gonna be able to drop it > wherever you want. > >> Discuss. > > +1 for WITH at the end; the existing syntax looks weird to me too. > -1 from me I like the current syntax more, and WHERE ... WITH seems a bit weird to me. But more importantly, one thing Dean probably considered when proposing the current syntax was that we may add support for partial statistics, pretty much like partial indexes. And we don't allow WITH at the end (after WHERE) for indexes: test=# create index on t (a) where a < 100 with (fillfactor=10); ERROR: syntax error at or near "with" LINE 1: create index on t (a) where a < 100 with (fillfactor=10); ^ test=# create index on t (a) with (fillfactor=10) where a < 100; regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Moin, On Thu, April 20, 2017 8:21 pm, Tomas Vondra wrote: > On 04/21/2017 12:13 AM, Tom Lane wrote: >> Alvaro Herrera <alvherre@2ndquadrant.com> writes: >>> Simon just pointed out that having the WITH clause appear in the middle >>> of the CREATE STATISTICS command looks odd; apparently somebody else >>> already complained on list about the same. Other commands put the WITH >>> clause at the end, so perhaps we should do likewise in the new command. >> >>> Here's a patch to implement that. I verified that if I change >>> qualified_name to qualified_name_list, bison does not complain about >>> conflicts, so this new syntax should support extension to multiple >>> relations without a problem. >> >> Yeah, WITH is fully reserved, so as long as the clause looks like >> WITH ( stuff... ) you're pretty much gonna be able to drop it >> wherever you want. >> >>> Discuss. >> >> +1 for WITH at the end; the existing syntax looks weird to me too. >> > > -1 from me > > I like the current syntax more, and WHERE ... WITH seems a bit weird to > me. But more importantly, one thing Dean probably considered when > proposing the current syntax was that we may add support for partial > statistics, pretty much like partial indexes. And we don't allow WITH at > the end (after WHERE) for indexes: > > test=# create index on t (a) where a < 100 with (fillfactor=10); > ERROR: syntax error at or near "with" > LINE 1: create index on t (a) where a < 100 with (fillfactor=10); > ^ > test=# create index on t (a) with (fillfactor=10) where a < 100; While I'm not sure about where to put the WITH, so to speak, I do favour consistency. So I'm inclinded to keep the syntax like for the "create index". More importantly however, I'd rather see the syntax on the "ON (column) FROM relname" to be changed to match the existing examples. (Already wrote this to Simon, not sure if my email made it to the list) So instead: CREATE STATISTICS stats_name ON (columns) FROM relname I'd rather see: CREATE STATISTICS stats_name ON table(col); as this both mirrors CREATE INDEX and foreign keys with REFERENCES. It could also be extended to both more columns, expressions or other tables like so: CREATE STATISTICS stats ON t1(col1, col2 / 2), t2 (a,b); and even: CREATE STATISTICS stats ON t1(col1, col2 / 2), t2 (a,b) WITH (options) WHERE t2.a > 4; This looks easy to remember, since it compares to: CREATE INDEX idx_name ON t2(a,b) WITH (options) WHERE t2.a > 4; Or am I'm missing something? Regards, Tels
On 21 April 2017 at 22:30, Tels <nospam-pg-abuse@bloodgate.com> wrote: > I'd rather see: > > CREATE STATISTICS stats_name ON table(col); > > as this both mirrors CREATE INDEX and foreign keys with REFERENCES. It > could also be extended to both more columns, expressions or other tables > like so: > > CREATE STATISTICS stats ON t1(col1, col2 / 2), t2 (a,b); > > and even: > > CREATE STATISTICS stats ON t1(col1, col2 / 2), t2 (a,b) WITH (options) > WHERE t2.a > 4; How would you express a join condition with that syntax? > This looks easy to remember, since it compares to: > > CREATE INDEX idx_name ON t2(a,b) WITH (options) WHERE t2.a > 4; > > Or am I'm missing something? Sadly yes, you are, and it's not the first time. I seem to remember mentioning this to you already in [1]. Please, can you read over [2], it mentions exactly what you're proposing and why it's not any good. [1] https://www.postgresql.org/message-id/CAKJS1f9HMeT+7adicEaU8heOMpOB5pKkCVYZLiEZje3DVutVPw@mail.gmail.com [2] https://www.postgresql.org/message-id/CAEZATCUtGR+U5+QTwjHhe9rLG2nguEysHQ5NaqcK=VbJ78VQFA@mail.gmail.com -- David Rowley http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
Moin, On Fri, April 21, 2017 7:04 am, David Rowley wrote: > On 21 April 2017 at 22:30, Tels <nospam-pg-abuse@bloodgate.com> wrote: >> I'd rather see: >> >> CREATE STATISTICS stats_name ON table(col); >> >> as this both mirrors CREATE INDEX and foreign keys with REFERENCES. It >> could also be extended to both more columns, expressions or other tables >> like so: >> >> CREATE STATISTICS stats ON t1(col1, col2 / 2), t2 (a,b); >> >> and even: >> >> CREATE STATISTICS stats ON t1(col1, col2 / 2), t2 (a,b) WITH (options) >> WHERE t2.a > 4; > > How would you express a join condition with that syntax? > >> This looks easy to remember, since it compares to: >> >> CREATE INDEX idx_name ON t2(a,b) WITH (options) WHERE t2.a > 4; >> >> Or am I'm missing something? > > Sadly yes, you are, and it's not the first time. > > I seem to remember mentioning this to you already in [1]. > > Please, can you read over [2], it mentions exactly what you're > proposing and why it's not any good. > > [1] > https://www.postgresql.org/message-id/CAKJS1f9HMeT+7adicEaU8heOMpOB5pKkCVYZLiEZje3DVutVPw@mail.gmail.com > [2] > https://www.postgresql.org/message-id/CAEZATCUtGR+U5+QTwjHhe9rLG2nguEysHQ5NaqcK=VbJ78VQFA@mail.gmail.com Ah, ok, thank you, somehow I missed your answer the first time. So, just ignore me :) Best wishes, Tels
On 21 April 2017 at 01:21, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote: > On 04/21/2017 12:13 AM, Tom Lane wrote: >> >> Alvaro Herrera <alvherre@2ndquadrant.com> writes: >>> >>> Simon just pointed out that having the WITH clause appear in the middle >>> of the CREATE STATISTICS command looks odd; apparently somebody else >>> already complained on list about the same. Other commands put the WITH >>> clause at the end, so perhaps we should do likewise in the new command. >> >> +1 for WITH at the end; the existing syntax looks weird to me too. > > -1 from me > Yeah, I'm still marginally in favour of the current syntax because it's a bit more consistent with the CREATE VIEW syntax which puts the WITH (options) clause before the query, and assuming that multi-table and partial statistics support do get added in the future, the thing that the statistics get created on is going to look more like a query. OTOH, a few people have now commented that the syntax looks weird, and not just because of the placement of the WITH clause. I don't have any better ideas, but it's not too late to change if anyone else does... Regards, Dean
On 21 April 2017 at 01:21, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote: > On 04/21/2017 12:13 AM, Tom Lane wrote: >> >> Alvaro Herrera <alvherre@2ndquadrant.com> writes: >>> >>> Simon just pointed out that having the WITH clause appear in the middle >>> of the CREATE STATISTICS command looks odd; apparently somebody else >>> already complained on list about the same. Other commands put the WITH >>> clause at the end, so perhaps we should do likewise in the new command. >> >> >>> Here's a patch to implement that. I verified that if I change >>> qualified_name to qualified_name_list, bison does not complain about >>> conflicts, so this new syntax should support extension to multiple >>> relations without a problem. >> >> >> Yeah, WITH is fully reserved, so as long as the clause looks like >> WITH ( stuff... ) you're pretty much gonna be able to drop it >> wherever you want. >> >>> Discuss. >> >> >> +1 for WITH at the end; the existing syntax looks weird to me too. >> > > -1 from me > > I like the current syntax more, and WHERE ... WITH seems a bit weird to me. > But more importantly, one thing Dean probably considered when proposing the > current syntax was that we may add support for partial statistics, pretty > much like partial indexes. And we don't allow WITH at the end (after WHERE) > for indexes: > > test=# create index on t (a) where a < 100 with (fillfactor=10); > ERROR: syntax error at or near "with" > LINE 1: create index on t (a) where a < 100 with (fillfactor=10); > ^ > test=# create index on t (a) with (fillfactor=10) where a < 100; OK, didn't know about WHERE clause; makes sense. Currently WITH is supported in two places, which feels definitely wrong. The WITH clause is used elsewhere to provide optional parameters, and if there are none present it is optional. OK, so lets try... 1. No keyword at all, just list of statistics we store (i.e. just lose the WITH) CREATE STATISTICS s1 (dependencies, ndistinct) ON (a, b) FROM t1 WHERE partial-stuff; and if there are options, use the WITH for the optional parameters like this CREATE STATISTICS s1 (dependencies, ndistinct) WITH (options) ON (a, b) FROM t1 WHERE partial-stuff; 2. USING keyword, no brackets CREATE STATISTICS s1 USING (dependencies, ndistinct) ON (a, b) FROM t1 WHERE partial-stuff; and if there are options, use the WITH for the optional parameters like this CREATE STATISTICS s1 USING (dependencies, ndistinct) WITH (options) ON (a, b) FROM t1 WHERE partial-stuff; I think I like (2) -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
2017-04-22 11:30 GMT+02:00 Simon Riggs <simon@2ndquadrant.com>:
On 21 April 2017 at 01:21, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote:
> On 04/21/2017 12:13 AM, Tom Lane wrote:
>>
>> Alvaro Herrera <alvherre@2ndquadrant.com> writes:
>>>
>>> Simon just pointed out that having the WITH clause appear in the middle
>>> of the CREATE STATISTICS command looks odd; apparently somebody else
>>> already complained on list about the same. Other commands put the WITH
>>> clause at the end, so perhaps we should do likewise in the new command.
>>
>>
>>> Here's a patch to implement that. I verified that if I change
>>> qualified_name to qualified_name_list, bison does not complain about
>>> conflicts, so this new syntax should support extension to multiple
>>> relations without a problem.
>>
>>
>> Yeah, WITH is fully reserved, so as long as the clause looks like
>> WITH ( stuff... ) you're pretty much gonna be able to drop it
>> wherever you want.
>>
>>> Discuss.
>>
>>
>> +1 for WITH at the end; the existing syntax looks weird to me too.
>>
>
> -1 from me
>
> I like the current syntax more, and WHERE ... WITH seems a bit weird to me.
> But more importantly, one thing Dean probably considered when proposing the
> current syntax was that we may add support for partial statistics, pretty
> much like partial indexes. And we don't allow WITH at the end (after WHERE)
> for indexes:
>
> test=# create index on t (a) where a < 100 with (fillfactor=10);
> ERROR: syntax error at or near "with"
> LINE 1: create index on t (a) where a < 100 with (fillfactor=10);
> ^
> test=# create index on t (a) with (fillfactor=10) where a < 100;
OK, didn't know about WHERE clause; makes sense.
Currently WITH is supported in two places, which feels definitely
wrong. The WITH clause is used elsewhere to provide optional
parameters, and if there are none present it is optional.
OK, so lets try...
1.
No keyword at all, just list of statistics we store (i.e. just lose the WITH)
CREATE STATISTICS s1 (dependencies, ndistinct) ON (a, b) FROM t1 WHERE
partial-stuff;
and if there are options, use the WITH for the optional parameters like this
CREATE STATISTICS s1 (dependencies, ndistinct) WITH (options) ON (a,
b) FROM t1 WHERE partial-stuff;
2.
USING keyword, no brackets
CREATE STATISTICS s1 USING (dependencies, ndistinct) ON (a, b) FROM t1
WHERE partial-stuff;
and if there are options, use the WITH for the optional parameters like this
CREATE STATISTICS s1 USING (dependencies, ndistinct) WITH (options) ON
(a, b) FROM t1 WHERE partial-stuff;
I think I like (2)
+1
Regards
Pavel
--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 22 April 2017 at 21:30, Simon Riggs <simon@2ndquadrant.com> wrote: > CREATE STATISTICS s1 USING (dependencies, ndistinct) ON (a, b) FROM t1 > WHERE partial-stuff; +1 Seems much more CREATE INDEX like, and that's pretty good given that most of the complaints so far were about it bearing enough resemblance to CREATE INDEX -- David Rowley http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
Simon Riggs wrote: > 2. > USING keyword, no brackets > CREATE STATISTICS s1 USING (dependencies, ndistinct) ON (a, b) FROM t1 > WHERE partial-stuff; > > and if there are options, use the WITH for the optional parameters like this > CREATE STATISTICS s1 USING (dependencies, ndistinct) WITH (options) ON > (a, b) FROM t1 WHERE partial-stuff; > > > I think I like (2) OK, sounds sensible. Note that the USING list is also optional -- if you don't specify it, we default to creating all stat types. Also note that we currently don't have any option other than stat types, so the WITH would always be empty -- in other words we should remove it until we implement some option. (I can readily see two possible options to implement for pg11, so the omission of the WITH clause would be temporary: 1. sample size to use instead of the per-column values 2. whether to forcibly collect stats for all column for this stat object even if the column has gotten a SET STATISTICS 0 Surely there can be others.) Patch attached that adds the USING clause replacing the WITH clause, which is also optional and only accepts statistic types (it doesn't accept "foo = OFF" anymore, as it seems pointless, but I'm open to accepting it if people care about it.) (This patch removes WITH, but I verified that bison accepts having both. The second attached reversed patch is what I used for removal.) In the meantime, I noticed that pg_dump support for extstats is not covered, which I'll go fix next. -- Álvaro Herrera https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Attachment
Alvaro Herrera wrote: > In the meantime, I noticed that pg_dump support for extstats is not > covered, which I'll go fix next. Here I add one, which seems to work for me. Considering that Stephen missed a terminating semicolon for test with create_order 96 (the last one prior to my commit) in commit 31a8b77a9244, I propose that we change whatever is concatenating those strings append a terminating semicolon. (Surely we don't care about two tests stanzas writing a single SQL command by omitting the semicolon terminator.) I wonder if there's any rationale to the create_order numbers. Surely we only care for objects that depend on others. -- Álvaro Herrera https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Attachment
Alvaro, * Alvaro Herrera (alvherre@2ndquadrant.com) wrote: > Alvaro Herrera wrote: > > > In the meantime, I noticed that pg_dump support for extstats is not > > covered, which I'll go fix next. > > Here I add one, which seems to work for me. > > Considering that Stephen missed a terminating semicolon for test with > create_order 96 (the last one prior to my commit) in commit > 31a8b77a9244, I propose that we change whatever is concatenating those > strings append a terminating semicolon. (Surely we don't care about two > tests stanzas writing a single SQL command by omitting the semicolon > terminator.) Whoops, sorry about that. Yes, we could pretty easily add that. The create SQL is built up at the bottom of 002_pg_dump.pl: $create_sql .= $tests{$test}->{create_sql}; > I wonder if there's any rationale to the create_order numbers. Surely > we only care for objects that depend on others. Yes, it was just a way to manage those dependencies. If there's value in doing something more complicated then we could certainly do that, but I'm not sure why it would be necessary to add that complexity. Thanks! Stephen
Alvaro Herrera wrote: > Simon Riggs wrote: > > > 2. > > USING keyword, no brackets > > CREATE STATISTICS s1 USING (dependencies, ndistinct) ON (a, b) FROM t1 > > WHERE partial-stuff; > > > > and if there are options, use the WITH for the optional parameters like this > > CREATE STATISTICS s1 USING (dependencies, ndistinct) WITH (options) ON > > (a, b) FROM t1 WHERE partial-stuff; > > > > > > I think I like (2) > > OK, sounds sensible. Yawn. So, we have not achieved the stated goal which was to get rid of the ugly clause in the middle of the command; moreover we have gained *yet another* clause in the middle of the command. Is this really an improvement? We're trading this CREATE STATISTICS s1 WITH (dependencies, ndistinct, options) ON (a, b) FROM t1 WHERE partial-stuff; for this: CREATE STATISTICS s1 USING (dependencies, ndistinct) WITH (options) ON (a, b) FROM t1 WHERE partial-stuff; Can we decide by a show of hands, please, whether we're really on board with this change? -- Álvaro Herrera https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Stephen Frost wrote: > > Here I add one, which seems to work for me. Pushed it -- I also added another one which specifies options, to test WITH handling in ruleutils. > > Considering that Stephen missed a terminating semicolon for test with > > create_order 96 (the last one prior to my commit) in commit > > 31a8b77a9244, I propose that we change whatever is concatenating those > > strings append a terminating semicolon. (Surely we don't care about two > > tests stanzas writing a single SQL command by omitting the semicolon > > terminator.) > > Whoops, sorry about that. Yes, we could pretty easily add that. The > create SQL is built up at the bottom of 002_pg_dump.pl: > > $create_sql .= $tests{$test}->{create_sql}; Okay, I added it. -- Álvaro Herrera https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Alvaro Herrera <alvherre@2ndquadrant.com> writes: > Yawn. So, we have not achieved the stated goal which was to get rid of > the ugly clause in the middle of the command; moreover we have gained > *yet another* clause in the middle of the command. Is this really an > improvement? We're trading this > CREATE STATISTICS s1 WITH (dependencies, ndistinct, options) ON (a, b) FROM t1 WHERE partial-stuff; > for this: > CREATE STATISTICS s1 USING (dependencies, ndistinct) WITH (options) ON (a, b) FROM t1 WHERE partial-stuff; > Can we decide by a show of hands, please, whether we're really on board > with this change? That seems totally messy :-( I can't see any good reason why "WITH (options)" can't be at the end of the query. WITH is a fully reserved word, there is not going to be any danger of a parse problem from having it follow the FROM or WHERE clauses. And the end is where we have other instances of "WITH (options)". It also seems like we don't need to have *both* fully-reserved keywords introducing each clause *and* parentheses around the lists. Maybe dropping the parens around the stats-types list and the column-names list would help to declutter? (But I'd keep parens around the WITH options, for consistency with other statements.) One other point is that as long as we've got reserved keywords introducing each clause, there isn't actually an implementation reason why we couldn't accept the clauses in any order. Not sure I want to document it that way, but it might not be a bad thing if the grammar was forgiving about whether you write the USING or ON part first ... regards, tom lane
On 05/03/2017 04:42 PM, Tom Lane wrote: > Alvaro Herrera <alvherre@2ndquadrant.com> writes: >> Yawn. So, we have not achieved the stated goal which was to get rid of >> the ugly clause in the middle of the command; moreover we have gained >> *yet another* clause in the middle of the command. Is this really an >> improvement? We're trading this >> CREATE STATISTICS s1 WITH (dependencies, ndistinct, options) ON (a, b) FROM t1 WHERE partial-stuff; >> for this: >> CREATE STATISTICS s1 USING (dependencies, ndistinct) WITH (options) ON (a, b) FROM t1 WHERE partial-stuff; >> Can we decide by a show of hands, please, whether we're really on board >> with this change? > That seems totally messy :-( > > I can't see any good reason why "WITH (options)" can't be at the end of > the query. WITH is a fully reserved word, there is not going to be any > danger of a parse problem from having it follow the FROM or WHERE clauses. > And the end is where we have other instances of "WITH (options)". > > It also seems like we don't need to have *both* fully-reserved keywords > introducing each clause *and* parentheses around the lists. Maybe > dropping the parens around the stats-types list and the column-names > list would help to declutter? (But I'd keep parens around the WITH > options, for consistency with other statements.) > > One other point is that as long as we've got reserved keywords introducing > each clause, there isn't actually an implementation reason why we couldn't > accept the clauses in any order. Not sure I want to document it that way, > but it might not be a bad thing if the grammar was forgiving about whether > you write the USING or ON part first ... > > +1 for allowing arbitrary order of clauses. I would document it with the USING clause at the end, and have that be what psql supports and pg_dump produces. Since there are no WITH options now we should leave that out until it's required. cheers andrew -- Andrew Dunstan https://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Andrew Dunstan wrote: > On 05/03/2017 04:42 PM, Tom Lane wrote: > > One other point is that as long as we've got reserved keywords introducing > > each clause, there isn't actually an implementation reason why we couldn't > > accept the clauses in any order. Not sure I want to document it that way, > > but it might not be a bad thing if the grammar was forgiving about whether > > you write the USING or ON part first ... > > +1 for allowing arbitrary order of clauses. I would document it with the > USING clause at the end, and have that be what psql supports and pg_dump > produces. Since there are no WITH options now we should leave that out > until it's required. Ok, sounds good to me. Unless there are objections I'm going to have a shot at implementing this. Thanks for the discussion. -- Álvaro Herrera https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On 5/3/17 11:36 PM, Alvaro Herrera wrote: > Andrew Dunstan wrote: > >> On 05/03/2017 04:42 PM, Tom Lane wrote: > >>> One other point is that as long as we've got reserved keywords introducing >>> each clause, there isn't actually an implementation reason why we couldn't >>> accept the clauses in any order. Not sure I want to document it that way, >>> but it might not be a bad thing if the grammar was forgiving about whether >>> you write the USING or ON part first ... >> >> +1 for allowing arbitrary order of clauses. I would document it with the >> USING clause at the end, and have that be what psql supports and pg_dump >> produces. Since there are no WITH options now we should leave that out >> until it's required. > > Ok, sounds good to me. Unless there are objections I'm going to have a > shot at implementing this. Thanks for the discussion. > Works for me. Do you also plan to remove the parentheses for the USING clause? regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On 3 May 2017 at 23:31, Andrew Dunstan <andrew.dunstan@2ndquadrant.com> wrote: >> It also seems like we don't need to have *both* fully-reserved keywords >> introducing each clause *and* parentheses around the lists. Maybe >> dropping the parens around the stats-types list and the column-names >> list would help to declutter? (But I'd keep parens around the WITH >> options, for consistency with other statements.) +1 >> One other point is that as long as we've got reserved keywords introducing >> each clause, there isn't actually an implementation reason why we couldn't >> accept the clauses in any order. Not sure I want to document it that way, >> but it might not be a bad thing if the grammar was forgiving about whether >> you write the USING or ON part first ... > > +1 for allowing arbitrary order of clauses. +1 > I would document it with the > USING clause at the end, and have that be what psql supports and pg_dump > produces. Since there are no WITH options now we should leave that out > until it's required. Let's record the target syntax in parser comments so we can just slot things in when needed later, without rediscussion. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Here's a patch implementing this idea. From gram.y's comment, the support syntax is now: /***************************************************************************** * * QUERY : ! * CREATE STATISTICS stats_name [(stat types)] arguments ! ! * where 'arguments' can be one or more of: ! * { ON (columns) ! * | FROM relations ! * | WITH (options) ! * | WHERE expression } Note that I removed the USING keyword in the stat types list, and also made it mandatory that that list appears immediately after the new stats name. This should make it possible to have USING in the relation list (the FROM clause), if we allow explicit multiple relations with join syntax there. The other options can appear in any order. Also, both WITH and WHERE are accepted by the grammar, but immediately throw "feature not implemented" error at parse time. I was on the fence about adding copy/equal/out support for the new StatisticArgument node; it seems pointless because that node does not leave gram.y anyway. Unless there are objections, I'll push this tomorrow. -- Álvaro Herrera https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Attachment
Alvaro Herrera <alvherre@2ndquadrant.com> writes: > Here's a patch implementing this idea. From gram.y's comment, the > support syntax is now: > /***************************************************************************** > * > * QUERY : > ! * CREATE STATISTICS stats_name [(stat types)] arguments > ! > ! * where 'arguments' can be one or more of: > ! * { ON (columns) > ! * | FROM relations > ! * | WITH (options) > ! * | WHERE expression } > Note that I removed the USING keyword in the stat types list, and also > made it mandatory that that list appears immediately after the new stats > name. This should make it possible to have USING in the relation list > (the FROM clause), if we allow explicit multiple relations with join > syntax there. The other options can appear in any order. Hmm ... I'm not sure that I buy that particular argument. If you're concerned that the grammar could not handle "FROM x JOIN y USING (z)", wouldn't it also have a problem with "FROM x JOIN y ON (z)"? It might work anyway, since the grammar should know whether ON or USING is needed to complete the JOIN clause. But I think you'd better check whether the complete join syntax works there, even if we're not going to support it now. I'm not against what you've done here, because I had no love for USING in this context anyway; it conveys approximately nothing to the mind about what is in the list it's introducing. But I'm concerned whether we're boxing ourselves in by using ON. Actually, "ON" doesn't seem all that mnemonic either. Maybe "FOR" would be a good substitute, if it turns out that "ON" has a problem? regards, tom lane
On 04.05.2017 23:13, Tom Lane wrote: > I'm not against what you've done here, because I had no love for USING > in this context anyway; it conveys approximately nothing to the mind > about what is in the list it's introducing. But I'm concerned whether > we're boxing ourselves in by using ON. > > Actually, "ON" doesn't seem all that mnemonic either. Maybe "FOR" > would be a good substitute, if it turns out that "ON" has a problem? The whole syntax reminds me of a regular SELECT clause. So, SELECT? Also considering the most generic form of statistic support mentioned in [1], one could even thing about allowing aggregates, windowing functions etc, aka the full SELECT clause in the future. Sven [1] https://www.postgresql.org/message-id/CAEZATCUtGR+U5+QTwjHhe9rLG2nguEysHQ5NaqcK=VbJ78VQFA@mail.gmail.com
Tom Lane wrote: > Hmm ... I'm not sure that I buy that particular argument. If you're > concerned that the grammar could not handle "FROM x JOIN y USING (z)", > wouldn't it also have a problem with "FROM x JOIN y ON (z)"? > > It might work anyway, since the grammar should know whether ON or USING > is needed to complete the JOIN clause. But I think you'd better check > whether the complete join syntax works there, even if we're not going > to support it now. Tomas spent some time trying to shoehorn the whole join syntax into the FROM clause, but stopped once he realized that the joined_table production uses table_ref, which allow things like TABLESAMPLE, SRFs, LATERAL, etc which presumably we don't want to accept in CREATE STATS. I didn't look into it any further. But because of the other considerations, I did end up changing the ON to FOR. So the attached is the final version which I intend to push shortly. -- Álvaro Herrera https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Attachment
Alvaro Herrera <alvherre@2ndquadrant.com> writes: > Tom Lane wrote: >> Hmm ... I'm not sure that I buy that particular argument. If you're >> concerned that the grammar could not handle "FROM x JOIN y USING (z)", >> wouldn't it also have a problem with "FROM x JOIN y ON (z)"? > Tomas spent some time trying to shoehorn the whole join syntax into the > FROM clause, but stopped once he realized that the joined_table > production uses table_ref, which allow things like TABLESAMPLE, SRFs, > LATERAL, etc which presumably we don't want to accept in CREATE STATS. > I didn't look into it any further. But because of the other > considerations, I did end up changing the ON to FOR. Have you thought further about the upthread suggestion to just borrow SELECT's syntax lock stock and barrel? That is, it'd look something like CREATE STATISTICS name [(list of stats types)] expression-list FROM ... [ WHERE ... ] [ WITH (options) ] This would certainly support any FROM stuff we might want later. Yeah, you'd need to reject any features you weren't supporting at execution, but doing that would allow issuing a friendlier message than "syntax error" anyway. If you don't have the cycles for that, I'd be willing to look into it. regards, tom lane
Tom Lane wrote: > Have you thought further about the upthread suggestion to just borrow > SELECT's syntax lock stock and barrel? That is, it'd look something > like > > CREATE STATISTICS name [(list of stats types)] expression-list FROM ... > [ WHERE ... ] [ WITH (options) ] > > This would certainly support any FROM stuff we might want later. > Yeah, you'd need to reject any features you weren't supporting at > execution, but doing that would allow issuing a friendlier message than > "syntax error" anyway. Hmm, no, I hadn't looked at this angle. > If you don't have the cycles for that, I'd be willing to look into it. I'll give it a try today, and pass the baton if I'm unable to. -- Álvaro Herrera https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Tom Lane wrote: > Have you thought further about the upthread suggestion to just borrow > SELECT's syntax lock stock and barrel? That is, it'd look something > like > > CREATE STATISTICS name [(list of stats types)] expression-list FROM ... > [ WHERE ... ] [ WITH (options) ] > > This would certainly support any FROM stuff we might want later. > Yeah, you'd need to reject any features you weren't supporting at > execution, but doing that would allow issuing a friendlier message than > "syntax error" anyway. > > If you don't have the cycles for that, I'd be willing to look into it. Bison seems to like the productions below. Is this what you had in mind? These mostly mimic joined_table and table_ref, stripping out the rules that we don't need. Note that I had to put back the FOR keyword in there; without the FOR, that is "CREATE STATS any_name opt_name_list expr_list FROM" there was one shift/reduce conflict, which I tried to solve by splitting opt_name_list using two rules (one with "'(' name_list ')'" and one without), but that gave rise to two reduce/reduce conflicts, and I didn't see any way to deal with them. (Note that I ended up realizing that stats_type_list is unnecessary since we can use name_list.) CreateStatsStmt: CREATE opt_if_not_exists STATISTICS any_name opt_name_list FOR expr_list FROM stats_joined_table { CreateStatsStmt *n = makeNode(CreateStatsStmt); n->defnames= $4; n->stat_types = $6; n->if_not_exists = $2; stmt->relation = $9; stmt->keys = $7; $$ = (Node *)n; } ; stats_joined_table: '(' stats_joined_table ')' { } | stats_table_ref CROSS JOIN stats_table_ref { } | stats_table_ref join_type JOIN stats_table_ref join_qual { } | stats_table_ref JOIN stats_table_ref join_qual { } | stats_table_ref NATURALjoin_type JOIN stats_table_ref { } | stats_table_ref NATURAL JOIN table_ref { } ; stats_table_ref: relation_expr opt_alias_clause { } | stats_joined_table { } | '(' stats_joined_table ')' alias_clause { } ; -- Álvaro Herrera https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Alvaro Herrera <alvherre@2ndquadrant.com> writes: > Tom Lane wrote: >> Have you thought further about the upthread suggestion to just borrow >> SELECT's syntax lock stock and barrel? > Bison seems to like the productions below. Is this what you had in > mind? These mostly mimic joined_table and table_ref, stripping out the > rules that we don't need. I'd suggest just using the from_list production and then complaining at runtime if what you get is too complicated. Otherwise, you have to maintain a duplicate set of productions, and you're going to be unable to throw anything more informative than "syntax error" when somebody tries to exceed the implementation limits. > Note that I had to put back the FOR keyword in there; without the FOR, > that is "CREATE STATS any_name opt_name_list expr_list FROM" there was > one shift/reduce conflict, which I tried to solve by splitting > opt_name_list using two rules (one with "'(' name_list ')'" and one > without), but that gave rise to two reduce/reduce conflicts, and I didn't > see any way to deal with them. That's fine. I was going to suggest using ON after the stats type list just because it seemed to read better. FOR is about as good. > (Note that I ended up realizing that stats_type_list is unnecessary > since we can use name_list.) Check. regards, tom lane
Tom Lane wrote: > Alvaro Herrera <alvherre@2ndquadrant.com> writes: > > Tom Lane wrote: > >> Have you thought further about the upthread suggestion to just borrow > >> SELECT's syntax lock stock and barrel? > > > Bison seems to like the productions below. Is this what you had in > > mind? These mostly mimic joined_table and table_ref, stripping out the > > rules that we don't need. > > I'd suggest just using the from_list production and then complaining > at runtime if what you get is too complicated. Otherwise, you have > to maintain a duplicate set of productions, and you're going to be > unable to throw anything more informative than "syntax error" when > somebody tries to exceed the implementation limits. Hmm, yeah, makes sense. Here's a patch for this approach. I ended up using on ON again for the list of columns. I suppose the checks in CreateStatistics() could still be improved, but I'd go ahead and push this tomorrow morning and we can hammer those details later on, if it's still needed. Better avoid shipping beta with outdated grammar ... BTW the new castNode() family of macros don't work with Value nodes (because the tags are different depending on what's stored, but each type does not have its own struct. Oh well.) -- Álvaro Herrera https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Attachment
Alvaro Herrera <alvherre@2ndquadrant.com> writes: > BTW the new castNode() family of macros don't work with Value nodes > (because the tags are different depending on what's stored, but each > type does not have its own struct. Oh well.) Yeah. Value nodes are pretty much of a wart --- it's not clear to me that they add anything compared to just having a separate node type for each of the possible subclasses. It's never bothered anyone enough to try to replace them, though. I'll take a look at the patch later. regards, tom lane
Alvaro Herrera <alvherre@2ndquadrant.com> writes: > Hmm, yeah, makes sense. Here's a patch for this approach. I did some code review on this patch. The attached update makes the following hopefully-uncontroversial changes: * fix inconsistencies in the set of fields in CreateStatsStmt * get rid of struct CreateStatsArgument, which seems to be a leftover * tighten up parse checks in CreateStatistics(), and improve comments * add missing check for ownership of target table; the documentation says this is checked, and surely it is a security bug that it's not * adjust regression test to compensate for the above, because it fails otherwise * change end of regression test to suppress cascade drop messages; it's astonishing that this test hasn't caused intermittent buildfarm failures due to unstable drop order I did not review the rest of the regression test changes, nor the tab-complete changes. I can however report that DROP STATISTICS <tab> doesn't successfully complete any stats object names. Also, while reading the docs changes, I got rather ill from the inconsistent and not very grammatical handling of "statistics" as a noun, particularly the inability to decide from one sentence to the next if that is singular or plural. In the attached, I fixed this in the ref/*_statistics.sgml files by always saying "statistics object" instead. If people agree that that reads better, we should make an effort to propagate that terminology elsewhere in the docs, and into error messages, objectaddress.c output, etc. Although I've not done anything about it here, I'm not happy about the handling of dependencies for stats objects. I do not think that cloning RemoveStatistics as RemoveStatisticsExt is sane at all. The former is meant to deal with cleaning up pg_statistic rows that we know will be there, so there's no real need to expend pg_depend overhead to track them. For objects that are only loosely connected, the right thing is to use the dependency system; in particular, this design has no real chance of working well with cross-table stats. Also, it's really silly to have *both* this hard-wired mechanism and a pg_depend entry; the latter is surely redundant if you have the former. IMO we should revert RemoveStatisticsExt and instead handle things by making stats objects auto-dependent on the individual column(s) they reference (not the whole table). I'm also of the opinion that having an AUTO dependency, rather than a NORMAL dependency, on the stats object's schema is the wrong semantics. There isn't any other case where you can drop a non-empty schema without saying CASCADE, and I'm mystified why this case should act that way. Lastly, I tried the example given in the CREATE STATISTICS reference page, and it doesn't seem to work. Without the stats object, the two queries are both estimated at one matching row, whereas they really produce 100 and 0 rows respectively. With the stats object, they seem to both get estimated at ~100 rows, which is a considerable improvement for one case but a considerable disimprovement for the other. If this is not broken, I'd like to know why not. What's the point of an expensive extended- stats mechanism if it can't get this right? regards, tom lane diff --git a/doc/src/sgml/perform.sgml b/doc/src/sgml/perform.sgml index b10b734..32e17ee 100644 *** a/doc/src/sgml/perform.sgml --- b/doc/src/sgml/perform.sgml *************** WHERE tablename = 'road'; *** 1132,1139 **** To inspect functional dependencies on a statistics <literal>stts</literal>, you may do this: <programlisting> ! CREATE STATISTICS stts WITH (dependencies) ! ON (zip, city) FROM zipcodes; ANALYZE zipcodes; SELECT stxname, stxkeys, stxdependencies FROM pg_statistic_ext --- 1132,1139 ---- To inspect functional dependencies on a statistics <literal>stts</literal>, you may do this: <programlisting> ! CREATE STATISTICS stts (dependencies) ! ON zip, city FROM zipcodes; ANALYZE zipcodes; SELECT stxname, stxkeys, stxdependencies FROM pg_statistic_ext *************** EXPLAIN (ANALYZE, TIMING OFF) SELECT * F *** 1219,1226 **** Continuing the above example, the n-distinct coefficients in a ZIP code table may look like the following: <programlisting> ! CREATE STATISTICS stts2 WITH (ndistinct) ! ON (zip, state, city) FROM zipcodes; ANALYZE zipcodes; SELECT stxkeys AS k, stxndistinct AS nd FROM pg_statistic_ext --- 1219,1226 ---- Continuing the above example, the n-distinct coefficients in a ZIP code table may look like the following: <programlisting> ! CREATE STATISTICS stts2 (ndistinct) ! ON zip, state, city FROM zipcodes; ANALYZE zipcodes; SELECT stxkeys AS k, stxndistinct AS nd FROM pg_statistic_ext diff --git a/doc/src/sgml/planstats.sgml b/doc/src/sgml/planstats.sgml index 11580bf..ef847b9 100644 *** a/doc/src/sgml/planstats.sgml --- b/doc/src/sgml/planstats.sgml *************** EXPLAIN (ANALYZE, TIMING OFF) SELECT * F *** 526,532 **** multivariate statistics on the two columns: <programlisting> ! CREATE STATISTICS stts WITH (dependencies) ON (a, b) FROM t; ANALYZE t; EXPLAIN (ANALYZE, TIMING OFF) SELECT * FROM t WHERE a = 1 AND b = 1; QUERY PLAN --- 526,532 ---- multivariate statistics on the two columns: <programlisting> ! CREATE STATISTICS stts (dependencies) ON a, b FROM t; ANALYZE t; EXPLAIN (ANALYZE, TIMING OFF) SELECT * FROM t WHERE a = 1 AND b = 1; QUERY PLAN *************** EXPLAIN (ANALYZE, TIMING OFF) SELECT COU *** 569,575 **** calculation, the estimate is much improved: <programlisting> DROP STATISTICS stts; ! CREATE STATISTICS stts WITH (dependencies, ndistinct) ON (a, b) FROM t; ANALYZE t; EXPLAIN (ANALYZE, TIMING OFF) SELECT COUNT(*) FROM t GROUP BY a, b; QUERY PLAN --- 569,575 ---- calculation, the estimate is much improved: <programlisting> DROP STATISTICS stts; ! CREATE STATISTICS stts (dependencies, ndistinct) ON a, b FROM t; ANALYZE t; EXPLAIN (ANALYZE, TIMING OFF) SELECT COUNT(*) FROM t GROUP BY a, b; QUERY PLAN diff --git a/doc/src/sgml/ref/alter_statistics.sgml b/doc/src/sgml/ref/alter_statistics.sgml index 3e4d286..4f25669 100644 *** a/doc/src/sgml/ref/alter_statistics.sgml --- b/doc/src/sgml/ref/alter_statistics.sgml *************** PostgreSQL documentation *** 17,23 **** <refnamediv> <refname>ALTER STATISTICS</refname> <refpurpose> ! change the definition of a extended statistics </refpurpose> </refnamediv> --- 17,23 ---- <refnamediv> <refname>ALTER STATISTICS</refname> <refpurpose> ! change the definition of an extended statistics object </refpurpose> </refnamediv> *************** ALTER STATISTICS <replaceable class="par *** 34,52 **** <para> <command>ALTER STATISTICS</command> changes the parameters of an existing ! extended statistics. Any parameters not specifically set in the <command>ALTER STATISTICS</command> command retain their prior settings. </para> <para> ! You must own the statistics to use <command>ALTER STATISTICS</>. ! To change a statistics' schema, you must also have <literal>CREATE</> ! privilege on the new schema. To alter the owner, you must also be a direct or indirect member of the new owning role, and that role must have <literal>CREATE</literal> privilege on ! the statistics' schema. (These restrictions enforce that altering the owner ! doesn't do anything you couldn't do by dropping and recreating the statistics. ! However, a superuser can alter ownership of any statistics anyway.) </para> </refsect1> --- 34,53 ---- <para> <command>ALTER STATISTICS</command> changes the parameters of an existing ! extended statistics object. Any parameters not specifically set in the <command>ALTER STATISTICS</command> command retain their prior settings. </para> <para> ! You must own the statistics object to use <command>ALTER STATISTICS</>. ! To change a statistics object's schema, you must also ! have <literal>CREATE</> privilege on the new schema. To alter the owner, you must also be a direct or indirect member of the new owning role, and that role must have <literal>CREATE</literal> privilege on ! the statistics object's schema. (These restrictions enforce that altering ! the owner doesn't do anything you couldn't do by dropping and recreating ! the statistics object. However, a superuser can alter ownership of any ! statistics object anyway.) </para> </refsect1> *************** ALTER STATISTICS <replaceable class="par *** 59,65 **** <term><replaceable class="parameter">name</replaceable></term> <listitem> <para> ! The name (optionally schema-qualified) of the statistics to be altered. </para> </listitem> </varlistentry> --- 60,67 ---- <term><replaceable class="parameter">name</replaceable></term> <listitem> <para> ! The name (optionally schema-qualified) of the statistics object to be ! altered. </para> </listitem> </varlistentry> *************** ALTER STATISTICS <replaceable class="par *** 68,74 **** <term><replaceable class="PARAMETER">new_owner</replaceable></term> <listitem> <para> ! The user name of the new owner of the statistics. </para> </listitem> </varlistentry> --- 70,76 ---- <term><replaceable class="PARAMETER">new_owner</replaceable></term> <listitem> <para> ! The user name of the new owner of the statistics object. </para> </listitem> </varlistentry> *************** ALTER STATISTICS <replaceable class="par *** 77,83 **** <term><replaceable class="parameter">new_name</replaceable></term> <listitem> <para> ! The new name for the statistics. </para> </listitem> </varlistentry> --- 79,85 ---- <term><replaceable class="parameter">new_name</replaceable></term> <listitem> <para> ! The new name for the statistics object. </para> </listitem> </varlistentry> *************** ALTER STATISTICS <replaceable class="par *** 86,92 **** <term><replaceable class="parameter">new_schema</replaceable></term> <listitem> <para> ! The new schema for the statistics. </para> </listitem> </varlistentry> --- 88,94 ---- <term><replaceable class="parameter">new_schema</replaceable></term> <listitem> <para> ! The new schema for the statistics object. </para> </listitem> </varlistentry> *************** ALTER STATISTICS <replaceable class="par *** 99,105 **** <title>Compatibility</title> <para> ! There's no <command>ALTER STATISTICS</command> command in the SQL standard. </para> </refsect1> --- 101,107 ---- <title>Compatibility</title> <para> ! There is no <command>ALTER STATISTICS</command> command in the SQL standard. </para> </refsect1> diff --git a/doc/src/sgml/ref/create_statistics.sgml b/doc/src/sgml/ref/create_statistics.sgml index edbcf58..92ee4e4 100644 *** a/doc/src/sgml/ref/create_statistics.sgml --- b/doc/src/sgml/ref/create_statistics.sgml *************** PostgreSQL documentation *** 22,29 **** <refsynopsisdiv> <synopsis> CREATE STATISTICS [ IF NOT EXISTS ] <replaceable class="PARAMETER">statistics_name</replaceable> ! WITH ( <replaceable class="PARAMETER">option</replaceable> [= <replaceable class="PARAMETER">value</replaceable>] [,... ] ) ! ON ( <replaceable class="PARAMETER">column_name</replaceable>, <replaceable class="PARAMETER">column_name</replaceable>[, ...]) FROM <replaceable class="PARAMETER">table_name</replaceable> </synopsis> --- 22,29 ---- <refsynopsisdiv> <synopsis> CREATE STATISTICS [ IF NOT EXISTS ] <replaceable class="PARAMETER">statistics_name</replaceable> ! [ ( <replaceable class="PARAMETER">statistic_type</replaceable> [, ... ] ) ] ! ON <replaceable class="PARAMETER">column_name</replaceable>, <replaceable class="PARAMETER">column_name</replaceable>[, ...] FROM <replaceable class="PARAMETER">table_name</replaceable> </synopsis> *************** CREATE STATISTICS [ IF NOT EXISTS ] <rep *** 34,50 **** <para> <command>CREATE STATISTICS</command> will create a new extended statistics ! object on the specified table, foreign table or materialized view. ! The statistics will be created in the current database and ! will be owned by the user issuing the command. </para> <para> If a schema name is given (for example, <literal>CREATE STATISTICS ! myschema.mystat ...</>) then the statistics is created in the specified ! schema. Otherwise it is created in the current schema. The name of ! the statistics must be distinct from the name of any other statistics in the ! same schema. </para> </refsect1> --- 34,50 ---- <para> <command>CREATE STATISTICS</command> will create a new extended statistics ! object tracking data about the specified table, foreign table or ! materialized view. The statistics object will be created in the current ! database and will be owned by the user issuing the command. </para> <para> If a schema name is given (for example, <literal>CREATE STATISTICS ! myschema.mystat ...</>) then the statistics object is created in the ! specified schema. Otherwise it is created in the current schema. ! The name of the statistics object must be distinct from the name of any ! other statistics object in the same schema. </para> </refsect1> *************** CREATE STATISTICS [ IF NOT EXISTS ] <rep *** 57,66 **** <term><literal>IF NOT EXISTS</></term> <listitem> <para> ! Do not throw an error if a statistics with the same name already exists. ! A notice is issued in this case. Note that only the name of the ! statistics object is considered here. The definition of the statistics is ! not considered. </para> </listitem> </varlistentry> --- 57,66 ---- <term><literal>IF NOT EXISTS</></term> <listitem> <para> ! Do not throw an error if a statistics object with the same name already ! exists. A notice is issued in this case. Note that only the name of ! the statistics object is considered here, not the details of its ! definition. </para> </listitem> </varlistentry> *************** CREATE STATISTICS [ IF NOT EXISTS ] <rep *** 69,135 **** <term><replaceable class="PARAMETER">statistics_name</replaceable></term> <listitem> <para> ! The name (optionally schema-qualified) of the statistics to be created. ! </para> ! </listitem> ! </varlistentry> ! ! <varlistentry> ! <term><replaceable class="PARAMETER">column_name</replaceable></term> ! <listitem> ! <para> ! The name of a column to be included in the statistics. </para> </listitem> </varlistentry> <varlistentry> ! <term><replaceable class="PARAMETER">table_name</replaceable></term> <listitem> <para> ! The name (optionally schema-qualified) of the table the statistics should ! be created on. </para> </listitem> </varlistentry> - </variablelist> - - <refsect2 id="SQL-CREATESTATISTICS-parameters"> - <title id="SQL-CREATESTATISTICS-parameters-title">Parameters</title> - - <indexterm zone="sql-createstatistics-parameters"> - <primary>statistics parameters</primary> - </indexterm> - - <para> - The <literal>WITH</> clause can specify <firstterm>options</> - for the statistics. Available options are listed below. - </para> - - <variablelist> - <varlistentry> ! <term><literal>dependencies</> (<type>boolean</>)</term> <listitem> <para> ! Enables functional dependencies for the statistics. </para> </listitem> </varlistentry> <varlistentry> ! <term><literal>ndistinct</> (<type>boolean</>)</term> <listitem> <para> ! Enables ndistinct coefficients for the statistics. </para> </listitem> </varlistentry> ! </variablelist> ! ! </refsect2> </refsect1> <refsect1> --- 69,113 ---- <term><replaceable class="PARAMETER">statistics_name</replaceable></term> <listitem> <para> ! The name (optionally schema-qualified) of the statistics object to be ! created. </para> </listitem> </varlistentry> <varlistentry> ! <term><replaceable class="PARAMETER">statistic_type</replaceable></term> <listitem> <para> ! A statistic type to be computed in this statistics object. Currently ! supported types are <literal>ndistinct</literal>, which enables ! n-distinct coefficient tracking, ! and <literal>dependencies</literal>, which enables functional ! dependencies. </para> </listitem> </varlistentry> <varlistentry> ! <term><replaceable class="PARAMETER">column_name</replaceable></term> <listitem> <para> ! The name of a table column to be included in the statistics object. </para> </listitem> </varlistentry> <varlistentry> ! <term><replaceable class="PARAMETER">table_name</replaceable></term> <listitem> <para> ! The name (optionally schema-qualified) of the table containing the ! column(s) the statistics are computed on. </para> </listitem> </varlistentry> ! </variablelist> </refsect1> <refsect1> *************** CREATE TABLE t1 ( *** 158,164 **** INSERT INTO t1 SELECT i/100, i/500 FROM generate_series(1,1000000) s(i); ! CREATE STATISTICS s1 WITH (dependencies) ON (a, b) FROM t1; ANALYZE t1; --- 136,142 ---- INSERT INTO t1 SELECT i/100, i/500 FROM generate_series(1,1000000) s(i); ! CREATE STATISTICS s1 (dependencies) ON a, b FROM t1; ANALYZE t1; *************** EXPLAIN ANALYZE SELECT * FROM t1 WHERE ( *** 168,173 **** --- 146,156 ---- -- invalid combination of values EXPLAIN ANALYZE SELECT * FROM t1 WHERE (a = 1) AND (b = 1); </programlisting> + + Without functional-dependency statistics, the planner would make the + same estimate of the number of matching rows for these two queries. + With such statistics, it is able to tell that one case has matches + and the other does not. </para> </refsect1> *************** EXPLAIN ANALYZE SELECT * FROM t1 WHERE ( *** 176,182 **** <title>Compatibility</title> <para> ! There's no <command>CREATE STATISTICS</command> command in the SQL standard. </para> </refsect1> --- 159,165 ---- <title>Compatibility</title> <para> ! There is no <command>CREATE STATISTICS</command> command in the SQL standard. </para> </refsect1> diff --git a/doc/src/sgml/ref/drop_statistics.sgml b/doc/src/sgml/ref/drop_statistics.sgml index 98c3381..ef659fc 100644 *** a/doc/src/sgml/ref/drop_statistics.sgml --- b/doc/src/sgml/ref/drop_statistics.sgml *************** DROP STATISTICS [ IF EXISTS ] <replaceab *** 29,37 **** <title>Description</title> <para> ! <command>DROP STATISTICS</command> removes statistics from the database. ! Only the statistics owner, the schema owner, and superuser can drop a ! statistics. </para> </refsect1> --- 29,37 ---- <title>Description</title> <para> ! <command>DROP STATISTICS</command> removes statistics object(s) from the ! database. Only the statistics object's owner, the schema owner, or a ! superuser can drop a statistics object. </para> </refsect1> *************** DROP STATISTICS [ IF EXISTS ] <replaceab *** 44,51 **** <term><literal>IF EXISTS</literal></term> <listitem> <para> ! Do not throw an error if the statistics do not exist. A notice is ! issued in this case. </para> </listitem> </varlistentry> --- 44,51 ---- <term><literal>IF EXISTS</literal></term> <listitem> <para> ! Do not throw an error if the statistics object does not exist. A notice ! is issued in this case. </para> </listitem> </varlistentry> *************** DROP STATISTICS [ IF EXISTS ] <replaceab *** 54,60 **** <term><replaceable class="PARAMETER">name</replaceable></term> <listitem> <para> ! The name (optionally schema-qualified) of the statistics to drop. </para> </listitem> </varlistentry> --- 54,60 ---- <term><replaceable class="PARAMETER">name</replaceable></term> <listitem> <para> ! The name (optionally schema-qualified) of the statistics object to drop. </para> </listitem> </varlistentry> *************** DROP STATISTICS [ IF EXISTS ] <replaceab *** 66,72 **** <title>Examples</title> <para> ! To destroy two statistics objects on different schemas, without failing if they don't exist: <programlisting> --- 66,72 ---- <title>Examples</title> <para> ! To destroy two statistics objects in different schemas, without failing if they don't exist: <programlisting> *************** DROP STATISTICS IF EXISTS *** 82,88 **** <title>Compatibility</title> <para> ! There's no <command>DROP STATISTICS</command> command in the SQL standard. </para> </refsect1> --- 82,88 ---- <title>Compatibility</title> <para> ! There is no <command>DROP STATISTICS</command> command in the SQL standard. </para> </refsect1> diff --git a/src/backend/commands/statscmds.c b/src/backend/commands/statscmds.c index 0b9c33e..662b4fa 100644 *** a/src/backend/commands/statscmds.c --- b/src/backend/commands/statscmds.c *************** CreateStatistics(CreateStatsStmt *stmt) *** 60,66 **** bool nulls[Natts_pg_statistic_ext]; int2vector *stxkeys; Relation statrel; ! Relation rel; Oid relid; ObjectAddress parentobject, childobject; --- 60,66 ---- bool nulls[Natts_pg_statistic_ext]; int2vector *stxkeys; Relation statrel; ! Relation rel = NULL; Oid relid; ObjectAddress parentobject, childobject; *************** CreateStatistics(CreateStatsStmt *stmt) *** 71,77 **** bool build_dependencies; bool requested_type = false; int i; ! ListCell *l; Assert(IsA(stmt, CreateStatsStmt)); --- 71,77 ---- bool build_dependencies; bool requested_type = false; int i; ! ListCell *cell; Assert(IsA(stmt, CreateStatsStmt)); *************** CreateStatistics(CreateStatsStmt *stmt) *** 101,135 **** } /* ! * CREATE STATISTICS will influence future execution plans but does not ! * interfere with currently executing plans. So it should be enough to ! * take only ShareUpdateExclusiveLock on relation, conflicting with ! * ANALYZE and other DDL that sets statistical information, but not with ! * normal queries. */ ! rel = relation_openrv(stmt->relation, ShareUpdateExclusiveLock); ! relid = RelationGetRelid(rel); ! ! if (rel->rd_rel->relkind != RELKIND_RELATION && ! rel->rd_rel->relkind != RELKIND_MATVIEW && ! rel->rd_rel->relkind != RELKIND_FOREIGN_TABLE && ! rel->rd_rel->relkind != RELKIND_PARTITIONED_TABLE) ereport(ERROR, ! (errcode(ERRCODE_WRONG_OBJECT_TYPE), ! errmsg("relation \"%s\" is not a table, foreign table, or materialized view", ! RelationGetRelationName(rel)))); /* ! * Transform column names to array of attnums. While at it, enforce some ! * constraints. */ ! foreach(l, stmt->keys) { ! char *attname = strVal(lfirst(l)); HeapTuple atttuple; Form_pg_attribute attForm; TypeCacheEntry *type; atttuple = SearchSysCacheAttName(relid, attname); if (!HeapTupleIsValid(atttuple)) ereport(ERROR, --- 101,181 ---- } /* ! * Examine the FROM clause. Currently, we only allow it to be a single ! * simple table, but later we'll probably allow multiple tables and JOIN ! * syntax. The grammar is already prepared for that, so we have to check ! * here that what we got is what we can support. */ ! if (list_length(stmt->relations) != 1) ereport(ERROR, ! (errcode(ERRCODE_FEATURE_NOT_SUPPORTED), ! errmsg("only a single relation is allowed in CREATE STATISTICS"))); ! ! foreach(cell, stmt->relations) ! { ! Node *rln = (Node *) lfirst(cell); ! ! if (!IsA(rln, RangeVar)) ! ereport(ERROR, ! (errcode(ERRCODE_FEATURE_NOT_SUPPORTED), ! errmsg("only a single relation is allowed in CREATE STATISTICS"))); ! ! /* ! * CREATE STATISTICS will influence future execution plans but does ! * not interfere with currently executing plans. So it should be ! * enough to take only ShareUpdateExclusiveLock on relation, ! * conflicting with ANALYZE and other DDL that sets statistical ! * information, but not with normal queries. ! */ ! rel = relation_openrv((RangeVar *) rln, ShareUpdateExclusiveLock); ! ! /* Restrict to allowed relation types */ ! if (rel->rd_rel->relkind != RELKIND_RELATION && ! rel->rd_rel->relkind != RELKIND_MATVIEW && ! rel->rd_rel->relkind != RELKIND_FOREIGN_TABLE && ! rel->rd_rel->relkind != RELKIND_PARTITIONED_TABLE) ! ereport(ERROR, ! (errcode(ERRCODE_WRONG_OBJECT_TYPE), ! errmsg("relation \"%s\" is not a table, foreign table, or materialized view", ! RelationGetRelationName(rel)))); ! ! /* You must own the relation to create stats on it */ ! if (!pg_class_ownercheck(RelationGetRelid(rel), GetUserId())) ! aclcheck_error(ACLCHECK_NOT_OWNER, ACL_KIND_CLASS, ! RelationGetRelationName(rel)); ! } ! ! Assert(rel); ! relid = RelationGetRelid(rel); /* ! * Currently, we only allow simple column references in the expression ! * list. That will change someday, and again the grammar already supports ! * it so we have to enforce restrictions here. For now, we can convert ! * the expression list to a simple array of attnums. While at it, enforce ! * some constraints. */ ! foreach(cell, stmt->exprs) { ! Node *expr = (Node *) lfirst(cell); ! ColumnRef *cref; ! char *attname; HeapTuple atttuple; Form_pg_attribute attForm; TypeCacheEntry *type; + if (!IsA(expr, ColumnRef)) + ereport(ERROR, + (errcode(ERRCODE_FEATURE_NOT_SUPPORTED), + errmsg("only simple column references are allowed in CREATE STATISTICS"))); + cref = (ColumnRef *) expr; + + if (list_length(cref->fields) != 1) + ereport(ERROR, + (errcode(ERRCODE_FEATURE_NOT_SUPPORTED), + errmsg("only simple column references are allowed in CREATE STATISTICS"))); + attname = strVal((Value *) linitial(cref->fields)); + atttuple = SearchSysCacheAttName(relid, attname); if (!HeapTupleIsValid(atttuple)) ereport(ERROR, *************** CreateStatistics(CreateStatsStmt *stmt) *** 194,223 **** stxkeys = buildint2vector(attnums, numcols); /* ! * Parse the statistics options. Currently only statistics types are ! * recognized. */ build_ndistinct = false; build_dependencies = false; ! foreach(l, stmt->options) { ! DefElem *opt = (DefElem *) lfirst(l); ! if (strcmp(opt->defname, "ndistinct") == 0) { ! build_ndistinct = defGetBoolean(opt); requested_type = true; } ! else if (strcmp(opt->defname, "dependencies") == 0) { ! build_dependencies = defGetBoolean(opt); requested_type = true; } else ereport(ERROR, (errcode(ERRCODE_SYNTAX_ERROR), ! errmsg("unrecognized STATISTICS option \"%s\"", ! opt->defname))); } /* If no statistic type was specified, build them all. */ if (!requested_type) --- 240,268 ---- stxkeys = buildint2vector(attnums, numcols); /* ! * Parse the statistics types. */ build_ndistinct = false; build_dependencies = false; ! foreach(cell, stmt->stat_types) { ! char *type = strVal((Value *) lfirst(cell)); ! if (strcmp(type, "ndistinct") == 0) { ! build_ndistinct = true; requested_type = true; } ! else if (strcmp(type, "dependencies") == 0) { ! build_dependencies = true; requested_type = true; } else ereport(ERROR, (errcode(ERRCODE_SYNTAX_ERROR), ! errmsg("unrecognized statistics type \"%s\"", ! type))); } /* If no statistic type was specified, build them all. */ if (!requested_type) *************** CreateStatistics(CreateStatsStmt *stmt) *** 268,273 **** --- 313,320 ---- /* * Add a dependency on the table, so that stats get dropped on DROP TABLE. + * + * XXX don't we need dependencies on the specific columns, instead? */ ObjectAddressSet(parentobject, RelationRelationId, relid); ObjectAddressSet(childobject, StatisticExtRelationId, statoid); diff --git a/src/backend/nodes/copyfuncs.c b/src/backend/nodes/copyfuncs.c index 2d2a9d0..d13a6fc 100644 *** a/src/backend/nodes/copyfuncs.c --- b/src/backend/nodes/copyfuncs.c *************** _copyCreateStatsStmt(const CreateStatsSt *** 3389,3397 **** CreateStatsStmt *newnode = makeNode(CreateStatsStmt); COPY_NODE_FIELD(defnames); ! COPY_NODE_FIELD(relation); ! COPY_NODE_FIELD(keys); ! COPY_NODE_FIELD(options); COPY_SCALAR_FIELD(if_not_exists); return newnode; --- 3389,3397 ---- CreateStatsStmt *newnode = makeNode(CreateStatsStmt); COPY_NODE_FIELD(defnames); ! COPY_NODE_FIELD(stat_types); ! COPY_NODE_FIELD(exprs); ! COPY_NODE_FIELD(relations); COPY_SCALAR_FIELD(if_not_exists); return newnode; diff --git a/src/backend/nodes/equalfuncs.c b/src/backend/nodes/equalfuncs.c index b5459cd..c9a8c34 100644 *** a/src/backend/nodes/equalfuncs.c --- b/src/backend/nodes/equalfuncs.c *************** static bool *** 1349,1357 **** _equalCreateStatsStmt(const CreateStatsStmt *a, const CreateStatsStmt *b) { COMPARE_NODE_FIELD(defnames); ! COMPARE_NODE_FIELD(relation); ! COMPARE_NODE_FIELD(keys); ! COMPARE_NODE_FIELD(options); COMPARE_SCALAR_FIELD(if_not_exists); return true; --- 1349,1357 ---- _equalCreateStatsStmt(const CreateStatsStmt *a, const CreateStatsStmt *b) { COMPARE_NODE_FIELD(defnames); ! COMPARE_NODE_FIELD(stat_types); ! COMPARE_NODE_FIELD(exprs); ! COMPARE_NODE_FIELD(relations); COMPARE_SCALAR_FIELD(if_not_exists); return true; diff --git a/src/backend/nodes/outfuncs.c b/src/backend/nodes/outfuncs.c index 98f6768..3d5b09a 100644 *** a/src/backend/nodes/outfuncs.c --- b/src/backend/nodes/outfuncs.c *************** _outCreateStatsStmt(StringInfo str, cons *** 2639,2647 **** WRITE_NODE_TYPE("CREATESTATSSTMT"); WRITE_NODE_FIELD(defnames); ! WRITE_NODE_FIELD(relation); ! WRITE_NODE_FIELD(keys); ! WRITE_NODE_FIELD(options); WRITE_BOOL_FIELD(if_not_exists); } --- 2639,2647 ---- WRITE_NODE_TYPE("CREATESTATSSTMT"); WRITE_NODE_FIELD(defnames); ! WRITE_NODE_FIELD(stat_types); ! WRITE_NODE_FIELD(exprs); ! WRITE_NODE_FIELD(relations); WRITE_BOOL_FIELD(if_not_exists); } diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y index 65c004c..c60b048 100644 *** a/src/backend/parser/gram.y --- b/src/backend/parser/gram.y *************** ExistingIndex: USING INDEX index_name *** 3836,3866 **** /***************************************************************************** * * QUERY : ! * CREATE STATISTICS stats_name WITH (options) ON (columns) FROM relname * *****************************************************************************/ ! ! CreateStatsStmt: CREATE STATISTICS any_name opt_reloptions ON '(' columnList ')' FROM qualified_name ! { ! CreateStatsStmt *n = makeNode(CreateStatsStmt); ! n->defnames = $3; ! n->relation = $10; ! n->keys = $7; ! n->options = $4; ! n->if_not_exists = false; ! $$ = (Node *)n; ! } ! | CREATE STATISTICS IF_P NOT EXISTS any_name opt_reloptions ON '(' columnList ')' FROM qualified_name ! { ! CreateStatsStmt *n = makeNode(CreateStatsStmt); ! n->defnames = $6; ! n->relation = $13; ! n->keys = $10; ! n->options = $7; ! n->if_not_exists = true; ! $$ = (Node *)n; ! } ; /***************************************************************************** --- 3836,3864 ---- /***************************************************************************** * * QUERY : ! * CREATE STATISTICS stats_name [(stat types)] ! * ON expression-list FROM from_list ! * ! * Note: the expectation here is that the clauses after ON are a subset of ! * SELECT syntax, allowing for expressions and joined tables, and probably ! * someday a WHERE clause. Much less than that is currently implemented, ! * but the grammar accepts it and then we'll throw FEATURE_NOT_SUPPORTED ! * errors as necessary at execution. * *****************************************************************************/ ! CreateStatsStmt: ! CREATE opt_if_not_exists STATISTICS any_name ! opt_name_list ON expr_list FROM from_list ! { ! CreateStatsStmt *n = makeNode(CreateStatsStmt); ! n->defnames = $4; ! n->stat_types = $5; ! n->exprs = $7; ! n->relations = $9; ! n->if_not_exists = $2; ! $$ = (Node *)n; ! } ; /***************************************************************************** diff --git a/src/backend/utils/adt/ruleutils.c b/src/backend/utils/adt/ruleutils.c index cbde1ff..983b980 100644 *** a/src/backend/utils/adt/ruleutils.c --- b/src/backend/utils/adt/ruleutils.c *************** pg_get_statisticsext_worker(Oid statexti *** 1504,1518 **** } /* ! * If any option is disabled, then we'll need to append a WITH clause to ! * show which options are enabled. We omit the WITH clause on purpose * when all options are enabled, so a pg_dump/pg_restore will create all * statistics types on a newer postgres version, if the statistics had all * options enabled on the original version. */ if (!ndistinct_enabled || !dependencies_enabled) { ! appendStringInfoString(&buf, " WITH ("); if (ndistinct_enabled) appendStringInfoString(&buf, "ndistinct"); else if (dependencies_enabled) --- 1504,1518 ---- } /* ! * If any option is disabled, then we'll need to append the types clause ! * to show which options are enabled. We omit the types clause on purpose * when all options are enabled, so a pg_dump/pg_restore will create all * statistics types on a newer postgres version, if the statistics had all * options enabled on the original version. */ if (!ndistinct_enabled || !dependencies_enabled) { ! appendStringInfoString(&buf, " ("); if (ndistinct_enabled) appendStringInfoString(&buf, "ndistinct"); else if (dependencies_enabled) *************** pg_get_statisticsext_worker(Oid statexti *** 1521,1527 **** appendStringInfoChar(&buf, ')'); } ! appendStringInfoString(&buf, " ON ("); for (colno = 0; colno < statextrec->stxkeys.dim1; colno++) { --- 1521,1527 ---- appendStringInfoChar(&buf, ')'); } ! appendStringInfoString(&buf, " ON "); for (colno = 0; colno < statextrec->stxkeys.dim1; colno++) { *************** pg_get_statisticsext_worker(Oid statexti *** 1536,1542 **** appendStringInfoString(&buf, quote_identifier(attname)); } ! appendStringInfo(&buf, ") FROM %s", generate_relation_name(statextrec->stxrelid, NIL)); ReleaseSysCache(statexttup); --- 1536,1542 ---- appendStringInfoString(&buf, quote_identifier(attname)); } ! appendStringInfo(&buf, " FROM %s", generate_relation_name(statextrec->stxrelid, NIL)); ReleaseSysCache(statexttup); diff --git a/src/bin/pg_dump/t/002_pg_dump.pl b/src/bin/pg_dump/t/002_pg_dump.pl index ce0c9ef..f88832e 100644 *** a/src/bin/pg_dump/t/002_pg_dump.pl --- b/src/bin/pg_dump/t/002_pg_dump.pl *************** qr/CREATE TRANSFORM FOR integer LANGUAGE *** 4957,4965 **** catch_all => 'CREATE ... commands', create_order => 97, create_sql => 'CREATE STATISTICS dump_test.test_ext_stats_no_options ! ON (col1, col2) FROM dump_test.test_fifth_table', regexp => qr/^ ! \QCREATE STATISTICS dump_test.test_ext_stats_no_options ON (col1, col2) FROM test_fifth_table;\E /xms, like => { binary_upgrade => 1, --- 4957,4965 ---- catch_all => 'CREATE ... commands', create_order => 97, create_sql => 'CREATE STATISTICS dump_test.test_ext_stats_no_options ! ON col1, col2 FROM dump_test.test_fifth_table', regexp => qr/^ ! \QCREATE STATISTICS dump_test.test_ext_stats_no_options ON col1, col2 FROM test_fifth_table;\E /xms, like => { binary_upgrade => 1, *************** qr/CREATE TRANSFORM FOR integer LANGUAGE *** 4990,4999 **** all_runs => 1, catch_all => 'CREATE ... commands', create_order => 97, ! create_sql => 'CREATE STATISTICS dump_test.test_ext_stats_using ! WITH (ndistinct) ON (col1, col2) FROM dump_test.test_fifth_table', regexp => qr/^ ! \QCREATE STATISTICS dump_test.test_ext_stats_using WITH (ndistinct) ON (col1, col2) FROM test_fifth_table;\E /xms, like => { binary_upgrade => 1, --- 4990,4999 ---- all_runs => 1, catch_all => 'CREATE ... commands', create_order => 97, ! create_sql => 'CREATE STATISTICS dump_test.test_ext_stats_opts ! (ndistinct) ON col1, col2 FROM dump_test.test_fifth_table', regexp => qr/^ ! \QCREATE STATISTICS dump_test.test_ext_stats_opts (ndistinct) ON col1, col2 FROM test_fifth_table;\E /xms, like => { binary_upgrade => 1, diff --git a/src/bin/psql/describe.c b/src/bin/psql/describe.c index 13395f5..386af61 100644 *** a/src/bin/psql/describe.c --- b/src/bin/psql/describe.c *************** describeOneTableDetails(const char *sche *** 2355,2362 **** { printfPQExpBuffer(&buf, "SELECT oid, " "stxnamespace::pg_catalog.regnamespace AS nsp, " ! "stxname, stxkeys,\n" " (SELECT pg_catalog.string_agg(pg_catalog.quote_ident(attname),', ')\n" " FROM pg_catalog.unnest(stxkeys) s(attnum)\n" " JOIN pg_catalog.pg_attribute a ON (stxrelid = a.attrelid AND\n" --- 2355,2363 ---- { printfPQExpBuffer(&buf, "SELECT oid, " + "stxrelid::pg_catalog.regclass, " "stxnamespace::pg_catalog.regnamespace AS nsp, " ! "stxname,\n" " (SELECT pg_catalog.string_agg(pg_catalog.quote_ident(attname),', ')\n" " FROM pg_catalog.unnest(stxkeys) s(attnum)\n" " JOIN pg_catalog.pg_attribute a ON (stxrelid = a.attrelid AND\n" *************** describeOneTableDetails(const char *sche *** 2385,2393 **** printfPQExpBuffer(&buf, " "); /* statistics name (qualified with namespace) */ ! appendPQExpBuffer(&buf, "\"%s.%s\" WITH (", ! PQgetvalue(result, i, 1), ! PQgetvalue(result, i, 2)); /* options */ if (strcmp(PQgetvalue(result, i, 5), "t") == 0) --- 2386,2394 ---- printfPQExpBuffer(&buf, " "); /* statistics name (qualified with namespace) */ ! appendPQExpBuffer(&buf, "\"%s\".\"%s\" (", ! PQgetvalue(result, i, 2), ! PQgetvalue(result, i, 3)); /* options */ if (strcmp(PQgetvalue(result, i, 5), "t") == 0) *************** describeOneTableDetails(const char *sche *** 2401,2408 **** appendPQExpBuffer(&buf, "%sdependencies", gotone ? ", " : ""); } ! appendPQExpBuffer(&buf, ") ON (%s)", ! PQgetvalue(result, i, 4)); printTableAddFooter(&cont, buf.data); } --- 2402,2410 ---- appendPQExpBuffer(&buf, "%sdependencies", gotone ? ", " : ""); } ! appendPQExpBuffer(&buf, ") ON %s FROM %s", ! PQgetvalue(result, i, 4), ! PQgetvalue(result, i, 1)); printTableAddFooter(&cont, buf.data); } diff --git a/src/bin/psql/tab-complete.c b/src/bin/psql/tab-complete.c index 3bd5277..7853c25 100644 *** a/src/bin/psql/tab-complete.c --- b/src/bin/psql/tab-complete.c *************** static const SchemaQuery Query_for_list_ *** 453,458 **** --- 453,473 ---- NULL }; + static const SchemaQuery Query_for_list_of_statistics = { + /* catname */ + "pg_catalog.pg_statistic_ext s", + /* selcondition */ + NULL, + /* viscondition */ + NULL, + /* namespace */ + "s.stxnamespace", + /* result */ + "pg_catalog.quote_ident(s.stxname)", + /* qualresult */ + NULL + }; + static const SchemaQuery Query_for_list_of_tables = { /* catname */ "pg_catalog.pg_class c", *************** static const pgsql_thing_t words_after_c *** 1023,1028 **** --- 1038,1044 ---- {"SCHEMA", Query_for_list_of_schemas}, {"SEQUENCE", NULL, &Query_for_list_of_sequences}, {"SERVER", Query_for_list_of_servers}, + {"STATISTICS", NULL, &Query_for_list_of_statistics}, {"SUBSCRIPTION", Query_for_list_of_subscriptions}, {"SYSTEM", NULL, NULL, THING_NO_CREATE | THING_NO_DROP}, {"TABLE", NULL, &Query_for_list_of_tables}, *************** psql_completion(const char *text, int st *** 1783,1788 **** --- 1799,1808 ---- else if (Matches5("ALTER", "RULE", MatchAny, "ON", MatchAny)) COMPLETE_WITH_CONST("RENAME TO"); + /* ALTER STATISTICS <name> */ + else if (Matches3("ALTER", "STATISTICS", MatchAny)) + COMPLETE_WITH_LIST3("OWNER TO", "RENAME TO", "SET SCHEMA"); + /* ALTER TRIGGER <name>, add ON */ else if (Matches3("ALTER", "TRIGGER", MatchAny)) COMPLETE_WITH_CONST("ON"); *************** psql_completion(const char *text, int st *** 2119,2125 **** {"ACCESS METHOD", "CAST", "COLLATION", "CONVERSION", "DATABASE", "EVENT TRIGGER", "EXTENSION", "FOREIGN DATA WRAPPER", "FOREIGN TABLE", ! "SERVER", "INDEX", "LANGUAGE", "POLICY", "PUBLICATION", "RULE", "SCHEMA", "SEQUENCE", "SUBSCRIPTION", "TABLE", "TYPE", "VIEW", "MATERIALIZED VIEW", "COLUMN", "AGGREGATE", "FUNCTION", "OPERATOR", "TRIGGER", "CONSTRAINT", "DOMAIN", "LARGE OBJECT", "TABLESPACE", "TEXT SEARCH", "ROLE", NULL}; --- 2139,2146 ---- {"ACCESS METHOD", "CAST", "COLLATION", "CONVERSION", "DATABASE", "EVENT TRIGGER", "EXTENSION", "FOREIGN DATA WRAPPER", "FOREIGN TABLE", ! "SERVER", "INDEX", "LANGUAGE", "POLICY", "PUBLICATION", "RULE", ! "SCHEMA", "SEQUENCE", "STATISTICS", "SUBSCRIPTION", "TABLE", "TYPE", "VIEW", "MATERIALIZED VIEW", "COLUMN", "AGGREGATE", "FUNCTION", "OPERATOR", "TRIGGER", "CONSTRAINT", "DOMAIN", "LARGE OBJECT", "TABLESPACE", "TEXT SEARCH", "ROLE", NULL}; *************** psql_completion(const char *text, int st *** 2383,2388 **** --- 2404,2422 ---- else if (Matches3("CREATE", "SERVER", MatchAny)) COMPLETE_WITH_LIST3("TYPE", "VERSION", "FOREIGN DATA WRAPPER"); + /* CREATE STATISTICS <name> */ + else if (Matches3("CREATE", "STATISTICS", MatchAny)) + COMPLETE_WITH_LIST2("(", "ON"); + else if (Matches4("CREATE", "STATISTICS", MatchAny, "(")) + COMPLETE_WITH_LIST2("ndistinct", "dependencies"); + else if (HeadMatches3("CREATE", "STATISTICS", MatchAny) && + previous_words[0][0] == '(' && + previous_words[0][strlen(previous_words[0]) - 1] == ')') + COMPLETE_WITH_CONST("ON"); + else if (HeadMatches3("CREATE", "STATISTICS", MatchAny) && + TailMatches1("FROM")) + COMPLETE_WITH_SCHEMA_QUERY(Query_for_list_of_tables, NULL); + /* CREATE TABLE --- is allowed inside CREATE SCHEMA, so use TailMatches */ /* Complete "CREATE TEMP/TEMPORARY" with the possible temp objects */ else if (TailMatches2("CREATE", "TEMP|TEMPORARY")) *************** psql_completion(const char *text, int st *** 2589,2595 **** /* DROP */ /* Complete DROP object with CASCADE / RESTRICT */ else if (Matches3("DROP", ! "COLLATION|CONVERSION|DOMAIN|EXTENSION|LANGUAGE|PUBLICATION|SCHEMA|SEQUENCE|SERVER|SUBSCRIPTION|TABLE|TYPE|VIEW", MatchAny) || Matches4("DROP", "ACCESS", "METHOD", MatchAny) || (Matches4("DROP", "AGGREGATE|FUNCTION", MatchAny, MatchAny) && --- 2623,2629 ---- /* DROP */ /* Complete DROP object with CASCADE / RESTRICT */ else if (Matches3("DROP", ! "COLLATION|CONVERSION|DOMAIN|EXTENSION|LANGUAGE|PUBLICATION|SCHEMA|SEQUENCE|SERVER|SUBSCRIPTION|STATISTICS|TABLE|TYPE|VIEW", MatchAny) || Matches4("DROP", "ACCESS", "METHOD", MatchAny) || (Matches4("DROP", "AGGREGATE|FUNCTION", MatchAny, MatchAny) && diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h index 46c23c2..d396be3 100644 *** a/src/include/nodes/parsenodes.h --- b/src/include/nodes/parsenodes.h *************** typedef struct CreateStatsStmt *** 2689,2698 **** { NodeTag type; List *defnames; /* qualified name (list of Value strings) */ ! RangeVar *relation; /* relation to build statistics on */ ! List *keys; /* String nodes naming referenced columns */ ! List *options; /* list of DefElem */ ! bool if_not_exists; /* do nothing if statistics already exists */ } CreateStatsStmt; /* ---------------------- --- 2689,2698 ---- { NodeTag type; List *defnames; /* qualified name (list of Value strings) */ ! List *stat_types; /* stat types (list of Value strings) */ ! List *exprs; /* expressions to build statistics on */ ! List *relations; /* rels to build stats on (list of RangeVar) */ ! bool if_not_exists; /* do nothing if stats name already exists */ } CreateStatsStmt; /* ---------------------- diff --git a/src/test/regress/expected/alter_generic.out b/src/test/regress/expected/alter_generic.out index a81a4ed..28e6916 100644 *** a/src/test/regress/expected/alter_generic.out --- b/src/test/regress/expected/alter_generic.out *************** DROP OPERATOR FAMILY alt_opf18 USING btr *** 501,508 **** -- SET SESSION AUTHORIZATION regress_alter_user1; CREATE TABLE alt_regress_1 (a INTEGER, b INTEGER); ! CREATE STATISTICS alt_stat1 ON (a, b) FROM alt_regress_1; ! CREATE STATISTICS alt_stat2 ON (a, b) FROM alt_regress_1; ALTER STATISTICS alt_stat1 RENAME TO alt_stat2; -- failed (name conflict) ERROR: statistics "alt_stat2" already exists in schema "alt_nsp1" ALTER STATISTICS alt_stat1 RENAME TO alt_stat3; -- failed (name conflict) --- 501,508 ---- -- SET SESSION AUTHORIZATION regress_alter_user1; CREATE TABLE alt_regress_1 (a INTEGER, b INTEGER); ! CREATE STATISTICS alt_stat1 ON a, b FROM alt_regress_1; ! CREATE STATISTICS alt_stat2 ON a, b FROM alt_regress_1; ALTER STATISTICS alt_stat1 RENAME TO alt_stat2; -- failed (name conflict) ERROR: statistics "alt_stat2" already exists in schema "alt_nsp1" ALTER STATISTICS alt_stat1 RENAME TO alt_stat3; -- failed (name conflict) *************** ERROR: must be member of role "regress_ *** 511,518 **** ALTER STATISTICS alt_stat2 OWNER TO regress_alter_user3; -- OK ALTER STATISTICS alt_stat2 SET SCHEMA alt_nsp2; -- OK SET SESSION AUTHORIZATION regress_alter_user2; ! CREATE STATISTICS alt_stat1 ON (a, b) FROM alt_regress_1; ! CREATE STATISTICS alt_stat2 ON (a, b) FROM alt_regress_1; ALTER STATISTICS alt_stat3 RENAME TO alt_stat4; -- failed (not owner) ERROR: must be owner of statistics alt_stat3 ALTER STATISTICS alt_stat1 RENAME TO alt_stat4; -- OK --- 511,519 ---- ALTER STATISTICS alt_stat2 OWNER TO regress_alter_user3; -- OK ALTER STATISTICS alt_stat2 SET SCHEMA alt_nsp2; -- OK SET SESSION AUTHORIZATION regress_alter_user2; ! CREATE TABLE alt_regress_2 (a INTEGER, b INTEGER); ! CREATE STATISTICS alt_stat1 ON a, b FROM alt_regress_2; ! CREATE STATISTICS alt_stat2 ON a, b FROM alt_regress_2; ALTER STATISTICS alt_stat3 RENAME TO alt_stat4; -- failed (not owner) ERROR: must be owner of statistics alt_stat3 ALTER STATISTICS alt_stat1 RENAME TO alt_stat4; -- OK *************** SELECT nspname, prsname *** 672,725 **** --- --- Cleanup resources --- DROP FOREIGN DATA WRAPPER alt_fdw2 CASCADE; - NOTICE: drop cascades to server alt_fserv2 DROP FOREIGN DATA WRAPPER alt_fdw3 CASCADE; - NOTICE: drop cascades to server alt_fserv3 DROP LANGUAGE alt_lang2 CASCADE; DROP LANGUAGE alt_lang3 CASCADE; - DROP LANGUAGE alt_lang4 CASCADE; - ERROR: language "alt_lang4" does not exist DROP SCHEMA alt_nsp1 CASCADE; - NOTICE: drop cascades to 27 other objects - DETAIL: drop cascades to function alt_func3(integer) - drop cascades to function alt_agg3(integer) - drop cascades to function alt_func4(integer) - drop cascades to function alt_func2(integer) - drop cascades to function alt_agg4(integer) - drop cascades to function alt_agg2(integer) - drop cascades to conversion alt_conv3 - drop cascades to conversion alt_conv4 - drop cascades to conversion alt_conv2 - drop cascades to operator @+@(integer,integer) - drop cascades to operator @-@(integer,integer) - drop cascades to operator family alt_opf3 for access method hash - drop cascades to operator family alt_opc1 for access method hash - drop cascades to operator family alt_opc2 for access method hash - drop cascades to operator family alt_opf4 for access method hash - drop cascades to operator family alt_opf2 for access method hash - drop cascades to table alt_regress_1 - drop cascades to text search dictionary alt_ts_dict3 - drop cascades to text search dictionary alt_ts_dict4 - drop cascades to text search dictionary alt_ts_dict2 - drop cascades to text search configuration alt_ts_conf3 - drop cascades to text search configuration alt_ts_conf4 - drop cascades to text search configuration alt_ts_conf2 - drop cascades to text search template alt_ts_temp3 - drop cascades to text search template alt_ts_temp2 - drop cascades to text search parser alt_ts_prs3 - drop cascades to text search parser alt_ts_prs2 DROP SCHEMA alt_nsp2 CASCADE; - NOTICE: drop cascades to 9 other objects - DETAIL: drop cascades to function alt_nsp2.alt_func2(integer) - drop cascades to function alt_nsp2.alt_agg2(integer) - drop cascades to conversion alt_conv2 - drop cascades to operator alt_nsp2.@-@(integer,integer) - drop cascades to operator family alt_nsp2.alt_opf2 for access method hash - drop cascades to text search dictionary alt_ts_dict2 - drop cascades to text search configuration alt_ts_conf2 - drop cascades to text search template alt_ts_temp2 - drop cascades to text search parser alt_ts_prs2 DROP USER regress_alter_user1; DROP USER regress_alter_user2; DROP USER regress_alter_user3; --- 673,685 ---- --- --- Cleanup resources --- + set client_min_messages to warning; -- suppress cascade notices DROP FOREIGN DATA WRAPPER alt_fdw2 CASCADE; DROP FOREIGN DATA WRAPPER alt_fdw3 CASCADE; DROP LANGUAGE alt_lang2 CASCADE; DROP LANGUAGE alt_lang3 CASCADE; DROP SCHEMA alt_nsp1 CASCADE; DROP SCHEMA alt_nsp2 CASCADE; DROP USER regress_alter_user1; DROP USER regress_alter_user2; DROP USER regress_alter_user3; diff --git a/src/test/regress/expected/object_address.out b/src/test/regress/expected/object_address.out index 40eeeed..dce8508 100644 *** a/src/test/regress/expected/object_address.out --- b/src/test/regress/expected/object_address.out *************** CREATE TRANSFORM FOR int LANGUAGE SQL ( *** 39,45 **** CREATE PUBLICATION addr_pub FOR TABLE addr_nsp.gentable; CREATE SUBSCRIPTION addr_sub CONNECTION '' PUBLICATION bar WITH (NOCONNECT, SLOT NAME = NONE); WARNING: tables were not subscribed, you will have to run ALTER SUBSCRIPTION ... REFRESH PUBLICATION to subscribe thetables ! CREATE STATISTICS addr_nsp.gentable_stat ON (a,b) FROM addr_nsp.gentable; -- test some error cases SELECT pg_get_object_address('stone', '{}', '{}'); ERROR: unrecognized object type "stone" --- 39,45 ---- CREATE PUBLICATION addr_pub FOR TABLE addr_nsp.gentable; CREATE SUBSCRIPTION addr_sub CONNECTION '' PUBLICATION bar WITH (NOCONNECT, SLOT NAME = NONE); WARNING: tables were not subscribed, you will have to run ALTER SUBSCRIPTION ... REFRESH PUBLICATION to subscribe thetables ! CREATE STATISTICS addr_nsp.gentable_stat ON a, b FROM addr_nsp.gentable; -- test some error cases SELECT pg_get_object_address('stone', '{}', '{}'); ERROR: unrecognized object type "stone" diff --git a/src/test/regress/expected/stats_ext.out b/src/test/regress/expected/stats_ext.out index 92ac84a..4ccdf21 100644 *** a/src/test/regress/expected/stats_ext.out --- b/src/test/regress/expected/stats_ext.out *************** *** 5,28 **** SET max_parallel_workers = 0; SET max_parallel_workers_per_gather = 0; SET work_mem = '128kB'; -- Ensure stats are dropped sanely CREATE TABLE ab1 (a INTEGER, b INTEGER, c INTEGER); ! CREATE STATISTICS ab1_a_b_stats ON (a, b) FROM ab1; DROP STATISTICS ab1_a_b_stats; CREATE SCHEMA regress_schema_2; ! CREATE STATISTICS regress_schema_2.ab1_a_b_stats ON (a, b) FROM ab1; -- Let's also verify the pg_get_statisticsextdef output looks sane. SELECT pg_get_statisticsextdef(oid) FROM pg_statistic_ext WHERE stxname = 'ab1_a_b_stats'; ! pg_get_statisticsextdef ! --------------------------------------------------------------------- ! CREATE STATISTICS regress_schema_2.ab1_a_b_stats ON (a, b) FROM ab1 (1 row) DROP STATISTICS regress_schema_2.ab1_a_b_stats; -- Ensure statistics are dropped when columns are ! CREATE STATISTICS ab1_b_c_stats ON (b, c) FROM ab1; ! CREATE STATISTICS ab1_a_b_c_stats ON (a, b, c) FROM ab1; ! CREATE STATISTICS ab1_a_b_stats ON (a, b) FROM ab1; ALTER TABLE ab1 DROP COLUMN a; \d ab1 Table "public.ab1" --- 5,53 ---- SET max_parallel_workers = 0; SET max_parallel_workers_per_gather = 0; SET work_mem = '128kB'; + -- Verify failures + CREATE STATISTICS tst; + ERROR: syntax error at or near ";" + LINE 1: CREATE STATISTICS tst; + ^ + CREATE STATISTICS tst ON a, b; + ERROR: syntax error at or near ";" + LINE 1: CREATE STATISTICS tst ON a, b; + ^ + CREATE STATISTICS tst FROM sometab; + ERROR: syntax error at or near "FROM" + LINE 1: CREATE STATISTICS tst FROM sometab; + ^ + CREATE STATISTICS tst ON a, b FROM nonexistant; + ERROR: relation "nonexistant" does not exist + CREATE STATISTICS tst ON a, b FROM pg_class; + ERROR: column "a" referenced in statistics does not exist + CREATE STATISTICS tst ON relname, relname, relnatts FROM pg_class; + ERROR: duplicate column name in statistics definition + CREATE STATISTICS tst ON relnatts + relpages FROM pg_class; + ERROR: only simple column references are allowed in CREATE STATISTICS + CREATE STATISTICS tst ON (relpages, reltuples) FROM pg_class; + ERROR: only simple column references are allowed in CREATE STATISTICS + CREATE STATISTICS tst (unrecognized) ON relname, relnatts FROM pg_class; + ERROR: unrecognized statistics type "unrecognized" -- Ensure stats are dropped sanely CREATE TABLE ab1 (a INTEGER, b INTEGER, c INTEGER); ! CREATE STATISTICS ab1_a_b_stats ON a, b FROM ab1; DROP STATISTICS ab1_a_b_stats; CREATE SCHEMA regress_schema_2; ! CREATE STATISTICS regress_schema_2.ab1_a_b_stats ON a, b FROM ab1; -- Let's also verify the pg_get_statisticsextdef output looks sane. SELECT pg_get_statisticsextdef(oid) FROM pg_statistic_ext WHERE stxname = 'ab1_a_b_stats'; ! pg_get_statisticsextdef ! ------------------------------------------------------------------- ! CREATE STATISTICS regress_schema_2.ab1_a_b_stats ON a, b FROM ab1 (1 row) DROP STATISTICS regress_schema_2.ab1_a_b_stats; -- Ensure statistics are dropped when columns are ! CREATE STATISTICS ab1_b_c_stats ON b, c FROM ab1; ! CREATE STATISTICS ab1_a_b_c_stats ON a, b, c FROM ab1; ! CREATE STATISTICS ab1_a_b_stats ON a, b FROM ab1; ALTER TABLE ab1 DROP COLUMN a; \d ab1 Table "public.ab1" *************** ALTER TABLE ab1 DROP COLUMN a; *** 31,44 **** b | integer | | | c | integer | | | Statistics: ! "public.ab1_b_c_stats" WITH (ndistinct, dependencies) ON (b, c) DROP TABLE ab1; -- Ensure things work sanely with SET STATISTICS 0 CREATE TABLE ab1 (a INTEGER, b INTEGER); ALTER TABLE ab1 ALTER a SET STATISTICS 0; INSERT INTO ab1 SELECT a, a%23 FROM generate_series(1, 1000) a; ! CREATE STATISTICS ab1_a_b_stats ON (a, b) FROM ab1; ANALYZE ab1; WARNING: extended statistics "public.ab1_a_b_stats" could not be collected for relation public.ab1 ALTER TABLE ab1 ALTER a SET STATISTICS -1; --- 56,69 ---- b | integer | | | c | integer | | | Statistics: ! "public"."ab1_b_c_stats" (ndistinct, dependencies) ON b, c FROM ab1 DROP TABLE ab1; -- Ensure things work sanely with SET STATISTICS 0 CREATE TABLE ab1 (a INTEGER, b INTEGER); ALTER TABLE ab1 ALTER a SET STATISTICS 0; INSERT INTO ab1 SELECT a, a%23 FROM generate_series(1, 1000) a; ! CREATE STATISTICS ab1_a_b_stats ON a, b FROM ab1; ANALYZE ab1; WARNING: extended statistics "public.ab1_a_b_stats" could not be collected for relation public.ab1 ALTER TABLE ab1 ALTER a SET STATISTICS -1; *************** CREATE SERVER extstats_dummy_srv FOREIGN *** 60,83 **** CREATE FOREIGN TABLE tststats.f (a int, b int, c text) SERVER extstats_dummy_srv; CREATE TABLE tststats.pt (a int, b int, c text) PARTITION BY RANGE (a, b); CREATE TABLE tststats.pt1 PARTITION OF tststats.pt FOR VALUES FROM (-10, -10) TO (10, 10); ! CREATE STATISTICS tststats.s1 ON (a, b) FROM tststats.t; ! CREATE STATISTICS tststats.s2 ON (a, b) FROM tststats.ti; ERROR: relation "ti" is not a table, foreign table, or materialized view ! CREATE STATISTICS tststats.s3 ON (a, b) FROM tststats.s; ERROR: relation "s" is not a table, foreign table, or materialized view ! CREATE STATISTICS tststats.s4 ON (a, b) FROM tststats.v; ERROR: relation "v" is not a table, foreign table, or materialized view ! CREATE STATISTICS tststats.s5 ON (a, b) FROM tststats.mv; ! CREATE STATISTICS tststats.s6 ON (a, b) FROM tststats.ty; ERROR: relation "ty" is not a table, foreign table, or materialized view ! CREATE STATISTICS tststats.s7 ON (a, b) FROM tststats.f; ! CREATE STATISTICS tststats.s8 ON (a, b) FROM tststats.pt; ! CREATE STATISTICS tststats.s9 ON (a, b) FROM tststats.pt1; DO $$ DECLARE relname text := reltoastrelid::regclass FROM pg_class WHERE oid = 'tststats.t'::regclass; BEGIN ! EXECUTE 'CREATE STATISTICS tststats.s10 ON (a, b) FROM ' || relname; EXCEPTION WHEN wrong_object_type THEN RAISE NOTICE 'stats on toast table not created'; END; --- 85,108 ---- CREATE FOREIGN TABLE tststats.f (a int, b int, c text) SERVER extstats_dummy_srv; CREATE TABLE tststats.pt (a int, b int, c text) PARTITION BY RANGE (a, b); CREATE TABLE tststats.pt1 PARTITION OF tststats.pt FOR VALUES FROM (-10, -10) TO (10, 10); ! CREATE STATISTICS tststats.s1 ON a, b FROM tststats.t; ! CREATE STATISTICS tststats.s2 ON a, b FROM tststats.ti; ERROR: relation "ti" is not a table, foreign table, or materialized view ! CREATE STATISTICS tststats.s3 ON a, b FROM tststats.s; ERROR: relation "s" is not a table, foreign table, or materialized view ! CREATE STATISTICS tststats.s4 ON a, b FROM tststats.v; ERROR: relation "v" is not a table, foreign table, or materialized view ! CREATE STATISTICS tststats.s5 ON a, b FROM tststats.mv; ! CREATE STATISTICS tststats.s6 ON a, b FROM tststats.ty; ERROR: relation "ty" is not a table, foreign table, or materialized view ! CREATE STATISTICS tststats.s7 ON a, b FROM tststats.f; ! CREATE STATISTICS tststats.s8 ON a, b FROM tststats.pt; ! CREATE STATISTICS tststats.s9 ON a, b FROM tststats.pt1; DO $$ DECLARE relname text := reltoastrelid::regclass FROM pg_class WHERE oid = 'tststats.t'::regclass; BEGIN ! EXECUTE 'CREATE STATISTICS tststats.s10 ON a, b FROM ' || relname; EXCEPTION WHEN wrong_object_type THEN RAISE NOTICE 'stats on toast table not created'; END; *************** EXPLAIN (COSTS off) *** 158,177 **** -> Seq Scan on ndistinct (5 rows) - -- unknown column - CREATE STATISTICS s10 ON (unknown_column) FROM ndistinct; - ERROR: column "unknown_column" referenced in statistics does not exist - -- single column - CREATE STATISTICS s10 ON (a) FROM ndistinct; - ERROR: extended statistics require at least 2 columns - -- single column, duplicated - CREATE STATISTICS s10 ON (a,a) FROM ndistinct; - ERROR: duplicate column name in statistics definition - -- two columns, one duplicated - CREATE STATISTICS s10 ON (a, a, b) FROM ndistinct; - ERROR: duplicate column name in statistics definition -- correct command ! CREATE STATISTICS s10 ON (a, b, c) FROM ndistinct; ANALYZE ndistinct; SELECT stxkind, stxndistinct FROM pg_statistic_ext WHERE stxrelid = 'ndistinct'::regclass; --- 183,190 ---- -> Seq Scan on ndistinct (5 rows) -- correct command ! CREATE STATISTICS s10 ON a, b, c FROM ndistinct; ANALYZE ndistinct; SELECT stxkind, stxndistinct FROM pg_statistic_ext WHERE stxrelid = 'ndistinct'::regclass; *************** EXPLAIN (COSTS off) *** 352,358 **** -> Seq Scan on ndistinct (3 rows) - DROP TABLE ndistinct; -- functional dependencies tests CREATE TABLE functional_dependencies ( filler1 TEXT, --- 365,370 ---- *************** EXPLAIN (COSTS OFF) *** 389,395 **** (2 rows) -- create statistics ! CREATE STATISTICS func_deps_stat WITH (dependencies) ON (a, b, c) FROM functional_dependencies; ANALYZE functional_dependencies; EXPLAIN (COSTS OFF) SELECT * FROM functional_dependencies WHERE a = 1 AND b = '1'; --- 401,407 ---- (2 rows) -- create statistics ! CREATE STATISTICS func_deps_stat (dependencies) ON a, b, c FROM functional_dependencies; ANALYZE functional_dependencies; EXPLAIN (COSTS OFF) SELECT * FROM functional_dependencies WHERE a = 1 AND b = '1'; *************** EXPLAIN (COSTS OFF) *** 432,438 **** (2 rows) -- create statistics ! CREATE STATISTICS func_deps_stat WITH (dependencies) ON (a, b, c) FROM functional_dependencies; ANALYZE functional_dependencies; EXPLAIN (COSTS OFF) SELECT * FROM functional_dependencies WHERE a = 1 AND b = '1'; --- 444,450 ---- (2 rows) -- create statistics ! CREATE STATISTICS func_deps_stat (dependencies) ON a, b, c FROM functional_dependencies; ANALYZE functional_dependencies; EXPLAIN (COSTS OFF) SELECT * FROM functional_dependencies WHERE a = 1 AND b = '1'; *************** EXPLAIN (COSTS OFF) *** 456,459 **** (5 rows) RESET random_page_cost; - DROP TABLE functional_dependencies; --- 468,470 ---- diff --git a/src/test/regress/sql/alter_generic.sql b/src/test/regress/sql/alter_generic.sql index 88e8d7e..342f828 100644 *** a/src/test/regress/sql/alter_generic.sql --- b/src/test/regress/sql/alter_generic.sql *************** DROP OPERATOR FAMILY alt_opf18 USING btr *** 438,445 **** -- SET SESSION AUTHORIZATION regress_alter_user1; CREATE TABLE alt_regress_1 (a INTEGER, b INTEGER); ! CREATE STATISTICS alt_stat1 ON (a, b) FROM alt_regress_1; ! CREATE STATISTICS alt_stat2 ON (a, b) FROM alt_regress_1; ALTER STATISTICS alt_stat1 RENAME TO alt_stat2; -- failed (name conflict) ALTER STATISTICS alt_stat1 RENAME TO alt_stat3; -- failed (name conflict) --- 438,445 ---- -- SET SESSION AUTHORIZATION regress_alter_user1; CREATE TABLE alt_regress_1 (a INTEGER, b INTEGER); ! CREATE STATISTICS alt_stat1 ON a, b FROM alt_regress_1; ! CREATE STATISTICS alt_stat2 ON a, b FROM alt_regress_1; ALTER STATISTICS alt_stat1 RENAME TO alt_stat2; -- failed (name conflict) ALTER STATISTICS alt_stat1 RENAME TO alt_stat3; -- failed (name conflict) *************** ALTER STATISTICS alt_stat2 OWNER TO regr *** 448,455 **** ALTER STATISTICS alt_stat2 SET SCHEMA alt_nsp2; -- OK SET SESSION AUTHORIZATION regress_alter_user2; ! CREATE STATISTICS alt_stat1 ON (a, b) FROM alt_regress_1; ! CREATE STATISTICS alt_stat2 ON (a, b) FROM alt_regress_1; ALTER STATISTICS alt_stat3 RENAME TO alt_stat4; -- failed (not owner) ALTER STATISTICS alt_stat1 RENAME TO alt_stat4; -- OK --- 448,456 ---- ALTER STATISTICS alt_stat2 SET SCHEMA alt_nsp2; -- OK SET SESSION AUTHORIZATION regress_alter_user2; ! CREATE TABLE alt_regress_2 (a INTEGER, b INTEGER); ! CREATE STATISTICS alt_stat1 ON a, b FROM alt_regress_2; ! CREATE STATISTICS alt_stat2 ON a, b FROM alt_regress_2; ALTER STATISTICS alt_stat3 RENAME TO alt_stat4; -- failed (not owner) ALTER STATISTICS alt_stat1 RENAME TO alt_stat4; -- OK *************** SELECT nspname, prsname *** 572,583 **** --- --- Cleanup resources --- DROP FOREIGN DATA WRAPPER alt_fdw2 CASCADE; DROP FOREIGN DATA WRAPPER alt_fdw3 CASCADE; DROP LANGUAGE alt_lang2 CASCADE; DROP LANGUAGE alt_lang3 CASCADE; - DROP LANGUAGE alt_lang4 CASCADE; DROP SCHEMA alt_nsp1 CASCADE; DROP SCHEMA alt_nsp2 CASCADE; --- 573,585 ---- --- --- Cleanup resources --- + set client_min_messages to warning; -- suppress cascade notices + DROP FOREIGN DATA WRAPPER alt_fdw2 CASCADE; DROP FOREIGN DATA WRAPPER alt_fdw3 CASCADE; DROP LANGUAGE alt_lang2 CASCADE; DROP LANGUAGE alt_lang3 CASCADE; DROP SCHEMA alt_nsp1 CASCADE; DROP SCHEMA alt_nsp2 CASCADE; diff --git a/src/test/regress/sql/object_address.sql b/src/test/regress/sql/object_address.sql index 6940392..70f5f1c 100644 *** a/src/test/regress/sql/object_address.sql --- b/src/test/regress/sql/object_address.sql *************** CREATE TRANSFORM FOR int LANGUAGE SQL ( *** 41,47 **** TO SQL WITH FUNCTION int4recv(internal)); CREATE PUBLICATION addr_pub FOR TABLE addr_nsp.gentable; CREATE SUBSCRIPTION addr_sub CONNECTION '' PUBLICATION bar WITH (NOCONNECT, SLOT NAME = NONE); ! CREATE STATISTICS addr_nsp.gentable_stat ON (a,b) FROM addr_nsp.gentable; -- test some error cases SELECT pg_get_object_address('stone', '{}', '{}'); --- 41,47 ---- TO SQL WITH FUNCTION int4recv(internal)); CREATE PUBLICATION addr_pub FOR TABLE addr_nsp.gentable; CREATE SUBSCRIPTION addr_sub CONNECTION '' PUBLICATION bar WITH (NOCONNECT, SLOT NAME = NONE); ! CREATE STATISTICS addr_nsp.gentable_stat ON a, b FROM addr_nsp.gentable; -- test some error cases SELECT pg_get_object_address('stone', '{}', '{}'); diff --git a/src/test/regress/sql/stats_ext.sql b/src/test/regress/sql/stats_ext.sql index 72c7659..4050f33 100644 *** a/src/test/regress/sql/stats_ext.sql --- b/src/test/regress/sql/stats_ext.sql *************** SET max_parallel_workers = 0; *** 7,19 **** SET max_parallel_workers_per_gather = 0; SET work_mem = '128kB'; -- Ensure stats are dropped sanely CREATE TABLE ab1 (a INTEGER, b INTEGER, c INTEGER); ! CREATE STATISTICS ab1_a_b_stats ON (a, b) FROM ab1; DROP STATISTICS ab1_a_b_stats; CREATE SCHEMA regress_schema_2; ! CREATE STATISTICS regress_schema_2.ab1_a_b_stats ON (a, b) FROM ab1; -- Let's also verify the pg_get_statisticsextdef output looks sane. SELECT pg_get_statisticsextdef(oid) FROM pg_statistic_ext WHERE stxname = 'ab1_a_b_stats'; --- 7,30 ---- SET max_parallel_workers_per_gather = 0; SET work_mem = '128kB'; + -- Verify failures + CREATE STATISTICS tst; + CREATE STATISTICS tst ON a, b; + CREATE STATISTICS tst FROM sometab; + CREATE STATISTICS tst ON a, b FROM nonexistant; + CREATE STATISTICS tst ON a, b FROM pg_class; + CREATE STATISTICS tst ON relname, relname, relnatts FROM pg_class; + CREATE STATISTICS tst ON relnatts + relpages FROM pg_class; + CREATE STATISTICS tst ON (relpages, reltuples) FROM pg_class; + CREATE STATISTICS tst (unrecognized) ON relname, relnatts FROM pg_class; + -- Ensure stats are dropped sanely CREATE TABLE ab1 (a INTEGER, b INTEGER, c INTEGER); ! CREATE STATISTICS ab1_a_b_stats ON a, b FROM ab1; DROP STATISTICS ab1_a_b_stats; CREATE SCHEMA regress_schema_2; ! CREATE STATISTICS regress_schema_2.ab1_a_b_stats ON a, b FROM ab1; -- Let's also verify the pg_get_statisticsextdef output looks sane. SELECT pg_get_statisticsextdef(oid) FROM pg_statistic_ext WHERE stxname = 'ab1_a_b_stats'; *************** SELECT pg_get_statisticsextdef(oid) FROM *** 21,29 **** DROP STATISTICS regress_schema_2.ab1_a_b_stats; -- Ensure statistics are dropped when columns are ! CREATE STATISTICS ab1_b_c_stats ON (b, c) FROM ab1; ! CREATE STATISTICS ab1_a_b_c_stats ON (a, b, c) FROM ab1; ! CREATE STATISTICS ab1_a_b_stats ON (a, b) FROM ab1; ALTER TABLE ab1 DROP COLUMN a; \d ab1 DROP TABLE ab1; --- 32,40 ---- DROP STATISTICS regress_schema_2.ab1_a_b_stats; -- Ensure statistics are dropped when columns are ! CREATE STATISTICS ab1_b_c_stats ON b, c FROM ab1; ! CREATE STATISTICS ab1_a_b_c_stats ON a, b, c FROM ab1; ! CREATE STATISTICS ab1_a_b_stats ON a, b FROM ab1; ALTER TABLE ab1 DROP COLUMN a; \d ab1 DROP TABLE ab1; *************** DROP TABLE ab1; *** 32,38 **** CREATE TABLE ab1 (a INTEGER, b INTEGER); ALTER TABLE ab1 ALTER a SET STATISTICS 0; INSERT INTO ab1 SELECT a, a%23 FROM generate_series(1, 1000) a; ! CREATE STATISTICS ab1_a_b_stats ON (a, b) FROM ab1; ANALYZE ab1; ALTER TABLE ab1 ALTER a SET STATISTICS -1; -- partial analyze doesn't build stats either --- 43,49 ---- CREATE TABLE ab1 (a INTEGER, b INTEGER); ALTER TABLE ab1 ALTER a SET STATISTICS 0; INSERT INTO ab1 SELECT a, a%23 FROM generate_series(1, 1000) a; ! CREATE STATISTICS ab1_a_b_stats ON a, b FROM ab1; ANALYZE ab1; ALTER TABLE ab1 ALTER a SET STATISTICS -1; -- partial analyze doesn't build stats either *************** CREATE FOREIGN TABLE tststats.f (a int, *** 55,74 **** CREATE TABLE tststats.pt (a int, b int, c text) PARTITION BY RANGE (a, b); CREATE TABLE tststats.pt1 PARTITION OF tststats.pt FOR VALUES FROM (-10, -10) TO (10, 10); ! CREATE STATISTICS tststats.s1 ON (a, b) FROM tststats.t; ! CREATE STATISTICS tststats.s2 ON (a, b) FROM tststats.ti; ! CREATE STATISTICS tststats.s3 ON (a, b) FROM tststats.s; ! CREATE STATISTICS tststats.s4 ON (a, b) FROM tststats.v; ! CREATE STATISTICS tststats.s5 ON (a, b) FROM tststats.mv; ! CREATE STATISTICS tststats.s6 ON (a, b) FROM tststats.ty; ! CREATE STATISTICS tststats.s7 ON (a, b) FROM tststats.f; ! CREATE STATISTICS tststats.s8 ON (a, b) FROM tststats.pt; ! CREATE STATISTICS tststats.s9 ON (a, b) FROM tststats.pt1; DO $$ DECLARE relname text := reltoastrelid::regclass FROM pg_class WHERE oid = 'tststats.t'::regclass; BEGIN ! EXECUTE 'CREATE STATISTICS tststats.s10 ON (a, b) FROM ' || relname; EXCEPTION WHEN wrong_object_type THEN RAISE NOTICE 'stats on toast table not created'; END; --- 66,85 ---- CREATE TABLE tststats.pt (a int, b int, c text) PARTITION BY RANGE (a, b); CREATE TABLE tststats.pt1 PARTITION OF tststats.pt FOR VALUES FROM (-10, -10) TO (10, 10); ! CREATE STATISTICS tststats.s1 ON a, b FROM tststats.t; ! CREATE STATISTICS tststats.s2 ON a, b FROM tststats.ti; ! CREATE STATISTICS tststats.s3 ON a, b FROM tststats.s; ! CREATE STATISTICS tststats.s4 ON a, b FROM tststats.v; ! CREATE STATISTICS tststats.s5 ON a, b FROM tststats.mv; ! CREATE STATISTICS tststats.s6 ON a, b FROM tststats.ty; ! CREATE STATISTICS tststats.s7 ON a, b FROM tststats.f; ! CREATE STATISTICS tststats.s8 ON a, b FROM tststats.pt; ! CREATE STATISTICS tststats.s9 ON a, b FROM tststats.pt1; DO $$ DECLARE relname text := reltoastrelid::regclass FROM pg_class WHERE oid = 'tststats.t'::regclass; BEGIN ! EXECUTE 'CREATE STATISTICS tststats.s10 ON a, b FROM ' || relname; EXCEPTION WHEN wrong_object_type THEN RAISE NOTICE 'stats on toast table not created'; END; *************** EXPLAIN (COSTS off) *** 113,132 **** EXPLAIN (COSTS off) SELECT COUNT(*) FROM ndistinct GROUP BY b, c, d; - -- unknown column - CREATE STATISTICS s10 ON (unknown_column) FROM ndistinct; - - -- single column - CREATE STATISTICS s10 ON (a) FROM ndistinct; - - -- single column, duplicated - CREATE STATISTICS s10 ON (a,a) FROM ndistinct; - - -- two columns, one duplicated - CREATE STATISTICS s10 ON (a, a, b) FROM ndistinct; - -- correct command ! CREATE STATISTICS s10 ON (a, b, c) FROM ndistinct; ANALYZE ndistinct; --- 124,131 ---- EXPLAIN (COSTS off) SELECT COUNT(*) FROM ndistinct GROUP BY b, c, d; -- correct command ! CREATE STATISTICS s10 ON a, b, c FROM ndistinct; ANALYZE ndistinct; *************** EXPLAIN (COSTS off) *** 202,209 **** EXPLAIN (COSTS off) SELECT COUNT(*) FROM ndistinct GROUP BY a, d; - DROP TABLE ndistinct; - -- functional dependencies tests CREATE TABLE functional_dependencies ( filler1 TEXT, --- 201,206 ---- *************** EXPLAIN (COSTS OFF) *** 233,239 **** SELECT * FROM functional_dependencies WHERE a = 1 AND b = '1' AND c = 1; -- create statistics ! CREATE STATISTICS func_deps_stat WITH (dependencies) ON (a, b, c) FROM functional_dependencies; ANALYZE functional_dependencies; --- 230,236 ---- SELECT * FROM functional_dependencies WHERE a = 1 AND b = '1' AND c = 1; -- create statistics ! CREATE STATISTICS func_deps_stat (dependencies) ON a, b, c FROM functional_dependencies; ANALYZE functional_dependencies; *************** EXPLAIN (COSTS OFF) *** 259,265 **** SELECT * FROM functional_dependencies WHERE a = 1 AND b = '1' AND c = 1; -- create statistics ! CREATE STATISTICS func_deps_stat WITH (dependencies) ON (a, b, c) FROM functional_dependencies; ANALYZE functional_dependencies; --- 256,262 ---- SELECT * FROM functional_dependencies WHERE a = 1 AND b = '1' AND c = 1; -- create statistics ! CREATE STATISTICS func_deps_stat (dependencies) ON a, b, c FROM functional_dependencies; ANALYZE functional_dependencies; *************** EXPLAIN (COSTS OFF) *** 270,273 **** SELECT * FROM functional_dependencies WHERE a = 1 AND b = '1' AND c = 1; RESET random_page_cost; - DROP TABLE functional_dependencies; --- 267,269 ---- -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
On 5/12/17 4:46 AM, Tom Lane wrote: > Alvaro Herrera <alvherre@2ndquadrant.com> writes: >> Hmm, yeah, makes sense. Here's a patch for this approach. > > ... > > Also, while reading the docs changes, I got rather ill from the > inconsistent and not very grammatical handling of "statistics" as a > noun, particularly the inability to decide from one sentence to the next > if that is singular or plural. In the attached, I fixed this in the > ref/*_statistics.sgml files by always saying "statistics object" instead. > If people agree that that reads better, we should make an effort to > propagate that terminology elsewhere in the docs, and into error messages, > objectaddress.c output, etc. > I'm fine with the 'statistics object' wording. I've been struggling with this bit while working on the patch, and I agree it sounds better and getting it consistent is definitely worthwhile. > > Although I've not done anything about it here, I'm not happy about the > handling of dependencies for stats objects. I do not think that cloning > RemoveStatistics as RemoveStatisticsExt is sane at all. The former is > meant to deal with cleaning up pg_statistic rows that we know will be > there, so there's no real need to expend pg_depend overhead to track them. > For objects that are only loosely connected, the right thing is to use > the dependency system; in particular, this design has no real chance of > working well with cross-table stats. Also, it's really silly to have > *both* this hard-wired mechanism and a pg_depend entry; the latter is > surely redundant if you have the former. IMO we should revert > RemoveStatisticsExt and instead handle things by making stats objects > auto-dependent on the individual column(s) they reference (not the whole > table). > > I'm also of the opinion that having an AUTO dependency, rather than > a NORMAL dependency, on the stats object's schema is the wrong semantics. > There isn't any other case where you can drop a non-empty schema without > saying CASCADE, and I'm mystified why this case should act that way. > Yeah, it's a bit frankensteinian ... > > Lastly, I tried the example given in the CREATE STATISTICS reference page, > and it doesn't seem to work. Without the stats object, the two queries > are both estimated at one matching row, whereas they really produce 100 > and 0 rows respectively. With the stats object, they seem to both get > estimated at ~100 rows, which is a considerable improvement for one case > but a considerable disimprovement for the other. If this is not broken, > I'd like to know why not. What's the point of an expensive extended- > stats mechanism if it can't get this right? > I assume you're talking about the functional dependencies and in that case that's expected behavior, because f. dependencies do assume the conditions are "consistent" with the functional dependencies. This is an inherent limitation of functional dependencies, and perhaps a price for the simplicity of this statistics type. If your application executes queries that are likely not consistent with this, don't use functional dependencies. The simplicity is why dependencies were implemented first, mostly to introduce all the infrastructure. The other statistics types (MCV, histograms) don't have this limitation, but those did not make it into pg10. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Tomas Vondra <tomas.vondra@2ndquadrant.com> writes: > On 5/12/17 4:46 AM, Tom Lane wrote: >> If people agree that that reads better, we should make an effort to >> propagate that terminology elsewhere in the docs, and into error messages, >> objectaddress.c output, etc. > I'm fine with the 'statistics object' wording. I've been struggling with > this bit while working on the patch, and I agree it sounds better and > getting it consistent is definitely worthwhile. OK. We can work on that as a followon patch. >> Although I've not done anything about it here, I'm not happy about the >> handling of dependencies for stats objects. > Yeah, it's a bit frankensteinian ... I'm prepared to create a fix for that, but it'd be easier to commit the current patch first, to avoid merge conflicts. >> Lastly, I tried the example given in the CREATE STATISTICS reference page, >> and it doesn't seem to work. > I assume you're talking about the functional dependencies and in that > case that's expected behavior, because f. dependencies do assume the > conditions are "consistent" with the functional dependencies. Hm. OK, but then that example is pretty misleading, because it leads the reader to suppose that the planner can tell the difference between the selectivities of the two queries. Maybe what's lacking is an explanation of how you'd use this statistics type. regards, tom lane
Tom Lane wrote: > Tomas Vondra <tomas.vondra@2ndquadrant.com> writes: > >> Although I've not done anything about it here, I'm not happy about the > >> handling of dependencies for stats objects. > > > Yeah, it's a bit frankensteinian ... > > I'm prepared to create a fix for that, but it'd be easier to commit the > current patch first, to avoid merge conflicts. It seems we're mostly in agreement regarding the parts I was touching. Do you want to push your version of the patch? > >> Lastly, I tried the example given in the CREATE STATISTICS reference page, > >> and it doesn't seem to work. > > > I assume you're talking about the functional dependencies and in that > > case that's expected behavior, because f. dependencies do assume the > > conditions are "consistent" with the functional dependencies. > > Hm. OK, but then that example is pretty misleading, because it leads > the reader to suppose that the planner can tell the difference between > the selectivities of the two queries. Maybe what's lacking is an > explanation of how you'd use this statistics type. I think we should remove the complex example from that page, and instead refer the reader to chapters 14 and 69. The CREATE STATISTICS page can just give some overview of what the syntax looks like, and all explanations of complex topics such as "what does it mean for a query to be consistent with the functional dependencies" can be given at length elsewhere. -- Álvaro Herrera https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Alvaro Herrera <alvherre@2ndquadrant.com> writes: > Tom Lane wrote: >> I'm prepared to create a fix for that, but it'd be easier to commit the >> current patch first, to avoid merge conflicts. > It seems we're mostly in agreement regarding the parts I was touching. > Do you want to push your version of the patch? It's still almost all your work, so I figured you should push it. >> Hm. OK, but then that example is pretty misleading, because it leads >> the reader to suppose that the planner can tell the difference between >> the selectivities of the two queries. Maybe what's lacking is an >> explanation of how you'd use this statistics type. > I think we should remove the complex example from that page, and instead > refer the reader to chapters 14 and 69. Seems reasonable. If we did add enough material there to be a standalone description, it would be duplicative probably. However, that does put the onus on the other chapters to offer a complete definition, rather than leaving details to the man page as we do in many other cases. regards, tom lane
Tom Lane wrote: > Alvaro Herrera <alvherre@2ndquadrant.com> writes: > > Tom Lane wrote: > >> I'm prepared to create a fix for that, but it'd be easier to commit the > >> current patch first, to avoid merge conflicts. > > > It seems we're mostly in agreement regarding the parts I was touching. > > Do you want to push your version of the patch? > > It's still almost all your work, so I figured you should push it. OK, I'll do that shortly then. -- Álvaro Herrera https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Tom Lane wrote: > I did not review the rest of the regression test changes, nor the > tab-complete changes. I can however report that DROP STATISTICS <tab> > doesn't successfully complete any stats object names. The attached patch fixes this problem. -- Álvaro Herrera https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Attachment
I wrote: > Tomas Vondra <tomas.vondra@2ndquadrant.com> writes: >> On 5/12/17 4:46 AM, Tom Lane wrote: >>> Although I've not done anything about it here, I'm not happy about the >>> handling of dependencies for stats objects. >> Yeah, it's a bit frankensteinian ... > I'm prepared to create a fix for that, but it'd be easier to commit the > current patch first, to avoid merge conflicts. Done. I noted that we weren't making an owner dependency either :-(. Also, we might want to think about letting stats objects be extension members sometime. As things stand, if an extension script does CREATE STATISTICS, the stats object will just be "loose" rather than bound into the extension. We still have docs and nomenclature issues to work on. Anyone want to take point on those? regards, tom lane
Tom Lane wrote: > Although I've not done anything about it here, I'm not happy about the > handling of dependencies for stats objects. I do not think that cloning > RemoveStatistics as RemoveStatisticsExt is sane at all. The former is > meant to deal with cleaning up pg_statistic rows that we know will be > there, so there's no real need to expend pg_depend overhead to track them. > For objects that are only loosely connected, the right thing is to use > the dependency system; in particular, this design has no real chance of > working well with cross-table stats. Also, it's really silly to have > *both* this hard-wired mechanism and a pg_depend entry; the latter is > surely redundant if you have the former. IMO we should revert > RemoveStatisticsExt and instead handle things by making stats objects > auto-dependent on the individual column(s) they reference (not the whole > table). > > I'm also of the opinion that having an AUTO dependency, rather than > a NORMAL dependency, on the stats object's schema is the wrong semantics. > There isn't any other case where you can drop a non-empty schema without > saying CASCADE, and I'm mystified why this case should act that way. Here are two patches regarding handling of dependencies. The first one implements your first suggestion: add a NORMAL dependency on each column, and do away with RemoveStatisticsExt. This works well and should uncontroversial. If we only do this, then DROP TABLE needs to say CASCADE if there's a statistics object in the table. This seems pointless to me, so the second patch throws in an additional dependency on the table as a whole, AUTO this time, so that if the table is dropped, the statistics goes away without requiring cascade. There is no point in forcing CASCADE for this case, ISTM. This works well too, but I admit there might be resistance to doing it. OTOH this is how CREATE INDEX works. (I considered what would happen with a stats object covering multiple tables. I think it's reasonable to drop the stats too in that case, under the rationale that the object is no longer useful. Not really sure about this.) -- Álvaro Herrera https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Attachment
Alvaro Herrera <alvherre@2ndquadrant.com> writes: > Tom Lane wrote: >> Although I've not done anything about it here, I'm not happy about the >> handling of dependencies for stats objects. > Here are two patches regarding handling of dependencies. Oh, sorry --- I already pushed something about this. > The first one > implements your first suggestion: add a NORMAL dependency on each > column, and do away with RemoveStatisticsExt. This works well and > should uncontroversial. No, I wanted an AUTO dependency there, for exactly the reason you mention: > If we only do this, then DROP TABLE needs to say CASCADE if there's a > statistics object in the table. I don't think we really want that, do we? A stats object can't be of any value if the underlying table is gone. Perhaps that calculus would change for cross-table stats but I don't see why. > This seems pointless to me, so the > second patch throws in an additional dependency on the table as a whole, > AUTO this time, so that if the table is dropped, the statistics goes > away without requiring cascade. There is no point in forcing CASCADE > for this case, ISTM. This works well too, but I admit there might be > resistance to doing it. OTOH this is how CREATE INDEX works. Uh, no it isn't. Indexes have auto dependencies on the individual columns they index, and *not* on the table as a whole. regards, tom lane
Tom Lane wrote: > Alvaro Herrera <alvherre@2ndquadrant.com> writes: > > Tom Lane wrote: > >> Although I've not done anything about it here, I'm not happy about the > >> handling of dependencies for stats objects. > > > Here are two patches regarding handling of dependencies. > > Oh, sorry --- I already pushed something about this. That's fine, thanks. I'm glad that this thread got you interested in look over the extended statistics stuff. Thanks for the input. -- Álvaro Herrera https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Alvaro Herrera <alvherre@2ndquadrant.com> writes: > Tom Lane wrote: >> Oh, sorry --- I already pushed something about this. > That's fine, thanks. Oh, I see your patch also fixes missing code in getObjectDescription(). We need that. Is there a decent way to get the compiler to warn about that oversight? regards, tom lane
Tom Lane wrote: > Alvaro Herrera <alvherre@2ndquadrant.com> writes: > > Tom Lane wrote: > >> Oh, sorry --- I already pushed something about this. > > > That's fine, thanks. > > Oh, I see your patch also fixes missing code in getObjectDescription(). > We need that. Is there a decent way to get the compiler to warn about > that oversight? We could remove the default clause. That results in the expected warning, and no others (i.e. the switch was already complete): /pgsql/source/master/src/backend/catalog/objectaddress.c:2657:2: warning: enumeration value 'OCLASS_STATISTIC_EXT' not handledin switch [-Wswitch] -- Álvaro Herrera https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Alvaro Herrera <alvherre@2ndquadrant.com> writes: > Tom Lane wrote: >> Oh, I see your patch also fixes missing code in getObjectDescription(). >> We need that. Is there a decent way to get the compiler to warn about >> that oversight? > We could remove the default clause. That results in the expected > warning, and no others (i.e. the switch was already complete): > /pgsql/source/master/src/backend/catalog/objectaddress.c:2657:2: warning: enumeration value 'OCLASS_STATISTIC_EXT' nothandled in switch [-Wswitch] Hm. That would behave less than desirably if getObjectClass() could return a value that wasn't a member of the enum, but it's hard to credit that happening. I guess I'd vote for removing the default: case from all of the places that have "switch (getObjectClass(object))", as forgetting to add an entry seems like a much larger hazard. regards, tom lane
Tom Lane wrote: > Alvaro Herrera <alvherre@2ndquadrant.com> writes: > > Tom Lane wrote: > >> Oh, I see your patch also fixes missing code in getObjectDescription(). > >> We need that. Is there a decent way to get the compiler to warn about > >> that oversight? > > > We could remove the default clause. That results in the expected > > warning, and no others (i.e. the switch was already complete): > > > /pgsql/source/master/src/backend/catalog/objectaddress.c:2657:2: warning: enumeration value 'OCLASS_STATISTIC_EXT' nothandled in switch [-Wswitch] > > Hm. That would behave less than desirably if getObjectClass() could > return a value that wasn't a member of the enum, but it's hard to > credit that happening. I guess I'd vote for removing the default: > case from all of the places that have "switch (getObjectClass(object))", > as forgetting to add an entry seems like a much larger hazard. Ignoring the one in alter.c's AlterObjectNamespace_oid, which only handles a small subset of the enum values, that gives the following warnings: /pgsql/source/master/src/backend/catalog/dependency.c: In function 'doDeletion': /pgsql/source/master/src/backend/catalog/dependency.c:1106:2: warning: enumeration value 'OCLASS_ROLE' not handled in switch[-Wswitch] switch (getObjectClass(object)) ^ /pgsql/source/master/src/backend/catalog/dependency.c:1106:2: warning: enumeration value 'OCLASS_DATABASE' not handled inswitch [-Wswitch] /pgsql/source/master/src/backend/catalog/dependency.c:1106:2: warning: enumeration value 'OCLASS_TBLSPACE' not handled inswitch [-Wswitch] /pgsql/source/master/src/backend/catalog/dependency.c:1106:2: warning: enumeration value 'OCLASS_SUBSCRIPTION' not handledin switch [-Wswitch] /pgsql/source/master/src/backend/commands/tablecmds.c: In function 'ATExecAlterColumnType': /pgsql/source/master/src/backend/commands/tablecmds.c:9112:3: warning: enumeration value 'OCLASS_AM' not handled in switch[-Wswitch] switch (getObjectClass(&foundObject)) ^ /pgsql/source/master/src/backend/commands/tablecmds.c:9112:3: warning: enumeration value 'OCLASS_STATISTIC_EXT' not handledin switch [-Wswitch] /pgsql/source/master/src/backend/commands/tablecmds.c:9112:3: warning: enumeration value 'OCLASS_EVENT_TRIGGER' not handledin switch [-Wswitch] /pgsql/source/master/src/backend/commands/tablecmds.c:9112:3: warning: enumeration value 'OCLASS_PUBLICATION' not handledin switch [-Wswitch] /pgsql/source/master/src/backend/commands/tablecmds.c:9112:3: warning: enumeration value 'OCLASS_PUBLICATION_REL' not handledin switch [-Wswitch] /pgsql/source/master/src/backend/commands/tablecmds.c:9112:3: warning: enumeration value 'OCLASS_SUBSCRIPTION' not handledin switch [-Wswitch] /pgsql/source/master/src/backend/commands/tablecmds.c:9112:3: warning: enumeration value 'OCLASS_TRANSFORM' not handled inswitch [-Wswitch] -- Álvaro Herrera https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Alvaro Herrera <alvherre@2ndquadrant.com> writes: > Tom Lane wrote: >> Hm. That would behave less than desirably if getObjectClass() could >> return a value that wasn't a member of the enum, but it's hard to >> credit that happening. I guess I'd vote for removing the default: >> case from all of the places that have "switch (getObjectClass(object))", >> as forgetting to add an entry seems like a much larger hazard. > Ignoring the one in alter.c's AlterObjectNamespace_oid, which only > handles a small subset of the enum values, that gives the following > warnings: > /pgsql/source/master/src/backend/catalog/dependency.c: In function 'doDeletion': > /pgsql/source/master/src/backend/catalog/dependency.c:1106:2: warning: enumeration value 'OCLASS_ROLE' not handled in switch[-Wswitch] > switch (getObjectClass(object)) > ^ > /pgsql/source/master/src/backend/catalog/dependency.c:1106:2: warning: enumeration value 'OCLASS_DATABASE' not handledin switch [-Wswitch] > /pgsql/source/master/src/backend/catalog/dependency.c:1106:2: warning: enumeration value 'OCLASS_TBLSPACE' not handledin switch [-Wswitch] > /pgsql/source/master/src/backend/catalog/dependency.c:1106:2: warning: enumeration value 'OCLASS_SUBSCRIPTION' not handledin switch [-Wswitch] Hm. I think it's intentional that shared objects are not handled there, but I wonder whether the lack of SUBSCRIPTION represents a bug. Anyway I think it would be fine to add those to the switch as explicit error cases. > /pgsql/source/master/src/backend/commands/tablecmds.c: In function 'ATExecAlterColumnType': > /pgsql/source/master/src/backend/commands/tablecmds.c:9112:3: warning: enumeration value 'OCLASS_AM' not handled in switch[-Wswitch] > switch (getObjectClass(&foundObject)) > ^ > /pgsql/source/master/src/backend/commands/tablecmds.c:9112:3: warning: enumeration value 'OCLASS_STATISTIC_EXT' not handledin switch [-Wswitch] > /pgsql/source/master/src/backend/commands/tablecmds.c:9112:3: warning: enumeration value 'OCLASS_EVENT_TRIGGER' not handledin switch [-Wswitch] > /pgsql/source/master/src/backend/commands/tablecmds.c:9112:3: warning: enumeration value 'OCLASS_PUBLICATION' not handledin switch [-Wswitch] > /pgsql/source/master/src/backend/commands/tablecmds.c:9112:3: warning: enumeration value 'OCLASS_PUBLICATION_REL' not handledin switch [-Wswitch] > /pgsql/source/master/src/backend/commands/tablecmds.c:9112:3: warning: enumeration value 'OCLASS_SUBSCRIPTION' not handledin switch [-Wswitch] > /pgsql/source/master/src/backend/commands/tablecmds.c:9112:3: warning: enumeration value 'OCLASS_TRANSFORM' not handledin switch [-Wswitch] All of those ought to go into the category that prints "unexpected object depending on column", I think; certainly "unrecognized object class" is not a better report, and the fact that we've evidently not touched this switch in the last few additions of object types provides fodder for the idea that the default: is unhelpful. (Not to mention that not having OCLASS_STATISTIC_EXT here is an outright bug as of today...) So losing the default: case here seems good to me. Alternatively, we could drop the explicit listing of oclasses we're not expecting to see, and let the default: case be "unexpected object depending on column". Treating the default separately is only of value if we'd expect getObjectDescription to fail, but there's no good reason for this code to be passing judgment on whether that might happen. What do we want this to do about statistics objects? We could either throw a feature-not-implemented error or try to update the stats object for the new column type. regards, tom lane