Thread: Assignment of valid collation for SET operations on queries with UNKNOWN types.
Assignment of valid collation for SET operations on queries with UNKNOWN types.
From
Rahila Syed
Date:
Following UNION of two queries with constant literals runs successfully.
CASE 1:
postgres=# SELECT 'abc' UNION SELECT 'bcd' ;
?column?
----------
abc
bcd
(2 rows)
whereas when these literals are part of a view, the UNION fails.
CASE 2:
postgres=# create view v as select 'abc' a;
2016-11-16 15:28:48 IST WARNING: column "a" has type "unknown"
2016-11-16 15:28:48 IST DETAIL: Proceeding with relation creation anyway.
WARNING: column "a" has type "unknown"
DETAIL: Proceeding with relation creation anyway.
CREATE VIEW
postgres=# create view v1 as select 'bcd' a;
2016-11-16 15:28:56 IST WARNING: column "a" has type "unknown"
2016-11-16 15:28:56 IST DETAIL: Proceeding with relation creation anyway.
WARNING: column "a" has type "unknown"
DETAIL: Proceeding with relation creation anyway.
CREATE VIEW
postgres=# select a from v UNION select a from v1;
2016-11-16 15:25:28 IST ERROR: could not determine which collation to use for string comparison
2016-11-16 15:25:28 IST HINT: Use the COLLATE clause to set the collation explicitly.
2016-11-16 15:25:28 IST STATEMENT: select a from v UNION select a from v1;
ERROR: could not determine which collation to use for string comparison
HINT: Use the COLLATE clause to set the collation explicitly.
When UNION of queries with constant literals as in CASE 1 is allowed shouldn't a UNION of queries with literals in a view as in CASE 2 be allowed?
In transformSetOperationTree, while determining the result type of the merged
output columns, if the left and right column types are UNKNOWNs the result type
is resolved to TEXT.
The difference of behaviour in above two cases arises because the result collation
assigned is not valid in CASE 2.
When the left and the right inputs are literal constants i.e UNKNOWN as in Case 1
the collation of result column is correctly assigned to a valid value.
Whereas when the left and the right inputs are columns of UNKNOWN type as in Case 2,
the result collation is InvalidOid.
So if we ensure assignment of a valid collation when the left and the right columns/inputs
are UNKNOWN, the above can be resolved.
Attached WIP patch does that. Kindly let me know your opinion.
Thank you,
Rahila SyedCASE 1:
postgres=# SELECT 'abc' UNION SELECT 'bcd' ;
?column?
----------
abc
bcd
(2 rows)
whereas when these literals are part of a view, the UNION fails.
CASE 2:
postgres=# create view v as select 'abc' a;
2016-11-16 15:28:48 IST WARNING: column "a" has type "unknown"
2016-11-16 15:28:48 IST DETAIL: Proceeding with relation creation anyway.
WARNING: column "a" has type "unknown"
DETAIL: Proceeding with relation creation anyway.
CREATE VIEW
postgres=# create view v1 as select 'bcd' a;
2016-11-16 15:28:56 IST WARNING: column "a" has type "unknown"
2016-11-16 15:28:56 IST DETAIL: Proceeding with relation creation anyway.
WARNING: column "a" has type "unknown"
DETAIL: Proceeding with relation creation anyway.
CREATE VIEW
postgres=# select a from v UNION select a from v1;
2016-11-16 15:25:28 IST ERROR: could not determine which collation to use for string comparison
2016-11-16 15:25:28 IST HINT: Use the COLLATE clause to set the collation explicitly.
2016-11-16 15:25:28 IST STATEMENT: select a from v UNION select a from v1;
ERROR: could not determine which collation to use for string comparison
HINT: Use the COLLATE clause to set the collation explicitly.
When UNION of queries with constant literals as in CASE 1 is allowed shouldn't a UNION of queries with literals in a view as in CASE 2 be allowed?
In transformSetOperationTree, while determining the result type of the merged
output columns, if the left and right column types are UNKNOWNs the result type
is resolved to TEXT.
The difference of behaviour in above two cases arises because the result collation
assigned is not valid in CASE 2.
When the left and the right inputs are literal constants i.e UNKNOWN as in Case 1
the collation of result column is correctly assigned to a valid value.
Whereas when the left and the right inputs are columns of UNKNOWN type as in Case 2,
the result collation is InvalidOid.
So if we ensure assignment of a valid collation when the left and the right columns/inputs
are UNKNOWN, the above can be resolved.
Attached WIP patch does that. Kindly let me know your opinion.
Thank you,
Attachment
Re: Assignment of valid collation for SET operations on queries with UNKNOWN types.
From
Tom Lane
Date:
Rahila Syed <rahilasyed90@gmail.com> writes: > CASE 2: > postgres=# create view v as select 'abc' a; > 2016-11-16 15:28:48 IST WARNING: column "a" has type "unknown" > 2016-11-16 15:28:48 IST DETAIL: Proceeding with relation creation anyway. > WARNING: column "a" has type "unknown" > DETAIL: Proceeding with relation creation anyway. > CREATE VIEW We really ought to make that a hard error. And ideally fix things so that the type of the view column will be resolved as text, so that you don't hit this condition in the first place; but there is no good that comes out of allowing a view to be created like this. > Attached WIP patch does that. Kindly let me know your opinion. This is a seriously ugly kluge that's attacking the symptom not the problem. Or really, a symptom not the problem. There are lots of other symptoms, for instance regression=# select * from v order by 1; ERROR: failed to find conversion function from unknown to text regards, tom lane
Re: Assignment of valid collation for SET operations on queries with UNKNOWN types.
From
Rahila Syed
Date:
Hello,
>And ideally fix things so
>that the type of the view column will be resolved as text, so that you>And ideally fix things so
>don't hit this condition in the first place; but there is no good that
>comes out of allowing a view to be created like this
Thank you for suggestion. Attached is a patch which resolves the columns
postgres=# create view v as select 'abc' a;
CREATE VIEW
postgres=# create view v1 as select 'def' a;
CREATE VIEW
postgres=# \d+ v1;
View "public.v1"
Column | Type | Collation | Nullable | Default | Storage | Description
--------+------+-----------+--
a | text | | | | extended |
View definition:
SELECT 'def'::text AS a;
postgres=# select a from v UNION select a from v1;
a
-----
abc
def
(2 rows)
AND
postgres=# select * from v order by 1;
a
-----
abc
(1 row)
postgres=# select * from v order by 1;
a
-----
abc
(1 row)
Kindly give your opinion.
Thank you,
Rahila Syed.
On Thu, Nov 17, 2016 at 8:59 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Rahila Syed <rahilasyed90@gmail.com> writes:
> CASE 2:
> postgres=# create view v as select 'abc' a;
> 2016-11-16 15:28:48 IST WARNING: column "a" has type "unknown"
> 2016-11-16 15:28:48 IST DETAIL: Proceeding with relation creation anyway.
> WARNING: column "a" has type "unknown"
> DETAIL: Proceeding with relation creation anyway.
> CREATE VIEW
We really ought to make that a hard error. And ideally fix things so
that the type of the view column will be resolved as text, so that you
don't hit this condition in the first place; but there is no good that
comes out of allowing a view to be created like this.
> Attached WIP patch does that. Kindly let me know your opinion.
This is a seriously ugly kluge that's attacking the symptom not the
problem. Or really, a symptom not the problem. There are lots of
other symptoms, for instance
regression=# select * from v order by 1;
ERROR: failed to find conversion function from unknown to text
regards, tom lane
Attachment
Re: Assignment of valid collation for SET operations on queries with UNKNOWN types.
From
Michael Paquier
Date:
On Tue, Dec 6, 2016 at 10:42 PM, Rahila Syed <rahilasyed90@gmail.com> wrote: > Hello, > >>And ideally fix things so >>that the type of the view column will be resolved as text, so that you >>don't hit this condition in the first place; but there is no good that >>comes out of allowing a view to be created like this > > Thank you for suggestion. Attached is a patch which resolves the columns > with literal constants as TEXT for view creation. > > Following views can be created with literal columns resolved as TEXT. > > postgres=# create view v as select 'abc' a; > CREATE VIEW > postgres=# create view v1 as select 'def' a; > CREATE VIEW There is a similar code pattern for materialized views, see create_ctas_nodata() where the attribute list is built. Even with your patch, I get that: =# create materialized view m as select 'abc' a; WARNING: 42P16: column "a" has type "unknown" DETAIL: Proceeding with relation creation anyway. LOCATION: CheckAttributeType, heap.c:499 SELECT 1 Time: 6.566 ms =# select * from m order by 1; ERROR: XX000: failed to find conversion function from unknown to text Your patch has no regression tests, surely you want some to stress this code path. And actually, shouldn't this be just a plain error? -- Michael
Re: Assignment of valid collation for SET operations on queries with UNKNOWN types.
From
Michael Paquier
Date:
On Wed, Dec 7, 2016 at 4:24 PM, Michael Paquier <michael.paquier@gmail.com> wrote: > On Tue, Dec 6, 2016 at 10:42 PM, Rahila Syed <rahilasyed90@gmail.com> wrote: >> Thank you for suggestion. Attached is a patch which resolves the columns >> with literal constants as TEXT for view creation. You may also want to add your patch to a CF so as it does not fall into the cracks. -- Michael
Re: [HACKERS] Assignment of valid collation for SET operations onqueries with UNKNOWN types.
From
Rahila Syed
Date:
Hello,
Thank you for comments. >There is a similar code pattern for materialized views, see
>create_ctas_nodata() where the attribute list is built
For other materialized views and CREATE TABLE AS, column definitions are built in
intorel_startup() function which has different code from that of CREATE VIEW which
intorel_startup() function which has different code from that of CREATE VIEW which
the patch deals with.
Limiting the scope of the patch to include changing the type of literal constants
to text only for plain views. Also, error out when column with UNKNOWN type is
being created for other relations like tables and materialized views.
being created for other relations like tables and materialized views.
>And actually, shouldn't this be just a plain error?
Changed it to error in the attached patch.
>Your patch has no regression tests, surely you want some to stress
>this code path
>Your patch has no regression tests, surely you want some to stress
>this code path
Added regression tests in the attached patch.
Also adding this patch to CF 2017-01
Thank you,
Rahila Syed
Attachment
Re: [HACKERS] Assignment of valid collation for SET operations onqueries with UNKNOWN types.
From
Michael Paquier
Date:
On Wed, Dec 14, 2016 at 7:02 PM, Rahila Syed <rahilasyed90@gmail.com> wrote: >>There is a similar code pattern for materialized views, see >>create_ctas_nodata() where the attribute list is built > create_ctas_nodata() is for creation of materialized views WITH NO DATA. > For other materialized views and CREATE TABLE AS, column definitions are > built in > intorel_startup() function which has different code from that of CREATE VIEW > which > the patch deals with. > > Limiting the scope of the patch to include changing the type of literal > constants > to text only for plain views. Also, error out when column with UNKNOWN type > is > being created for other relations like tables and materialized views. Matviews is the same way of thinking as views in terms of definition. It is inconsistent to try to address a problem only partially if the same behavior shows up in different code paths. -- Michael
Re: [HACKERS] Assignment of valid collation for SET operations onqueries with UNKNOWN types.
From
Ashutosh Bapat
Date:
The way this patch has been written, it doesn't allow creating tables with unknown type columns, which was allowed earlier. That breaks backward compatibility. Users, who have created such tables will face problems while loading dumps from earlier versions. pg_upgrade might be an issue, but we may modify it to convert those columns to text. Given that a column with unknown type is pretty much useless unless casted to some other type, there may not be many such users out there. But we should probably add a notice in release notes. +-- check coercion of UNKNOWN type to text for literal constants +create view v as select 'abc' a; +create view v11 as select 'def' a; +select a from v UNION select a from v11; + The comment says that it's testing coercion of unknown to text, but nothing in the test guarantees that constant literals were casted to text. The union could give the expected result if code merging the two views knew how to handle unknown types. Instead \d v or \d v11 would be a better test to test what the comment says. If we want to test such a union the test is fine, but the comment needs to change. For a materialized view you may have to modify transformCreateTableAsStmt() to modify at targetlist of the query similar to DefineVirtualRelation(), but I think that can be done as a separate patch. You might want to add some testcases to test the error report e.g. (not necessarily in the same form) create view sv as select relname::unknown from pg_class; ERROR: column "relname" has type "unknown" On Wed, Dec 14, 2016 at 3:32 PM, Rahila Syed <rahilasyed90@gmail.com> wrote: > Hello, > > Thank you for comments. > >>There is a similar code pattern for materialized views, see >>create_ctas_nodata() where the attribute list is built > create_ctas_nodata() is for creation of materialized views WITH NO DATA. > For other materialized views and CREATE TABLE AS, column definitions are > built in > intorel_startup() function which has different code from that of CREATE VIEW > which > the patch deals with. > > Limiting the scope of the patch to include changing the type of literal > constants > to text only for plain views. Also, error out when column with UNKNOWN type > is > being created for other relations like tables and materialized views. > >>And actually, shouldn't this be just a plain error? > Changed it to error in the attached patch. > >>Your patch has no regression tests, surely you want some to stress >>this code path > Added regression tests in the attached patch. > > Also adding this patch to CF 2017-01 > > Thank you, > Rahila Syed > > > > > -- > Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-hackers > -- Best Wishes, Ashutosh Bapat EnterpriseDB Corporation The Postgres Database Company
Re: [HACKERS] Assignment of valid collation for SET operations on queries with UNKNOWN types.
From
Tom Lane
Date:
Ashutosh Bapat <ashutosh.bapat@enterprisedb.com> writes: > The way this patch has been written, it doesn't allow creating tables > with unknown type columns, which was allowed earlier. Yes, that's an intentional change; creating such tables (or views) has never been anything but a foot-gun. However, I thought the idea was to silently coerce affected columns from unknown to text. This doesn't look like the behavior we want: > You might want to add some testcases to test the error report e.g. > (not necessarily in the same form) create view sv as select > relname::unknown from pg_class; > ERROR: column "relname" has type "unknown" regards, tom lane
Re: [HACKERS] Assignment of valid collation for SET operations onqueries with UNKNOWN types.
From
Ashutosh Bapat
Date:
On Thu, Dec 29, 2016 at 8:18 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Ashutosh Bapat <ashutosh.bapat@enterprisedb.com> writes: >> The way this patch has been written, it doesn't allow creating tables >> with unknown type columns, which was allowed earlier. > > Yes, that's an intentional change; creating such tables (or views) has > never been anything but a foot-gun. > > However, I thought the idea was to silently coerce affected columns from > unknown to text. This doesn't look like the behavior we want: Do you mean to say that when creating a table with a column of unknown type, that column type should be silently converted (there's nothing to coerce when the table is being created) to text? instead of throwing an error? The patch does that when creating a view with constant literals, which are known to be binary compatible with text and hence coercible. It looks like any "unknown' type data should be coercible to text, so it shouldn't matter whether those are constants or non-constant nodes. > >> You might want to add some testcases to test the error report e.g. >> (not necessarily in the same form) create view sv as select >> relname::unknown from pg_class; >> ERROR: column "relname" has type "unknown" > > regards, tom lane -- Best Wishes, Ashutosh Bapat EnterpriseDB Corporation The Postgres Database Company
Re: [HACKERS] Assignment of valid collation for SET operations onqueries with UNKNOWN types.
From
Michael Paquier
Date:
On Fri, Dec 30, 2016 at 1:30 PM, Ashutosh Bapat <ashutosh.bapat@enterprisedb.com> wrote: > On Thu, Dec 29, 2016 at 8:18 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> Ashutosh Bapat <ashutosh.bapat@enterprisedb.com> writes: >>> The way this patch has been written, it doesn't allow creating tables >>> with unknown type columns, which was allowed earlier. >> >> Yes, that's an intentional change; creating such tables (or views) has >> never been anything but a foot-gun. >> >> However, I thought the idea was to silently coerce affected columns from >> unknown to text. This doesn't look like the behavior we want: > > Do you mean to say that when creating a table with a column of unknown > type, that column type should be silently converted (there's nothing > to coerce when the table is being created) to text? instead of > throwing an error? FWIW that's what I understood: the patch should switch unknown columns to text. A bunch of side effects when converting types are avoided this way. -- Michael
Re: [HACKERS] Assignment of valid collation for SET operations onqueries with UNKNOWN types.
From
Rahila Syed
Date:
Thank you all for inputs.
Kindly help me clarify the scope of the patch.
>However, I thought the idea was to silently coerce affected columns from
>unknown to text. This doesn't look like the behavior we want:
Kindly help me clarify the scope of the patch.
>However, I thought the idea was to silently coerce affected columns from
>unknown to text. This doesn't look like the behavior we want:
This patch prevents creation of relation with unknown columns and
in addition fixes the particular case of CREATE VIEW with literal columns
by coercing unknown to text only in this particular case.
Are you suggesting extending the patch to include coercing from unknown to
text for all possible cases where a column of unknown type is being created?
Thank you,
Rahila Syed
On Fri, Dec 30, 2016 at 7:21 PM, Michael Paquier <michael.paquier@gmail.com> wrote:
On Fri, Dec 30, 2016 at 1:30 PM, Ashutosh Bapat
<ashutosh.bapat@enterprisedb.com> wrote:
> On Thu, Dec 29, 2016 at 8:18 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Ashutosh Bapat <ashutosh.bapat@enterprisedb.com> writes:
>>> The way this patch has been written, it doesn't allow creating tables
>>> with unknown type columns, which was allowed earlier.
>>
>> Yes, that's an intentional change; creating such tables (or views) has
>> never been anything but a foot-gun.
>>
>> However, I thought the idea was to silently coerce affected columns from
>> unknown to text. This doesn't look like the behavior we want:
>
> Do you mean to say that when creating a table with a column of unknown
> type, that column type should be silently converted (there's nothing
> to coerce when the table is being created) to text? instead of
> throwing an error?
FWIW that's what I understood: the patch should switch unknown columns
to text. A bunch of side effects when converting types are avoided
this way.
--
Michael
Re: [HACKERS] Assignment of valid collation for SET operations onqueries with UNKNOWN types.
From
Ashutosh Bapat
Date:
On Tue, Jan 3, 2017 at 5:57 PM, Rahila Syed <rahilasyed90@gmail.com> wrote: > Thank you all for inputs. > Kindly help me clarify the scope of the patch. > >>However, I thought the idea was to silently coerce affected columns from >>unknown to text. This doesn't look like the behavior we want: > > This patch prevents creation of relation with unknown columns and > in addition fixes the particular case of CREATE VIEW with literal columns > by coercing unknown to text only in this particular case. > > Are you suggesting extending the patch to include coercing from unknown to > text for all possible cases where a column of unknown type is being created? > I guess, that's what Tom is suggesting. We need to do something to avoid throwing an error in the cases which do not throw error in earlier releases. We may just leave that warning as warning for now, and just tackle the view case in this patch. I would like everything being done in one patch, rather than this piece-meal approach. But delaying fix for views because it takes longer to work on other pieces may not be good either. -- Best Wishes, Ashutosh Bapat EnterpriseDB Corporation The Postgres Database Company
Re: [HACKERS] Assignment of valid collation for SET operations on queries with UNKNOWN types.
From
Tom Lane
Date:
Ashutosh Bapat <ashutosh.bapat@enterprisedb.com> writes: > On Tue, Jan 3, 2017 at 5:57 PM, Rahila Syed <rahilasyed90@gmail.com> wrote: >> Are you suggesting extending the patch to include coercing from unknown to >> text for all possible cases where a column of unknown type is being created? > I guess, that's what Tom is suggesting. Yes; I think the point is we should change the semantics we assume for "SELECT 'foo'", not only for views but across the board. There are plenty of non-view-related cases where it doesn't behave well, for example regression=# select * from (select 'foo' from generate_series(1,3)) ss order by 1; ERROR: failed to find conversion function from unknown to text Furthermore, it seems like a seriously bad idea for "SELECT 'foo'" to mean something different in the context of CREATE VIEW than it means elsewhere. The trick is to not break cases where we've already hacked things to allow external influence on the resolved types of SELECT-list items. AFAICT, these cases are just INSERT/SELECT and set operations (eg unions): regression=# create table target (f1 int); CREATE TABLE regression=# explain verbose insert into target select '42' from generate_series(1,3); QUERY PLAN -------------------------------------------------------------------------------- --------- Insert on public.target (cost=0.00..10.00 rows=1000 width=4) -> Function Scan on pg_catalog.generate_series (cost=0.00..10.00 rows=1000 width=4) Output: 42 Function Call: generate_series(1, 3) (4 rows) regression=# explain verbose select '42' union all select 43; QUERY PLAN ------------------------------------------------ Append (cost=0.00..0.04 rows=2 width=4) -> Result (cost=0.00..0.01 rows=1 width=4) Output: 42 -> Result (cost=0.00..0.01 rows=1 width=4) Output: 43 (5 rows) In both of the above cases, we're getting an integer constant not a text or "unknown" constant, because the type information gets imported from outside the "SELECT". I spent some time fooling with this today and came up with the attached patch. I think this is basically the direction we should go in, but there are various loose ends: 1. I adjusted a couple of existing regression tests whose results are affected, but didn't do anything towards adding new tests showing the desirable results of this change (as per the examples in this thread). 2. I didn't do anything about docs, either, though maybe no change is needed. One user-visible change from this is that queries should never return any "unknown"-type columns to clients anymore. But I think that is not documented now, so maybe there's nothing to change. 3. We need to look at whether pg_upgrade is affected. I think we might be able to let it just ignore the issue for views, since they'd get properly updated during the dump and reload anyway. But if someone had an actual table (or matview) with an "unknown" column, that would be a pg_upgrade hazard, because "unknown" doesn't store like "text". 4. If "unknown" were marked as a pseudotype not base type in pg_type (ie, typtype "p" not "b") then the special case for it in CheckAttributeType could go away completely. I'm not really sure why it's "b" today --- maybe specifically because of this point that it's currently possible to create tables with unknown columns? But I've not looked at what all the implications of changing that might be, and anyway it could be left for a followon patch. 5. I noticed that the "resolveUnknown" arguments of transformSortClause and other functions in parse_clause.c could go away: there's really no reason not to just treat them as "true" always. But that could be separate cleanup as well. Anybody want to hack on those loose ends? regards, tom lane diff --git a/src/backend/catalog/heap.c b/src/backend/catalog/heap.c index 72aa0dd..af6ba47 100644 *** a/src/backend/catalog/heap.c --- b/src/backend/catalog/heap.c *************** CheckAttributeType(const char *attname, *** 490,507 **** char att_typtype = get_typtype(atttypid); Oid att_typelem; ! if (atttypid == UNKNOWNOID) ! { ! /* ! * Warn user, but don't fail, if column to be created has UNKNOWN type ! * (usually as a result of a 'retrieve into' - jolly) ! */ ! ereport(WARNING, ! (errcode(ERRCODE_INVALID_TABLE_DEFINITION), ! errmsg("column \"%s\" has type \"unknown\"", attname), ! errdetail("Proceeding with relation creation anyway."))); ! } ! else if (att_typtype == TYPTYPE_PSEUDO) { /* * Refuse any attempt to create a pseudo-type column, except for a --- 490,497 ---- char att_typtype = get_typtype(atttypid); Oid att_typelem; ! if (atttypid == UNKNOWNOID || ! att_typtype == TYPTYPE_PSEUDO) { /* * Refuse any attempt to create a pseudo-type column, except for a diff --git a/src/backend/parser/analyze.c b/src/backend/parser/analyze.c index 5116cbb..a3be333 100644 *** a/src/backend/parser/analyze.c --- b/src/backend/parser/analyze.c *************** parse_analyze_varparams(Node *parseTree, *** 155,167 **** Query * parse_sub_analyze(Node *parseTree, ParseState *parentParseState, CommonTableExpr *parentCTE, ! bool locked_from_parent) { ParseState *pstate = make_parsestate(parentParseState); Query *query; pstate->p_parent_cte = parentCTE; pstate->p_locked_from_parent = locked_from_parent; query = transformStmt(pstate, parseTree); --- 155,169 ---- Query * parse_sub_analyze(Node *parseTree, ParseState *parentParseState, CommonTableExpr *parentCTE, ! bool locked_from_parent, ! bool resolve_unknowns) { ParseState *pstate = make_parsestate(parentParseState); Query *query; pstate->p_parent_cte = parentCTE; pstate->p_locked_from_parent = locked_from_parent; + pstate->p_resolve_unknowns = resolve_unknowns; query = transformStmt(pstate, parseTree); *************** transformInsertStmt(ParseState *pstate, *** 552,561 **** --- 554,570 ---- * otherwise the behavior of SELECT within INSERT might be different * from a stand-alone SELECT. (Indeed, Postgres up through 6.5 had * bugs of just that nature...) + * + * The sole exception is that we prevent resolving unknown-type + * outputs as TEXT. This does not change the semantics since if the + * column type matters semantically, it would have been resolved to + * something else anyway. Doing this lets us resolve such outputs as + * the target column's type, which we handle below. */ sub_pstate->p_rtable = sub_rtable; sub_pstate->p_joinexprs = NIL; /* sub_rtable has no joins */ sub_pstate->p_namespace = sub_namespace; + sub_pstate->p_resolve_unknowns = false; selectQuery = transformStmt(sub_pstate, stmt->selectStmt); *************** transformSelectStmt(ParseState *pstate, *** 1252,1257 **** --- 1261,1270 ---- pstate->p_windowdefs, &qry->targetList); + /* resolve any still-unresolved output columns as being type text */ + if (pstate->p_resolve_unknowns) + resolveTargetListUnknowns(pstate, qry->targetList); + qry->rtable = pstate->p_rtable; qry->jointree = makeFromExpr(pstate->p_joinlist, qual); *************** transformSetOperationTree(ParseState *ps *** 1826,1836 **** /* * Transform SelectStmt into a Query. * * Note: previously transformed sub-queries don't affect the parsing * of this sub-query, because they are not in the toplevel pstate's * namespace list. */ ! selectQuery = parse_sub_analyze((Node *) stmt, pstate, NULL, false); /* * Check for bogus references to Vars on the current query level (but --- 1839,1857 ---- /* * Transform SelectStmt into a Query. * + * This works the same as SELECT transformation normally would, except + * that we prevent resolving unknown-type outputs as TEXT. This does + * not change the subquery's semantics since if the column type + * matters semantically, it would have been resolved to something else + * anyway. Doing this lets us resolve such outputs using + * select_common_type(), below. + * * Note: previously transformed sub-queries don't affect the parsing * of this sub-query, because they are not in the toplevel pstate's * namespace list. */ ! selectQuery = parse_sub_analyze((Node *) stmt, pstate, ! NULL, false, false); /* * Check for bogus references to Vars on the current query level (but *************** transformReturningList(ParseState *pstat *** 2333,2338 **** --- 2354,2363 ---- /* mark column origins */ markTargetListOrigins(pstate, rlist); + /* resolve any still-unresolved output columns as being type text */ + if (pstate->p_resolve_unknowns) + resolveTargetListUnknowns(pstate, rlist); + /* restore state */ pstate->p_next_resno = save_next_resno; diff --git a/src/backend/parser/parse_clause.c b/src/backend/parser/parse_clause.c index d788ffd..3a01e6c 100644 *** a/src/backend/parser/parse_clause.c --- b/src/backend/parser/parse_clause.c *************** transformRangeSubselect(ParseState *psta *** 471,477 **** * Analyze and transform the subquery. */ query = parse_sub_analyze(r->subquery, pstate, NULL, ! isLockedRefname(pstate, r->alias->aliasname)); /* Restore state */ pstate->p_lateral_active = false; --- 471,478 ---- * Analyze and transform the subquery. */ query = parse_sub_analyze(r->subquery, pstate, NULL, ! isLockedRefname(pstate, r->alias->aliasname), ! true); /* Restore state */ pstate->p_lateral_active = false; diff --git a/src/backend/parser/parse_cte.c b/src/backend/parser/parse_cte.c index fc8c15b..dfbcaa2 100644 *** a/src/backend/parser/parse_cte.c --- b/src/backend/parser/parse_cte.c *************** analyzeCTE(ParseState *pstate, CommonTab *** 241,247 **** /* Analysis not done already */ Assert(!IsA(cte->ctequery, Query)); ! query = parse_sub_analyze(cte->ctequery, pstate, cte, false); cte->ctequery = (Node *) query; /* --- 241,247 ---- /* Analysis not done already */ Assert(!IsA(cte->ctequery, Query)); ! query = parse_sub_analyze(cte->ctequery, pstate, cte, false, true); cte->ctequery = (Node *) query; /* *************** analyzeCTETargetList(ParseState *pstate, *** 393,403 **** /* * If the CTE is recursive, force the exposed column type of any ! * "unknown" column to "text". This corresponds to the fact that ! * SELECT 'foo' UNION SELECT 'bar' will ultimately produce text. We ! * might see "unknown" as a result of an untyped literal in the ! * non-recursive term's select list, and if we don't convert to text ! * then we'll have a mismatch against the UNION result. * * The column might contain 'foo' COLLATE "bar", so don't override * collation if it's already set. --- 393,402 ---- /* * If the CTE is recursive, force the exposed column type of any ! * "unknown" column to "text". We must deal with this here because ! * we're called on the non-recursive term before there's been any ! * attempt to force unknown output columns to some other type. We ! * have to resolve unknowns before looking at the recursive term. * * The column might contain 'foo' COLLATE "bar", so don't override * collation if it's already set. diff --git a/src/backend/parser/parse_expr.c b/src/backend/parser/parse_expr.c index f62e45f..db7fb1e 100644 *** a/src/backend/parser/parse_expr.c --- b/src/backend/parser/parse_expr.c *************** transformSubLink(ParseState *pstate, Sub *** 1845,1851 **** /* * OK, let's transform the sub-SELECT. */ ! qtree = parse_sub_analyze(sublink->subselect, pstate, NULL, false); /* * Check that we got something reasonable. Many of these conditions are --- 1845,1851 ---- /* * OK, let's transform the sub-SELECT. */ ! qtree = parse_sub_analyze(sublink->subselect, pstate, NULL, false, true); /* * Check that we got something reasonable. Many of these conditions are diff --git a/src/backend/parser/parse_node.c b/src/backend/parser/parse_node.c index 73e7d65..2a5f147 100644 *** a/src/backend/parser/parse_node.c --- b/src/backend/parser/parse_node.c *************** make_parsestate(ParseState *parentParseS *** 51,56 **** --- 51,57 ---- /* Fill in fields that don't start at null/false/zero */ pstate->p_next_resno = 1; + pstate->p_resolve_unknowns = true; if (parentParseState) { diff --git a/src/backend/parser/parse_target.c b/src/backend/parser/parse_target.c index 081a8dd..2576e31 100644 *** a/src/backend/parser/parse_target.c --- b/src/backend/parser/parse_target.c *************** transformExpressionList(ParseState *psta *** 289,300 **** /* * markTargetListOrigins() * Mark targetlist columns that are simple Vars with the source * table's OID and column number. * ! * Currently, this is done only for SELECT targetlists, since we only ! * need the info if we are going to send it to the frontend. */ void markTargetListOrigins(ParseState *pstate, List *targetlist) --- 289,329 ---- /* + * resolveTargetListUnknowns() + * Convert any unknown-type targetlist entries to type TEXT. + * + * We do this after we've exhausted all other ways of identifying the output + * column types of a query. + */ + void + resolveTargetListUnknowns(ParseState *pstate, List *targetlist) + { + ListCell *l; + + foreach(l, targetlist) + { + TargetEntry *tle = (TargetEntry *) lfirst(l); + Oid restype = exprType((Node *) tle->expr); + + if (restype == UNKNOWNOID) + { + tle->expr = (Expr *) coerce_type(pstate, (Node *) tle->expr, + restype, TEXTOID, -1, + COERCION_IMPLICIT, + COERCE_IMPLICIT_CAST, + -1); + } + } + } + + + /* * markTargetListOrigins() * Mark targetlist columns that are simple Vars with the source * table's OID and column number. * ! * Currently, this is done only for SELECT targetlists and RETURNING lists, ! * since we only need the info if we are going to send it to the frontend. */ void markTargetListOrigins(ParseState *pstate, List *targetlist) diff --git a/src/include/parser/analyze.h b/src/include/parser/analyze.h index 3bf4edd..705f833 100644 *** a/src/include/parser/analyze.h --- b/src/include/parser/analyze.h *************** extern Query *parse_analyze_varparams(No *** 29,35 **** extern Query *parse_sub_analyze(Node *parseTree, ParseState *parentParseState, CommonTableExpr *parentCTE, ! bool locked_from_parent); extern Query *transformTopLevelStmt(ParseState *pstate, Node *parseTree); extern Query *transformStmt(ParseState *pstate, Node *parseTree); --- 29,36 ---- extern Query *parse_sub_analyze(Node *parseTree, ParseState *parentParseState, CommonTableExpr *parentCTE, ! bool locked_from_parent, ! bool resolve_unknowns); extern Query *transformTopLevelStmt(ParseState *pstate, Node *parseTree); extern Query *transformStmt(ParseState *pstate, Node *parseTree); diff --git a/src/include/parser/parse_node.h b/src/include/parser/parse_node.h index 7cdf142..dda108e 100644 *** a/src/include/parser/parse_node.h --- b/src/include/parser/parse_node.h *************** typedef Node *(*CoerceParamHook) (ParseS *** 149,154 **** --- 149,157 ---- * p_locked_from_parent: true if parent query level applies FOR UPDATE/SHARE * to this subquery as a whole. * + * p_resolve_unknowns: resolve unknown-type SELECT output columns as type TEXT + * (this is true by default). + * * p_hasAggs, p_hasWindowFuncs, etc: true if we've found any of the indicated * constructs in the query. * *************** struct ParseState *** 181,186 **** --- 184,191 ---- List *p_locking_clause; /* raw FOR UPDATE/FOR SHARE info */ bool p_locked_from_parent; /* parent has marked this subquery * with FOR UPDATE/FOR SHARE */ + bool p_resolve_unknowns; /* resolve unknown-type SELECT outputs + * as type text */ /* Flags telling about things found in the query: */ bool p_hasAggs; diff --git a/src/include/parser/parse_target.h b/src/include/parser/parse_target.h index e035aac..d06a235 100644 *** a/src/include/parser/parse_target.h --- b/src/include/parser/parse_target.h *************** extern List *transformTargetList(ParseSt *** 21,26 **** --- 21,27 ---- ParseExprKind exprKind); extern List *transformExpressionList(ParseState *pstate, List *exprlist, ParseExprKind exprKind, bool allowDefault); + extern void resolveTargetListUnknowns(ParseState *pstate, List *targetlist); extern void markTargetListOrigins(ParseState *pstate, List *targetlist); extern TargetEntry *transformTargetEntry(ParseState *pstate, Node *node, Node *expr, ParseExprKind exprKind, diff --git a/src/test/regress/expected/with.out b/src/test/regress/expected/with.out index 02fa08e..3b7f689 100644 *** a/src/test/regress/expected/with.out --- b/src/test/regress/expected/with.out *************** SELECT * FROM t LIMIT 10; *** 133,141 **** -- Test behavior with an unknown-type literal in the WITH WITH q AS (SELECT 'foo' AS x) ! SELECT x, x IS OF (unknown) as is_unknown FROM q; ! x | is_unknown ! -----+------------ foo | t (1 row) --- 133,141 ---- -- Test behavior with an unknown-type literal in the WITH WITH q AS (SELECT 'foo' AS x) ! SELECT x, x IS OF (text) AS is_text FROM q; ! x | is_text ! -----+--------- foo | t (1 row) *************** WITH RECURSIVE t(n) AS ( *** 144,150 **** UNION ALL SELECT n || ' bar' FROM t WHERE length(n) < 20 ) ! SELECT n, n IS OF (text) as is_text FROM t; n | is_text -------------------------+--------- foo | t --- 144,150 ---- UNION ALL SELECT n || ' bar' FROM t WHERE length(n) < 20 ) ! SELECT n, n IS OF (text) AS is_text FROM t; n | is_text -------------------------+--------- foo | t *************** SELECT n, n IS OF (text) as is_text FROM *** 155,160 **** --- 155,172 ---- foo bar bar bar bar bar | t (6 rows) + -- In a perfect world, this would work and resolve the literal as int ... + -- but for now, we have to be content with resolving to text too soon. + WITH RECURSIVE t(n) AS ( + SELECT '7' + UNION ALL + SELECT n+1 FROM t WHERE n < 10 + ) + SELECT n, n IS OF (int) AS is_int FROM t; + ERROR: operator does not exist: text + integer + LINE 4: SELECT n+1 FROM t WHERE n < 10 + ^ + HINT: No operator matches the given name and argument type(s). You might need to add explicit type casts. -- -- Some examples with a tree -- diff --git a/src/test/regress/output/create_function_1.source b/src/test/regress/output/create_function_1.source index 30c2936..957595c 100644 *** a/src/test/regress/output/create_function_1.source --- b/src/test/regress/output/create_function_1.source *************** CREATE FUNCTION test_atomic_ops() *** 59,65 **** CREATE FUNCTION test1 (int) RETURNS int LANGUAGE SQL AS 'SELECT ''not an integer'';'; ERROR: return type mismatch in function declared to return integer ! DETAIL: Actual return type is unknown. CONTEXT: SQL function "test1" CREATE FUNCTION test1 (int) RETURNS int LANGUAGE SQL AS 'not even SQL'; --- 59,65 ---- CREATE FUNCTION test1 (int) RETURNS int LANGUAGE SQL AS 'SELECT ''not an integer'';'; ERROR: return type mismatch in function declared to return integer ! DETAIL: Actual return type is text. CONTEXT: SQL function "test1" CREATE FUNCTION test1 (int) RETURNS int LANGUAGE SQL AS 'not even SQL'; diff --git a/src/test/regress/sql/with.sql b/src/test/regress/sql/with.sql index 7ee32ba..08ddc8b 100644 *** a/src/test/regress/sql/with.sql --- b/src/test/regress/sql/with.sql *************** SELECT * FROM t LIMIT 10; *** 69,82 **** -- Test behavior with an unknown-type literal in the WITH WITH q AS (SELECT 'foo' AS x) ! SELECT x, x IS OF (unknown) as is_unknown FROM q; WITH RECURSIVE t(n) AS ( SELECT 'foo' UNION ALL SELECT n || ' bar' FROM t WHERE length(n) < 20 ) ! SELECT n, n IS OF (text) as is_text FROM t; -- -- Some examples with a tree --- 69,91 ---- -- Test behavior with an unknown-type literal in the WITH WITH q AS (SELECT 'foo' AS x) ! SELECT x, x IS OF (text) AS is_text FROM q; WITH RECURSIVE t(n) AS ( SELECT 'foo' UNION ALL SELECT n || ' bar' FROM t WHERE length(n) < 20 ) ! SELECT n, n IS OF (text) AS is_text FROM t; ! ! -- In a perfect world, this would work and resolve the literal as int ... ! -- but for now, we have to be content with resolving to text too soon. ! WITH RECURSIVE t(n) AS ( ! SELECT '7' ! UNION ALL ! SELECT n+1 FROM t WHERE n < 10 ! ) ! SELECT n, n IS OF (int) AS is_int FROM t; -- -- Some examples with a tree -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Assignment of valid collation for SET operations onqueries with UNKNOWN types.
From
Michael Paquier
Date:
On Sun, Jan 8, 2017 at 10:55 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Ashutosh Bapat <ashutosh.bapat@enterprisedb.com> writes: >> On Tue, Jan 3, 2017 at 5:57 PM, Rahila Syed <rahilasyed90@gmail.com> wrote: >>> Are you suggesting extending the patch to include coercing from unknown to >>> text for all possible cases where a column of unknown type is being created? > >> I guess, that's what Tom is suggesting. > > Yes; I think the point is we should change the semantics we assume for > "SELECT 'foo'", not only for views but across the board. There are plenty > of non-view-related cases where it doesn't behave well, for example > > regression=# select * from > (select 'foo' from generate_series(1,3)) ss order by 1; > ERROR: failed to find conversion function from unknown to text > > Furthermore, it seems like a seriously bad idea for "SELECT 'foo'" to mean > something different in the context of CREATE VIEW than it means elsewhere. And this offers the same semantics for any DDL using SELECT's target list to build the list of column's types, which is consistent. > The trick is to not break cases where we've already hacked things to allow > external influence on the resolved types of SELECT-list items. AFAICT, > these cases are just INSERT/SELECT and set operations (eg unions): > > regression=# create table target (f1 int); > CREATE TABLE > regression=# explain verbose insert into target select '42' from generate_series(1,3); > QUERY PLAN > > -------------------------------------------------------------------------------- > --------- > Insert on public.target (cost=0.00..10.00 rows=1000 width=4) > -> Function Scan on pg_catalog.generate_series (cost=0.00..10.00 rows=1000 > width=4) > Output: 42 > Function Call: generate_series(1, 3) > (4 rows) > > regression=# explain verbose select '42' union all select 43; > QUERY PLAN > ------------------------------------------------ > Append (cost=0.00..0.04 rows=2 width=4) > -> Result (cost=0.00..0.01 rows=1 width=4) > Output: 42 > -> Result (cost=0.00..0.01 rows=1 width=4) > Output: 43 > (5 rows) > > In both of the above cases, we're getting an integer constant not a > text or "unknown" constant, because the type information gets imported > from outside the "SELECT". > > I spent some time fooling with this today and came up with the attached > patch. I think this is basically the direction we should go in, but > there are various loose ends: > > 1. I adjusted a couple of existing regression tests whose results are > affected, but didn't do anything towards adding new tests showing the > desirable results of this change (as per the examples in this thread). > > 2. I didn't do anything about docs, either, though maybe no change > is needed. One user-visible change from this is that queries should > never return any "unknown"-type columns to clients anymore. But I > think that is not documented now, so maybe there's nothing to change. > > 3. We need to look at whether pg_upgrade is affected. I think we > might be able to let it just ignore the issue for views, since they'd > get properly updated during the dump and reload anyway. But if someone > had an actual table (or matview) with an "unknown" column, that would > be a pg_upgrade hazard, because "unknown" doesn't store like "text". > > 4. If "unknown" were marked as a pseudotype not base type in pg_type > (ie, typtype "p" not "b") then the special case for it in > CheckAttributeType could go away completely. I'm not really sure > why it's "b" today --- maybe specifically because of this point that > it's currently possible to create tables with unknown columns? But > I've not looked at what all the implications of changing that might > be, and anyway it could be left for a followon patch. > > 5. I noticed that the "resolveUnknown" arguments of transformSortClause > and other functions in parse_clause.c could go away: there's really no > reason not to just treat them as "true" always. But that could be > separate cleanup as well. > > Anybody want to hack on those loose ends? Ashutosh, Rahila, do you have plans to review things here? -- Michael
Re: [HACKERS] Assignment of valid collation for SET operations onqueries with UNKNOWN types.
From
Ashutosh Bapat
Date:
On Wed, Jan 18, 2017 at 10:55 AM, Michael Paquier <michael.paquier@gmail.com> wrote: > On Sun, Jan 8, 2017 at 10:55 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> Ashutosh Bapat <ashutosh.bapat@enterprisedb.com> writes: >>> On Tue, Jan 3, 2017 at 5:57 PM, Rahila Syed <rahilasyed90@gmail.com> wrote: >>>> Are you suggesting extending the patch to include coercing from unknown to >>>> text for all possible cases where a column of unknown type is being created? >> >>> I guess, that's what Tom is suggesting. >> >> Yes; I think the point is we should change the semantics we assume for >> "SELECT 'foo'", not only for views but across the board. There are plenty >> of non-view-related cases where it doesn't behave well, for example >> >> regression=# select * from >> (select 'foo' from generate_series(1,3)) ss order by 1; >> ERROR: failed to find conversion function from unknown to text >> >> Furthermore, it seems like a seriously bad idea for "SELECT 'foo'" to mean >> something different in the context of CREATE VIEW than it means elsewhere. > > And this offers the same semantics for any DDL using SELECT's target > list to build the list of column's types, which is consistent. > >> The trick is to not break cases where we've already hacked things to allow >> external influence on the resolved types of SELECT-list items. AFAICT, >> these cases are just INSERT/SELECT and set operations (eg unions): >> >> regression=# create table target (f1 int); >> CREATE TABLE >> regression=# explain verbose insert into target select '42' from generate_series(1,3); >> QUERY PLAN >> >> -------------------------------------------------------------------------------- >> --------- >> Insert on public.target (cost=0.00..10.00 rows=1000 width=4) >> -> Function Scan on pg_catalog.generate_series (cost=0.00..10.00 rows=1000 >> width=4) >> Output: 42 >> Function Call: generate_series(1, 3) >> (4 rows) >> >> regression=# explain verbose select '42' union all select 43; >> QUERY PLAN >> ------------------------------------------------ >> Append (cost=0.00..0.04 rows=2 width=4) >> -> Result (cost=0.00..0.01 rows=1 width=4) >> Output: 42 >> -> Result (cost=0.00..0.01 rows=1 width=4) >> Output: 43 >> (5 rows) >> >> In both of the above cases, we're getting an integer constant not a >> text or "unknown" constant, because the type information gets imported >> from outside the "SELECT". >> >> I spent some time fooling with this today and came up with the attached >> patch. I think this is basically the direction we should go in, but >> there are various loose ends: >> >> 1. I adjusted a couple of existing regression tests whose results are >> affected, but didn't do anything towards adding new tests showing the >> desirable results of this change (as per the examples in this thread). >> >> 2. I didn't do anything about docs, either, though maybe no change >> is needed. One user-visible change from this is that queries should >> never return any "unknown"-type columns to clients anymore. But I >> think that is not documented now, so maybe there's nothing to change. >> >> 3. We need to look at whether pg_upgrade is affected. I think we >> might be able to let it just ignore the issue for views, since they'd >> get properly updated during the dump and reload anyway. But if someone >> had an actual table (or matview) with an "unknown" column, that would >> be a pg_upgrade hazard, because "unknown" doesn't store like "text". >> >> 4. If "unknown" were marked as a pseudotype not base type in pg_type >> (ie, typtype "p" not "b") then the special case for it in >> CheckAttributeType could go away completely. I'm not really sure >> why it's "b" today --- maybe specifically because of this point that >> it's currently possible to create tables with unknown columns? But >> I've not looked at what all the implications of changing that might >> be, and anyway it could be left for a followon patch. >> >> 5. I noticed that the "resolveUnknown" arguments of transformSortClause >> and other functions in parse_clause.c could go away: there's really no >> reason not to just treat them as "true" always. But that could be >> separate cleanup as well. >> >> Anybody want to hack on those loose ends? > > Ashutosh, Rahila, do you have plans to review things here? I won't be able to work on creating patches for TODOs listed by Tom. But I can review if someone volunteers to produce the patches. -- Best Wishes, Ashutosh Bapat EnterpriseDB Corporation The Postgres Database Company
Re: [HACKERS] Assignment of valid collation for SET operations on queries with UNKNOWN types.
From
Tom Lane
Date:
I wrote: > I spent some time fooling with this today and came up with the attached > patch. I think this is basically the direction we should go in, but > there are various loose ends: Here's an updated patch that's rebased against today's HEAD and addresses most of the loose ends: > 1. I adjusted a couple of existing regression tests whose results are > affected, but didn't do anything towards adding new tests showing the > desirable results of this change (as per the examples in this thread). Added some regression test cases. These are mostly designed to prove that coercion to text occurs where expected, but I did include a couple of queries that would have failed outright before. > 2. I didn't do anything about docs, either, though maybe no change > is needed. One user-visible change from this is that queries should > never return any "unknown"-type columns to clients anymore. But I > think that is not documented now, so maybe there's nothing to change. The only thing I could find in the SGML docs that directly addresses unknown-type columns was a remark in the CREATE VIEW man page, which I've updated. I also added a section to typeconv.sgml to document the behavior. > 3. We need to look at whether pg_upgrade is affected. I think we > might be able to let it just ignore the issue for views, since they'd > get properly updated during the dump and reload anyway. But if someone > had an actual table (or matview) with an "unknown" column, that would > be a pg_upgrade hazard, because "unknown" doesn't store like "text". I tested and found that simple views with unknown columns seem to update sanely if we just let pg_upgrade dump and restore them. So I suggest we allow that to happen. There might be cases where dependent views behave unexpectedly after such a conversion, but I think that would be rare enough that it's not worth forcing users to fix these views by hand before updating. However, tables with unknown columns would fail the upgrade (since we'd reject the CREATE TABLE command) while matviews would be accepted but then the DDL wouldn't match the physical data storage. So I added code to pg_upgrade to check for those cases and refuse to proceed. This is almost a straight copy-and-paste of the existing pg_upgrade code for checking for "line" columns. I think this is committable now; the other loose ends can be dealt with in follow-on patches. Does anyone want to review it? regards, tom lane diff --git a/doc/src/sgml/ref/create_view.sgml b/doc/src/sgml/ref/create_view.sgml index 8641e19..a83d956 100644 *** a/doc/src/sgml/ref/create_view.sgml --- b/doc/src/sgml/ref/create_view.sgml *************** CREATE VIEW [ <replaceable>schema</> . ] *** 251,259 **** <programlisting> CREATE VIEW vista AS SELECT 'Hello World'; </programlisting> ! is bad form in two ways: the column name defaults to <literal>?column?</>, ! and the column data type defaults to <type>unknown</>. If you want a ! string literal in a view's result, use something like: <programlisting> CREATE VIEW vista AS SELECT text 'Hello World' AS hello; </programlisting> --- 251,260 ---- <programlisting> CREATE VIEW vista AS SELECT 'Hello World'; </programlisting> ! is bad form because the column name defaults to <literal>?column?</>; ! also, the column data type defaults to <type>text</>, which might not ! be what you wanted. Better style for a string literal in a view's ! result is something like: <programlisting> CREATE VIEW vista AS SELECT text 'Hello World' AS hello; </programlisting> diff --git a/doc/src/sgml/typeconv.sgml b/doc/src/sgml/typeconv.sgml index c031c01..63d41f0 100644 *** a/doc/src/sgml/typeconv.sgml --- b/doc/src/sgml/typeconv.sgml *************** domain's base type for all subsequent st *** 984,990 **** <para> If all inputs are of type <type>unknown</type>, resolve as type <type>text</type> (the preferred type of the string category). ! Otherwise, <type>unknown</type> inputs are ignored. </para> </step> --- 984,991 ---- <para> If all inputs are of type <type>unknown</type>, resolve as type <type>text</type> (the preferred type of the string category). ! Otherwise, <type>unknown</type> inputs are ignored for the purposes ! of the remaining rules. </para> </step> *************** but <type>integer</> can be implicitly c *** 1076,1081 **** --- 1077,1129 ---- result type is resolved as <type>real</>. </para> </example> + </sect1> + + <sect1 id="typeconv-select"> + <title><literal>SELECT</literal> Output Columns</title> + + <indexterm zone="typeconv-select"> + <primary>SELECT</primary> + <secondary>determination of result type</secondary> + </indexterm> + + <para> + The rules given in the preceding sections will result in assignment + of non-<type>unknown</> data types to all expressions in a SQL query, + except for unspecified-type literals that appear as simple output + columns of a <command>SELECT</> command. For example, in + + <screen> + SELECT 'Hello World'; + </screen> + + there is nothing to identify what type the string literal should be + taken as. In this situation <productname>PostgreSQL</> will fall back + to resolving the literal's type as <type>text</>. + </para> + + <para> + When the <command>SELECT</> is one arm of a <literal>UNION</> + (or <literal>INTERSECT</> or <literal>EXCEPT</>) construct, or when it + appears within <command>INSERT ... SELECT</>, this rule is not applied + since rules given in preceding sections take precedence. The type of an + unspecified-type literal can be taken from the other <literal>UNION</> arm + in the first case, or from the destination column in the second case. + </para> + + <para> + <literal>RETURNING</> lists are treated the same as <command>SELECT</> + output lists for this purpose. + </para> + + <note> + <para> + Prior to <productname>PostgreSQL</> 10, this rule did not exist, and + unspecified-type literals in a <command>SELECT</> output list were + left as type <type>unknown</>. That had assorted bad consequences, + so it's been changed. + </para> + </note> </sect1> </chapter> diff --git a/src/backend/catalog/heap.c b/src/backend/catalog/heap.c index bfc54a8..af6ba47 100644 *** a/src/backend/catalog/heap.c --- b/src/backend/catalog/heap.c *************** CheckAttributeType(const char *attname, *** 490,507 **** char att_typtype = get_typtype(atttypid); Oid att_typelem; ! if (atttypid == UNKNOWNOID) ! { ! /* ! * Warn user, but don't fail, if column to be created has UNKNOWN type ! * (usually as a result of a 'retrieve into' - jolly) ! */ ! ereport(WARNING, ! (errcode(ERRCODE_INVALID_TABLE_DEFINITION), ! errmsg("column \"%s\" has type %s", attname, "unknown"), ! errdetail("Proceeding with relation creation anyway."))); ! } ! else if (att_typtype == TYPTYPE_PSEUDO) { /* * Refuse any attempt to create a pseudo-type column, except for a --- 490,497 ---- char att_typtype = get_typtype(atttypid); Oid att_typelem; ! if (atttypid == UNKNOWNOID || ! att_typtype == TYPTYPE_PSEUDO) { /* * Refuse any attempt to create a pseudo-type column, except for a diff --git a/src/backend/parser/analyze.c b/src/backend/parser/analyze.c index a02a77a..f954dc1 100644 *** a/src/backend/parser/analyze.c --- b/src/backend/parser/analyze.c *************** parse_analyze_varparams(RawStmt *parseTr *** 156,168 **** Query * parse_sub_analyze(Node *parseTree, ParseState *parentParseState, CommonTableExpr *parentCTE, ! bool locked_from_parent) { ParseState *pstate = make_parsestate(parentParseState); Query *query; pstate->p_parent_cte = parentCTE; pstate->p_locked_from_parent = locked_from_parent; query = transformStmt(pstate, parseTree); --- 156,170 ---- Query * parse_sub_analyze(Node *parseTree, ParseState *parentParseState, CommonTableExpr *parentCTE, ! bool locked_from_parent, ! bool resolve_unknowns) { ParseState *pstate = make_parsestate(parentParseState); Query *query; pstate->p_parent_cte = parentCTE; pstate->p_locked_from_parent = locked_from_parent; + pstate->p_resolve_unknowns = resolve_unknowns; query = transformStmt(pstate, parseTree); *************** transformInsertStmt(ParseState *pstate, *** 570,579 **** --- 572,588 ---- * otherwise the behavior of SELECT within INSERT might be different * from a stand-alone SELECT. (Indeed, Postgres up through 6.5 had * bugs of just that nature...) + * + * The sole exception is that we prevent resolving unknown-type + * outputs as TEXT. This does not change the semantics since if the + * column type matters semantically, it would have been resolved to + * something else anyway. Doing this lets us resolve such outputs as + * the target column's type, which we handle below. */ sub_pstate->p_rtable = sub_rtable; sub_pstate->p_joinexprs = NIL; /* sub_rtable has no joins */ sub_pstate->p_namespace = sub_namespace; + sub_pstate->p_resolve_unknowns = false; selectQuery = transformStmt(sub_pstate, stmt->selectStmt); *************** transformSelectStmt(ParseState *pstate, *** 1269,1274 **** --- 1278,1287 ---- pstate->p_windowdefs, &qry->targetList); + /* resolve any still-unresolved output columns as being type text */ + if (pstate->p_resolve_unknowns) + resolveTargetListUnknowns(pstate, qry->targetList); + qry->rtable = pstate->p_rtable; qry->jointree = makeFromExpr(pstate->p_joinlist, qual); *************** transformSetOperationTree(ParseState *ps *** 1843,1853 **** /* * Transform SelectStmt into a Query. * * Note: previously transformed sub-queries don't affect the parsing * of this sub-query, because they are not in the toplevel pstate's * namespace list. */ ! selectQuery = parse_sub_analyze((Node *) stmt, pstate, NULL, false); /* * Check for bogus references to Vars on the current query level (but --- 1856,1874 ---- /* * Transform SelectStmt into a Query. * + * This works the same as SELECT transformation normally would, except + * that we prevent resolving unknown-type outputs as TEXT. This does + * not change the subquery's semantics since if the column type + * matters semantically, it would have been resolved to something else + * anyway. Doing this lets us resolve such outputs using + * select_common_type(), below. + * * Note: previously transformed sub-queries don't affect the parsing * of this sub-query, because they are not in the toplevel pstate's * namespace list. */ ! selectQuery = parse_sub_analyze((Node *) stmt, pstate, ! NULL, false, false); /* * Check for bogus references to Vars on the current query level (but *************** transformReturningList(ParseState *pstat *** 2350,2355 **** --- 2371,2380 ---- /* mark column origins */ markTargetListOrigins(pstate, rlist); + /* resolve any still-unresolved output columns as being type text */ + if (pstate->p_resolve_unknowns) + resolveTargetListUnknowns(pstate, rlist); + /* restore state */ pstate->p_next_resno = save_next_resno; diff --git a/src/backend/parser/parse_clause.c b/src/backend/parser/parse_clause.c index 624ab41..4769e78 100644 *** a/src/backend/parser/parse_clause.c --- b/src/backend/parser/parse_clause.c *************** transformRangeSubselect(ParseState *psta *** 471,477 **** * Analyze and transform the subquery. */ query = parse_sub_analyze(r->subquery, pstate, NULL, ! isLockedRefname(pstate, r->alias->aliasname)); /* Restore state */ pstate->p_lateral_active = false; --- 471,478 ---- * Analyze and transform the subquery. */ query = parse_sub_analyze(r->subquery, pstate, NULL, ! isLockedRefname(pstate, r->alias->aliasname), ! true); /* Restore state */ pstate->p_lateral_active = false; diff --git a/src/backend/parser/parse_cte.c b/src/backend/parser/parse_cte.c index fc8c15b..dfbcaa2 100644 *** a/src/backend/parser/parse_cte.c --- b/src/backend/parser/parse_cte.c *************** analyzeCTE(ParseState *pstate, CommonTab *** 241,247 **** /* Analysis not done already */ Assert(!IsA(cte->ctequery, Query)); ! query = parse_sub_analyze(cte->ctequery, pstate, cte, false); cte->ctequery = (Node *) query; /* --- 241,247 ---- /* Analysis not done already */ Assert(!IsA(cte->ctequery, Query)); ! query = parse_sub_analyze(cte->ctequery, pstate, cte, false, true); cte->ctequery = (Node *) query; /* *************** analyzeCTETargetList(ParseState *pstate, *** 393,403 **** /* * If the CTE is recursive, force the exposed column type of any ! * "unknown" column to "text". This corresponds to the fact that ! * SELECT 'foo' UNION SELECT 'bar' will ultimately produce text. We ! * might see "unknown" as a result of an untyped literal in the ! * non-recursive term's select list, and if we don't convert to text ! * then we'll have a mismatch against the UNION result. * * The column might contain 'foo' COLLATE "bar", so don't override * collation if it's already set. --- 393,402 ---- /* * If the CTE is recursive, force the exposed column type of any ! * "unknown" column to "text". We must deal with this here because ! * we're called on the non-recursive term before there's been any ! * attempt to force unknown output columns to some other type. We ! * have to resolve unknowns before looking at the recursive term. * * The column might contain 'foo' COLLATE "bar", so don't override * collation if it's already set. diff --git a/src/backend/parser/parse_expr.c b/src/backend/parser/parse_expr.c index add3be6..c43ef19 100644 *** a/src/backend/parser/parse_expr.c --- b/src/backend/parser/parse_expr.c *************** transformSubLink(ParseState *pstate, Sub *** 1846,1852 **** /* * OK, let's transform the sub-SELECT. */ ! qtree = parse_sub_analyze(sublink->subselect, pstate, NULL, false); /* * Check that we got a SELECT. Anything else should be impossible given --- 1846,1852 ---- /* * OK, let's transform the sub-SELECT. */ ! qtree = parse_sub_analyze(sublink->subselect, pstate, NULL, false, true); /* * Check that we got a SELECT. Anything else should be impossible given diff --git a/src/backend/parser/parse_node.c b/src/backend/parser/parse_node.c index 73e7d65..2a5f147 100644 *** a/src/backend/parser/parse_node.c --- b/src/backend/parser/parse_node.c *************** make_parsestate(ParseState *parentParseS *** 51,56 **** --- 51,57 ---- /* Fill in fields that don't start at null/false/zero */ pstate->p_next_resno = 1; + pstate->p_resolve_unknowns = true; if (parentParseState) { diff --git a/src/backend/parser/parse_target.c b/src/backend/parser/parse_target.c index 081a8dd..2576e31 100644 *** a/src/backend/parser/parse_target.c --- b/src/backend/parser/parse_target.c *************** transformExpressionList(ParseState *psta *** 289,300 **** /* * markTargetListOrigins() * Mark targetlist columns that are simple Vars with the source * table's OID and column number. * ! * Currently, this is done only for SELECT targetlists, since we only ! * need the info if we are going to send it to the frontend. */ void markTargetListOrigins(ParseState *pstate, List *targetlist) --- 289,329 ---- /* + * resolveTargetListUnknowns() + * Convert any unknown-type targetlist entries to type TEXT. + * + * We do this after we've exhausted all other ways of identifying the output + * column types of a query. + */ + void + resolveTargetListUnknowns(ParseState *pstate, List *targetlist) + { + ListCell *l; + + foreach(l, targetlist) + { + TargetEntry *tle = (TargetEntry *) lfirst(l); + Oid restype = exprType((Node *) tle->expr); + + if (restype == UNKNOWNOID) + { + tle->expr = (Expr *) coerce_type(pstate, (Node *) tle->expr, + restype, TEXTOID, -1, + COERCION_IMPLICIT, + COERCE_IMPLICIT_CAST, + -1); + } + } + } + + + /* * markTargetListOrigins() * Mark targetlist columns that are simple Vars with the source * table's OID and column number. * ! * Currently, this is done only for SELECT targetlists and RETURNING lists, ! * since we only need the info if we are going to send it to the frontend. */ void markTargetListOrigins(ParseState *pstate, List *targetlist) diff --git a/src/bin/pg_upgrade/check.c b/src/bin/pg_upgrade/check.c index b6efad4..ff1d7a2 100644 *** a/src/bin/pg_upgrade/check.c --- b/src/bin/pg_upgrade/check.c *************** check_and_dump_old_cluster(bool live_che *** 99,104 **** --- 99,108 ---- check_for_reg_data_type_usage(&old_cluster); check_for_isn_and_int8_passing_mismatch(&old_cluster); + /* Pre-PG 10 allowed tables with 'unknown' type columns */ + if (GET_MAJOR_VERSION(old_cluster.major_version) <= 906) + old_9_6_check_for_unknown_data_type_usage(&old_cluster); + /* 9.5 and below should not have roles starting with pg_ */ if (GET_MAJOR_VERSION(old_cluster.major_version) <= 905) check_for_pg_role_prefix(&old_cluster); diff --git a/src/bin/pg_upgrade/pg_upgrade.h b/src/bin/pg_upgrade/pg_upgrade.h index 42e7aeb..968ab8a 100644 *** a/src/bin/pg_upgrade/pg_upgrade.h --- b/src/bin/pg_upgrade/pg_upgrade.h *************** void pg_putenv(const char *var, const c *** 442,447 **** --- 442,448 ---- void new_9_0_populate_pg_largeobject_metadata(ClusterInfo *cluster, bool check_mode); void old_9_3_check_for_line_data_type_usage(ClusterInfo *cluster); + void old_9_6_check_for_unknown_data_type_usage(ClusterInfo *cluster); /* parallel.c */ void parallel_exec_prog(const char *log_file, const char *opt_log_file, diff --git a/src/bin/pg_upgrade/version.c b/src/bin/pg_upgrade/version.c index fb56fea..aa462da 100644 *** a/src/bin/pg_upgrade/version.c --- b/src/bin/pg_upgrade/version.c *************** old_9_3_check_for_line_data_type_usage(C *** 185,187 **** --- 185,284 ---- else check_ok(); } + + + /* + * old_9_6_check_for_unknown_data_type_usage() + * 9.6 -> 10 + * It's no longer allowed to create tables or views with "unknown"-type + * columns. We do not complain about views with such columns, because + * they should get silently converted to "text" columns during the DDL + * dump and reload; it seems unlikely to be worth making users do that + * by hand. However, if there's a table with such a column, the DDL + * reload will fail, so we should pre-detect that rather than failing + * mid-upgrade. Worse, if there's a matview with such a column, the + * DDL reload will silently change it to "text" which won't match the + * on-disk storage (which is like "cstring"). So we *must* reject that. + * Also check composite types, in case they are used for table columns. + * We needn't check indexes, because "unknown" has no opclasses. + */ + void + old_9_6_check_for_unknown_data_type_usage(ClusterInfo *cluster) + { + int dbnum; + FILE *script = NULL; + bool found = false; + char output_path[MAXPGPATH]; + + prep_status("Checking for invalid \"unknown\" user columns"); + + snprintf(output_path, sizeof(output_path), "tables_using_unknown.txt"); + + for (dbnum = 0; dbnum < cluster->dbarr.ndbs; dbnum++) + { + PGresult *res; + bool db_used = false; + int ntups; + int rowno; + int i_nspname, + i_relname, + i_attname; + DbInfo *active_db = &cluster->dbarr.dbs[dbnum]; + PGconn *conn = connectToServer(cluster, active_db->db_name); + + res = executeQueryOrDie(conn, + "SELECT n.nspname, c.relname, a.attname " + "FROM pg_catalog.pg_class c, " + " pg_catalog.pg_namespace n, " + " pg_catalog.pg_attribute a " + "WHERE c.oid = a.attrelid AND " + " NOT a.attisdropped AND " + " a.atttypid = 'pg_catalog.unknown'::pg_catalog.regtype AND " + " c.relkind IN ('r', 'c', 'm') AND " + " c.relnamespace = n.oid AND " + /* exclude possible orphaned temp tables */ + " n.nspname !~ '^pg_temp_' AND " + " n.nspname !~ '^pg_toast_temp_' AND " + " n.nspname NOT IN ('pg_catalog', 'information_schema')"); + + ntups = PQntuples(res); + i_nspname = PQfnumber(res, "nspname"); + i_relname = PQfnumber(res, "relname"); + i_attname = PQfnumber(res, "attname"); + for (rowno = 0; rowno < ntups; rowno++) + { + found = true; + if (script == NULL && (script = fopen_priv(output_path, "w")) == NULL) + pg_fatal("could not open file \"%s\": %s\n", output_path, + strerror(errno)); + if (!db_used) + { + fprintf(script, "Database: %s\n", active_db->db_name); + db_used = true; + } + fprintf(script, " %s.%s.%s\n", + PQgetvalue(res, rowno, i_nspname), + PQgetvalue(res, rowno, i_relname), + PQgetvalue(res, rowno, i_attname)); + } + + PQclear(res); + + PQfinish(conn); + } + + if (script) + fclose(script); + + if (found) + { + pg_log(PG_REPORT, "fatal\n"); + pg_fatal("Your installation contains the \"unknown\" data type in user tables. This\n" + "data type is no longer allowed in tables, so this cluster cannot currently\n" + "be upgraded. You can remove the problem tables and restart the upgrade.\n" + "A list of the problem columns is in the file:\n" + " %s\n\n", output_path); + } + else + check_ok(); + } diff --git a/src/include/parser/analyze.h b/src/include/parser/analyze.h index a7e5c55..1725940 100644 *** a/src/include/parser/analyze.h --- b/src/include/parser/analyze.h *************** extern Query *parse_analyze_varparams(Ra *** 29,35 **** extern Query *parse_sub_analyze(Node *parseTree, ParseState *parentParseState, CommonTableExpr *parentCTE, ! bool locked_from_parent); extern Query *transformTopLevelStmt(ParseState *pstate, RawStmt *parseTree); extern Query *transformStmt(ParseState *pstate, Node *parseTree); --- 29,36 ---- extern Query *parse_sub_analyze(Node *parseTree, ParseState *parentParseState, CommonTableExpr *parentCTE, ! bool locked_from_parent, ! bool resolve_unknowns); extern Query *transformTopLevelStmt(ParseState *pstate, RawStmt *parseTree); extern Query *transformStmt(ParseState *pstate, Node *parseTree); diff --git a/src/include/parser/parse_node.h b/src/include/parser/parse_node.h index bc3eea9..3a25d95 100644 *** a/src/include/parser/parse_node.h --- b/src/include/parser/parse_node.h *************** typedef Node *(*CoerceParamHook) (ParseS *** 150,155 **** --- 150,158 ---- * p_locked_from_parent: true if parent query level applies FOR UPDATE/SHARE * to this subquery as a whole. * + * p_resolve_unknowns: resolve unknown-type SELECT output columns as type TEXT + * (this is true by default). + * * p_hasAggs, p_hasWindowFuncs, etc: true if we've found any of the indicated * constructs in the query. * *************** struct ParseState *** 182,187 **** --- 185,192 ---- List *p_locking_clause; /* raw FOR UPDATE/FOR SHARE info */ bool p_locked_from_parent; /* parent has marked this subquery * with FOR UPDATE/FOR SHARE */ + bool p_resolve_unknowns; /* resolve unknown-type SELECT outputs + * as type text */ /* Flags telling about things found in the query: */ bool p_hasAggs; diff --git a/src/include/parser/parse_target.h b/src/include/parser/parse_target.h index e035aac..d06a235 100644 *** a/src/include/parser/parse_target.h --- b/src/include/parser/parse_target.h *************** extern List *transformTargetList(ParseSt *** 21,26 **** --- 21,27 ---- ParseExprKind exprKind); extern List *transformExpressionList(ParseState *pstate, List *exprlist, ParseExprKind exprKind, bool allowDefault); + extern void resolveTargetListUnknowns(ParseState *pstate, List *targetlist); extern void markTargetListOrigins(ParseState *pstate, List *targetlist); extern TargetEntry *transformTargetEntry(ParseState *pstate, Node *node, Node *expr, ParseExprKind exprKind, diff --git a/src/test/regress/expected/create_table.out b/src/test/regress/expected/create_table.out index 6caa9c2..36266f0 100644 *** a/src/test/regress/expected/create_table.out --- b/src/test/regress/expected/create_table.out *************** CREATE TABLE array_index_op_test ( *** 199,204 **** --- 199,212 ---- CREATE TABLE testjsonb ( j jsonb ); + CREATE TABLE unknowntab ( + u unknown -- fail + ); + ERROR: column "u" has pseudo-type unknown + CREATE TYPE unknown_comptype AS ( + u unknown -- fail + ); + ERROR: column "u" has pseudo-type unknown CREATE TABLE IF NOT EXISTS test_tsvector( t text, a tsvector diff --git a/src/test/regress/expected/create_view.out b/src/test/regress/expected/create_view.out index 096bfc3..ce0c8ce 100644 *** a/src/test/regress/expected/create_view.out --- b/src/test/regress/expected/create_view.out *************** SELECT relname, relkind, reloptions FROM *** 288,293 **** --- 288,319 ---- mysecview4 | v | {security_barrier=false} (4 rows) + -- Check that unknown literals are converted to "text" in CREATE VIEW, + -- so that we don't end up with unknown-type columns. + CREATE VIEW unspecified_types AS + SELECT 42 as i, 42.5 as num, 'foo' as u, 'foo'::unknown as u2, null as n; + \d+ unspecified_types + View "testviewschm2.unspecified_types" + Column | Type | Collation | Nullable | Default | Storage | Description + --------+---------+-----------+----------+---------+----------+------------- + i | integer | | | | plain | + num | numeric | | | | main | + u | text | | | | extended | + u2 | text | | | | extended | + n | text | | | | extended | + View definition: + SELECT 42 AS i, + 42.5 AS num, + 'foo'::text AS u, + 'foo'::text AS u2, + NULL::text AS n; + + SELECT * FROM unspecified_types; + i | num | u | u2 | n + ----+------+-----+-----+--- + 42 | 42.5 | foo | foo | + (1 row) + -- This test checks that proper typmods are assigned in a multi-row VALUES CREATE VIEW tt1 AS SELECT * FROM ( diff --git a/src/test/regress/expected/matview.out b/src/test/regress/expected/matview.out index 7a2eaa0..4ae4460 100644 *** a/src/test/regress/expected/matview.out --- b/src/test/regress/expected/matview.out *************** DETAIL: drop cascades to materialized v *** 508,513 **** --- 508,540 ---- drop cascades to materialized view mvtest_mv_v_2 drop cascades to materialized view mvtest_mv_v_3 drop cascades to materialized view mvtest_mv_v_4 + -- Check that unknown literals are converted to "text" in CREATE MATVIEW, + -- so that we don't end up with unknown-type columns. + CREATE MATERIALIZED VIEW mv_unspecified_types AS + SELECT 42 as i, 42.5 as num, 'foo' as u, 'foo'::unknown as u2, null as n; + \d+ mv_unspecified_types + Materialized view "public.mv_unspecified_types" + Column | Type | Collation | Nullable | Default | Storage | Stats target | Description + --------+---------+-----------+----------+---------+----------+--------------+------------- + i | integer | | | | plain | | + num | numeric | | | | main | | + u | text | | | | extended | | + u2 | text | | | | extended | | + n | text | | | | extended | | + View definition: + SELECT 42 AS i, + 42.5 AS num, + 'foo'::text AS u, + 'foo'::text AS u2, + NULL::text AS n; + + SELECT * FROM mv_unspecified_types; + i | num | u | u2 | n + ----+------+-----+-----+--- + 42 | 42.5 | foo | foo | + (1 row) + + DROP MATERIALIZED VIEW mv_unspecified_types; -- make sure that create WITH NO DATA does not plan the query (bug #13907) create materialized view mvtest_error as select 1/0 as x; -- fail ERROR: division by zero diff --git a/src/test/regress/expected/subselect.out b/src/test/regress/expected/subselect.out index abd3217..47afdc3 100644 *** a/src/test/regress/expected/subselect.out --- b/src/test/regress/expected/subselect.out *************** SELECT '' AS five, f1 AS "Correlated Fie *** 196,201 **** --- 196,232 ---- | 3 (5 rows) + -- Unspecified-type literals in output columns should resolve as text + SELECT *, pg_typeof(f1) FROM + (SELECT 'foo' AS f1 FROM generate_series(1,3)) ss ORDER BY 1; + f1 | pg_typeof + -----+----------- + foo | text + foo | text + foo | text + (3 rows) + + -- ... unless there's context to suggest differently + explain verbose select '42' union all select '43'; + QUERY PLAN + ------------------------------------------------- + Append (cost=0.00..0.04 rows=2 width=32) + -> Result (cost=0.00..0.01 rows=1 width=32) + Output: '42'::text + -> Result (cost=0.00..0.01 rows=1 width=32) + Output: '43'::text + (5 rows) + + explain verbose select '42' union all select 43; + QUERY PLAN + ------------------------------------------------ + Append (cost=0.00..0.04 rows=2 width=4) + -> Result (cost=0.00..0.01 rows=1 width=4) + Output: 42 + -> Result (cost=0.00..0.01 rows=1 width=4) + Output: 43 + (5 rows) + -- -- Use some existing tables in the regression test -- diff --git a/src/test/regress/expected/with.out b/src/test/regress/expected/with.out index 02fa08e..3b7f689 100644 *** a/src/test/regress/expected/with.out --- b/src/test/regress/expected/with.out *************** SELECT * FROM t LIMIT 10; *** 133,141 **** -- Test behavior with an unknown-type literal in the WITH WITH q AS (SELECT 'foo' AS x) ! SELECT x, x IS OF (unknown) as is_unknown FROM q; ! x | is_unknown ! -----+------------ foo | t (1 row) --- 133,141 ---- -- Test behavior with an unknown-type literal in the WITH WITH q AS (SELECT 'foo' AS x) ! SELECT x, x IS OF (text) AS is_text FROM q; ! x | is_text ! -----+--------- foo | t (1 row) *************** WITH RECURSIVE t(n) AS ( *** 144,150 **** UNION ALL SELECT n || ' bar' FROM t WHERE length(n) < 20 ) ! SELECT n, n IS OF (text) as is_text FROM t; n | is_text -------------------------+--------- foo | t --- 144,150 ---- UNION ALL SELECT n || ' bar' FROM t WHERE length(n) < 20 ) ! SELECT n, n IS OF (text) AS is_text FROM t; n | is_text -------------------------+--------- foo | t *************** SELECT n, n IS OF (text) as is_text FROM *** 155,160 **** --- 155,172 ---- foo bar bar bar bar bar | t (6 rows) + -- In a perfect world, this would work and resolve the literal as int ... + -- but for now, we have to be content with resolving to text too soon. + WITH RECURSIVE t(n) AS ( + SELECT '7' + UNION ALL + SELECT n+1 FROM t WHERE n < 10 + ) + SELECT n, n IS OF (int) AS is_int FROM t; + ERROR: operator does not exist: text + integer + LINE 4: SELECT n+1 FROM t WHERE n < 10 + ^ + HINT: No operator matches the given name and argument type(s). You might need to add explicit type casts. -- -- Some examples with a tree -- diff --git a/src/test/regress/output/create_function_1.source b/src/test/regress/output/create_function_1.source index 30c2936..957595c 100644 *** a/src/test/regress/output/create_function_1.source --- b/src/test/regress/output/create_function_1.source *************** CREATE FUNCTION test_atomic_ops() *** 59,65 **** CREATE FUNCTION test1 (int) RETURNS int LANGUAGE SQL AS 'SELECT ''not an integer'';'; ERROR: return type mismatch in function declared to return integer ! DETAIL: Actual return type is unknown. CONTEXT: SQL function "test1" CREATE FUNCTION test1 (int) RETURNS int LANGUAGE SQL AS 'not even SQL'; --- 59,65 ---- CREATE FUNCTION test1 (int) RETURNS int LANGUAGE SQL AS 'SELECT ''not an integer'';'; ERROR: return type mismatch in function declared to return integer ! DETAIL: Actual return type is text. CONTEXT: SQL function "test1" CREATE FUNCTION test1 (int) RETURNS int LANGUAGE SQL AS 'not even SQL'; diff --git a/src/test/regress/sql/create_table.sql b/src/test/regress/sql/create_table.sql index 8242e73..6314aa4 100644 *** a/src/test/regress/sql/create_table.sql --- b/src/test/regress/sql/create_table.sql *************** CREATE TABLE testjsonb ( *** 236,241 **** --- 236,249 ---- j jsonb ); + CREATE TABLE unknowntab ( + u unknown -- fail + ); + + CREATE TYPE unknown_comptype AS ( + u unknown -- fail + ); + CREATE TABLE IF NOT EXISTS test_tsvector( t text, a tsvector diff --git a/src/test/regress/sql/create_view.sql b/src/test/regress/sql/create_view.sql index 5fe8b94..c27f103 100644 *** a/src/test/regress/sql/create_view.sql --- b/src/test/regress/sql/create_view.sql *************** SELECT relname, relkind, reloptions FROM *** 224,229 **** --- 224,237 ---- 'mysecview3'::regclass, 'mysecview4'::regclass) ORDER BY relname; + -- Check that unknown literals are converted to "text" in CREATE VIEW, + -- so that we don't end up with unknown-type columns. + + CREATE VIEW unspecified_types AS + SELECT 42 as i, 42.5 as num, 'foo' as u, 'foo'::unknown as u2, null as n; + \d+ unspecified_types + SELECT * FROM unspecified_types; + -- This test checks that proper typmods are assigned in a multi-row VALUES CREATE VIEW tt1 AS diff --git a/src/test/regress/sql/matview.sql b/src/test/regress/sql/matview.sql index 65a743c..1164b4c 100644 *** a/src/test/regress/sql/matview.sql --- b/src/test/regress/sql/matview.sql *************** SELECT * FROM mvtest_mv_v_3; *** 198,203 **** --- 198,211 ---- SELECT * FROM mvtest_mv_v_4; DROP TABLE mvtest_v CASCADE; + -- Check that unknown literals are converted to "text" in CREATE MATVIEW, + -- so that we don't end up with unknown-type columns. + CREATE MATERIALIZED VIEW mv_unspecified_types AS + SELECT 42 as i, 42.5 as num, 'foo' as u, 'foo'::unknown as u2, null as n; + \d+ mv_unspecified_types + SELECT * FROM mv_unspecified_types; + DROP MATERIALIZED VIEW mv_unspecified_types; + -- make sure that create WITH NO DATA does not plan the query (bug #13907) create materialized view mvtest_error as select 1/0 as x; -- fail create materialized view mvtest_error as select 1/0 as x with no data; diff --git a/src/test/regress/sql/subselect.sql b/src/test/regress/sql/subselect.sql index 08eb825..9c2a73d 100644 *** a/src/test/regress/sql/subselect.sql --- b/src/test/regress/sql/subselect.sql *************** SELECT '' AS five, f1 AS "Correlated Fie *** 80,85 **** --- 80,95 ---- WHERE (f1, f2) IN (SELECT f2, CAST(f3 AS int4) FROM SUBSELECT_TBL WHERE f3 IS NOT NULL); + -- Unspecified-type literals in output columns should resolve as text + + SELECT *, pg_typeof(f1) FROM + (SELECT 'foo' AS f1 FROM generate_series(1,3)) ss ORDER BY 1; + + -- ... unless there's context to suggest differently + + explain verbose select '42' union all select '43'; + explain verbose select '42' union all select 43; + -- -- Use some existing tables in the regression test -- diff --git a/src/test/regress/sql/with.sql b/src/test/regress/sql/with.sql index 7ee32ba..08ddc8b 100644 *** a/src/test/regress/sql/with.sql --- b/src/test/regress/sql/with.sql *************** SELECT * FROM t LIMIT 10; *** 69,82 **** -- Test behavior with an unknown-type literal in the WITH WITH q AS (SELECT 'foo' AS x) ! SELECT x, x IS OF (unknown) as is_unknown FROM q; WITH RECURSIVE t(n) AS ( SELECT 'foo' UNION ALL SELECT n || ' bar' FROM t WHERE length(n) < 20 ) ! SELECT n, n IS OF (text) as is_text FROM t; -- -- Some examples with a tree --- 69,91 ---- -- Test behavior with an unknown-type literal in the WITH WITH q AS (SELECT 'foo' AS x) ! SELECT x, x IS OF (text) AS is_text FROM q; WITH RECURSIVE t(n) AS ( SELECT 'foo' UNION ALL SELECT n || ' bar' FROM t WHERE length(n) < 20 ) ! SELECT n, n IS OF (text) AS is_text FROM t; ! ! -- In a perfect world, this would work and resolve the literal as int ... ! -- but for now, we have to be content with resolving to text too soon. ! WITH RECURSIVE t(n) AS ( ! SELECT '7' ! UNION ALL ! SELECT n+1 FROM t WHERE n < 10 ! ) ! SELECT n, n IS OF (int) AS is_int FROM t; -- -- Some examples with a tree -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Assignment of valid collation for SET operations onqueries with UNKNOWN types.
From
Michael Paquier
Date:
On Mon, Jan 23, 2017 at 7:53 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> 2. I didn't do anything about docs, either, though maybe no change >> is needed. One user-visible change from this is that queries should >> never return any "unknown"-type columns to clients anymore. But I >> think that is not documented now, so maybe there's nothing to change. > > The only thing I could find in the SGML docs that directly addresses > unknown-type columns was a remark in the CREATE VIEW man page, which > I've updated. I also added a section to typeconv.sgml to document > the behavior. This looks good. >> 3. We need to look at whether pg_upgrade is affected. I think we >> might be able to let it just ignore the issue for views, since they'd >> get properly updated during the dump and reload anyway. But if someone >> had an actual table (or matview) with an "unknown" column, that would >> be a pg_upgrade hazard, because "unknown" doesn't store like "text". > > I tested and found that simple views with unknown columns seem to update > sanely if we just let pg_upgrade dump and restore them. So I suggest we > allow that to happen. There might be cases where dependent views behave > unexpectedly after such a conversion, but I think that would be rare > enough that it's not worth forcing users to fix these views by hand before > updating. However, tables with unknown columns would fail the upgrade > (since we'd reject the CREATE TABLE command) while matviews would be > accepted but then the DDL wouldn't match the physical data storage. > So I added code to pg_upgrade to check for those cases and refuse to > proceed. This is almost a straight copy-and-paste of the existing > pg_upgrade code for checking for "line" columns. > > I think this is committable now; the other loose ends can be dealt > with in follow-on patches. Does anyone want to review it? I have spent a couple of hours looking at it, and it looks in pretty good shape. The concept of doing the checks in the parser is far cleaner than what was proposed upthread to tweak the column lists, and more generic. One thing though: even with this patch, it is still possible to define a domain with unknown as underlying type and have a table grab it: create domain name as unknown; create table foo_name (a name); I think that this case should be restricted as well, and pg_upgrade had better complain with a lookup at typbasetype in pg_type. -- Michael
Re: [HACKERS] Assignment of valid collation for SET operations onqueries with UNKNOWN types.
From
Ashutosh Bapat
Date:
On Mon, Jan 23, 2017 at 4:23 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > I wrote: >> I spent some time fooling with this today and came up with the attached >> patch. I think this is basically the direction we should go in, but >> there are various loose ends: > > Here's an updated patch that's rebased against today's HEAD and addresses > most of the loose ends: > >> 1. I adjusted a couple of existing regression tests whose results are >> affected, but didn't do anything towards adding new tests showing the >> desirable results of this change (as per the examples in this thread). > > Added some regression test cases. These are mostly designed to prove > that coercion to text occurs where expected, but I did include a couple > of queries that would have failed outright before. +1. Thanks. > >> 2. I didn't do anything about docs, either, though maybe no change >> is needed. One user-visible change from this is that queries should >> never return any "unknown"-type columns to clients anymore. But I >> think that is not documented now, so maybe there's nothing to change. > > The only thing I could find in the SGML docs that directly addresses > unknown-type columns was a remark in the CREATE VIEW man page, which > I've updated. I also added a section to typeconv.sgml to document > the behavior. > >> 3. We need to look at whether pg_upgrade is affected. I think we >> might be able to let it just ignore the issue for views, since they'd >> get properly updated during the dump and reload anyway. But if someone >> had an actual table (or matview) with an "unknown" column, that would >> be a pg_upgrade hazard, because "unknown" doesn't store like "text". > > I tested and found that simple views with unknown columns seem to update > sanely if we just let pg_upgrade dump and restore them. So I suggest we > allow that to happen. There might be cases where dependent views behave > unexpectedly after such a conversion, but I think that would be rare > enough that it's not worth forcing users to fix these views by hand before > updating. However, tables with unknown columns would fail the upgrade > (since we'd reject the CREATE TABLE command) while matviews would be > accepted but then the DDL wouldn't match the physical data storage. > So I added code to pg_upgrade to check for those cases and refuse to > proceed. This is almost a straight copy-and-paste of the existing > pg_upgrade code for checking for "line" columns. > Following error message might be misleading, postgres=# create table t1 (a unknown); ERROR: column "a" has pseudo-type unknown UNKNOWN is not exactly a pseudo-type. postgres=# select typname, typtype from pg_type where typname = 'unknown' or typname = 'any';typname | typtype ---------+---------unknown | bany | p (2 rows) In your earlier mail, you had raised the point about marking unknown as a pseudo-type. But till we actually do that, it would be better not to mention 'unknown' as a pseudo-type. -- Best Wishes, Ashutosh Bapat EnterpriseDB Corporation The Postgres Database Company
Re: [HACKERS] Assignment of valid collation for SET operations on queries with UNKNOWN types.
From
Tom Lane
Date:
Michael Paquier <michael.paquier@gmail.com> writes: > One thing though: even with this patch, it is still possible to define > a domain with unknown as underlying type and have a table grab it: > create domain name as unknown; > create table foo_name (a name); > I think that this case should be restricted as well, and pg_upgrade > had better complain with a lookup at typbasetype in pg_type. Yeah, this is the sort of corner case that I think we ought to plug by turning "unknown" into a true pseudotype. DefineDomain already has a defense against that: regression=# create domain d2 as anyelement; ERROR: "anyelement" is not a valid base type for a domain regression=# \errverbose ERROR: 42804: "anyelement" is not a valid base type for a domain LOCATION: DefineDomain, typecmds.c:812 Some prodding indicates that the PLs are not uniformly preventing making functions with unknown input or result type, either. That's another case that would be less likely to get missed if it were a true pseudotype. However, I think that's all material for a separate patch. This patch is just concerned with what the main parser should do with "unknown", not with what utility commands should do with it. regards, tom lane
Re: [HACKERS] Assignment of valid collation for SET operations on queries with UNKNOWN types.
From
Tom Lane
Date:
Ashutosh Bapat <ashutosh.bapat@enterprisedb.com> writes: > Following error message might be misleading, > postgres=# create table t1 (a unknown); > ERROR: column "a" has pseudo-type unknown > UNKNOWN is not exactly a pseudo-type. Well, as I said to Michael just now, I think we should turn it into one now that we're disallowing it in tables, because "cannot be used as a table column" is more or less the definition of a pseudotype. In any case, the average user probably thinks UNKNOWN is a pseudotype, if they think about it at all. So I think this message is fine even if we left it as-is. regards, tom lane
Re: [HACKERS] Assignment of valid collation for SET operations on queries with UNKNOWN types.
From
Tom Lane
Date:
I wrote: > Ashutosh Bapat <ashutosh.bapat@enterprisedb.com> writes: >> UNKNOWN is not exactly a pseudo-type. > Well, as I said to Michael just now, I think we should turn it into one > now that we're disallowing it in tables, because "cannot be used as a > table column" is more or less the definition of a pseudotype. I experimented with this, and it actually doesn't seem to be any harder than the attached: there's one type_sanity query that changes results, and otherwise all the regression tests pass. I've grepped the code for references to UNKNOWNOID and TYPTYPE_PSEUDO, and I can't find any places where the behavior would change in a way that we don't want. Basically it looks like we'd disallow UNKNOWN as a domain base, a PL function argument or result, and a plpgsql local variable; and all of those seem like good things from here. Have not checked the docs. regards, tom lane diff --git a/src/backend/catalog/heap.c b/src/backend/catalog/heap.c index af6ba47..51db1c6 100644 *** a/src/backend/catalog/heap.c --- b/src/backend/catalog/heap.c *************** CheckAttributeType(const char *attname, *** 490,497 **** char att_typtype = get_typtype(atttypid); Oid att_typelem; ! if (atttypid == UNKNOWNOID || ! att_typtype == TYPTYPE_PSEUDO) { /* * Refuse any attempt to create a pseudo-type column, except for a --- 490,496 ---- char att_typtype = get_typtype(atttypid); Oid att_typelem; ! if (att_typtype == TYPTYPE_PSEUDO) { /* * Refuse any attempt to create a pseudo-type column, except for a diff --git a/src/include/catalog/pg_type.h b/src/include/catalog/pg_type.h index c2350f3..6e4c65e 100644 *** a/src/include/catalog/pg_type.h --- b/src/include/catalog/pg_type.h *************** DESCR("relative, limited-range time inte *** 418,424 **** DATA(insert OID = 704 ( tinterval PGNSP PGUID 12 f b T f t \054 0 0 1025 tintervalin tintervalout tintervalrecv tintervalsend- - - i p f 0 -1 0 0 _null_ _null_ _null_ )); DESCR("(abstime,abstime), time interval"); #define TINTERVALOID 704 ! DATA(insert OID = 705 ( unknown PGNSP PGUID -2 f b X f t \054 0 0 0 unknownin unknownout unknownrecv unknownsend- - - c p f 0 -1 0 0 _null_ _null_ _null_ )); DESCR(""); #define UNKNOWNOID 705 --- 418,424 ---- DATA(insert OID = 704 ( tinterval PGNSP PGUID 12 f b T f t \054 0 0 1025 tintervalin tintervalout tintervalrecv tintervalsend- - - i p f 0 -1 0 0 _null_ _null_ _null_ )); DESCR("(abstime,abstime), time interval"); #define TINTERVALOID 704 ! DATA(insert OID = 705 ( unknown PGNSP PGUID -2 f p X f t \054 0 0 0 unknownin unknownout unknownrecv unknownsend- - - c p f 0 -1 0 0 _null_ _null_ _null_ )); DESCR(""); #define UNKNOWNOID 705 diff --git a/src/test/regress/expected/type_sanity.out b/src/test/regress/expected/type_sanity.out index e5adfba..312d290 100644 *** a/src/test/regress/expected/type_sanity.out --- b/src/test/regress/expected/type_sanity.out *************** WHERE (p1.typtype = 'c' AND p1.typrelid *** 59,65 **** -- Look for types that should have an array type according to their typtype, -- but don't. We exclude composites here because we have not bothered to -- make array types corresponding to the system catalogs' rowtypes. ! -- NOTE: as of 9.1, this check finds pg_node_tree, smgr, and unknown. SELECT p1.oid, p1.typname FROM pg_type as p1 WHERE p1.typtype not in ('c','d','p') AND p1.typname NOT LIKE E'\\_%' --- 59,65 ---- -- Look for types that should have an array type according to their typtype, -- but don't. We exclude composites here because we have not bothered to -- make array types corresponding to the system catalogs' rowtypes. ! -- NOTE: as of v10, this check finds pg_node_tree and smgr. SELECT p1.oid, p1.typname FROM pg_type as p1 WHERE p1.typtype not in ('c','d','p') AND p1.typname NOT LIKE E'\\_%' *************** WHERE p1.typtype not in ('c','d','p') AN *** 71,78 **** -----+-------------- 194 | pg_node_tree 210 | smgr ! 705 | unknown ! (3 rows) -- Make sure typarray points to a varlena array type of our own base SELECT p1.oid, p1.typname as basetype, p2.typname as arraytype, --- 71,77 ---- -----+-------------- 194 | pg_node_tree 210 | smgr ! (2 rows) -- Make sure typarray points to a varlena array type of our own base SELECT p1.oid, p1.typname as basetype, p2.typname as arraytype, diff --git a/src/test/regress/sql/type_sanity.sql b/src/test/regress/sql/type_sanity.sql index f7c5c9d..0282f84 100644 *** a/src/test/regress/sql/type_sanity.sql --- b/src/test/regress/sql/type_sanity.sql *************** WHERE (p1.typtype = 'c' AND p1.typrelid *** 53,59 **** -- Look for types that should have an array type according to their typtype, -- but don't. We exclude composites here because we have not bothered to -- make array types corresponding to the system catalogs' rowtypes. ! -- NOTE: as of 9.1, this check finds pg_node_tree, smgr, and unknown. SELECT p1.oid, p1.typname FROM pg_type as p1 --- 53,59 ---- -- Look for types that should have an array type according to their typtype, -- but don't. We exclude composites here because we have not bothered to -- make array types corresponding to the system catalogs' rowtypes. ! -- NOTE: as of v10, this check finds pg_node_tree and smgr. SELECT p1.oid, p1.typname FROM pg_type as p1 -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Assignment of valid collation for SET operations onqueries with UNKNOWN types.
From
Michael Paquier
Date:
On Tue, Jan 24, 2017 at 1:26 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > I wrote: >> Ashutosh Bapat <ashutosh.bapat@enterprisedb.com> writes: >>> UNKNOWN is not exactly a pseudo-type. > >> Well, as I said to Michael just now, I think we should turn it into one >> now that we're disallowing it in tables, because "cannot be used as a >> table column" is more or less the definition of a pseudotype. > > I experimented with this, and it actually doesn't seem to be any harder > than the attached: there's one type_sanity query that changes results, > and otherwise all the regression tests pass. Indeed. > I've grepped the code for references to UNKNOWNOID and TYPTYPE_PSEUDO, > and I can't find any places where the behavior would change in a way > that we don't want. Basically it looks like we'd disallow UNKNOWN as > a domain base, a PL function argument or result, and a plpgsql local > variable; and all of those seem like good things from here. This has the merit to fix the ugly check in heap.c, so you may want to merge both patches. At least I'd suggest to do so. > Have not checked the docs. Just looked at that. As unknown is a pseudo type, I don't think you need TYPCATEGORY_UNKNOWN in pg_type.h or even the mention to the unknown type in catalogs.sgml as that becomes a pseudo-type. The table of Pseudo-Types needs to be updated as well with unknown in datatype.sgml. For domains, it is still necessary to add an extra check in pg_upgrade and fail the upgrade if one of the domains declared uses the type unknown. Domains are not listed in pg_class, and are only present in pg_type. If you don't do that, the binary restore would just fail. -- Michael
Re: [HACKERS] Assignment of valid collation for SET operations onqueries with UNKNOWN types.
From
Ashutosh Bapat
Date:
On Mon, Jan 23, 2017 at 9:56 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > I wrote: >> Ashutosh Bapat <ashutosh.bapat@enterprisedb.com> writes: >>> UNKNOWN is not exactly a pseudo-type. > >> Well, as I said to Michael just now, I think we should turn it into one >> now that we're disallowing it in tables, because "cannot be used as a >> table column" is more or less the definition of a pseudotype. > > I experimented with this, and it actually doesn't seem to be any harder > than the attached: there's one type_sanity query that changes results, > and otherwise all the regression tests pass. > > I've grepped the code for references to UNKNOWNOID and TYPTYPE_PSEUDO, > and I can't find any places where the behavior would change in a way > that we don't want. Basically it looks like we'd disallow UNKNOWN as > a domain base, a PL function argument or result, and a plpgsql local > variable; and all of those seem like good things from here. Thanks. I think this brings unknown in line with record, internal, void etc. and that's good. That's really where it should be. I thought this code in CreateCast would create problem. /* No pseudo-types allowed */ if (sourcetyptype == TYPTYPE_PSEUDO) ereport(ERROR, (errcode(ERRCODE_WRONG_OBJECT_TYPE), errmsg("source data type %s is a pseudo-type", TypeNameToString(stmt->sourcetype)))); if (targettyptype == TYPTYPE_PSEUDO) ereport(ERROR, (errcode(ERRCODE_WRONG_OBJECT_TYPE), errmsg("target data type %s is a pseudo-type", TypeNameToString(stmt->targettype)))); This means that the user can not create a cast to or from unknown type. But then there's following code in can_coerce_type() /* * If input is an untyped string constant, assumewe can convert it to * anything. */ if (inputTypeId == UNKNOWNOID) continue; which would allow any kind of cast. But in coerce_type(), we seem to handle case of unknown constants explicitly. But I think with the earlier patch, we will be left with only the constant literals of unknown type. So, we are fine. But I think we will have to watch for any such casts created by users in pg_dump and pg_upgrade. Similarly for transforms, range(?). -- Best Wishes, Ashutosh Bapat EnterpriseDB Corporation The Postgres Database Company
Re: [HACKERS] Assignment of valid collation for SET operations on queries with UNKNOWN types.
From
Tom Lane
Date:
Michael Paquier <michael.paquier@gmail.com> writes: > As unknown is a pseudo type, I don't think you need > TYPCATEGORY_UNKNOWN in pg_type.h or even the mention to the unknown > type in catalogs.sgml as that becomes a pseudo-type. I wondered whether to remove TYPCATEGORY_UNKNOWN but thought it was an unnecessary change. "unknown" is different from the other pseudotypes in that type resolution treats it very specially, so it doesn't seem unreasonable for it to continue to have its own typcategory. Also, since type resolution sometimes takes into account whether types are of the same category or not, I'm a bit worried about whether moving "unknown" into the pseudotype category might have unexpected side effects. > The table of Pseudo-Types needs to be updated as well with unknown in > datatype.sgml. Check. > For domains, it is still necessary to add an extra check in pg_upgrade > and fail the upgrade if one of the domains declared uses the type > unknown. Domains are not listed in pg_class, and are only present in > pg_type. If you don't do that, the binary restore would just fail. Meh. I think this would largely be a useless check --- who would create such a domain? Also, it's not like the system will crash and burn if we don't check for it, it will just fail a bit further into the pg_upgrade process. That's not like the matview situation where it would appear to go through and then you'd have a broken matview. regards, tom lane
Re: [HACKERS] Assignment of valid collation for SET operations on queries with UNKNOWN types.
From
Tom Lane
Date:
Ashutosh Bapat <ashutosh.bapat@enterprisedb.com> writes: > On Mon, Jan 23, 2017 at 9:56 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> I've grepped the code for references to UNKNOWNOID and TYPTYPE_PSEUDO, >> and I can't find any places where the behavior would change in a way >> that we don't want. Basically it looks like we'd disallow UNKNOWN as >> a domain base, a PL function argument or result, and a plpgsql local >> variable; and all of those seem like good things from here. > Thanks. I think this brings unknown in line with record, internal, > void etc. and that's good. That's really where it should be. > I thought this code in CreateCast would create problem. Ah, I forgot to mention that: we'd also be disallowing creation of casts to and from unknown. This is also a good thing. > This means that the user can not create a cast to or from unknown > type. But then there's following code in can_coerce_type() Right, the system's notion of what to do with unknown is hard-wired. We do not want people to get the idea that they can override it by defining a cast. (Also, if anyone has done that, I don't think it actually had any effect.) > But I think we will have to watch for > any such casts created by users in pg_dump and pg_upgrade. Similarly > for transforms, range(?). As I said to Michael w.r.t. the same point for domains, I doubt this is worth spending cycles on to make a separate check for. It seems pretty unlikely that anyone has actually done that, and if they did, they'll still get a clean failure with an understandable message. regards, tom lane
Re: [HACKERS] Assignment of valid collation for SET operations on queries with UNKNOWN types.
From
Tom Lane
Date:
I wrote: > Michael Paquier <michael.paquier@gmail.com> writes: >> The table of Pseudo-Types needs to be updated as well with unknown in >> datatype.sgml. > Check. Here's an updated patch with doc changes. Aside from that one, I tried to spell "pseudo-type" consistently, and I changed one place where we were calling something a pseudo-type that isn't. regards, tom lane diff --git a/doc/src/sgml/datatype.sgml b/doc/src/sgml/datatype.sgml index 3bc6854..9ef7b4a 100644 *** a/doc/src/sgml/datatype.sgml --- b/doc/src/sgml/datatype.sgml *************** SELECT * FROM pg_attribute *** 4661,4666 **** --- 4661,4670 ---- </indexterm> <indexterm zone="datatype-pseudo"> + <primary>unknown</primary> + </indexterm> + + <indexterm zone="datatype-pseudo"> <primary>opaque</primary> </indexterm> *************** SELECT * FROM pg_attribute *** 4782,4789 **** </row> <row> <entry><type>opaque</></entry> ! <entry>An obsolete type name that formerly served all the above purposes.</entry> </row> </tbody> </tgroup> --- 4786,4800 ---- </row> <row> + <entry><type>unknown</></entry> + <entry>Identifies a not-yet-resolved type, e.g. of an undecorated + string literal.</entry> + </row> + + <row> <entry><type>opaque</></entry> ! <entry>An obsolete type name that formerly served many of the above ! purposes.</entry> </row> </tbody> </tgroup> diff --git a/doc/src/sgml/ddl.sgml b/doc/src/sgml/ddl.sgml index d7117cb..aebe898 100644 *** a/doc/src/sgml/ddl.sgml --- b/doc/src/sgml/ddl.sgml *************** WHERE c.altitude > 500 AND c.tableoid *** 2579,2585 **** <para> Another way to get the same effect is to use the <type>regclass</> ! pseudo-type, which will print the table OID symbolically: <programlisting> SELECT c.tableoid::regclass, c.name, c.altitude --- 2579,2585 ---- <para> Another way to get the same effect is to use the <type>regclass</> ! alias type, which will print the table OID symbolically: <programlisting> SELECT c.tableoid::regclass, c.name, c.altitude diff --git a/doc/src/sgml/plhandler.sgml b/doc/src/sgml/plhandler.sgml index 0fc5d7b..57a2a05 100644 *** a/doc/src/sgml/plhandler.sgml --- b/doc/src/sgml/plhandler.sgml *************** *** 26,32 **** language such as C, using the version-1 interface, and registered with <productname>PostgreSQL</productname> as taking no arguments and returning the type <type>language_handler</type>. This ! special pseudotype identifies the function as a call handler and prevents it from being called directly in SQL commands. For more details on C language calling conventions and dynamic loading, see <xref linkend="xfunc-c">. --- 26,32 ---- language such as C, using the version-1 interface, and registered with <productname>PostgreSQL</productname> as taking no arguments and returning the type <type>language_handler</type>. This ! special pseudo-type identifies the function as a call handler and prevents it from being called directly in SQL commands. For more details on C language calling conventions and dynamic loading, see <xref linkend="xfunc-c">. diff --git a/doc/src/sgml/protocol.sgml b/doc/src/sgml/protocol.sgml index 5f89db5..e315548 100644 *** a/doc/src/sgml/protocol.sgml --- b/doc/src/sgml/protocol.sgml *************** *** 668,674 **** number of parameter symbols (<literal>$</><replaceable>n</>) used in the query string. Another special case is that a parameter's type can be specified as <type>void</> (that is, the OID of the ! <type>void</> pseudotype). This is meant to allow parameter symbols to be used for function parameters that are actually OUT parameters. Ordinarily there is no context in which a <type>void</> parameter could be used, but if such a parameter symbol appears in a function's --- 668,674 ---- number of parameter symbols (<literal>$</><replaceable>n</>) used in the query string. Another special case is that a parameter's type can be specified as <type>void</> (that is, the OID of the ! <type>void</> pseudo-type). This is meant to allow parameter symbols to be used for function parameters that are actually OUT parameters. Ordinarily there is no context in which a <type>void</> parameter could be used, but if such a parameter symbol appears in a function's diff --git a/doc/src/sgml/queries.sgml b/doc/src/sgml/queries.sgml index ef623d5..30792f4 100644 *** a/doc/src/sgml/queries.sgml --- b/doc/src/sgml/queries.sgml *************** SELECT * FROM vw_getfoo; *** 762,768 **** In some cases it is useful to define table functions that can return different column sets depending on how they are invoked. To support this, the table function can be declared as returning ! the pseudotype <type>record</>. When such a function is used in a query, the expected row structure must be specified in the query itself, so that the system can know how to parse and plan the query. This syntax looks like: --- 762,768 ---- In some cases it is useful to define table functions that can return different column sets depending on how they are invoked. To support this, the table function can be declared as returning ! the pseudo-type <type>record</>. When such a function is used in a query, the expected row structure must be specified in the query itself, so that the system can know how to parse and plan the query. This syntax looks like: diff --git a/doc/src/sgml/ref/create_function.sgml b/doc/src/sgml/ref/create_function.sgml index 8108a43..e705778 100644 *** a/doc/src/sgml/ref/create_function.sgml --- b/doc/src/sgml/ref/create_function.sgml *************** CREATE [ OR REPLACE ] FUNCTION *** 160,167 **** </para> <para> Depending on the implementation language it might also be allowed ! to specify <quote>pseudotypes</> such as <type>cstring</>. ! Pseudotypes indicate that the actual argument type is either incompletely specified, or outside the set of ordinary SQL data types. </para> <para> --- 160,167 ---- </para> <para> Depending on the implementation language it might also be allowed ! to specify <quote>pseudo-types</> such as <type>cstring</>. ! Pseudo-types indicate that the actual argument type is either incompletely specified, or outside the set of ordinary SQL data types. </para> <para> *************** CREATE [ OR REPLACE ] FUNCTION *** 199,205 **** can be a base, composite, or domain type, or can reference the type of a table column. Depending on the implementation language it might also be allowed ! to specify <quote>pseudotypes</> such as <type>cstring</>. If the function is not supposed to return a value, specify <type>void</> as the return type. </para> --- 199,205 ---- can be a base, composite, or domain type, or can reference the type of a table column. Depending on the implementation language it might also be allowed ! to specify <quote>pseudo-types</> such as <type>cstring</>. If the function is not supposed to return a value, specify <type>void</> as the return type. </para> diff --git a/doc/src/sgml/ref/create_type.sgml b/doc/src/sgml/ref/create_type.sgml index 5a09f19..7146c4a 100644 *** a/doc/src/sgml/ref/create_type.sgml --- b/doc/src/sgml/ref/create_type.sgml *************** CREATE TYPE <replaceable class="paramete *** 824,830 **** In <productname>PostgreSQL</productname> versions before 7.3, it was customary to avoid creating a shell type at all, by replacing the functions' forward references to the type name with the placeholder ! pseudotype <type>opaque</>. The <type>cstring</> arguments and results also had to be declared as <type>opaque</> before 7.3. To support loading of old dump files, <command>CREATE TYPE</> will accept I/O functions declared using <type>opaque</>, but it will issue --- 824,830 ---- In <productname>PostgreSQL</productname> versions before 7.3, it was customary to avoid creating a shell type at all, by replacing the functions' forward references to the type name with the placeholder ! pseudo-type <type>opaque</>. The <type>cstring</> arguments and results also had to be declared as <type>opaque</> before 7.3. To support loading of old dump files, <command>CREATE TYPE</> will accept I/O functions declared using <type>opaque</>, but it will issue diff --git a/src/backend/catalog/heap.c b/src/backend/catalog/heap.c index af6ba47..51db1c6 100644 *** a/src/backend/catalog/heap.c --- b/src/backend/catalog/heap.c *************** CheckAttributeType(const char *attname, *** 490,497 **** char att_typtype = get_typtype(atttypid); Oid att_typelem; ! if (atttypid == UNKNOWNOID || ! att_typtype == TYPTYPE_PSEUDO) { /* * Refuse any attempt to create a pseudo-type column, except for a --- 490,496 ---- char att_typtype = get_typtype(atttypid); Oid att_typelem; ! if (att_typtype == TYPTYPE_PSEUDO) { /* * Refuse any attempt to create a pseudo-type column, except for a diff --git a/src/include/catalog/pg_type.h b/src/include/catalog/pg_type.h index c2350f3..6e4c65e 100644 *** a/src/include/catalog/pg_type.h --- b/src/include/catalog/pg_type.h *************** DESCR("relative, limited-range time inte *** 418,424 **** DATA(insert OID = 704 ( tinterval PGNSP PGUID 12 f b T f t \054 0 0 1025 tintervalin tintervalout tintervalrecv tintervalsend- - - i p f 0 -1 0 0 _null_ _null_ _null_ )); DESCR("(abstime,abstime), time interval"); #define TINTERVALOID 704 ! DATA(insert OID = 705 ( unknown PGNSP PGUID -2 f b X f t \054 0 0 0 unknownin unknownout unknownrecv unknownsend- - - c p f 0 -1 0 0 _null_ _null_ _null_ )); DESCR(""); #define UNKNOWNOID 705 --- 418,424 ---- DATA(insert OID = 704 ( tinterval PGNSP PGUID 12 f b T f t \054 0 0 1025 tintervalin tintervalout tintervalrecv tintervalsend- - - i p f 0 -1 0 0 _null_ _null_ _null_ )); DESCR("(abstime,abstime), time interval"); #define TINTERVALOID 704 ! DATA(insert OID = 705 ( unknown PGNSP PGUID -2 f p X f t \054 0 0 0 unknownin unknownout unknownrecv unknownsend- - - c p f 0 -1 0 0 _null_ _null_ _null_ )); DESCR(""); #define UNKNOWNOID 705 diff --git a/src/test/regress/expected/type_sanity.out b/src/test/regress/expected/type_sanity.out index e5adfba..312d290 100644 *** a/src/test/regress/expected/type_sanity.out --- b/src/test/regress/expected/type_sanity.out *************** WHERE (p1.typtype = 'c' AND p1.typrelid *** 59,65 **** -- Look for types that should have an array type according to their typtype, -- but don't. We exclude composites here because we have not bothered to -- make array types corresponding to the system catalogs' rowtypes. ! -- NOTE: as of 9.1, this check finds pg_node_tree, smgr, and unknown. SELECT p1.oid, p1.typname FROM pg_type as p1 WHERE p1.typtype not in ('c','d','p') AND p1.typname NOT LIKE E'\\_%' --- 59,65 ---- -- Look for types that should have an array type according to their typtype, -- but don't. We exclude composites here because we have not bothered to -- make array types corresponding to the system catalogs' rowtypes. ! -- NOTE: as of v10, this check finds pg_node_tree and smgr. SELECT p1.oid, p1.typname FROM pg_type as p1 WHERE p1.typtype not in ('c','d','p') AND p1.typname NOT LIKE E'\\_%' *************** WHERE p1.typtype not in ('c','d','p') AN *** 71,78 **** -----+-------------- 194 | pg_node_tree 210 | smgr ! 705 | unknown ! (3 rows) -- Make sure typarray points to a varlena array type of our own base SELECT p1.oid, p1.typname as basetype, p2.typname as arraytype, --- 71,77 ---- -----+-------------- 194 | pg_node_tree 210 | smgr ! (2 rows) -- Make sure typarray points to a varlena array type of our own base SELECT p1.oid, p1.typname as basetype, p2.typname as arraytype, diff --git a/src/test/regress/sql/type_sanity.sql b/src/test/regress/sql/type_sanity.sql index f7c5c9d..0282f84 100644 *** a/src/test/regress/sql/type_sanity.sql --- b/src/test/regress/sql/type_sanity.sql *************** WHERE (p1.typtype = 'c' AND p1.typrelid *** 53,59 **** -- Look for types that should have an array type according to their typtype, -- but don't. We exclude composites here because we have not bothered to -- make array types corresponding to the system catalogs' rowtypes. ! -- NOTE: as of 9.1, this check finds pg_node_tree, smgr, and unknown. SELECT p1.oid, p1.typname FROM pg_type as p1 --- 53,59 ---- -- Look for types that should have an array type according to their typtype, -- but don't. We exclude composites here because we have not bothered to -- make array types corresponding to the system catalogs' rowtypes. ! -- NOTE: as of v10, this check finds pg_node_tree and smgr. SELECT p1.oid, p1.typname FROM pg_type as p1 -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Assignment of valid collation for SET operations onqueries with UNKNOWN types.
From
Michael Paquier
Date:
On Wed, Jan 25, 2017 at 12:46 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > I wrote: >> Michael Paquier <michael.paquier@gmail.com> writes: >>> The table of Pseudo-Types needs to be updated as well with unknown in >>> datatype.sgml. > >> Check. > > Here's an updated patch with doc changes. Aside from that one, > I tried to spell "pseudo-type" consistently, and I changed one > place where we were calling something a pseudo-type that isn't. Thanks for the updated version, all the comments have been addressed. You have left a lot of code comments using "pseudotype" instead of "pseudo-type" in the code. I am guessing that this is on purpose, which is fine for me. There is no point to make back-patching more complicated for just that. The CF app has been updated to ready for committer for the record. -- Michael
Re: [HACKERS] Assignment of valid collation for SET operations onqueries with UNKNOWN types.
From
Ashutosh Bapat
Date:
On Wed, Jan 25, 2017 at 10:54 AM, Michael Paquier <michael.paquier@gmail.com> wrote: > On Wed, Jan 25, 2017 at 12:46 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> I wrote: >>> Michael Paquier <michael.paquier@gmail.com> writes: >>>> The table of Pseudo-Types needs to be updated as well with unknown in >>>> datatype.sgml. >> >>> Check. >> >> Here's an updated patch with doc changes. Aside from that one, >> I tried to spell "pseudo-type" consistently, and I changed one >> place where we were calling something a pseudo-type that isn't. > > Thanks for the updated version, all the comments have been addressed. > You have left a lot of code comments using "pseudotype" instead of > "pseudo-type" in the code. I am guessing that this is on purpose, > which is fine for me. There is no point to make back-patching more > complicated for just that. > > The CF app has been updated to ready for committer for the record. I have listed myself as reviewer for this commitfest entry and I am yet to review the patch. Can you please wait for the listed reviewers, esp. when those reviewers are active, for changing the commitfest entry status? -- Best Wishes, Ashutosh Bapat EnterpriseDB Corporation The Postgres Database Company
Re: [HACKERS] Assignment of valid collation for SET operations onqueries with UNKNOWN types.
From
Michael Paquier
Date:
On Wed, Jan 25, 2017 at 2:28 PM, Ashutosh Bapat <ashutosh.bapat@enterprisedb.com> wrote: > On Wed, Jan 25, 2017 at 10:54 AM, Michael Paquier > <michael.paquier@gmail.com> wrote: >> On Wed, Jan 25, 2017 at 12:46 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >>> I wrote: >>>> Michael Paquier <michael.paquier@gmail.com> writes: >>>>> The table of Pseudo-Types needs to be updated as well with unknown in >>>>> datatype.sgml. >>> >>>> Check. >>> >>> Here's an updated patch with doc changes. Aside from that one, >>> I tried to spell "pseudo-type" consistently, and I changed one >>> place where we were calling something a pseudo-type that isn't. >> >> Thanks for the updated version, all the comments have been addressed. >> You have left a lot of code comments using "pseudotype" instead of >> "pseudo-type" in the code. I am guessing that this is on purpose, >> which is fine for me. There is no point to make back-patching more >> complicated for just that. >> >> The CF app has been updated to ready for committer for the record. > > I have listed myself as reviewer for this commitfest entry and I am > yet to review the patch. Can you please wait for the listed reviewers, > esp. when those reviewers are active, for changing the commitfest > entry status? If you want to have an extra look, please be my guest. To be consistent, I have added my name as well in the list of reviewers. -- Michael
Re: [HACKERS] Assignment of valid collation for SET operations onqueries with UNKNOWN types.
From
Ashutosh Bapat
Date:
On Wed, Jan 25, 2017 at 10:54 AM, Michael Paquier <michael.paquier@gmail.com> wrote: > On Wed, Jan 25, 2017 at 12:46 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> I wrote: >>> Michael Paquier <michael.paquier@gmail.com> writes: >>>> The table of Pseudo-Types needs to be updated as well with unknown in >>>> datatype.sgml. >> >>> Check. >> >> Here's an updated patch with doc changes. Aside from that one, >> I tried to spell "pseudo-type" consistently, and I changed one >> place where we were calling something a pseudo-type that isn't. I think, those changes, even though small, deserve their own commit. The changes themselves look good. -- Best Wishes, Ashutosh Bapat EnterpriseDB Corporation The Postgres Database Company
Re: [HACKERS] Assignment of valid collation for SET operations on queries with UNKNOWN types.
From
Tom Lane
Date:
Ashutosh Bapat <ashutosh.bapat@enterprisedb.com> writes: > On Wed, Jan 25, 2017 at 10:54 AM, Michael Paquier > <michael.paquier@gmail.com> wrote: >> On Wed, Jan 25, 2017 at 12:46 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >>> Here's an updated patch with doc changes. Aside from that one, >>> I tried to spell "pseudo-type" consistently, and I changed one >>> place where we were calling something a pseudo-type that isn't. > I think, those changes, even though small, deserve their own commit. > The changes themselves look good. Pushed, thanks for the reviews! regards, tom lane
Re: [HACKERS] Assignment of valid collation for SET operations onqueries with UNKNOWN types.
From
Robert Haas
Date:
On Wed, Jan 25, 2017 at 9:37 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Pushed, thanks for the reviews! I think this is a nice improvement. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company