Thread: [PATCH] Proposal for HIDDEN/INVISIBLE column
Hi, Here is a proposal to implement HIDDEN columns feature in PostgreSQL. The user defined columns are always visible in the PostgreSQL. If user wants to hide some column(s) from a SELECT * returned values then the hidden columns feature is useful. Hidden column can always be used and returned by explicitly referring it in the query. I agree that views are done for that or that using a SELECT * is a bad practice but sometime we could need to "technically" prevent some columns to be part of a star expansion and nbot be forced to use view+rules. For example when upgrading a database schema where a column have been added to a table, this will break any old version of the application that is using a SELECT * on this table. Being able to "hide" this column to such query will make migration easier. An other common use case for this feature is to implements temporal tables or row versionning. On my side I see a direct interest in Oracle to PostgreSQL migration to emulate the ROWID system column without the hassle of creating views, it will save lot of time. The other advantage over views is that the hidden column can still be used in JOIN, WHERE, ORDER BY or GROUP BY clause which is not possible otherwise. I don't talk about writing to complex view which would require a RULE. Hidden column is not part of the SQL standard but is implemented in all other RDBMS which is also called invisible columns [1] [2] [3] [4]. In all these RDBMS the feature is quite the same. [1] https://www.ibm.com/docs/en/db2/10.5?topic=concepts-hidden-columns [2] https://oracle-base.com/articles/12c/invisible-columns-12cr1 [3] https://docs.microsoft.com/en-us/sql/t-sql/statements/create-table-transact-sql?view=sql-server-ver15 [4] https://dev.mysql.com/doc/refman/8.0/en/invisible-columns.html Here is the full description of the proposal with a patch attached that implements the feature: 1) Creating hidden columns: A column visibility attribute is added to the column definition of CREATE TABLE and ALTER TABLE statements. For example: CREATE TABLE htest1 (a bigserial HIDDEN, b text); ALTER TABLE htest1 ADD COLUMN c integer HIDDEN; Columns are visible by default. 2) Altering column visibility attribute: The ALTER TABLE statement can be used to change hidden columns to not hidden and the opposite. Example: ALTER TABLE htest1 ALTER COLUMN c DROP HIDDEN; 3) Insert and hidden columns: If the column list of INSERT or COPY statements is empty then while expanding column list hidden columns are NOT included. DEFAULT or NULL values are inserted for hidden columns in this case. Hidden column should be explicitly referenced in the column list of INSERT and COPY statement to insert a value. Example: -- Value 'one' is stored in column b and 1 in hidden column. INSERT INTO t1 VALUES ('one'); -- Value 2 is stored in hidden column and 'two' in b. INSERT INTO htest1 (a, b) VALUES (2, 'two'); 4) Star expansion for SELECT * statements: Hidden columns are not included in a column list while expanding wild card '*' in the SELECT statement. Example: SELECT * FROM htest1; b ------ one two Hidden columns are accessible when explicitly referenced in the query. Example: SELECT f1, f2 FROM t1; a | b ------+------ 1 | one 2 | two 5) psql extended describe lists hidden columns. postgres=# \d+ htest1 Table "public.htest1" Column | Type | Collation | Nullable | Default | Visible | ... --------+--------+-----------+----------+------------+---------+ ... a | bigint | | not null | nextval... | hidden | ... b | text | | | | | ... 6) When a column is flagged as hidden the attishidden column value of table pg_attribute is set to true. 7) For hidden attributes, column is_hidden of table information_schema.columns is set to YES. By default the column is visible and the value is 'NO'. For a complete description of the feature, see chapter "Hidden columns" in file doc/src/sgml/ddl.sgml after applying the patch. The patch is a full implementation of this feture except that I sill have to prevent a ALTER ... SET HIDDEN to be applied of there is no more visible columns in the table after the change. I will do that when I will recover more time. I have choose HIDDEN vs INVISIBLE but this could be a minor change or we could use NOT EXPANDABLE. Personnaly I prefer the HIDDEN attribute. Any though and interest in this feature? -- Gilles Darold http://www.migops.com/
Attachment
Hi Gilles, > Any though and interest in this feature? Personally, I wouldn't call this feature particularly useful. `SELECT *` is intended for people who are working with DBMS directly e.g. via psql and want to see ALL columns. The applications should never use `SELECT *`. So I can't see any real benefits of adding this feature to PostgreSQL. It will only make the existing code and the existing user interface even more complicated than they are now. Also, every yet another feature is x N corner cases when this feature works with other N features of PostgreSQL. How should it work with partitioned or inherited tables? Or with logical replication? With pg_dump? With COPY? So all in all, -1. This being said, I very much appreciate your attempt to improve PostgreSQL. However next time before writing the code I suggest submitting an RFC first. -- Best regards, Aleksander Alekseev
> So all in all, -1. [...]
Here is something I would like to add:
1. As far as I know, "all the rest of DBMS have this" was never a good argument in the PostgreSQL community. Generally, using it will turn people against you.
2. I recall there was a proposal of making the SQL syntax itself extendable. To my knowledge, this is still a wanted feature [1]. In theory, that would allow you to implement the feature you want in an extension.
[1]: https://wiki.postgresql.org/wiki/Todo#Exotic_Features
--
Best regards,
Aleksander Alekseev
On 10/14/21 1:47 PM, Aleksander Alekseev wrote: > Hi Gilles, > >> Any though and interest in this feature? > > Personally, I wouldn't call this feature particularly useful. `SELECT > *` is intended for people who are working with DBMS directly e.g. via > psql and want to see ALL columns. I disagree strongly with this. It is really annoying when working interactively with psql on a table that has a PostGIS geometry column, or any other large blobby type column. I have not looked at the patch, but +1 for the feature. -- Vik Fearing
Hi Vik, > I have not looked at the patch, but +1 for the feature. Maybe you could describe your use case in a little more detail? How did you end up working with PostGIS geometry via psql on regular basis? What exactly do you find of annoyance? How will the proposed patch help? I find it great that we have people with polar opinions in the discussion. But to reach any consensus you should make the opponent understand your situation. Also, please don't simply discard the disadvantages stated above. If you don't believe these are significant disadvantages, please explain why do you think so. -- Best regards, Aleksander Alekseev
On 10/14/21 1:47 PM, Aleksander Alekseev wrote:
> Hi Gilles,
>
>> Any though and interest in this feature?
>
> Personally, I wouldn't call this feature particularly useful. `SELECT
> *` is intended for people who are working with DBMS directly e.g. via
> psql and want to see ALL columns.
I disagree strongly with this. It is really annoying when working
interactively with psql on a table that has a PostGIS geometry column,
or any other large blobby type column.
I have not looked at the patch, but +1 for the feature.
--
Vik Fearing
The user defined columns are always visible in the PostgreSQL. If user
wants to hide some column(s) from a SELECT * returned values then the
hidden columns feature is useful. Hidden column can always be used and
returned by explicitly referring it in the query.
Le 14/10/2021 à 13:47, Aleksander Alekseev a écrit : > Hi Gilles, > >> Any though and interest in this feature? > Personally, I wouldn't call this feature particularly useful. `SELECT > *` is intended for people who are working with DBMS directly e.g. via > psql and want to see ALL columns. The applications should never use > `SELECT *`. So I can't see any real benefits of adding this feature to > PostgreSQL. It will only make the existing code and the existing user > interface even more complicated than they are now. Thanks for your comments Aleksander. This was also my thougth at begining but unfortunately there is cases where things are not so simple and just relying on SELECT * is dirty or forbidden. The hidden column are not only useful for SELECT * but also for INSERT without column list, but INSERT without column list is also a bad practice. > Also, every yet another feature is x N corner cases when this feature > works with other N features of PostgreSQL. How should it work with > partitioned or inherited tables? Or with logical replication? With > pg_dump? With COPY? I recommand you to have look to my patch because the partitioned and inherited case are covered, you can have a . For logical replication I guess that any change in pg_attribute is also replicated so I I would said that it is fully supported. But obviously I may miss something. pg_dump and COPY are also supported. Actually the patch only prevent an hidden column to be part of a star expansion for the returned column, I don't think there is corner case with the other part of the code outside that we need to prevent a table to have all columns hidden. But I could miss something, I agree. > So all in all, -1. This being said, I very much appreciate your > attempt to improve PostgreSQL. However next time before writing the > code I suggest submitting an RFC first. Don't worry about my time spent for the PG community, this patch is a dust in my contribution to open source :-) If I have provided the patch to show the concept and how it can be easily implemented. Also it can be used in some PostgreSQL forks if one is interested by this feature. -- Gilles Darold
Le 14/10/2021 à 14:09, Aleksander Alekseev a écrit : > Hi again, > > > So all in all, -1. [...] > > Here is something I would like to add: > > 1. As far as I know, "all the rest of DBMS have this" was never a good > argument in the PostgreSQL community. Generally, using it will turn > people against you. I have cited the implementation in the other RDBMS because it helps to understand the feature, it shows the state of the art on it and illustrates my needs. If making references to other implementation turns people against me I think that they have the wrong approach on this proposal and if we refuse feature because they are implemented in other RDBMS this is even worst. I'm not agree with this comment. > 2. I recall there was a proposal of making the SQL syntax itself > extendable. To my knowledge, this is still a wanted feature [1]. In > theory, that would allow you to implement the feature you want in an > extension. For what I've read in this thread https://www.postgresql.org/message-id/flat/20210501072458.adqjoaqnmhg4l34l%40nol there is no real consensus in how implementing this feature should be done. But I agree that if the implementation through an extension was possible I would not write a patch to core but an extension, this is my common behavior. Best regards, -- Gilles Darold http://www.darold.net/
čt 14. 10. 2021 v 14:13 odesílatel Vik Fearing <vik@postgresfriends.org> napsal:On 10/14/21 1:47 PM, Aleksander Alekseev wrote:
> Hi Gilles,
>
>> Any though and interest in this feature?
>
> Personally, I wouldn't call this feature particularly useful. `SELECT
> *` is intended for people who are working with DBMS directly e.g. via
> psql and want to see ALL columns.
I disagree strongly with this. It is really annoying when working
interactively with psql on a table that has a PostGIS geometry column,
or any other large blobby type column.
I have not looked at the patch, but +1 for the feature.Cannot be better to redefine some strategies for output for some types.I can agree so sometimes in some environments proposed features can be nice, but it can be a strong footgun too.Maybe some strange data can be filtered in psql and it can be better solution. I agree, so usually print long geometry in psql is useless.
Pavel this doesn't concern only output but input too, think about the INSERT or COPY without a column list. We can add such filter in psql but how about other clients? They all have to implement their own filtering method. I think the HIDDEN attribute provide a common and basic way to implement that in all client application.
-- Gilles Darold http://www.darold.net/
Le 14/10/2021 à 14:28, Pavel Stehule a écrit :čt 14. 10. 2021 v 14:13 odesílatel Vik Fearing <vik@postgresfriends.org> napsal:On 10/14/21 1:47 PM, Aleksander Alekseev wrote:
> Hi Gilles,
>
>> Any though and interest in this feature?
>
> Personally, I wouldn't call this feature particularly useful. `SELECT
> *` is intended for people who are working with DBMS directly e.g. via
> psql and want to see ALL columns.
I disagree strongly with this. It is really annoying when working
interactively with psql on a table that has a PostGIS geometry column,
or any other large blobby type column.
I have not looked at the patch, but +1 for the feature.Cannot be better to redefine some strategies for output for some types.I can agree so sometimes in some environments proposed features can be nice, but it can be a strong footgun too.Maybe some strange data can be filtered in psql and it can be better solution. I agree, so usually print long geometry in psql is useless.
Pavel this doesn't concern only output but input too, think about the INSERT or COPY without a column list. We can add such filter in psql but how about other clients? They all have to implement their own filtering method. I think the HIDDEN attribute provide a common and basic way to implement that in all client application.
Hi,
Here is a proposal to implement HIDDEN columns feature in PostgreSQL.
The user defined columns are always visible in the PostgreSQL. If user
wants to hide some column(s) from a SELECT * returned values then the
hidden columns feature is useful. Hidden column can always be used and
returned by explicitly referring it in the query.
On Thu, Oct 14, 2021 at 01:16:45PM +0200, Gilles Darold wrote: > Hi, > > > Here is a proposal to implement HIDDEN columns feature in PostgreSQL. > Great! Actually I found this very useful, especially for those people using big fields (geometry, files, large texts). > The user defined columns are always visible in the PostgreSQL. If user > wants to hide some column(s) from a SELECT * returned values then the > hidden columns feature is useful. Hidden column can always be used and > returned by explicitly referring it in the query. > > I agree that views are done for that or that using a SELECT * is a bad > practice An a common one, even if we want to think otherwise. I have found that in almost every customer I have the bad luck to get to see code or SELECTs. Not counting that sometimes we have columns for optimization like Dave saved about hidden a ts_vector column. Another use case I can think of is not covered in this patch, but it could be (I hope!) or even if not I would like opinions on this idea. What about a boolean GUC log_hidden_column that throws a LOG message when a hidden column is used directly? The intention is to mark a to-be-deleted column as HIDDEN and then check the logs to understand if is still being used somewhere. I know systems where they carry the baggage of deprecated columns only because they don't know if some system is still using them. I know this would be extending your original proposal, and understand if you decide is not a first patch material. Anyway, a +1 to your proposal. -- Jaime Casanova Director de Servicios Profesionales SystemGuards - Consultores de PostgreSQL
Le 14/10/2021 à 17:38, Jaime Casanova a écrit : > On Thu, Oct 14, 2021 at 01:16:45PM +0200, Gilles Darold wrote: >> Hi, >> >> >> Here is a proposal to implement HIDDEN columns feature in PostgreSQL. >> > Great! Actually I found this very useful, especially for those people > using big fields (geometry, files, large texts). > >> The user defined columns are always visible in the PostgreSQL. If user >> wants to hide some column(s) from a SELECT * returned values then the >> hidden columns feature is useful. Hidden column can always be used and >> returned by explicitly referring it in the query. >> >> I agree that views are done for that or that using a SELECT * is a bad >> practice > An a common one, even if we want to think otherwise. I have found that > in almost every customer I have the bad luck to get to see code or > SELECTs. > > Not counting that sometimes we have columns for optimization like Dave > saved about hidden a ts_vector column. > > Another use case I can think of is not covered in this patch, but it > could be (I hope!) or even if not I would like opinions on this idea. > What about a boolean GUC log_hidden_column that throws a LOG message when > a hidden column is used directly? > > The intention is to mark a to-be-deleted column as HIDDEN and then check > the logs to understand if is still being used somewhere. I know systems > where they carry the baggage of deprecated columns only because they > don't know if some system is still using them. > > I know this would be extending your original proposal, and understand if > you decide is not a first patch material. Why not, I will add it if there is a consencus about logging hidden column use, this is not a big work. -- Gilles Darold
Gilles Darold <gilles@migops.com> writes: > Le 14/10/2021 à 17:38, Jaime Casanova a écrit : >> On Thu, Oct 14, 2021 at 01:16:45PM +0200, Gilles Darold wrote: >>> Here is a proposal to implement HIDDEN columns feature in PostgreSQL. >> Another use case I can think of is not covered in this patch, but it >> could be (I hope!) or even if not I would like opinions on this idea. >> What about a boolean GUC log_hidden_column that throws a LOG message when >> a hidden column is used directly? > Why not, I will add it if there is a consencus about logging hidden > column use, this is not a big work. This seems like a completely orthogonal idea. If you are trying to figure out whether you have any applications that depend on column X (without breaking anything), you should absolutely not start by marking the column "hidden", because that'll break the case where the apps are expecting "SELECT *" to return the column. But if you're okay with breaking things, you might as well just drop the column, or else revoke SELECT privilege on it, and see what happens. I'm not sure about the utility of logging explicit references to a specific column --- seems like grepping the results of "log_statement" would serve. But in any case I think it is not a good idea to tie it to this proposal. As for the proposal itself, I'm kind of allergic to the terminology you've suggested, because the column is in no way hidden. It's still visible in the catalogs, you can still select it explicitly, etc. Anybody who thinks this is useful from a security standpoint is mistaken, but these words suggest that it is. Perhaps some terminology like "not expanded" or "unexpanded" would serve better to indicate that "SELECT *" doesn't expand to include the column. Or STAR versus NO STAR, maybe. I also do not care for the syntax you propose: AFAICS the only reason you've gotten away with making HIDDEN not fully reserved is that you require it to be the last attribute of a column, which is something that will trip users up all the time. Plus, it does not scale to the next thing we might want to add. So if you can't make it a regular, position-independent element of the ColQualList you shouldn't do it at all. What I think is actually important is the ALTER COLUMN syntax. We could easily get away with having that be the only syntax for this --- compare the precedent of ALTER COLUMN SET STATISTICS. BTW, you do NOT get to add an information_schema column for this. The information_schema is defined in the SQL standard. Yes, I'm aware that mysql feels free to "extend" the standard in that area; but our policy is that the only point of having the information_schema views at all is if they're standard-compliant. regards, tom lane
čt 14. 10. 2021 v 13:17 odesílatel Gilles Darold <gilles@migops.com> napsal: > > Hi, > > > Here is a proposal to implement HIDDEN columns feature in PostgreSQL. > > The user defined columns are always visible in the PostgreSQL. If user > wants to hide some column(s) from a SELECT * returned values then the > hidden columns feature is useful. Hidden column can always be used and > returned by explicitly referring it in the query. > > I agree that views are done for that or that using a SELECT * is a bad > practice > but sometime we could need to "technically" prevent some columns to be part > of a star expansion and nbot be forced to use view+rules. Just to remind here, there was recently a proposal to handle this problem another way - provide a list of columns to skip for "star selection" aka "SELECT * EXCEPT col1...". https://postgrespro.com/list/id/d51371a2-f221-1cf3-4a7d-b2242d4dafdb@gmail.com > For example when > upgrading a database schema where a column have been added to a table, > this will break any old version of the application that is using a > SELECT * on > this table. Being able to "hide" this column to such query will make > migration > easier. > > An other common use case for this feature is to implements temporal tables > or row versionning. On my side I see a direct interest in Oracle to > PostgreSQL > migration to emulate the ROWID system column without the hassle of creating > views, it will save lot of time. > > The other advantage over views is that the hidden column can still be used > in JOIN, WHERE, ORDER BY or GROUP BY clause which is not possible otherwise. > I don't talk about writing to complex view which would require a RULE. > > Hidden column is not part of the SQL standard but is implemented in all > other > RDBMS which is also called invisible columns [1] [2] [3] [4]. In all > these RDBMS > the feature is quite the same. > > [1] https://www.ibm.com/docs/en/db2/10.5?topic=concepts-hidden-columns > [2] https://oracle-base.com/articles/12c/invisible-columns-12cr1 > [3] > https://docs.microsoft.com/en-us/sql/t-sql/statements/create-table-transact-sql?view=sql-server-ver15 > [4] https://dev.mysql.com/doc/refman/8.0/en/invisible-columns.html > > > Here is the full description of the proposal with a patch attached that > implements > the feature: > > 1) Creating hidden columns: > > A column visibility attribute is added to the column definition > of CREATE TABLE and ALTER TABLE statements. For example: > > CREATE TABLE htest1 (a bigserial HIDDEN, b text); > > ALTER TABLE htest1 ADD COLUMN c integer HIDDEN; > > Columns are visible by default. > > 2) Altering column visibility attribute: > > The ALTER TABLE statement can be used to change hidden columns to not > hidden and the opposite. Example: > > ALTER TABLE htest1 ALTER COLUMN c DROP HIDDEN; > > 3) Insert and hidden columns: > > If the column list of INSERT or COPY statements is empty > then while expanding column list hidden columns are NOT > included. DEFAULT or NULL values are inserted for hidden > columns in this case. Hidden column should be explicitly > referenced in the column list of INSERT and COPY statement > to insert a value. > > Example: > > -- Value 'one' is stored in column b and 1 in hidden column. > INSERT INTO t1 VALUES ('one'); > > -- Value 2 is stored in hidden column and 'two' in b. > INSERT INTO htest1 (a, b) VALUES (2, 'two'); > > 4) Star expansion for SELECT * statements: > > Hidden columns are not included in a column list while > expanding wild card '*' in the SELECT statement. > > Example: > > SELECT * FROM htest1; > b > ------ > one > two > > Hidden columns are accessible when explicitly referenced > in the query. > > Example: > SELECT f1, f2 FROM t1; > a | b > ------+------ > 1 | one > 2 | two > > 5) psql extended describe lists hidden columns. > > postgres=# \d+ htest1 > Table "public.htest1" > Column | Type | Collation | Nullable | Default | Visible | ... > --------+--------+-----------+----------+------------+---------+ ... > a | bigint | | not null | nextval... | hidden | ... > b | text | | | | | ... > > 6) When a column is flagged as hidden the attishidden column value of > table pg_attribute is set to true. > > 7) For hidden attributes, column is_hidden of table > information_schema.columns > is set to YES. By default the column is visible and the value is 'NO'. > > For a complete description of the feature, see chapter "Hidden columns" in > file doc/src/sgml/ddl.sgml after applying the patch. > > > The patch is a full implementation of this feture except that I sill have to > prevent a ALTER ... SET HIDDEN to be applied of there is no more visible > columns in the table after the change. I will do that when I will recover > more time. > > I have choose HIDDEN vs INVISIBLE but this could be a minor change or > we could use NOT EXPANDABLE. Personnaly I prefer the HIDDEN attribute. > > > Any though and interest in this feature? > > -- > Gilles Darold > http://www.migops.com/ >
Gilles Darold <gilles@migops.com> writes:
> Le 14/10/2021 à 17:38, Jaime Casanova a écrit :
>> On Thu, Oct 14, 2021 at 01:16:45PM +0200, Gilles Darold wrote:
> Why not, I will add it if there is a consencus about logging hidden
> column use, this is not a big work.
This seems like a completely orthogonal idea.
As for the proposal itself, I'm kind of allergic to the terminology
you've suggested, because the column is in no way hidden. It's
still visible in the catalogs, you can still select it explicitly,
etc. Anybody who thinks this is useful from a security standpoint
is mistaken, but these words suggest that it is. Perhaps some
terminology like "not expanded" or "unexpanded" would serve better
to indicate that "SELECT *" doesn't expand to include the column.
Or STAR versus NO STAR, maybe.
What I think is actually important is the ALTER COLUMN syntax.
We could easily get away with having that be the only syntax for
this --- compare the precedent of ALTER COLUMN SET STATISTICS.
BTW, you do NOT get to add an information_schema column for
this.
"David G. Johnston" <david.g.johnston@gmail.com> writes: > Taking this a bit further, I dislike tying the suppression of the column > from the select-list star to the behavior of insert without a column list > provided. I’m not fully on board with having an attribute that is not > fundamental to the data model but rather an instruction about how that > column interacts with SQL; separating the two aspects, though, would help. > I accept the desire to avoid star expansion much more than default columns > for insert. Yeah, me too. I think it would add a lot of clarity if we defined this as "this affects the behavior of SELECT * and nothing else" ... although even then, there are squishy questions about how much it affects the behavior of composite datums that are using the column's rowtype. But as soon as you want it to bleed into INSERT, you start having a lot of questions about what else it should bleed into, as Aleksander already mentioned. regards, tom lane
Le 14/10/2021 à 19:44, Tom Lane a écrit : > As for the proposal itself, I'm kind of allergic to the terminology > you've suggested, because the column is in no way hidden. It's > still visible in the catalogs, you can still select it explicitly, > etc. Anybody who thinks this is useful from a security standpoint > is mistaken, but these words suggest that it is. Perhaps some > terminology like "not expanded" or "unexpanded" would serve better > to indicate that "SELECT *" doesn't expand to include the column. > Or STAR versus NO STAR, maybe. Agree, I also had this feeling. I decide to use HIDDEN like in DB2 just because UNEXPANDED looks to me difficult to understand by users and that hidden or Invisible column are well known. This is a kind of "vendor standard" now. But I agree that it can confuse uninformed people and doesn't reflect the real feature. I will rename the keyword as "UNEXPANDED", will do. > I also do not care for the syntax you propose: AFAICS the only reason > you've gotten away with making HIDDEN not fully reserved is that you > require it to be the last attribute of a column, which is something > that will trip users up all the time. Plus, it does not scale to the > next thing we might want to add. So if you can't make it a regular, > position-independent element of the ColQualList you shouldn't do it > at all. Yes I have also noted that and wanted to improve this later if the proposal was accepted. > What I think is actually important is the ALTER COLUMN syntax. > We could easily get away with having that be the only syntax for > this --- compare the precedent of ALTER COLUMN SET STATISTICS. Ok great, I'm fine with that, especially for the previous point :-) I will remove it from the CREATE TABLE syntax except in the INCLUDING like option. > BTW, you do NOT get to add an information_schema column for > this. The information_schema is defined in the SQL standard. > Yes, I'm aware that mysql feels free to "extend" the standard > in that area; but our policy is that the only point of having the > information_schema views at all is if they're standard-compliant. Ok, I will remove it. -- Gilles Darold
I wrote: > Yeah, me too. I think it would add a lot of clarity if we defined this > as "this affects the behavior of SELECT * and nothing else" ... although > even then, there are squishy questions about how much it affects the > behavior of composite datums that are using the column's rowtype. Re-reading that, I realize I probably left way too much unstated, so let me spell it out. Should this feature affect SELECT * FROM my_table t; ? Yes, absolutely. How about SELECT t.* FROM my_table t; ? Yup, one would think so. Now how about SELECT row_to_json(t.*) FROM my_table t; ? All of a sudden, I'm a lot less sure --- not least because we *can't* simply omit some columns, without the composite datum suddenly not being of the table's rowtype anymore, which could have unexpected effects on query semantics. In particular, if we have a user-defined function that's defined to accept composite type my_table, I don't think we can suppress columns in SELECT myfunction(t.*) FROM my_table t; And don't forget that these can also be spelled like SELECT row_to_json(t) FROM my_table t; without any star visible anywhere. So the more I think about this, the squishier it gets. I'm now sharing the fears expressed upthread about whether it's even possible to define this in a way that won't have a lot of gotchas. regards, tom lane
Le 14/10/2021 à 20:43, Tom Lane a écrit : > Re-reading that, I realize I probably left way too much unstated, > so let me spell it out. > > Should this feature affect > SELECT * FROM my_table t; > ? Yes, absolutely. > > How about > SELECT t.* FROM my_table t; > ? Yup, one would think so. > > Now how about > SELECT row_to_json(t.*) FROM my_table t; > ? All of a sudden, I'm a lot less sure --- not least because we *can't* > simply omit some columns, without the composite datum suddenly not being > of the table's rowtype anymore, which could have unexpected effects on > query semantics. In particular, if we have a user-defined function > that's defined to accept composite type my_table, I don't think we can > suppress columns in > SELECT myfunction(t.*) FROM my_table t; > > And don't forget that these can also be spelled like > SELECT row_to_json(t) FROM my_table t; > without any star visible anywhere. > > So the more I think about this, the squishier it gets. I'm now sharing > the fears expressed upthread about whether it's even possible to define > this in a way that won't have a lot of gotchas. > > regards, tom lane You mean this ? :-) gilles=# CREATE TABLE htest0 (a int PRIMARY KEY, b text NOT NULL HIDDEN); CREATE TABLE gilles=# INSERT INTO htest0 (a, b) VALUES (1, 'htest0 one'); INSERT 0 1 gilles=# INSERT INTO htest0 (a, b) VALUES (2, 'htest0 two'); INSERT 0 1 gilles=# SELECT * FROM htest0 t; a --- 1 2 (2 rows) gilles=# SELECT t.* FROM htest0 t; a --- 1 2 (2 rows) gilles=# SELECT row_to_json(t.*) FROM htest0 t; row_to_json -------------------------- {"a":1,"b":"htest0 one"} {"a":2,"b":"htest0 two"} (2 rows) gilles=# SELECT row_to_json(t) FROM htest0 t; row_to_json -------------------------- {"a":1,"b":"htest0 one"} {"a":2,"b":"htest0 two"} (2 rows) You should have a look at the patch, I don't think that the way it is done there could have gotchas. -- Gilles Darold
Le 14/10/2021 à 20:26, Tom Lane a écrit : > "David G. Johnston" <david.g.johnston@gmail.com> writes: >> Taking this a bit further, I dislike tying the suppression of the column >> from the select-list star to the behavior of insert without a column list >> provided. I’m not fully on board with having an attribute that is not >> fundamental to the data model but rather an instruction about how that >> column interacts with SQL; separating the two aspects, though, would help. >> I accept the desire to avoid star expansion much more than default columns >> for insert. > Yeah, me too. I think it would add a lot of clarity if we defined this > as "this affects the behavior of SELECT * and nothing else" ... although > even then, there are squishy questions about how much it affects the > behavior of composite datums that are using the column's rowtype. > But as soon as you want it to bleed into INSERT, you start having a > lot of questions about what else it should bleed into, as Aleksander > already mentioned. I not agree, expansion in executed when there is no column list provided and this affect SELECT and INSERT. It cover the same needs: being able to remove a column for the target list when it is not explicitly set. This feature is known like this and I'm not in favor to tear off a leg. -- Gilles Darold
Le 14/10/2021 à 20:55, Gilles Darold a écrit : > > gilles=# SELECT row_to_json(t.*) FROM htest0 t; > row_to_json > -------------------------- > {"a":1,"b":"htest0 one"} > {"a":2,"b":"htest0 two"} > (2 rows) > > gilles=# SELECT row_to_json(t) FROM htest0 t; > row_to_json > -------------------------- > {"a":1,"b":"htest0 one"} > {"a":2,"b":"htest0 two"} > (2 rows) Tom, I have probably not well understood what you said about do the cases above. Do you mean that the column should not be visible too? I have though not but maybe I'm wrong, I will fix that. -- Gilles Darold
On 15/10/21 07:01, Josef Šimánek wrote: > čt 14. 10. 2021 v 13:17 odesílatel Gilles Darold <gilles@migops.com> napsal: >> Hi, >> >> >> Here is a proposal to implement HIDDEN columns feature in PostgreSQL. >> >> The user defined columns are always visible in the PostgreSQL. If user >> wants to hide some column(s) from a SELECT * returned values then the >> hidden columns feature is useful. Hidden column can always be used and >> returned by explicitly referring it in the query. >> >> I agree that views are done for that or that using a SELECT * is a bad >> practice >> but sometime we could need to "technically" prevent some columns to be part >> of a star expansion and nbot be forced to use view+rules. > Just to remind here, there was recently a proposal to handle this > problem another way - provide a list of columns to skip for "star > selection" aka "SELECT * EXCEPT col1...". > > https://postgrespro.com/list/id/d51371a2-f221-1cf3-4a7d-b2242d4dafdb@gmail.com [...] I feel using EXCEPT would be a lot clearer, no one is likely to be mislead into thinking that its is a security feature unlike 'HIDDEN'. Also you know that SELECT * will select all columns. If this kind of feature were to be added, then I'd give a +1 to use the EXCEPT syntax. Cheers, Gavin
Le 14/10/2021 à 22:01, Gavin Flower a écrit : > On 15/10/21 07:01, Josef Šimánek wrote: >> čt 14. 10. 2021 v 13:17 odesílatel Gilles Darold <gilles@migops.com> >> napsal: >>> Hi, >>> >>> >>> Here is a proposal to implement HIDDEN columns feature in PostgreSQL. >>> >>> The user defined columns are always visible in the PostgreSQL. If user >>> wants to hide some column(s) from a SELECT * returned values then the >>> hidden columns feature is useful. Hidden column can always be used and >>> returned by explicitly referring it in the query. >>> >>> I agree that views are done for that or that using a SELECT * is a bad >>> practice >>> but sometime we could need to "technically" prevent some columns to >>> be part >>> of a star expansion and nbot be forced to use view+rules. >> Just to remind here, there was recently a proposal to handle this >> problem another way - provide a list of columns to skip for "star >> selection" aka "SELECT * EXCEPT col1...". >> >> https://postgrespro.com/list/id/d51371a2-f221-1cf3-4a7d-b2242d4dafdb@gmail.com >> > > [...] > > I feel using EXCEPT would be a lot clearer, no one is likely to be > mislead into thinking that its is a security feature unlike 'HIDDEN'. > Also you know that SELECT * will select all columns. > > If this kind of feature were to be added, then I'd give a +1 to use > the EXCEPT syntax. I don't think that the EXCEPT syntax will be adopted as it change the SQL syntax for SELECT in a non standard way. This is not the case of the hidden column feature which doesn't touch of the SELECT or INSERT syntax.
Hi hackers, > > Just to remind here, there was recently a proposal to handle this > > problem another way - provide a list of columns to skip for "star > > selection" aka "SELECT * EXCEPT col1...". > > > > https://postgrespro.com/list/id/d51371a2-f221-1cf3-4a7d-b2242d4dafdb@gmail.com > > [...] > > I feel using EXCEPT would be a lot clearer, no one is likely to be > mislead into thinking that its is a security feature unlike 'HIDDEN'. > Also you know that SELECT * will select all columns. > > If this kind of feature were to be added, then I'd give a +1 to use the > EXCEPT syntax. +1 to that, personally I would love to have SELECT * EXCEPT ... syntax in PostgreSQL. Also, I discovered this feature was requested even earlier, in 2007 [1] > I don't think that the EXCEPT syntax will be adopted as it change the > SQL syntax for SELECT in a non standard way. This is not the case of the > hidden column feature which doesn't touch of the SELECT or INSERT syntax. HIDDEN columns affect SELECT and INSERT behaviour in the same non-standard way, although maybe without changing the syntax. Personally, I believe this is even worse. The difference is that with `SELECT * EXCEPT` you explicitly state what you want, while HIDDEN columns do this implicitly. Extending the syntax beyond standards in a reasonable way doesn't seem to be a problem. As a recent example in this thread [2] the community proposed to change the syntax in multiple places at the same time. `SELECT * EXCEPT` solves the same problem as HIDDEN columns, but is much easier to implement and maintain. Since it's a simple syntax sugar it doesn't affect the rest of the system. [1]: https://www.postgresql.org/message-id/flat/8A38B86D9187B34FA18766E261AB3AEA0D2072%40sageograma.GEO-I.local [2]: https://www.postgresql.org/message-id/flat/CAJ7c6TPx7N-bVw0dZ1ASCDQKZJHhBYkT6w4HV1LzfS%2BUUTUfmA%40mail.gmail.com -- Best regards, Aleksander Alekseev
Le 15/10/2021 à 09:47, Aleksander Alekseev a écrit : > >>> Just to remind here, there was recently a proposal to handle this >>> problem another way - provide a list of columns to skip for "star >>> selection" aka "SELECT * EXCEPT col1...". >>> >>> https://postgrespro.com/list/id/d51371a2-f221-1cf3-4a7d-b2242d4dafdb@gmail.com >> [...] >> >> I feel using EXCEPT would be a lot clearer, no one is likely to be >> mislead into thinking that its is a security feature unlike 'HIDDEN'. >> Also you know that SELECT * will select all columns. >> >> If this kind of feature were to be added, then I'd give a +1 to use the >> EXCEPT syntax. > +1 to that, personally I would love to have SELECT * EXCEPT ... syntax > in PostgreSQL. Also, I discovered this feature was requested even > earlier, in 2007 [1] > >> I don't think that the EXCEPT syntax will be adopted as it change the >> SQL syntax for SELECT in a non standard way. This is not the case of the >> hidden column feature which doesn't touch of the SELECT or INSERT syntax. > HIDDEN columns affect SELECT and INSERT behaviour in the same > non-standard way, although maybe without changing the syntax. > Personally, I believe this is even worse. The difference is that with > `SELECT * EXCEPT` you explicitly state what you want, while HIDDEN > columns do this implicitly. Extending the syntax beyond standards in a > reasonable way doesn't seem to be a problem. As a recent example in > this thread [2] the community proposed to change the syntax in > multiple places at the same time. > > `SELECT * EXCEPT` solves the same problem as HIDDEN columns, but is > much easier to implement and maintain. Since it's a simple syntax > sugar it doesn't affect the rest of the system. That's not true, this is not the same feature. the EXCEPT clause will not return column that you don't want in a specific request. I have nothing against that but you have to explicitly name them. I think about kind of bad design that we can find commonly like a table with attribute1 ... attribute20. If we can use regexp with EXCEPT like 'attribute\d+' that could be helpful too. But this is another thread. The hidden column feature hidden the column for all queries using the wilcard on the concerned table. For example if I have to import a database with OID enabled from an old dump and I want to prevent the OID column to be returned through the star use, I can turn the column hidden and I will not have to modify my old very good application. I caricature but this is the kind of thing that could happen. I see several other possible use of this feature with extensions that could use a technical column that the user must not see using the wildcard. Also as Vik or Dave mention being able to hide all tsvector columns from query without having to specify it as exception in each query used can save some time. IMHO this is definitively not the same feature. -- Gilles Darold
> having to specify it as exception in each query used can save some time.
--
Best regards,
Aleksander Alekseev
Le 15/10/2021 à 10:37, Aleksander Alekseev a écrit : > Hi Gilles, > > > I can turn the column hidden and I will not have to modify my old > very good application. > > I see your point. At the same time, I believe the statement above > shows the root reason why we have a different view on this feature. > The application should have never use SELECT * in the first place. > This is a terrible design - you add a column or change their order and > the application is broken. And I don't believe the DBMS core is the > right place for placing hacks for applications like this. This should > be solved in the application itself or in some sort of proxy server > between the application and DBMS. SELECT * is intended to be used by > people e.g. DBA. Yes I understand this point. Personally I have always used PostgreSQL and exclusively PostgreSQL in 25 years so I am aware of that and try to give my best to SQL code quality. But we have more and more application coming from others RDBMS with sometime no real possibility to modify the code or which requires lot of work. To give an other use case, some time ago I have written an extension (https://github.com/darold/pgtt-rsl) which use a technical column based on a composite type based on the backend start time and pid to emulate Global Temporary Table. To be able to hide this column from the user query point of view, I had to create a view and route any action on this view to the real underlying table in the extension C code. If the hidden feature was implemented it would have same me some time. I see several other possible extensions that could benefit of this feature. As I said when you develop an extension you can not just say to the user to never used SELECT * if he want to use your extension. At least this is something I will never said, even if this is a bad practice so I have to find a solution to avoid showing technical columns. If we really want SELECT * to be reserved to DBA then why not removing the star from PG unless you have the admin privilege? -- Gilles Darold
On Thu, 2021-10-14 at 13:16 +0200, Gilles Darold wrote: > Here is a proposal to implement HIDDEN columns feature in PostgreSQL. > > The user defined columns are always visible in the PostgreSQL. If user > wants to hide some column(s) from a SELECT * returned values then the > hidden columns feature is useful. Hidden column can always be used and > returned by explicitly referring it in the query. When I read your proposal, I had strangely mixed feelings: "This is cute!" versus "Do we need that?". After some thinking, I think that it boils down to the following: That feature is appealing to people who type SQL statements into psql, which is probably the majority of the readers on this list. It is immediately clear that this can be used for all kinds of nice things. On the other hand: a relational database is not a spreadsheet, where I want to hide or highlight columns. Sure, the interactive user may use it in that way, but that is not the target of a relational database. Databases usually are not user visible, but used by an application. So the appeal for the interactive user is really pretty irrelevant. Now this patch makes certain things easier, but it adds no substantially new functionality: I can exclude a column from display as it is, simply by listing all the other columns. Sure, that's a pain for the interactive user, but it is irrelevant for a query in an application. This together with the fact that it poses complicated questions when we dig deeper, such as "what about whole-row references?", tilts my vote. If it were for free, I would say +1. But given the ratio of potential headache versus added real-life benefit, I find myself voting -1. Still, it is cute! Yours, Laurenz Albe
Hi Gilles, > But we have more and more application coming from others RDBMS with sometime > no real possibility to modify the code or which requires lot of work. Somehow I feel everyone here very well understood the real motivation behind this proposal from the beginning, considering the e-mail of the author. And came to his or her own conclusions. > If we really want SELECT * to be reserved to DBA then why not removing the > star from PG unless you have the admin privilege? Respectfully, I perceive this as a trolling (presumably, non-intentional one) and not going to answer this. -- Best regards, Aleksander Alekseev
Le 15/10/2021 à 14:24, Aleksander Alekseev a écrit : > Hi Gilles, > >> If we really want SELECT * to be reserved to DBA then why not removing the >> star from PG unless you have the admin privilege? > Respectfully, I perceive this as a trolling (presumably, non-intentional one) > and not going to answer this. Yes, I don't wanted to offend you or to troll. This was just to point that the position of "SELECT * is bad practice" is not a good argument in my point of view, just because it is allowed for every one. I mean that in an extension or a client which allow user query input we must handle the case. -- Gilles Darold
Le 15/10/2021 à 14:24, Aleksander Alekseev a écrit :
> Hi Gilles,
>
>> If we really want SELECT * to be reserved to DBA then why not removing the
>> star from PG unless you have the admin privilege?
> Respectfully, I perceive this as a trolling (presumably, non-intentional one)
> and not going to answer this.
Yes, I don't wanted to offend you or to troll. This was just to point
that the position of "SELECT * is bad practice" is not a good argument
in my point of view, just because it is allowed for every one. I mean
that in an extension or a client which allow user query input we must
handle the case.
Hi Gilles, > Yes, I don't wanted to offend you or to troll. This was just to point > that the position of "SELECT * is bad practice" is not a good argument > in my point of view, just because it is allowed for every one. I mean > that in an extension or a client which allow user query input we must > handle the case. Sure, no worries. And my apologies if my feedback seemed a little harsh. I'm sure our goal is mutual - to make PostgreSQL even better than it is now. Finding a consensus occasionally can take time though. -- Best regards, Aleksander Alekseev
On Fri, Oct 15, 2021 at 9:40 PM Dave Cramer <davecramer@postgres.rocks> wrote: > > On Fri, 15 Oct 2021 at 09:29, Gilles Darold <gilles@migops.com> wrote: >> >> Yes, I don't wanted to offend you or to troll. This was just to point >> that the position of "SELECT * is bad practice" is not a good argument >> in my point of view, just because it is allowed for every one. I mean >> that in an extension or a client which allow user query input we must >> handle the case. > > This would break an awful lot of apps. Which is also why allowing to hide some custom columns from a "SELECT *" powerful. It's no doubt a niche usage, but as Gilles mentioned extensions can make use of that to build interesting things. If DBA can also make use of it to ease manual queries if the client apps are correctly written, that's icing on the cake.
On Fri, Oct 15, 2021 at 11:32:53AM +0200, Laurenz Albe wrote: > On Thu, 2021-10-14 at 13:16 +0200, Gilles Darold wrote: > > Here is a proposal to implement HIDDEN columns feature in PostgreSQL. > > > > The user defined columns are always visible in the PostgreSQL. If user > > wants to hide some column(s) from a SELECT * returned values then the > > hidden columns feature is useful. Hidden column can always be used and > > returned by explicitly referring it in the query. > > When I read your proposal, I had strangely mixed feelings: > "This is cute!" versus "Do we need that?". After some thinking, I think > that it boils down to the following: > > That feature is appealing to people who type SQL statements into psql, > which is probably the majority of the readers on this list. It is > immediately clear that this can be used for all kinds of nice things. > > On the other hand: a relational database is not a spreadsheet, where > I want to hide or highlight columns. Sure, the interactive user may > use it in that way, but that is not the target of a relational database. > Databases usually are not user visible, but used by an application. > So the appeal for the interactive user is really pretty irrelevant. > > Now this patch makes certain things easier, but it adds no substantially > new functionality: I can exclude a column from display as it is, simply > by listing all the other columns. Sure, that's a pain for the interactive > user, but it is irrelevant for a query in an application. > > This together with the fact that it poses complicated questions when > we dig deeper, such as "what about whole-row references?", tilts my vote. > If it were for free, I would say +1. But given the ratio of potential > headache versus added real-life benefit, I find myself voting -1. I can see the usefulness of this, though UNEXPANDED seems clearer. However, it also is likely to confuse someone who does SELECT * and then can't figure out why another query is showing a column that doesn't appear in SELECT *. I do think SELECT * EXCEPT is the better and less confusing solution. I can imagine people using different EXCEPT columns for different queries, which HIDDEN/UNEXPANDED does not allow. I frankly can't think of a single case where output is specified at the DDL level. Why is this not better addressed by creating a view on the original table, even perhaps renaming the original table and create a view using the old table name. -- Bruce Momjian <bruce@momjian.us> https://momjian.us EDB https://enterprisedb.com If only the physical world exists, free will is an illusion.
On 10/15/21 2:51 PM, Bruce Momjian wrote: > On Fri, Oct 15, 2021 at 11:32:53AM +0200, Laurenz Albe wrote: >> On Thu, 2021-10-14 at 13:16 +0200, Gilles Darold wrote: >>> Here is a proposal to implement HIDDEN columns feature in PostgreSQL. >>> >>> The user defined columns are always visible in the PostgreSQL. If user >>> wants to hide some column(s) from a SELECT * returned values then the >>> hidden columns feature is useful. Hidden column can always be used and >>> returned by explicitly referring it in the query. >> When I read your proposal, I had strangely mixed feelings: >> "This is cute!" versus "Do we need that?". After some thinking, I think >> that it boils down to the following: >> >> That feature is appealing to people who type SQL statements into psql, >> which is probably the majority of the readers on this list. It is >> immediately clear that this can be used for all kinds of nice things. >> >> On the other hand: a relational database is not a spreadsheet, where >> I want to hide or highlight columns. Sure, the interactive user may >> use it in that way, but that is not the target of a relational database. >> Databases usually are not user visible, but used by an application. >> So the appeal for the interactive user is really pretty irrelevant. >> >> Now this patch makes certain things easier, but it adds no substantially >> new functionality: I can exclude a column from display as it is, simply >> by listing all the other columns. Sure, that's a pain for the interactive >> user, but it is irrelevant for a query in an application. >> >> This together with the fact that it poses complicated questions when >> we dig deeper, such as "what about whole-row references?", tilts my vote. >> If it were for free, I would say +1. But given the ratio of potential >> headache versus added real-life benefit, I find myself voting -1. > I can see the usefulness of this, though UNEXPANDED seems clearer. > However, it also is likely to confuse someone who does SELECT * and then > can't figure out why another query is showing a column that doesn't > appear in SELECT *. I do think SELECT * EXCEPT is the better and less > confusing solution. I can imagine people using different EXCEPT columns > for different queries, which HIDDEN/UNEXPANDED does not allow. I > frankly can't think of a single case where output is specified at the > DDL level. > > Why is this not better addressed by creating a view on the original > table, even perhaps renaming the original table and create a view using > the old table name. That's pretty much my feeling. This seems a bit too cute. I have a little function I use to create a skeleton query on tables with lots of columns just so I can delete a few and leave the rest, a problem that would be solved neatly by the EXCEPT proposal and not but the HIDDEN proposal. cheers andrew -- Andrew Dunstan EDB: https://www.enterprisedb.com
On 10/15/21 2:51 PM, Bruce Momjian wrote:On Fri, Oct 15, 2021 at 11:32:53AM +0200, Laurenz Albe wrote:On Thu, 2021-10-14 at 13:16 +0200, Gilles Darold wrote:Here is a proposal to implement HIDDEN columns feature in PostgreSQL. The user defined columns are always visible in the PostgreSQL. If user wants to hide some column(s) from a SELECT * returned values then the hidden columns feature is useful. Hidden column can always be used and returned by explicitly referring it in the query.When I read your proposal, I had strangely mixed feelings: "This is cute!" versus "Do we need that?". After some thinking, I think that it boils down to the following: That feature is appealing to people who type SQL statements into psql, which is probably the majority of the readers on this list. It is immediately clear that this can be used for all kinds of nice things. On the other hand: a relational database is not a spreadsheet, where I want to hide or highlight columns. Sure, the interactive user may use it in that way, but that is not the target of a relational database. Databases usually are not user visible, but used by an application. So the appeal for the interactive user is really pretty irrelevant. Now this patch makes certain things easier, but it adds no substantially new functionality: I can exclude a column from display as it is, simply by listing all the other columns. Sure, that's a pain for the interactive user, but it is irrelevant for a query in an application. This together with the fact that it poses complicated questions when we dig deeper, such as "what about whole-row references?", tilts my vote. If it were for free, I would say +1. But given the ratio of potential headache versus added real-life benefit, I find myself voting -1.I can see the usefulness of this, though UNEXPANDED seems clearer. However, it also is likely to confuse someone who does SELECT * and then can't figure out why another query is showing a column that doesn't appear in SELECT *. I do think SELECT * EXCEPT is the better and less confusing solution. I can imagine people using different EXCEPT columns for different queries, which HIDDEN/UNEXPANDED does not allow. I frankly can't think of a single case where output is specified at the DDL level. Why is this not better addressed by creating a view on the original table, even perhaps renaming the original table and create a view using the old table name.That's pretty much my feeling. This seems a bit too cute. I have a little function I use to create a skeleton query on tables with lots of columns just so I can delete a few and leave the rest, a problem that would be solved neatly by the EXCEPT proposal and not but the HIDDEN proposal.
I have nothing against seeing the EXCEPT included into core except that this is a big sprain to the SQL standard and I doubt that personally I will used it for portability reason. Saying that, by this syntax we will also encourage the use of SELECT * which is in contradiction with the common opinion.
But again I don't think this is the same feature, the only thing where SELECT * EXCEPT is useful is for a single non portable statement. It does not help to extend PostgreSQL through extensions or can solves application migration issues. I'm a bit surprise by this confusion with the EXCEPT syntax.
-- Gilles Darold
Le 15/10/2021 à 18:42, Aleksander Alekseev a écrit : > Hi Gilles, > >> Yes, I don't wanted to offend you or to troll. This was just to point >> that the position of "SELECT * is bad practice" is not a good argument >> in my point of view, just because it is allowed for every one. I mean >> that in an extension or a client which allow user query input we must >> handle the case. > Sure, no worries. And my apologies if my feedback seemed a little harsh. > > I'm sure our goal is mutual - to make PostgreSQL even better than it > is now. Finding a consensus occasionally can take time though. > Right, no problem Aleksander, my english speaking and understanding is not very good so it doesn't help too. Let's have a beer next time :-)
Le 15/10/2021 à 20:51, Bruce Momjian a écrit : > Why is this not better addressed by creating a view on the original > table, even perhaps renaming the original table and create a view using > the old table name. Because when you use the view for the select you can not use the "hidden" column in your query, for example in the WHERE or ORDER BY clause. Also if you have a hundred of tables, let's says with a ts_vector column that you want to unexpand, you will have to create a hundred of view. The other problem it for write in the view, it you have a complex modification involving other tables in the query you have to define rules. Handling a technical column through a view over the real table require lot of work, this feature will help a lot to save this time. -- Gilles Darold
Hi, Here is a new version of the patch for the hidden column feature with the following changes: - Rename the HIDDEN into UNEXPANDED and replace all references to hidden column into unexpanded column - Remove changes in the information_schema - Limit use of the UNEXPANDED attribute to ALTER COLUMN SET/DROP commands. - Add a check into SET UNEXPANDED code to verify that there is at least one column expanded. - Verify that INSERT INTO table SELECT * FROM table respect the unexpanded column feature. - Verify that RETURNING * clause also respect the unexpanded column feature. I have kept the behavior on function using the wildcard * which does not take care of the unexpanded column attribute. I have not though of other gotcha for the moment, I will update the patch if other cases come. In psql the Expended information is displayed when using \d+, perhaps it could be better to see this information directly with \d so that the information comes to the eyes immediately. -- Gilles Darold
Attachment
On 10/17/21 11:01 PM, Gilles Darold wrote: > > - Add a check into SET UNEXPANDED code to verify that there is at > least one column expanded. What is the point of this? Postgres allows column-less tables. Both of these statements are valid: - CREATE TABLE nada (); - SELECT; -- Vik Fearing
Le 17/10/2021 à 23:04, Vik Fearing a écrit : > On 10/17/21 11:01 PM, Gilles Darold wrote: >> - Add a check into SET UNEXPANDED code to verify that there is at >> least one column expanded. > What is the point of this? Postgres allows column-less tables. > > Both of these statements are valid: > > - CREATE TABLE nada (); > - SELECT; Yes, my first though was to allow all columns to be unexpandable like a table without column, but the the problem is that when you execute "SELECT * FROM nada" it returns no rows which is not the case of a table with hidden column. I could fix that to return no rows if all columns are unexpandable but I think that all column hidden is a nonsens so I have prefered to not allow it and an error is raised. Also I've just though that applying unexpandable column feature to plpgsql breaks the use of ROWTYPE. It contains all columns so when use as a variable to receive a SELECT * or RETURNING * INTO it will not works, I will try to fix that. -- Gilles Darold
Yes, my first though was to allow all columns to be unexpandable like a
table without column, but the the problem is that when you execute
"SELECT * FROM nada" it returns no rows which is not the case of a table
with hidden column. I could fix that to return no rows if all columns
are unexpandable but I think that all column hidden is a nonsens so I
have prefered to not allow it and an error is raised.
Type "help" for help.
postgres=# create table nada ();
CREATE TABLE
postgres=# insert into nada default values;
INSERT 0 1
postgres=# insert into nada default values;
INSERT 0 1
postgres=# table nada;
--
(2 rows)
postgres=#
On Sun, 17 Oct 2021 at 17:42, Gilles Darold <gilles@migops.com> wrote:Perhaps I misunderstand what you are saying, but a no-columns table definitely can return rows:psql (12.2)
Type "help" for help.
postgres=# create table nada ();
CREATE TABLE
postgres=# insert into nada default values;
INSERT 0 1
postgres=# insert into nada default values;
INSERT 0 1
postgres=# table nada;
--
(2 rows)
postgres=#Note that psql doesn't display a separate line for each row in this case, but the actual result coming back from the server does contain the appropriate number of rows.
I was not aware of that. In this case perhaps that we can remove the restriction on having at least on expandable column and we will have the same behavior but I can't think of an interest to allow that.
-- Gilles Darold
On 10/18/21 8:44 AM, Gilles Darold wrote: > Le 17/10/2021 à 23:48, Isaac Morland a écrit : >> On Sun, 17 Oct 2021 at 17:42, Gilles Darold <gilles@migops.com >> <mailto:gilles@migops.com>> wrote: >> >> Note that psql doesn't display a separate line for each row in this >> case, but the actual result coming back from the server does contain >> the appropriate number of rows. > > I was not aware of that. In this case perhaps that we can remove the > restriction on having at least on expandable column and we will have the > same behavior but I can't think of an interest to allow that. Allowing no-column tables removed the need to handle a bunch of corner cases. Useful for users or not, the precedent is set. -- Vik Fearing
Le 18/10/2021 à 17:24, Vik Fearing a écrit : > On 10/18/21 8:44 AM, Gilles Darold wrote: >> Le 17/10/2021 à 23:48, Isaac Morland a écrit : >>> On Sun, 17 Oct 2021 at 17:42, Gilles Darold <gilles@migops.com >>> <mailto:gilles@migops.com>> wrote: >>> >>> Note that psql doesn't display a separate line for each row in this >>> case, but the actual result coming back from the server does contain >>> the appropriate number of rows. >> I was not aware of that. In this case perhaps that we can remove the >> restriction on having at least on expandable column and we will have the >> same behavior but I can't think of an interest to allow that. > Allowing no-column tables removed the need to handle a bunch of corner > cases. Useful for users or not, the precedent is set. I agree, now that I know that this is perfectly possible to return N rows without any data/column I also think that we should allow it in respect to PostgreSQL behavior with a table with no column. I will remove the check at SET UNEXPANDED. -- Gilles Darold
I suggest to look for output test files that are being massively modified by this patch. I think those are likely unintended: > diff --git a/src/test/regress/expected/collate.icu.utf8.out b/src/test/regress/expected/collate.icu.utf8.out > diff --git a/src/test/regress/expected/collate.linux.utf8.out b/src/test/regress/expected/collate.linux.utf8.out > diff --git a/src/test/regress/expected/compression.out b/src/test/regress/expected/compression.out > diff --git a/src/test/regress/expected/xml.out b/src/test/regress/expected/xml.out > diff --git a/src/test/regress/expected/xmlmap.out b/src/test/regress/expected/xmlmap.out -- Álvaro Herrera Valdivia, Chile — https://www.EnterpriseDB.com/
Le 18/10/2021 à 18:54, Alvaro Herrera a écrit : > I suggest to look for output test files that are being massively > modified by this patch. I think those are likely unintended: > >> diff --git a/src/test/regress/expected/collate.icu.utf8.out b/src/test/regress/expected/collate.icu.utf8.out >> diff --git a/src/test/regress/expected/collate.linux.utf8.out b/src/test/regress/expected/collate.linux.utf8.out >> diff --git a/src/test/regress/expected/compression.out b/src/test/regress/expected/compression.out >> diff --git a/src/test/regress/expected/xml.out b/src/test/regress/expected/xml.out >> diff --git a/src/test/regress/expected/xmlmap.out b/src/test/regress/expected/xmlmap.out My bad, thanks for the report Alvaro. New patch version v3 should fix that. -- Gilles Darold
Attachment
On 2021-Oct-18, Gilles Darold wrote: > Le 18/10/2021 à 18:54, Alvaro Herrera a écrit : > > I suggest to look for output test files that are being massively > > modified by this patch. I think those are likely unintended: > > > >> diff --git a/src/test/regress/expected/collate.icu.utf8.out b/src/test/regress/expected/collate.icu.utf8.out > >> diff --git a/src/test/regress/expected/collate.linux.utf8.out b/src/test/regress/expected/collate.linux.utf8.out > >> diff --git a/src/test/regress/expected/compression.out b/src/test/regress/expected/compression.out > >> diff --git a/src/test/regress/expected/xml.out b/src/test/regress/expected/xml.out > >> diff --git a/src/test/regress/expected/xmlmap.out b/src/test/regress/expected/xmlmap.out > > My bad, thanks for the report Alvaro. New patch version v3 should fix that. Hmm, the attachment was 500kB before, about 30% of that was the collate.*.out files, and it is 2.2 MB now. Something is still not right. -- Álvaro Herrera 39°49'30"S 73°17'W — https://www.EnterpriseDB.com/ "Porque francamente, si para saber manejarse a uno mismo hubiera que rendir examen... ¿Quién es el machito que tendría carnet?" (Mafalda)
Le 18/10/2021 à 22:36, Alvaro Herrera a écrit : > On 2021-Oct-18, Gilles Darold wrote: > >> Le 18/10/2021 à 18:54, Alvaro Herrera a écrit : >>> I suggest to look for output test files that are being massively >>> modified by this patch. I think those are likely unintended: >>> >>>> diff --git a/src/test/regress/expected/collate.icu.utf8.out b/src/test/regress/expected/collate.icu.utf8.out >>>> diff --git a/src/test/regress/expected/collate.linux.utf8.out b/src/test/regress/expected/collate.linux.utf8.out >>>> diff --git a/src/test/regress/expected/compression.out b/src/test/regress/expected/compression.out >>>> diff --git a/src/test/regress/expected/xml.out b/src/test/regress/expected/xml.out >>>> diff --git a/src/test/regress/expected/xmlmap.out b/src/test/regress/expected/xmlmap.out >> My bad, thanks for the report Alvaro. New patch version v3 should fix that. > Hmm, the attachment was 500kB before, about 30% of that was the > collate.*.out files, and it is 2.2 MB now. Something is still not > right. Right I don't know what I have done yesterday, look like I have included tests output autogenerated files. However I've attached a new version v4 of the patch that include the right list of files changed and some fixes: - Allow a table to have all columns unexpanded, doc updated. - Add a note to documentation about use of ROWTYPE when there is an unexpanded column. - Fix documentation about some sgml tag broken. About ROWTYPE generating an error when SELECT * INTO or RETURNING * INTO is used with unexpanded column, I have kept things like that because it is the normal behavior. I have checked on others database engine and this is the same. -- Gilles Darold
Attachment
Le 19/10/2021 à 07:43, Gilles Darold a écrit : > Le 18/10/2021 à 22:36, Alvaro Herrera a écrit : >> On 2021-Oct-18, Gilles Darold wrote: >> >>> Le 18/10/2021 à 18:54, Alvaro Herrera a écrit : >>>> I suggest to look for output test files that are being massively >>>> modified by this patch. I think those are likely unintended: >>>> >>>>> diff --git a/src/test/regress/expected/collate.icu.utf8.out b/src/test/regress/expected/collate.icu.utf8.out >>>>> diff --git a/src/test/regress/expected/collate.linux.utf8.out b/src/test/regress/expected/collate.linux.utf8.out >>>>> diff --git a/src/test/regress/expected/compression.out b/src/test/regress/expected/compression.out >>>>> diff --git a/src/test/regress/expected/xml.out b/src/test/regress/expected/xml.out >>>>> diff --git a/src/test/regress/expected/xmlmap.out b/src/test/regress/expected/xmlmap.out >>> My bad, thanks for the report Alvaro. New patch version v3 should fix that. >> Hmm, the attachment was 500kB before, about 30% of that was the >> collate.*.out files, and it is 2.2 MB now. Something is still not >> right. > > Right I don't know what I have done yesterday, look like I have included > tests output autogenerated files. However I've attached a new version v4 > of the patch that include the right list of files changed and some fixes: > > > - Allow a table to have all columns unexpanded, doc updated. > > - Add a note to documentation about use of ROWTYPE when there is an > unexpanded column. > > - Fix documentation about some sgml tag broken. > > > About ROWTYPE generating an error when SELECT * INTO or RETURNING * INTO > is used with unexpanded column, I have kept things like that because it > is the normal behavior. I have checked on others database engine and > this is the same.1 And finally I found the reason of the diff on compression.out and collate.linux.utf8.out, new version v5 of the patch attached. -- Gilles Darold
Attachment
Op 27-10-2021 om 16:33 schreef Gilles Darold: >> >> - Fix documentation about some sgml tag broken. >> >> >> About ROWTYPE generating an error when SELECT * INTO or RETURNING * INTO >> is used with unexpanded column, I have kept things like that because it >> is the normal behavior. I have checked on others database engine and >> this is the same.1 > > > And finally I found the reason of the diff on compression.out and > collate.linux.utf8.out, new version v5 of the patch attached. > > [ 0001-hidden-column-v5.patch ] This warning during compile from gcc 11.2: pg_dump.c: In function ‘dumpTableSchema’: pg_dump.c:16327:56: warning: comparison of constant ‘0’ with boolean expression is always true [-Wbool-compare] 16327 | if (tbinfo->attisunexpanded[j] >= 0) | ^~ Otherwise, build, make check, chekc-world are OK. Also the pdf builds ok. Thanks, Erik Rijkers
Le 27/10/2021 à 17:47, Erik Rijkers a écrit : > Op 27-10-2021 om 16:33 schreef Gilles Darold: >>> >>> - Fix documentation about some sgml tag broken. >>> >>> >>> About ROWTYPE generating an error when SELECT * INTO or RETURNING * >>> INTO >>> is used with unexpanded column, I have kept things like that because it >>> is the normal behavior. I have checked on others database engine and >>> this is the same.1 >> >> >> And finally I found the reason of the diff on compression.out and >> collate.linux.utf8.out, new version v5 of the patch attached. >> > > [ 0001-hidden-column-v5.patch ] > > > This warning during compile from gcc 11.2: > > pg_dump.c: In function ‘dumpTableSchema’: > pg_dump.c:16327:56: warning: comparison of constant ‘0’ with boolean > expression is always true [-Wbool-compare] > 16327 | if (tbinfo->attisunexpanded[j] >= 0) > | ^~ > > Otherwise, build, make check, chekc-world are OK. Also the pdf builds > ok. > > Thanks, > > Erik Rijkers Thanks Erik, new version v6 attached. -- Gilles Darold
Attachment
Op 27-10-2021 om 18:02 schreef Gilles Darold: >> >> Otherwise, build, make check, chekc-world are OK. Also the pdf builds >> ok. > > Thanks Erik, new version v6 attached. Hi, Anther small thing: the test_decoding module was overlooked, I think. Below is output from make check-world (this error does not occur in master) Erik ============== running regression test queries ============== test ddl ... FAILED 1210 ms test xact ... ok 22 ms test rewrite ... ok 176 ms test toast ... ok 292 ms test permissions ... ok 24 ms test decoding_in_xact ... ok 23 ms test decoding_into_rel ... ok 33 ms test binary ... ok 16 ms test prepared ... ok 21 ms test replorigin ... ok 23 ms test time ... ok 22 ms test messages ... ok 26 ms test spill ... ok 2407 ms test slot ... ok 424 ms test truncate ... ok 21 ms test stream ... ok 31 ms test stats ... ok 1097 ms test twophase ... ok 46 ms test twophase_stream ... ok 28 ms ============== shutting down postmaster ============== ======================= 1 of 19 tests failed. ======================= The differences that caused some tests to fail can be viewed in the file "/home/aardvark/pg_stuff/pg_sandbox/pgsql.hide_column/contrib/test_decoding/regression.diffs". A copy of the test summary that you see above is saved in the file "/home/aardvark/pg_stuff/pg_sandbox/pgsql.hide_column/contrib/test_decoding/regression.out". ../../src/makefiles/pgxs.mk:451: recipe for target 'check' failed make[2]: *** [check] Error 1 make[2]: Leaving directory '/home/aardvark/pg_stuff/pg_sandbox/pgsql.hide_column/contrib/test_decoding' Makefile:94: recipe for target 'check-test_decoding-recurse' failed make[1]: *** [check-test_decoding-recurse] Error 2 make[1]: Leaving directory '/home/aardvark/pg_stuff/pg_sandbox/pgsql.hide_column/contrib' GNUmakefile:71: recipe for target 'check-world-contrib-recurse' failed make: *** [check-world-contrib-recurse] Error 2
Attachment
Le 28/10/2021 à 09:29, Erik Rijkers a écrit : > Op 27-10-2021 om 18:02 schreef Gilles Darold: >>> >>> Otherwise, build, make check, chekc-world are OK. Also the pdf builds >>> ok. >> >> Thanks Erik, new version v6 attached. > > Hi, > > Anther small thing: the test_decoding module was overlooked, I think. > Below is output from make check-world (this error does not occur in > master) > > > Erik > Fixed with new patch version v7 attached. It also fixes unwanted change of some regression tests output reported by the cfbot because I forgot to change my locale. I will also add a pg_dump test to verify that ALTER ... SET UNEXPANDED statements are well generated in the dump. -- Gilles Darold
Attachment
Le 28/10/2021 à 16:31, Bruce Momjian a écrit : > On Thu, Oct 28, 2021 at 11:30:27AM +0200, Gilles Darold wrote: >> Fixed with new patch version v7 attached. It also fixes unwanted change >> of some regression tests output reported by the cfbot because I forgot >> to change my locale. >> >> >> I will also add a pg_dump test to verify that ALTER ... SET UNEXPANDED >> statements are well generated in the dump. > I want to state I still think this feature is not generally desired, and > is better implemented at the query level. I think that with an implementation at query level we will cover the user need but not the developer need to "hide" technical columns, and also it does not cover the INSERT statement without column. Personally I will not try to convince more I'm lacking of arguments, I just wanted to attach a full working patch to test the proposal. So unless there is more persons interested by this feature I suggest us to not waste more time on this proposal. -- Gilles Darold