Thread: Feature suggestion: auto-prefixing SELECT query column names withtable/alias names

Hello,

I've seen some situations where it would be very handy if PostgreSQL 
could automatically prefix column names in a SELECT query with their 
table names (or its alias).   So for two tables, a & b, each with 
columns "id" and "title":

     SELECT * FROM a , b;

instead of returning ambiguously-named columns:

     id, title, id, title

would return:

     a.id, a.title, b.id, b.title

Of course, this can be achieved by avoiding the (often frowned-upon) 
SELECT * syntax in the first place and using explicit column names, but 
that makes otherwise short queries much longer and harder to maintain.  
Also this doesn't only apply to SELECT * queries: if tables a & b had 
more columns, but I wanted just the "title" column of each, to avoid 
ambiguous column names or having to use column position indexes, I have 
to give explicit aliases to those columns (which could be anything, but 
in practice is often the table prefix as already written in the select 
(SELECT a.title AS "a.title". b.title as "b.title" etc).

Could there be an (admittedly non-standard) option to automatically 
prefix table/alias names to columns? From searching, it seems a number 
of people would find this useful[1].  Perhaps something like:   SELECT 
#* FROM a, b    -->   a.id, a.title, b.id, d.title.  This would allow 
use per-table:   SELECT a.id, #b.*   -->   id, b.id, b.title.   Or per 
column  E.g:   SELECT a.id, #a.title, #b.title   -->   id, a.title. 
b.title.   Or alternatively it could be specified in the FROM clause as 
to which tables have their columns prefixed.

I have no idea if this is viable/sensible/anathema, especially given the 
understandable reluctance to stray from the SQL standard, but thought 
I'd ask, especially as it seems like something others would find useful too.

Thanks,

Guy

[1]

https://stackoverflow.com/questions/329931/sql-select-join-is-it-possible-to-prefix-all-columns-as-prefix

https://stackoverflow.com/questions/13153344/in-a-join-how-to-prefix-all-column-names-with-the-table-it-came-from

https://stackoverflow.com/questions/53947657/how-to-display-table-name-as-prefix-to-column-name-in-sql-result

https://stackoverflow.com/questions/33364392/sql-join-prefix-fields-with-table-name




On Mon, 2020-06-22 at 14:16 +1200, Guy Burgess wrote:
> I've seen some situations where it would be very handy if PostgreSQL 
> could automatically prefix column names in a SELECT query with their 
> table names (or its alias).   So for two tables, a & b, each with 
> columns "id" and "title":
> 
>      SELECT * FROM a , b;
> 
> instead of returning ambiguously-named columns:
> 
>      id, title, id, title
> 
> would return:
> 
>      a.id, a.title, b.id, b.title
> 
> Of course, this can be achieved by avoiding the (often frowned-upon) 
> SELECT * syntax in the first place and using explicit column names, but 
> that makes otherwise short queries much longer and harder to maintain.  
> Also this doesn't only apply to SELECT * queries: if tables a & b had 
> more columns, but I wanted just the "title" column of each, to avoid 
> ambiguous column names or having to use column position indexes, I have 
> to give explicit aliases to those columns (which could be anything, but 
> in practice is often the table prefix as already written in the select 
> (SELECT a.title AS "a.title". b.title as "b.title" etc).
> 
> Could there be an (admittedly non-standard) option to automatically 
> prefix table/alias names to columns? From searching, it seems a number 
> of people would find this useful[1].  Perhaps something like:   SELECT 
> #* FROM a, b    -->   a.id, a.title, b.id, d.title.  This would allow 
> use per-table:   SELECT a.id, #b.*   -->   id, b.id, b.title.   Or per 
> column  E.g:   SELECT a.id, #a.title, #b.title   -->   id, a.title. 
> b.title.   Or alternatively it could be specified in the FROM clause as 
> to which tables have their columns prefixed.
> 
> I have no idea if this is viable/sensible/anathema, especially given the 
> understandable reluctance to stray from the SQL standard, but thought 
> I'd ask, especially as it seems like something others would find useful too.

I'd say that this idea is not so desirable, for one because the resulting
names would not follow the name rules for SQL identifiers, so that you'd
have to double quote them whenever you reference them.
That could of course be avoided by using "a_id", "a_title" and so on.

Then there is the case of "a JOIN b USING (somecol)".
Here, "somecol" will appear in the output only once.  How should it be
labeled?  If you say "not at all", then what do you want to happen for

  SELECT * FROM a JOIN b USING (id) JOIN c USING (x)

where all three tables have a column "id"?  You see, it is not so simple.

But the real objection I have is that the problem can be easily avoided
by spelling out the SELECT list and using aliases.  Either you are talking
about ad-hoc queries, or you want to use "SELECT *" in your code.

- For ad-hoc queries I don't see the problem so much.  Here, the tables
  will be returned in the order you specify them in the query.
  So if you have "b JOIN a", the result columns will always be
  first all columns from "b", then all columns from "a".
  So you can easily figure out which column belongs to which table.

- For queries in your code, using "SELECT *" is a very bad idea.
  There is a real risk of your code breaking if the database changes.
  In code, you are probably not ready to deal with a changing number
  of columns.

Yours,
Laurenz Albe
-- 
Cybertec | https://www.cybertec-postgresql.com




On 22/06/2020 3:25 pm, Laurenz Albe wrote:
Then there is the case of "a JOIN b USING (somecol)".
Here, "somecol" will appear in the output only once.  How should it be
labeled?  If you say "not at all", then what do you want to happen for
 SELECT * FROM a JOIN b USING (id) JOIN c USING (x)

where all three tables have a column "id"?
Thanks, I've learned a tip about USING :)  In such case, could it not simply return all columns prefixed with their table/alias names, as that is what the table-prefix option would mean? To pretend-modify the documentation: "Also, USING implies that only one of each pair of equivalent columns will be included in the join output, not both; unless table-prefixing is enabled for one or both of columns."  In other words, if table-prefixing is enabled for a column in a USING clause, it behaves like a regular LEFT JOIN.  So SELECT #* FROM a JOIN b USING (id) would give:

    a.id, a.title, b.id, b.title

But the real objection I have is that the problem can be easily avoided
by spelling out the SELECT list and using aliases.  Either you are talking
about ad-hoc queries, or you want to use "SELECT *" in your code.

- For ad-hoc queries I don't see the problem so much.  Here, the tables will be returned in the order you specify them in the query. So if you have "b JOIN a", the result columns will always be first all columns from "b", then all columns from "a". So you can easily figure out which column belongs to which table.
True, but it would be nice to have an easy 'foolproof' way to see the table name a column belongs to (especially for wide tables).
- For queries in your code, using "SELECT *" is a very bad idea. There is a real risk of your code breaking if the database changes. In code, you are probably not ready to deal with a changing number of columns.

Yes. Though in the case of explicit SELECT lists, duplicate column names usually requires an alias to be given (unless using column index), and this is often just the table prefix anyway, so it would be nice to be able to automatically prefix the table name in the query.  It would be rather like the USING clause itself - just a nice convenience/shorthand to have.

Thanks for the comments!

On Sunday, June 21, 2020, Guy Burgess <guy@burgess.co.nz> wrote:
    a.id, a.title, b.id, b.title

You are missing some double-quotes there.

Of course, this can be achieved by avoiding the (often frowned-upon) SELECT * syntax in the first place and using explicit column names,

Or choose better, distinguishing, column names.
 
Could there be an (admittedly non-standard) option to automatically prefix table/alias names to columns

The cost/benefit here seems quite unfavorable.

SQL is verbose and SQL doesn't make fields inseparable from the relations they are attached to.  So while "object.name" makes sense in an object-oriented world it doesn't make as much sense in SQL, object.object_name does.

At least at the top level of query, when dealing with physical relations at least, client software has the ability to inspect the result metadata and for those output columns that are from relations it can lookup related information by OID.  If anything I would restrict manipulation to this subset of problem space.  What I'd want is some way to attach a "human friendly" label to a column and have the server replace the output column name for a physical column with that human readable label instead.  I'd still have to alias derived values using "AS" but that's fine.

Tying this back to the original request, if we do focus on top-level tlist names in most production cases you'd want "Object Name" instead of "object.name" anyway - so absent something like I describe above you are just back to writing:
SELECT object.name AS "Object Name"
instead of
SELECT *
and getting something besides "name" back through some behind the scenes logic.

For ad-hoc queries I'd suggest that the incidence of desiring "SELECT *" is considerably higher but that the query author has a considerably greater propensity to know which name is which.

So, in short, this feels like something with a small but real audience but a non-trivial design, let alone implementation, and has a large degree of mitigation by adopting technology driven coding and naming standards.  For production outputs it doesn't even do that great a job as final output names should strive to be human friendly as opposed to the developer friendly names used in data models.

David J.