Feature suggestion: auto-prefixing SELECT query column names with table/alias names
Hello,
I've seen some situations where it would be very handy if PostgreSQL
could automatically prefix column names in a SELECT query with their
table names (or its alias). So for two tables, a & b, each with
columns "id" and "title":
SELECT * FROM a , b;
instead of returning ambiguously-named columns:
id, title, id, title
would return:
a.id, a.title, b.id, b.title
Of course, this can be achieved by avoiding the (often frowned-upon)
SELECT * syntax in the first place and using explicit column names, but
that makes otherwise short queries much longer and harder to maintain.
Also this doesn't only apply to SELECT * queries: if tables a & b had
more columns, but I wanted just the "title" column of each, to avoid
ambiguous column names or having to use column position indexes, I have
to give explicit aliases to those columns (which could be anything, but
in practice is often the table prefix as already written in the select
(SELECT a.title AS "a.title". b.title as "b.title" etc).
Could there be an (admittedly non-standard) option to automatically
prefix table/alias names to columns? From searching, it seems a number
of people would find this useful[1]https://stackoverflow.com/questions/329931/sql-select-join-is-it-possible-to-prefix-all-columns-as-prefix. Perhaps something like: SELECT
#* FROM a, b --> a.id, a.title, b.id, d.title. This would allow
use per-table: SELECT a.id, #b.* --> id, b.id, b.title. Or per
column E.g: SELECT a.id, #a.title, #b.title --> id, a.title.
b.title. Or alternatively it could be specified in the FROM clause as
to which tables have their columns prefixed.
I have no idea if this is viable/sensible/anathema, especially given the
understandable reluctance to stray from the SQL standard, but thought
I'd ask, especially as it seems like something others would find useful too.
Thanks,
Guy
https://stackoverflow.com/questions/33364392/sql-join-prefix-fields-with-table-name
On Mon, 2020-06-22 at 14:16 +1200, Guy Burgess wrote:
I've seen some situations where it would be very handy if PostgreSQL
could automatically prefix column names in a SELECT query with their
table names (or its alias). So for two tables, a & b, each with
columns "id" and "title":SELECT * FROM a , b;
instead of returning ambiguously-named columns:
id, title, id, title
would return:
a.id, a.title, b.id, b.title
Of course, this can be achieved by avoiding the (often frowned-upon)
SELECT * syntax in the first place and using explicit column names, but
that makes otherwise short queries much longer and harder to maintain.
Also this doesn't only apply to SELECT * queries: if tables a & b had
more columns, but I wanted just the "title" column of each, to avoid
ambiguous column names or having to use column position indexes, I have
to give explicit aliases to those columns (which could be anything, but
in practice is often the table prefix as already written in the select
(SELECT a.title AS "a.title". b.title as "b.title" etc).Could there be an (admittedly non-standard) option to automatically
prefix table/alias names to columns? From searching, it seems a number
of people would find this useful[1]. Perhaps something like: SELECT
#* FROM a, b --> a.id, a.title, b.id, d.title. This would allow
use per-table: SELECT a.id, #b.* --> id, b.id, b.title. Or per
column E.g: SELECT a.id, #a.title, #b.title --> id, a.title.
b.title. Or alternatively it could be specified in the FROM clause as
to which tables have their columns prefixed.I have no idea if this is viable/sensible/anathema, especially given the
understandable reluctance to stray from the SQL standard, but thought
I'd ask, especially as it seems like something others would find useful too.
I'd say that this idea is not so desirable, for one because the resulting
names would not follow the name rules for SQL identifiers, so that you'd
have to double quote them whenever you reference them.
That could of course be avoided by using "a_id", "a_title" and so on.
Then there is the case of "a JOIN b USING (somecol)".
Here, "somecol" will appear in the output only once. How should it be
labeled? If you say "not at all", then what do you want to happen for
SELECT * FROM a JOIN b USING (id) JOIN c USING (x)
where all three tables have a column "id"? You see, it is not so simple.
But the real objection I have is that the problem can be easily avoided
by spelling out the SELECT list and using aliases. Either you are talking
about ad-hoc queries, or you want to use "SELECT *" in your code.
- For ad-hoc queries I don't see the problem so much. Here, the tables
will be returned in the order you specify them in the query.
So if you have "b JOIN a", the result columns will always be
first all columns from "b", then all columns from "a".
So you can easily figure out which column belongs to which table.
- For queries in your code, using "SELECT *" is a very bad idea.
There is a real risk of your code breaking if the database changes.
In code, you are probably not ready to deal with a changing number
of columns.
Yours,
Laurenz Albe
--
Cybertec | https://www.cybertec-postgresql.com
On 22/06/2020 3:25 pm, Laurenz Albe wrote:
Then there is the case of "a JOIN b USING (somecol)".
Here, "somecol" will appear in the output only once. How should it be
labeled? If you say "not at all", then what do you want to happen forSELECT * FROM a JOIN b USING (id) JOIN c USING (x)
where all three tables have a column "id"?
Thanks, I've learned a tip about USING :) In such case, could it not
simply return all columns prefixed with their table/alias names, as that
is what the table-prefix option would mean? To pretend-modify the
documentation: "Also, |USING| implies that only one of each pair of
equivalent columns will be included in the join output, not both;
_unless table-prefixing is enabled for one or both of columns_." In
other words, if table-prefixing is enabled for a column in a USING
clause, it behaves like a regular LEFT JOIN. So SELECT #* FROM a JOIN b
USING (id) would give:
a.id, a.title, b.id, b.title
But the real objection I have is that the problem can be easily avoided
by spelling out the SELECT list and using aliases. Either you are talking
about ad-hoc queries, or you want to use "SELECT *" in your code.- For ad-hoc queries I don't see the problem so much. Here, the tables
will be returned in the order you specify them in the query.
So if you have "b JOIN a", the result columns will always be
first all columns from "b", then all columns from "a".
So you can easily figure out which column belongs to which table.
True, but it would be nice to have an easy 'foolproof' way to see the
table name a column belongs to (especially for wide tables).
- For queries in your code, using "SELECT *" is a very bad idea.
There is a real risk of your code breaking if the database changes.
In code, you are probably not ready to deal with a changing number
of columns.
Yes. Though in the case of explicit SELECT lists, duplicate column names
usually requires an alias to be given (unless using column index), and
this is often just the table prefix anyway, so it would be nice to be
able to automatically prefix the table name in the query. It would be
rather like the USING clause itself - just a nice convenience/shorthand
to have.
Thanks for the comments!
On Sunday, June 21, 2020, Guy Burgess <guy@burgess.co.nz> wrote:
a.id, a.title, b.id, b.title
You are missing some double-quotes there.
Of course, this can be achieved by avoiding the (often frowned-upon) SELECT
* syntax in the first place and using explicit column names,
Or choose better, distinguishing, column names.
Could there be an (admittedly non-standard) option to automatically prefix
table/alias names to columns
The cost/benefit here seems quite unfavorable.
SQL is verbose and SQL doesn't make fields inseparable from the relations
they are attached to. So while "object.name" makes sense in an
object-oriented world it doesn't make as much sense in SQL,
object.object_name does.
At least at the top level of query, when dealing with physical relations at
least, client software has the ability to inspect the result metadata and
for those output columns that are from relations it can lookup related
information by OID. If anything I would restrict manipulation to this
subset of problem space. What I'd want is some way to attach a "human
friendly" label to a column and have the server replace the output column
name for a physical column with that human readable label instead. I'd
still have to alias derived values using "AS" but that's fine.
Tying this back to the original request, if we do focus on top-level tlist
names in most production cases you'd want "Object Name" instead of "
object.name" anyway - so absent something like I describe above you are
just back to writing:
SELECT object.name AS "Object Name"
instead of
SELECT *
and getting something besides "name" back through some behind the scenes
logic.
For ad-hoc queries I'd suggest that the incidence of desiring "SELECT *" is
considerably higher but that the query author has a considerably greater
propensity to know which name is which.
So, in short, this feels like something with a small but real audience but
a non-trivial design, let alone implementation, and has a large degree of
mitigation by adopting technology driven coding and naming standards. For
production outputs it doesn't even do that great a job as final output
names should strive to be human friendly as opposed to the developer
friendly names used in data models.
David J.