pg14 psql broke \d datname.nspname.relname
This commit broke psql \d datname.nspname.relname
commit 2c8726c4b0a496608919d1f78a5abc8c9b6e0868
Author: Robert Haas <rhaas@postgresql.org>
Date: Wed Feb 3 13:19:41 2021 -0500
Factor pattern-construction logic out of processSQLNamePattern.
...
patternToSQLRegex is a little more general than what is required
by processSQLNamePattern. That function is only interested in
patterns that can have up to 2 parts, a schema and a relation;
but patternToSQLRegex can limit the maximum number of parts to
between 1 and 3, so that patterns can look like either
"database.schema.relation", "schema.relation", or "relation"
depending on how it's invoked and what the user specifies.
processSQLNamePattern only passes two buffers, so works exactly
the same as before, always interpreting the pattern as either
a "schema.relation" pattern or a "relation" pattern. But,
future callers can use this function in other ways.
|$ LD_LIBRARY_PATH=tmp_install/usr/local/pgsql/lib/ src/bin/psql/psql -h /tmp regression
|psql (15devel)
|Type "help" for help.
|regression=# \d regresion.public.bit_defaults
|Did not find any relation named "regresion.public.bit_defaults".
|regression=# \d public.bit_defaults
| Table "public.bit_defaults"
|...
This worked before v14 (even though the commit message says otherwise).
|$ /usr/lib/postgresql/13/bin/psql -h /tmp regression
|psql (13.2 (Debian 13.2-1.pgdg100+1), server 15devel)
|...
|regression=# \d regresion.public.bit_defaults
| Table "public.bit_defaults"
|...
--
Justin
On Oct 11, 2021, at 2:24 PM, Justin Pryzby <pryzby@telsasoft.com> wrote:
This commit broke psql \d datname.nspname.relname
commit 2c8726c4b0a496608919d1f78a5abc8c9b6e0868
Author: Robert Haas <rhaas@postgresql.org>
Date: Wed Feb 3 13:19:41 2021 -0500Factor pattern-construction logic out of processSQLNamePattern.
...
patternToSQLRegex is a little more general than what is required
by processSQLNamePattern. That function is only interested in
patterns that can have up to 2 parts, a schema and a relation;
but patternToSQLRegex can limit the maximum number of parts to
between 1 and 3, so that patterns can look like either
"database.schema.relation", "schema.relation", or "relation"
depending on how it's invoked and what the user specifies.processSQLNamePattern only passes two buffers, so works exactly
the same as before, always interpreting the pattern as either
a "schema.relation" pattern or a "relation" pattern. But,
future callers can use this function in other ways.|$ LD_LIBRARY_PATH=tmp_install/usr/local/pgsql/lib/ src/bin/psql/psql -h /tmp regression
|psql (15devel)
|Type "help" for help.
|regression=# \d regresion.public.bit_defaults
|Did not find any relation named "regresion.public.bit_defaults".
|regression=# \d public.bit_defaults
| Table "public.bit_defaults"
|...This worked before v14 (even though the commit message says otherwise).
|$ /usr/lib/postgresql/13/bin/psql -h /tmp regression
|psql (13.2 (Debian 13.2-1.pgdg100+1), server 15devel)
|...
|regression=# \d regresion.public.bit_defaults
| Table "public.bit_defaults"
|...
I can only assume that you are intentionally misspelling "regression" as "regresion" (with only one "s") as part of the test. I have not checked if that worked before v14, but if it ignored the misspelled database name before v14, and it rejects it now, I'm not sure that counts as a bug.
Am I misunderstanding your bug report?
—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
Mark Dilger <mark.dilger@enterprisedb.com> writes:
I can only assume that you are intentionally misspelling "regression" as "regresion" (with only one "s") as part of the test. I have not checked if that worked before v14, but if it ignored the misspelled database name before v14, and it rejects it now, I'm not sure that counts as a bug.
Doesn't work with the correct DB name, either:
regression=# \d public.bit_defaults
Table "public.bit_defaults"
Column | Type | Collation | Nullable | Default
--------+----------------+-----------+----------+---------------------
b1 | bit(4) | | | '1001'::"bit"
b2 | bit(4) | | | '0101'::"bit"
b3 | bit varying(5) | | | '1001'::bit varying
b4 | bit varying(5) | | | '0101'::"bit"
regression=# \d regression.public.bit_defaults
Did not find any relation named "regression.public.bit_defaults".
regards, tom lane
On Oct 11, 2021, at 3:04 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Doesn't work with the correct DB name, either:
regression=# \d public.bit_defaults
Table "public.bit_defaults"
Column | Type | Collation | Nullable | Default
--------+----------------+-----------+----------+---------------------
b1 | bit(4) | | | '1001'::"bit"
b2 | bit(4) | | | '0101'::"bit"
b3 | bit varying(5) | | | '1001'::bit varying
b4 | bit varying(5) | | | '0101'::"bit"regression=# \d regression.public.bit_defaults
Did not find any relation named "regression.public.bit_defaults".
REL_13_STABLE appears to accept any amount of nonsense you like:
foo=# \d nonesuch.foo.a.b.c.d.bar.baz
Table "bar.baz"
Column | Type | Collation | Nullable | Default
--------+---------+-----------+----------+---------
i | integer | | |
Is this something we're intentionally supporting? There is no regression test covering this, else we'd have seen breakage in the build-farm.
—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
On Mon, Oct 11, 2021 at 02:47:59PM -0700, Mark Dilger wrote:
|$ LD_LIBRARY_PATH=tmp_install/usr/local/pgsql/lib/ src/bin/psql/psql -h /tmp regression
|psql (15devel)
|Type "help" for help.
|regression=# \d regresion.public.bit_defaults
|Did not find any relation named "regresion.public.bit_defaults".
|regression=# \d public.bit_defaults
| Table "public.bit_defaults"
|...This worked before v14 (even though the commit message says otherwise).
|$ /usr/lib/postgresql/13/bin/psql -h /tmp regression
|psql (13.2 (Debian 13.2-1.pgdg100+1), server 15devel)
|...
|regression=# \d regresion.public.bit_defaults
| Table "public.bit_defaults"
|...I can only assume that you are intentionally misspelling "regression" as "regresion" (with only one "s") as part of the test. I have not checked if that worked before v14, but if it ignored the misspelled database name before v14, and it rejects it now, I'm not sure that counts as a bug.
Am I misunderstanding your bug report?
It's not intentional but certainly confusing to put a typo there.
Sorry for that (and good eyes, BTW).
In v15/master:
regression=# \d regression.public.bit_defaults
Did not find any relation named "regression.public.bit_defaults".
After reverting that commit and recompiling psql:
regression=# \d regression.public.bit_defaults
Table "public.bit_defaults"
...
In v13 psql:
regression=# \d regression.public.bit_defaults
Table "public.bit_defaults"
...
It looks like before v13 any "datname" prefix was ignored.
But now it fails to show the table because it does:
WHERE c.relname OPERATOR(pg_catalog.~) '^(public.bit_defaults)$' COLLATE pg_catalog.default
AND n.nspname OPERATOR(pg_catalog.~) '^(regression)$' COLLATE pg_catalog.default
--
Justin
On Oct 11, 2021, at 3:26 PM, Justin Pryzby <pryzby@telsasoft.com> wrote:
It looks like before v13 any "datname" prefix was ignored.
The evidence so far suggests that something is broken in v14, but it is less clear to me what the appropriate behavior is. The v14 psql is rejecting even a correctly named database.schema.table, but v13 psql accepted lots.of.nonsense.schema.table, and neither of those seems at first glance to be correct. But perhaps there are good reasons for ignoring the nonsense prefixes?
—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
Mark Dilger <mark.dilger@enterprisedb.com> writes:
On Oct 11, 2021, at 3:04 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Doesn't work with the correct DB name, either:
regression=# \d regression.public.bit_defaults
Did not find any relation named "regression.public.bit_defaults".
REL_13_STABLE appears to accept any amount of nonsense you like:
Yeah, I'm pretty sure that the old rule was to just ignore whatever
appeared in the database-name position. While we could tighten that
up to insist that it match the current DB's name, I'm not sure that
I see the point. There's no near-term prospect of doing anything
useful with some other DB's name there, so being more restrictive
seems like it'll probably break peoples' scripts to little purpose.
regards, tom lane
On Oct 11, 2021, at 3:37 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
REL_13_STABLE appears to accept any amount of nonsense you like:
Yeah, I'm pretty sure that the old rule was to just ignore whatever
appeared in the database-name position. While we could tighten that
up to insist that it match the current DB's name, I'm not sure that
I see the point. There's no near-term prospect of doing anything
useful with some other DB's name there, so being more restrictive
seems like it'll probably break peoples' scripts to little purpose.
You appear correct about the old behavior. It's unclear how intentional it was. There was a schema buffer and a name buffer, and while parsing the name, if a dot was encountered, the contents just parsed were copied into the schema buffer. If multiple dots were encountered, that had the consequence of blowing away the earlier ones.
But since we allow tables and schemas with dotted names in them, I'm uncertain what \d foo.bar.baz is really asking. That could be "foo.bar"."baz", or "foo"."bar"."baz", or "foo"."bar.baz", or even "public"."foo.bar.baz". The old behavior seems a bit dangerous. There may be tables with all those names, and the user may not have meant the one that we gave them.
The v14 code is no better. It just assumes that is "foo"."bar.baz". So (with debugging statements included):
foo=# create table "foo.bar.baz" (i integer);
CREATE TABLE
foo=# \d public.foo.bar.baz
Converting "public.foo.bar.baz"
GOT "^(public)$" . "^(foo.bar.baz)$"
Table "public.foo.bar.baz"
Column | Type | Collation | Nullable | Default
--------+---------+-----------+----------+---------
i | integer | | |
I expect I'll have to submit a patch restoring the old behavior, but I wonder if that's the best direction to go.
—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
On Mon, 11 Oct 2021 at 19:35, Mark Dilger <mark.dilger@enterprisedb.com>
wrote:
But since we allow tables and schemas with dotted names in them, I'm
uncertain what \d foo.bar.baz is really asking.
FWIW, it’s absolutely clear to me that "." is a special character which has
to be quoted in order to be in an identifier. In other words, a.b.c is
three identifiers separated by two period punctuation marks; what exactly
those periods mean is another question. If somebody uses periods in their
names, they have to quote those names just as if they used capital letters
etc.
But that's just my impression. I comment at all because I remember looking
at something to do with the grammar (I think I wanted to implement ALTER …
RENAME TO newschema.newname) and noticed that a database name could be
given in the syntax.
Mark Dilger <mark.dilger@enterprisedb.com> writes:
But since we allow tables and schemas with dotted names in them, I'm uncertain what \d foo.bar.baz is really asking. That could be "foo.bar"."baz", or "foo"."bar"."baz", or "foo"."bar.baz", or even "public"."foo.bar.baz". The old behavior seems a bit dangerous. There may be tables with all those names, and the user may not have meant the one that we gave them.
You are attacking a straw man here. To use a period in an identifier,
you have to double-quote it; that's the same in SQL or \d.
regression=# create table "foo.bar" (f1 int);
CREATE TABLE
regression=# \d foo.bar
Did not find any relation named "foo.bar".
regression=# \d "foo.bar"
Table "public.foo.bar"
Column | Type | Collation | Nullable | Default
--------+---------+-----------+----------+---------
f1 | integer | | |
According to a quick test, you did not manage to break that in v14.
I expect I'll have to submit a patch restoring the old behavior, but I wonder if that's the best direction to go.
I do not understand why you're even questioning that. The old
behavior had stood for a decade or two without complaints.
regards, tom lane
On Oct 11, 2021, at 4:49 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
You are attacking a straw man here. To use a period in an identifier,
you have to double-quote it; that's the same in SQL or \d.
That's a strange argument. If somebody gives an invalid identifier, we shouldn't assume they know the proper use of quotations. Somebody asking for a.b.c.d.e is clearly in the dark about something. Maybe it's the need to quote the "a.b" part separately from the "c.d.e" part, or maybe it's something else. There are lots of reasonable guesses about what they meant, and for backward compatibility reasons we define using the suffix d.e and ignoring the prefix a.b.c as the correct answer. That's a pretty arbitrary thing to do, but it has the advantage of being backwards compatible.
I expect I'll have to submit a patch restoring the old behavior, but I wonder if that's the best direction to go.
I do not understand why you're even questioning that. The old
behavior had stood for a decade or two without complaints.
I find the backward compatibility argument appealing, but since we have clients that understand the full database.schema.relation format without ignoring the database portion, our client behavior is getting inconsistent. I'd like to leave the door open for someday supporting server.database.schema.relation format, too. I was just wondering when it might be time to stop being lenient in psql and instead reject malformed identifiers.
—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
On Mon, Oct 11, 2021 at 7:09 PM Mark Dilger
<mark.dilger@enterprisedb.com> wrote:
I was just wondering when it might be time to stop being lenient in psql and instead reject malformed identifiers.
I suppose that I probably wouldn't have chosen this behavior in a
green field situation. But Hyrum's law tells us that there are bound
to be some number of users relying on it. I don't think that it's
worth inconveniencing those people without getting a clear benefit in
return.
Being lenient here just doesn't have much downside in practice, as
evidenced by the total lack of complaints about that lenience.
--
Peter Geoghegan
On Mon, Oct 11, 2021 at 10:33 PM Peter Geoghegan <pg@bowt.ie> wrote:
On Mon, Oct 11, 2021 at 7:09 PM Mark Dilger
<mark.dilger@enterprisedb.com> wrote:I was just wondering when it might be time to stop being lenient in psql and instead reject malformed identifiers.
I suppose that I probably wouldn't have chosen this behavior in a
green field situation. But Hyrum's law tells us that there are bound
to be some number of users relying on it. I don't think that it's
worth inconveniencing those people without getting a clear benefit in
return.Being lenient here just doesn't have much downside in practice, as
evidenced by the total lack of complaints about that lenience.
I find it kind of surprising to find everyone agreeing with this
argument. I mean, PostgreSQL users are often quick to criticize MySQL
for accepting 0000-00-00 as a date, because it isn't, and you
shouldn't accept garbage and do stuff with it as if it were valid
data. But by the same argument, accepting a database name that we know
is not correct as a request to show data in the current database seems
wrong to me.
I completely agree that somebody might be relying on the fact that \d
thisdb.someschema.sometable does something sensible when logged into
thisdb, but surely no user is relying on \d
jgldslghksdghjsgkhsdgjhskg.someschema.sometable is going to just
ignore the leading gibberish. Nor do I understand why we'd want to
ignore the leading gibberish. Saying, as Tom did, that nobody has
complained about that behavior is just another way of saying that
nobody tested it. Surely if someone had, it wouldn't be like this.
--
Robert Haas
EDB: http://www.enterprisedb.com
Robert Haas <robertmhaas@gmail.com> writes:
On Mon, Oct 11, 2021 at 10:33 PM Peter Geoghegan <pg@bowt.ie> wrote:
Being lenient here just doesn't have much downside in practice, as
evidenced by the total lack of complaints about that lenience.
I find it kind of surprising to find everyone agreeing with this
argument.
If the behavior v14 had implemented were "throw an error if the
first word doesn't match the current database name", perhaps nobody
would have questioned it. But that's not what we have. It's fairly
clear that neither you nor Mark thought very much about this case,
let alone tested it. Given that, I am not very pleased that you
are retroactively trying to justify breaking it by claiming that
it was already broken. It's been that way since 7.3 implemented
schemas, more or less, and nobody's complained about it. Therefore
I see little argument for changing that behavior. Changing it in
an already-released branch is especially suspect.
regards, tom lane
On Oct 12, 2021, at 7:30 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
If the behavior v14 had implemented were "throw an error if the
first word doesn't match the current database name", perhaps nobody
would have questioned it. But that's not what we have. It's fairly
clear that neither you nor Mark thought very much about this case,
let alone tested it. Given that, I am not very pleased that you
are retroactively trying to justify breaking it by claiming that
it was already broken. It's been that way since 7.3 implemented
schemas, more or less, and nobody's complained about it. Therefore
I see little argument for changing that behavior. Changing it in
an already-released branch is especially suspect.
I completely agree that we need to fix this. My question was only whether "fix" means to make it accept database.schema.table or whether it means to accept any.prefix.at.all.schema.table. It sounds like more people like the latter, so I'll go with that unless this debate rages on and a different conclusion is reached.
—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
On Tue, Oct 12, 2021 at 10:31 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
If the behavior v14 had implemented were "throw an error if the
first word doesn't match the current database name", perhaps nobody
would have questioned it. But that's not what we have. It's fairly
clear that neither you nor Mark thought very much about this case,
let alone tested it. Given that, I am not very pleased that you
are retroactively trying to justify breaking it by claiming that
it was already broken. It's been that way since 7.3 implemented
schemas, more or less, and nobody's complained about it. Therefore
I see little argument for changing that behavior. Changing it in
an already-released branch is especially suspect.
Oh, give me a break. The previous behavior obviously hasn't been
tested either, and is broken on its face. If someone *had* complained
about it, I imagine you would have promptly fixed it and likely
back-patched the fix, probably in under 24 hours from the time of the
report. I find it difficult to take seriously the contention that
anyone is expecting \d dlsgjdsghj.sdhg.l.dsg.jkhsdg.foo.bar to work
like \d foo.bar, or that they would even prefer that behavior over an
error message. You're carefully avoiding addressing that question in
favor of having a discussion of backward compatibility, but a better
term for what we're talking about here would be bug-compatibility.
--
Robert Haas
EDB: http://www.enterprisedb.com
Greetings,
* Robert Haas (robertmhaas@gmail.com) wrote:
On Tue, Oct 12, 2021 at 10:31 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
If the behavior v14 had implemented were "throw an error if the
first word doesn't match the current database name", perhaps nobody
would have questioned it. But that's not what we have. It's fairly
clear that neither you nor Mark thought very much about this case,
let alone tested it. Given that, I am not very pleased that you
are retroactively trying to justify breaking it by claiming that
it was already broken. It's been that way since 7.3 implemented
schemas, more or less, and nobody's complained about it. Therefore
I see little argument for changing that behavior. Changing it in
an already-released branch is especially suspect.Oh, give me a break. The previous behavior obviously hasn't been
tested either, and is broken on its face. If someone *had* complained
about it, I imagine you would have promptly fixed it and likely
back-patched the fix, probably in under 24 hours from the time of the
report. I find it difficult to take seriously the contention that
anyone is expecting \d dlsgjdsghj.sdhg.l.dsg.jkhsdg.foo.bar to work
like \d foo.bar, or that they would even prefer that behavior over an
error message. You're carefully avoiding addressing that question in
favor of having a discussion of backward compatibility, but a better
term for what we're talking about here would be bug-compatibility.
I tend to agree with Robert on this particular case. Accepting random
nonsense there isn't a feature or something which really needs to be
preserved. For my 2c, I would hope that one day we will be able to
accept other database names there and if that happens, what then? We'd
"break" these cases anyway. Better to be clear that such nonsense isn't
intended to be accepted and clean that up. I do think it'd be good to
accept the current database name there as that's reasonable.
Thanks,
Stephen
On Tue, Oct 12, 2021 at 7:41 AM Robert Haas <robertmhaas@gmail.com> wrote:
Oh, give me a break. The previous behavior obviously hasn't been
tested either, and is broken on its face. If someone *had* complained
about it, I imagine you would have promptly fixed it and likely
back-patched the fix, probably in under 24 hours from the time of the
report.
You're asking us to imagine a counterfactual. But this counterfactual
bug report would have to describe a real practical problem. The
details would matter. It's reasonable to suppose that we haven't seen
such a bug report for a reason.
I can't speak for Tom. My position on this is that it's better to
leave it alone at this time, given the history, and the lack of
complaints from users.
I find it difficult to take seriously the contention that
anyone is expecting \d dlsgjdsghj.sdhg.l.dsg.jkhsdg.foo.bar to work
like \d foo.bar, or that they would even prefer that behavior over an
error message. You're carefully avoiding addressing that question in
favor of having a discussion of backward compatibility, but a better
term for what we're talking about here would be bug-compatibility.
Let's assume that it is bug compatibility. Is that intrinsically a bad thing?
--
Peter Geoghegan
On 10/12/21 5:19 PM, Stephen Frost wrote:
Greetings,
* Robert Haas (robertmhaas@gmail.com) wrote:
On Tue, Oct 12, 2021 at 10:31 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
If the behavior v14 had implemented were "throw an error if the
first word doesn't match the current database name", perhaps nobody
would have questioned it. But that's not what we have. It's fairly
clear that neither you nor Mark thought very much about this case,
let alone tested it. Given that, I am not very pleased that you
are retroactively trying to justify breaking it by claiming that
it was already broken. It's been that way since 7.3 implemented
schemas, more or less, and nobody's complained about it. Therefore
I see little argument for changing that behavior. Changing it in
an already-released branch is especially suspect.Oh, give me a break. The previous behavior obviously hasn't been
tested either, and is broken on its face. If someone *had* complained
about it, I imagine you would have promptly fixed it and likely
back-patched the fix, probably in under 24 hours from the time of the
report. I find it difficult to take seriously the contention that
anyone is expecting \d dlsgjdsghj.sdhg.l.dsg.jkhsdg.foo.bar to work
like \d foo.bar, or that they would even prefer that behavior over an
error message. You're carefully avoiding addressing that question in
favor of having a discussion of backward compatibility, but a better
term for what we're talking about here would be bug-compatibility.I tend to agree with Robert on this particular case. Accepting random
nonsense there isn't a feature or something which really needs to be
preserved. For my 2c, I would hope that one day we will be able to
accept other database names there and if that happens, what then? We'd
"break" these cases anyway. Better to be clear that such nonsense isn't
intended to be accepted and clean that up. I do think it'd be good to
accept the current database name there as that's reasonable.
I am going to throw my hat in with Robert and Stephen, too. At least
for 15 if we don't want to change this behavior in back branches.
--
Vik Fearing
I understand Tom's position to be that the behavior should be changed back,
since it was 1) unintentional; and 2) breaks legitimate use (when the datname
matches current_database).
I think there's an easy answer here that would satisfy everyone; two patches:
0001 to fix the unintentional behavior change;
0002 to reject garbage input: anything with more than 3 dot-separated
components, or with 3 components where the first doesn't match
current_database.
0001 would be backpatched to v14.
If it turns out there's no consensus on 0002, or if it were really hard for
some reason, or (more likely) nobody went to the bother to implement it this
year, then that's okay.
I would prefer if it errored if the datname didn't match the current database.
After all, it would've helped me to avoid making a confusing problem report.
--
Justin