issue about information_schema REFERENTIAL_CONSTRAINTS
Hello,
I haven't found a bug management system about postgresql, so here is a
mail. Maybe this issue was already reported, sorry if it is the case.
I have seen anything about the information_schema in pg todo list.
This is tested on postgresql 8.4.4.
The REFERENTIAL_CONSTRAINTS table in the information_schema references a
constaint through its database/schema/name, but this information is not
unique, so it may identify several constraints, thus the information
derived may not be consistent.
CREATE TABLE destination(id SERIAL PRIMARY KEY);
CREATE TABLE source1
(id SERIAL PRIMARY KEY,
fk INTEGER CONSTRAINT to_destination REFERENCES destination);
CREATE TABLE source2
(id SERIAL PRIMARY KEY,
fk INTEGER CONSTRAINT to_destination REFERENCES destination);
SELECT * FROM information_schema.REFERENTIAL_CONSTRAINTS;
-- contains two identical lines
Other tables about constraints may have the same issue.
The direct result is that this table leads to false result on joins, thus
is pretty useless. Usually I have plenty of "$1" constraints.
Suggestion: constraint names could be systematically prefixed with their
corresponding table so that they are indeed unique, possibly with some
escaping: '"<double-quoted-table-name>"."constraint_name"'. I'm not sure
about the possible consequences of changing the constraint names, but as
the information is a set of views on top of pg_catalog, there may be none.
--
Fabien.
Fabien COELHO <coelho@cri.ensmp.fr> writes:
The REFERENTIAL_CONSTRAINTS table in the information_schema references a
constaint through its database/schema/name, but this information is not
unique, so it may identify several constraints, thus the information
derived may not be consistent.
Postgres does not enforce that constraints have unique names within a
schema. The SQL spec does say that they should be unique per-schema,
and the information_schema views are designed on that assumption.
If you use spec-compliant names for your constraints, you won't have a
problem. If you don't, well, the information_schema views will be of
limited use to you.
Suggestion: constraint names could be systematically prefixed with their
corresponding table so that they are indeed unique,
We are not going to try to enforce uniqueness. This has been debated
before, and most people like the current behavior just fine, or at least
better than the alternatives.
regards, tom lane
Dear Tom,
The REFERENTIAL_CONSTRAINTS table in the information_schema references a
constaint through its database/schema/name, but this information is not
unique, so it may identify several constraints, thus the information
derived may not be consistent.Postgres does not enforce that constraints have unique names within a
schema. The SQL spec does say that they should be unique per-schema,
and the information_schema views are designed on that assumption.
Hence a contradiction.
If you use spec-compliant names for your constraints, you won't have a
problem. If you don't, well, the information_schema views will be of
limited use to you.
I'm writing a schema analyzer which gives false results. I do not write
the constraints, I'm analyzing existing schemas. I cannot change it.
Suggestion: constraint names could be systematically prefixed with their
corresponding table so that they are indeed unique,We are not going to try to enforce uniqueness.
I'm not asking for uniqueness in "pg_catalog", esp as that would break
existing applications.
I'm suggesting uniqueness in the "information_schema", which can be
provided independently by some tweaking in the view construction, I think,
for instance by adding the oid of the constraint or maybe the table_name.
This has been debated before, and most people like the current behavior
just fine, or at least better than the alternatives.
I do not know "most people". I guess "most people" just do not use the
"information_schema", so they really do not care!
For the "few people" who do use the information_schema, I can assure you
that having a false information is a severe drawback, and it is called a
"bug".
So at least please fill in this as a "bug" somewhere, even if you do not
want to fix it.
--
Fabien.
On ons, 2010-09-01 at 16:22 +0200, Fabien COELHO wrote:
I'm suggesting uniqueness in the "information_schema", which can be
provided independently by some tweaking in the view construction, I
think, for instance by adding the oid of the constraint or maybe the
table_name.
The view is defined by the SQL standard. We cannot change it.
Dear Peter,
I'm suggesting uniqueness in the "information_schema", which can be
provided independently by some tweaking in the view construction, I
think, for instance by adding the oid of the constraint or maybe the
table_name.The view is defined by the SQL standard.
No. The result of the view (the definition of the expected attributes) is
defined in the standard. But it is really a view on top of "pg_catalog".
You are right that some views of the information_schema are defined in the
standard, but they deal with restrictions of other relations, say the
privileges for the current user...
We cannot change it.
Yes we can! It, it is 100% postgresql:
\d information_schema.referential_constraints
View "information_schema.referential_constraints"
...
FROM pg_namespace ncon
JOIN pg_constraint con ON ncon.oid = con.connamespace
JOIN pg_class c ON con.conrelid = c.oid
--
Fabien Coelho - CRI, Maths & Systèmes, MINES ParisTech
From pgsql-bugs-owner@postgresql.org Thu Sep 2 07:41:38 2010
Received: from maia.hub.org (maia-2.hub.org [200.46.204.251])
by mail.postgresql.org (Postfix) with ESMTP id 16994632AF0
for <pgsql-bugs-postgresql.org@mail.postgresql.org>; Thu, 2 Sep 2010 07:41:38 -0300 (ADT)
Received: from mail.postgresql.org ([200.46.204.86])
by maia.hub.org (mx1.hub.org [200.46.204.251]) (amavisd-maia, port 10024)
with ESMTP id 81079-04 for <pgsql-bugs-postgresql.org@mail.postgresql.org>;
Thu, 2 Sep 2010 10:41:36 +0000 (UTC)
X-Greylist: from auto-whitelisted by SQLgrey-1.7.6
Received: from wwwmaster.postgresql.org (wwwmaster.postgresql.org [217.196.146.204])
by mail.postgresql.org (Postfix) with ESMTP id 598D2632A2B
for <pgsql-bugs@postgresql.org>; Thu, 2 Sep 2010 07:41:37 -0300 (ADT)
Received: from wwwmaster.postgresql.org (wwwmaster.postgresql.org [217.196.146.204])
by wwwmaster.postgresql.org (8.14.3/8.14.3) with ESMTP id o82AfXM0085948
for <pgsql-bugs@postgresql.org>; Thu, 2 Sep 2010 10:41:33 GMT
(envelope-from www@wwwmaster.postgresql.org)
Received: (from www@localhost)
by wwwmaster.postgresql.org (8.14.3/8.14.3/Submit) id o82AfX1n085947;
Thu, 2 Sep 2010 10:41:33 GMT
(envelope-from www)
Date: Thu, 2 Sep 2010 10:41:33 GMT
Message-Id: <201009021041.o82AfX1n085947@wwwmaster.postgresql.org>
To: pgsql-bugs@postgresql.org
Subject: BUG #5636: extra "(" or missing ")"
From: "KOIZUMI Satoru" <koizumistr@minos.ocn.ne.jp>
Content-Type: text/plain; charset=utf-8
X-Virus-Scanned: Maia Mailguard 1.0.1
X-Archive-Number: 201009/8
X-Sequence-Number: 27972
The following bug has been logged online:
Bug reference: 5636
Logged by: KOIZUMI Satoru
Email address: koizumistr@minos.ocn.ne.jp
PostgreSQL version: 9.0RC1
Operating system: MacOS X
Description: extra "(" or missing ")"
Details:
At 33.8.5. The decimal type of PostgreSQL 9.0RC1 Documentation, "(" is
before "by means".
But there is no ")".
| the decimal type can be created either on the stack
| or on the heap (by means of the functions
| PGTYPESdecimal_new() and PGTYPESdecimal_free().
Maybe "(" is not needed or ")" is needed after "PGTYPESdecimal_free()".
On tor, 2010-09-02 at 07:54 +0200, Fabien COELHO wrote:
We cannot change it.
Yes we can! It, it is 100% postgresql:
\d information_schema.referential_constraints
View "information_schema.referential_constraints"
...
FROM pg_namespace ncon
JOIN pg_constraint con ON ncon.oid = con.connamespace
JOIN pg_class c ON con.conrelid = c.oid
Well, for the sake of argument, how would you propose to change it?
Hallo,
\d information_schema.referential_constraints
View "information_schema.referential_constraints"
...
FROM pg_namespace ncon
JOIN pg_constraint con ON ncon.oid = con.connamespace
JOIN pg_class c ON con.conrelid = c.oidWell, for the sake of argument, how would you propose to change it?
For the sake of argument:
For the information_schema to be useful, expected keys & foreign keys must
work properly so that the constraints can be joined meaningfully
(otherwise, what is the point?). So any change would have to be be
consistent over all definitions.
The standard expects the triplet <catalog,schema,constraint> to be unique,
but in Pg, <catalog,schema,table,constraint> is unique (I think). Too late
to change that, obviously. So that let building a unique constraint name
just for the information_schema, which is fine with me.
(1) use the OID as the "constraint name" everywhere, it would work, it
would not look so good for display, but it is simple and fast.
(2) otherwise something built on top of <table,constraint>. To be on the
safe side, I would build a string (sql_identifier?) with something like:
quote_ident(<table name>) || '.' || quote_ident(<constraint name>)
And the display would be reasonnable, like : "TableName"."$1" instead of a
big bunch of $1.
The affected information_schema views would be, as far as I can see:
- check_constraint_routine_usage
- check_constraints
- constraint_column_usage
- constraint_table_usage
- domain_constraints
- key_column_usage
- referential_constraints
- table_constraints
With one or two affected columns each.
Note that there may be the same issue with 'triggers' which are also
identified by a <catalog,schema,trigger> triplet. Maybe others.
I can send a patch on the information_schema definition in the coming days
if someone feel that this may deserve a "test". Otherwise, ISTM that it is
a "bug" to be recorded somewhere.
Have a nice day,
--
Fabien.
Fabien COELHO <coelho@cri.ensmp.fr> writes:
(1) use the OID as the "constraint name" everywhere, it would work, it
would not look so good for display, but it is simple and fast.
(2) otherwise something built on top of <table,constraint>. To be on the
safe side, I would build a string (sql_identifier?) with something like:
quote_ident(<table name>) || '.' || quote_ident(<constraint name>)
Note that (2) fails for long names; you have to do something to
compress to NAMEDATALEN.
The big problem with either of these is that it's no longer easily
possible to extract the actual constraint name from the view.
In any case, I am fairly sure that not having the constraint_name column
show the actual constraint name is a violation of the spirit of the SQL
spec, whether or not you can claim that it meets the letter.
regards, tom lane
Dear Tom,
Still for the sake of argument:
Note that (2) fails for long names; you have to do something to
compress to NAMEDATALEN.
Indeed.
What if the type is changed to TEXT? It is just a view after all.
How important is it to stick to "sql_identifier"?
The big problem with either of these is that it's no longer easily
possible to extract the actual constraint name from the view.
Sure. A function is provided to do so, say
information_schema_constraint_name_to_pg_constraint_name.
In any case, I am fairly sure that not having the constraint_name column
show the actual constraint name is a violation of the spirit of the SQL
spec, whether or not you can claim that it meets the letter.
Well, one must choose between to evil:
(1) the constraint_name is changed in the view to be unique as expected
by the spec, and the data can be joined meaningfully, and some reliable
information can be derived.
(2) the constraint_name looks nice but is not unique, and
the information in the view is ambiguous and cannot be relied upon,
so one is back to square "postgresql supports the information_schema,
but there is no point to query it and expecting the results to
reflect the contents of the catalogs".
If you want to stick to both the letter and the spirit of the spec, that
would mean enforcing unique constraint names in pg and break every
applications. Not good.
ISTM that the "spirit" of the information schema is more to be useful (1)
than to look beautiful (2).
Another technical proposal, a little more subtle and with possible
underlying issues I cannot foresee: have the constraint_name be a "pair of
sql_identifiers".
--
Fabien.
Fabien COELHO <coelho@cri.ensmp.fr> writes:
Note that (2) fails for long names; you have to do something to
compress to NAMEDATALEN.
Indeed.
What if the type is changed to TEXT? It is just a view after all.
How important is it to stick to "sql_identifier"?
It's a view defined by the SQL standard, and one of the properties
defined by the standard is the type of that column.
Well, one must choose between to evil:
Yeah, exactly. I think that the current tradeoff is just fine.
If you want SQL-standard behavior, pick SQL-standard constraint names,
and there you are.
regards, tom lane
Tom Lane <tgl@sss.pgh.pa.us> wrote:
Yeah, exactly. I think that the current tradeoff is just fine.
If you want SQL-standard behavior, pick SQL-standard constraint
names, and there you are.
I see that as the crux if it -- the current implementation *allows*
standard-conforming behavior, even though it doesn't *enforce*
conforming naming. The proposed alternative does not allow
standard-conforming behavior. If you're going to use something
which is PostgreSQL-specific, you may as well write your own views
or use the "native" tables and views directly.
-Kevin
Well, one must choose between to evil:
Yeah, exactly. I think that the current tradeoff is just fine.
Hmmm. I think exactly the contrary. There is no point in having a non
reliable feature.
ISTM that very few people use the information schema, and if the query
results is not reliable, it will stay this way. If you have zero user, no
one will complain, which prooves that everything is fine:-) QED.
By the way, do you use the information schema?
If you want SQL-standard behavior, pick SQL-standard constraint names,
and there you are.
I tried to explained that I'm analyzing other people's schemas. I cannot
ask all other people on the planet to rewrite their schemas, I pick them
as they are.
Could you register this "bug" somewhere please?
Thanks for your time.
--
Fabien.
Excerpts from Fabien COELHO's message of vie sep 03 13:39:19 -0400 2010:
I tried to explained that I'm analyzing other people's schemas. I cannot
ask all other people on the planet to rewrite their schemas, I pick them
as they are.
Maybe you shouldn't be using the information_schema in the first place.
--
Álvaro Herrera <alvherre@commandprompt.com>
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support
If you're going to use something which is PostgreSQL-specific, you may
as well write your own views or use the "native" tables and views
directly.
I wish I could write portable code, if possible:-)
I'm basically writing views on top of the information_schema under the
assumption that what is expected to be a key is a key. The information
schema is *useless* otherwise as wrong tuples are built on join, and
derived information is not reliable.
I guess I must the only actual user of the information_schema, and it will
soon be back to zero user, which will be fine from the developers point of
view.
--
Fabien.
Alvaro Herrera <alvherre@commandprompt.com> writes:
Excerpts from Fabien COELHO's message of vie sep 03 13:39:19 -0400 2010:
I tried to explained that I'm analyzing other people's schemas. I cannot
ask all other people on the planet to rewrite their schemas, I pick them
as they are.
Maybe you shouldn't be using the information_schema in the first place.
It would probably be reasonable to put something into chapter 34 of the
docs pointing out this limitation. I'm not sure exactly where though.
Should we invent a "compatibility issues" section in that chapter,
analogous to the ones we have for individual SQL commands? Are there
other issues worth documenting for the information_schema views?
regards, tom lane
Maybe you shouldn't be using the information_schema in the first place.
Sure, I could write non standard code for every database instead of trying
to write a portable code which work on all of them directly:-)
I think that trying to do the portable way, under the assumption that the
standard implementation would be okay, was a reasonnable choice, even if
proved wrong afterwards.
--
Fabien.
On fre, 2010-09-03 at 13:53 -0400, Tom Lane wrote:
Alvaro Herrera <alvherre@commandprompt.com> writes:
Excerpts from Fabien COELHO's message of vie sep 03 13:39:19 -0400 2010:
I tried to explained that I'm analyzing other people's schemas. I cannot
ask all other people on the planet to rewrite their schemas, I pick them
as they are.Maybe you shouldn't be using the information_schema in the first place.
It would probably be reasonable to put something into chapter 34 of the
docs pointing out this limitation. I'm not sure exactly where though.
Should we invent a "compatibility issues" section in that chapter,
analogous to the ones we have for individual SQL commands? Are there
other issues worth documenting for the information_schema views?
A very similar issue is already documented here:
http://developer.postgresql.org/pgdocs/postgres/infoschema-triggers.html
[ redirecting to pgsql-docs ]
Peter Eisentraut <peter_e@gmx.net> writes:
On fre, 2010-09-03 at 13:53 -0400, Tom Lane wrote:
It would probably be reasonable to put something into chapter 34 of the
docs pointing out this limitation. I'm not sure exactly where though.
Should we invent a "compatibility issues" section in that chapter,
analogous to the ones we have for individual SQL commands? Are there
other issues worth documenting for the information_schema views?
A very similar issue is already documented here:
http://developer.postgresql.org/pgdocs/postgres/infoschema-triggers.html
That sort of annotation was my first thought, but we'd have to repeat
ourselves in multiple places because there are several views that are
affected by the constraint-name issue. (Come to think of it, isn't
triggered_update_columns also affected by the trigger-name issue?)
Is that the direction we want to go, or would it be better to factor
the information out into a separate page about compatibility gotchas?
regards, tom lane
On sön, 2010-09-05 at 22:16 -0400, Tom Lane wrote:
A very similar issue is already documented here:
http://developer.postgresql.org/pgdocs/postgres/infoschema-triggers.htmlThat sort of annotation was my first thought, but we'd have to repeat
ourselves in multiple places because there are several views that are
affected by the constraint-name issue. (Come to think of it, isn't
triggered_update_columns also affected by the trigger-name issue?)Is that the direction we want to go, or would it be better to factor
the information out into a separate page about compatibility gotchas?
It would probably be better to explain globally applicable issues in a
separate section.
Is that the direction we want to go, or would it be better to factor
the information out into a separate page about compatibility gotchas?It would probably be better to explain globally applicable issues in a
separate section.
I agree that a general caveat is better, together with a one line
reference in the documentation of each table with an issue.
--
Fabien.