Fixing busted citext function declarations
In
/messages/by-id/BN1PR04MB37467AA1D412223B3D4A595DFD20@BN1PR04MB374.namprd04.prod.outlook.com
it's revealed that the citext extension misdeclares its versions of
regexp_matches(): they should return SETOF text[] but they're marked
as returning just text[].
We know generally how to fix this sort of thing: create a new version of
the citext extension and provide an upgrade script that repairs the error.
However there are a couple of points that deserve discussion.
* We can't use CREATE OR REPLACE FUNCTION in the upgrade script because
that intentionally doesn't let you change the result type of an existing
function. I considered doing a manual UPDATE of the pg_proc entry, but
then remembered why CREATE OR REPLACE FUNCTION is picky about this: the
result type, including set-ness, is embedded in the parse tree of any view
referencing the function. So AFAICS we need to actually drop and recreate
the citext regexp_matches() functions in the upgrade script. That means
"ALTER EXTENSION citext UPDATE" will fail if these functions are being
used in any views. That's annoying but I see no way around it. (We
could have the upgrade script do DROP CASCADE, but that seems way too
destructive.)
* Is anyone concerned about back-patching this fix? I suppose you could
make an argument that some app somewhere might be relying on the current
behavior of citext regexp_matches(), which effectively is to return only
the first match even if you said 'g'. One answer would be to keep on
supplying the citext 1.0 script in the back branches, so that anyone who
really needed it could still install 1.0 not 1.1. That's not our usual
practice though, so unless there's serious concern that such a problem
really exists, I'd rather not do that.
Comments?
regards, tom lane
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Tom Lane wrote:
* We can't use CREATE OR REPLACE FUNCTION in the upgrade script because
that intentionally doesn't let you change the result type of an existing
function. I considered doing a manual UPDATE of the pg_proc entry, but
then remembered why CREATE OR REPLACE FUNCTION is picky about this: the
result type, including set-ness, is embedded in the parse tree of any view
referencing the function. So AFAICS we need to actually drop and recreate
the citext regexp_matches() functions in the upgrade script. That means
"ALTER EXTENSION citext UPDATE" will fail if these functions are being
used in any views. That's annoying but I see no way around it. (We
could have the upgrade script do DROP CASCADE, but that seems way too
destructive.)
I think we do need to have the upgrade script drop/recreate without
cascade. Then, users can "alter extension upgrade", note the
problematic views (which should be part of the error message), drop
them, then retry the extension update and re-create their views. This
is necessarily a manual procedure -- I don't think we can re-create
views using the function automatically. CASCADE seems pretty dangerous.
(I think it is possible that the behavior change is actually problematic
as opposed to just behaving differently. For instance, if the function
is used in a subselect that's expected to return only one row, and it
suddenly starts returning more, the query would raise an error. Seems
better to err on the side of caution.)
* Is anyone concerned about back-patching this fix? I suppose you could
make an argument that some app somewhere might be relying on the current
behavior of citext regexp_matches(), which effectively is to return only
the first match even if you said 'g'. One answer would be to keep on
supplying the citext 1.0 script in the back branches, so that anyone who
really needed it could still install 1.0 not 1.1. That's not our usual
practice though, so unless there's serious concern that such a problem
really exists, I'd rather not do that.
I think we should keep the 1.0 version this time, in back branches.
This can be invoked manually only (i.e. the default version would still
be 1.1), and it wouldn't be provided in the master branch, so it would
not cause long-term maintenance pain. But it should be possible for
people to re-create their current databases, in case they need to
upgrade for unrelated reasons and have views dependant on this behavior.
--
�lvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On May 5, 2015, at 10:07 AM, Alvaro Herrera <alvherre@2ndquadrant.com> wrote:
So AFAICS we need to actually drop and recreate
the citext regexp_matches() functions in the upgrade script. That means
"ALTER EXTENSION citext UPDATE" will fail if these functions are being
used in any views. That's annoying but I see no way around it. (We
could have the upgrade script do DROP CASCADE, but that seems way too
destructive.)I think we do need to have the upgrade script drop/recreate without
cascade. Then, users can "alter extension upgrade", note the
problematic views (which should be part of the error message), drop
them, then retry the extension update and re-create their views. This
is necessarily a manual procedure -- I don't think we can re-create
views using the function automatically. CASCADE seems pretty dangerous.
FWIW, this is a challenge inherent in all extension upgrade scripts. It’d be great if there was a way to defer such dependency errors to COMMIT time, so if a function is replaced with a new one that’s compatible with the old, the dependency tree could be updated automatically.
Best,
David
Attachments:
Alvaro Herrera <alvherre@2ndquadrant.com> writes:
(I think it is possible that the behavior change is actually problematic
as opposed to just behaving differently. For instance, if the function
is used in a subselect that's expected to return only one row, and it
suddenly starts returning more, the query would raise an error. Seems
better to err on the side of caution.)
Yeah. Also, I realized from the citext regression tests that there's a
behavioral change even if you *don't* use the 'g' flag: the previous
behavior was to return a null on no match, but now you get zero rows out
instead. That's a fairly significant change.
I think we should keep the 1.0 version this time, in back branches.
Agreed. Maybe we shouldn't even make 1.1 the default in the back
branches.
regards, tom lane
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Tue, May 5, 2015 at 10:20 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Alvaro Herrera <alvherre@2ndquadrant.com> writes:
(I think it is possible that the behavior change is actually problematic
as opposed to just behaving differently. For instance, if the function
is used in a subselect that's expected to return only one row, and it
suddenly starts returning more, the query would raise an error. Seems
better to err on the side of caution.)Yeah. Also, I realized from the citext regression tests that there's a
behavioral change even if you *don't* use the 'g' flag: the previous
behavior was to return a null on no match, but now you get zero rows out
instead. That's a fairly significant change.I think we should keep the 1.0 version this time, in back branches.
Agreed. Maybe we shouldn't even make 1.1 the default in the back
branches.
Does 9.0 get different treatment?
If (I'm not sure this is the case - or must be...) a pg_dump/pg_restore
sequence against a back-branch database installs the default version of the
extension for that PostgreSQL version I would agree; and, to clarify, we
would still provide the ability to upgrade to citext-1.1 in back-branches.
Alvaro >> and it (1.0) wouldn't be provided in the master branch
Why wouldn't it? The whole point of versioning is to mark a release as
stable/immutable and therefore eliminate any maintenance burden by simply
refusing to maintain it. There is no maintainability reason to avoid
shipping the previous versions that I can think of if the extension
infrastructure works as advertised.
Is there anywhere besides the source code that a user can read how
extensions and pg_dump/pg_restore inter-operate in this manner? Neither
pg_dump/restore nor CREATE EXTENSION cover the topic and there isn't a
chapter called "upgrading" that would be a nice place to put such
information...
David J
"David G. Johnston" <david.g.johnston@gmail.com> writes:
On Tue, May 5, 2015 at 10:20 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Alvaro Herrera <alvherre@2ndquadrant.com> writes:
I think we should keep the 1.0 version this time, in back branches.
Agreed. Maybe we shouldn't even make 1.1 the default in the back
branches.
Does 9.0 get different treatment?
Given the lack of previous complaints, I'm inclined to not touch 9.0
at all. We don't have any mechanism like multiple extension versions
to let users control what happens in 9.0, and this seems like rather a
large behavior change to set loose in such an old branch without that.
If (I'm not sure this is the case - or must be...) a pg_dump/pg_restore
sequence against a back-branch database installs the default version of the
extension for that PostgreSQL version I would agree; and, to clarify, we
would still provide the ability to upgrade to citext-1.1 in back-branches.
Right.
Alvaro >> and it (1.0) wouldn't be provided in the master branch
Why wouldn't it?
The current behavior is without question broken, so I don't see a good
argument for preserving it forever.
regards, tom lane
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Tue, May 5, 2015 at 02:07:08PM -0300, Alvaro Herrera wrote:
Tom Lane wrote:
* We can't use CREATE OR REPLACE FUNCTION in the upgrade script because
that intentionally doesn't let you change the result type of an existing
function. I considered doing a manual UPDATE of the pg_proc entry, but
then remembered why CREATE OR REPLACE FUNCTION is picky about this: the
result type, including set-ness, is embedded in the parse tree of any view
referencing the function. So AFAICS we need to actually drop and recreate
the citext regexp_matches() functions in the upgrade script. That means
"ALTER EXTENSION citext UPDATE" will fail if these functions are being
used in any views. That's annoying but I see no way around it. (We
could have the upgrade script do DROP CASCADE, but that seems way too
destructive.)I think we do need to have the upgrade script drop/recreate without
cascade. Then, users can "alter extension upgrade", note the
problematic views (which should be part of the error message), drop
them, then retry the extension update and re-create their views. This
is necessarily a manual procedure -- I don't think we can re-create
views using the function automatically. CASCADE seems pretty dangerous.
Just a reality check but this will break a pg_upgrade, and will not be
detected by --check.
--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com
+ Everyone has their own god. +
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Thu, May 7, 2015 at 04:19:52PM -0400, Bruce Momjian wrote:
On Tue, May 5, 2015 at 02:07:08PM -0300, Alvaro Herrera wrote:
Tom Lane wrote:
* We can't use CREATE OR REPLACE FUNCTION in the upgrade script because
that intentionally doesn't let you change the result type of an existing
function. I considered doing a manual UPDATE of the pg_proc entry, but
then remembered why CREATE OR REPLACE FUNCTION is picky about this: the
result type, including set-ness, is embedded in the parse tree of any view
referencing the function. So AFAICS we need to actually drop and recreate
the citext regexp_matches() functions in the upgrade script. That means
"ALTER EXTENSION citext UPDATE" will fail if these functions are being
used in any views. That's annoying but I see no way around it. (We
could have the upgrade script do DROP CASCADE, but that seems way too
destructive.)I think we do need to have the upgrade script drop/recreate without
cascade. Then, users can "alter extension upgrade", note the
problematic views (which should be part of the error message), drop
them, then retry the extension update and re-create their views. This
is necessarily a manual procedure -- I don't think we can re-create
views using the function automatically. CASCADE seems pretty dangerous.Just a reality check but this will break a pg_upgrade, and will not be
detected by --check.
Actually, pg_upgrade might be OK because the views would be recreated
with the new functions already installed.
--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com
+ Everyone has their own god. +
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Bruce Momjian <bruce@momjian.us> writes:
On Thu, May 7, 2015 at 04:19:52PM -0400, Bruce Momjian wrote:
Just a reality check but this will break a pg_upgrade, and will not be
detected by --check.
Actually, pg_upgrade might be OK because the views would be recreated
with the new functions already installed.
pg_upgrade is okay in any case because it dumps and reloads the current
extension's components. Doesn't matter whether there's another version
that is not compatible.
regards, tom lane
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Thu, May 7, 2015 at 4:19 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Bruce Momjian <bruce@momjian.us> writes:
On Thu, May 7, 2015 at 04:19:52PM -0400, Bruce Momjian wrote:
Just a reality check but this will break a pg_upgrade, and will not be
detected by --check.Actually, pg_upgrade might be OK because the views would be recreated
with the new functions already installed.pg_upgrade is okay in any case because it dumps and reloads the current
extension's components. Doesn't matter whether there's another version
that is not compatible.
For clarity - which one is "current" in this context?
1. The existing database's (previous extension version)
2. The target database's (current default extension version in the new
PostgreSQL version)
The answer has to be #2 since the version in the existing database no
longer exists in the new PostgreSQL version.
David J.
"David G. Johnston" <david.g.johnston@gmail.com> writes:
On Thu, May 7, 2015 at 4:19 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
pg_upgrade is okay in any case because it dumps and reloads the current
extension's components. Doesn't matter whether there's another version
that is not compatible.
For clarity - which one is "current" in this context?
1. The existing database's (previous extension version)
2. The target database's (current default extension version in the new
PostgreSQL version)
The answer has to be #2 since the version in the existing database no
longer exists in the new PostgreSQL version.
You're mistaken. pg_dump --binary_upgrade does not care whether the
target database thinks that version exists or not. (It does care that
there's a compatible shared-library object, but that's not at issue
in this case.)
regards, tom lane
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Tom,
On May 5, 2015, at 9:40 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
In
/messages/by-id/BN1PR04MB37467AA1D412223B3D4A595DFD20@BN1PR04MB374.namprd04.prod.outlook.com
it's revealed that the citext extension misdeclares its versions of
regexp_matches(): they should return SETOF text[] but they're marked
as returning just text[].
I wanted to make sure my backport was fixed for this, but it turns out it was already fixed as of this commit:
https://github.com/theory/citext/commit/99c925f
Note that I credited you for the spot --- way back in October 2009! Pretty confused how the same change wasn’t made to the core contrib module back then.
Best,
David
Attachments:
"David E. Wheeler" <david@justatheory.com> writes:
On May 5, 2015, at 9:40 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
In
/messages/by-id/BN1PR04MB37467AA1D412223B3D4A595DFD20@BN1PR04MB374.namprd04.prod.outlook.com
it's revealed that the citext extension misdeclares its versions of
regexp_matches(): they should return SETOF text[] but they're marked
as returning just text[].
I wanted to make sure my backport was fixed for this, but it turns out it was already fixed as of this commit:
Note that I credited you for the spot --- way back in October 2009! Pretty confused how the same change wasn’t made to the core contrib module back then.
Me too. Something fell through the cracks rather badly there :-(.
Would you check your commit history to see if anything else got missed?
regards, tom lane
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On May 11, 2015, at 5:01 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Me too. Something fell through the cracks rather badly there :-(.
Would you check your commit history to see if anything else got missed?
Let’s see…
In https://github.com/theory/citext/commit/4030b4e1ad9fd9f994a6cdca1126a903682acae4 I copied your use of specifying the full path to pg_catalog function, which is still in core.
In https://github.com/theory/citext/commit/c24132c098a822f5a8669ed522e747e01e1c0835, I made some tweaks based on you change you made to some version of my patch. Most are minor, or just for functions needed for 8.4 and not later versions.
In https://github.com/theory/citext/commit/2c7e997fd60e2b708d06c128e5fd2db51c7a9f33, I added a cast to bpchar, which is in core.
In https://github.com/theory/citext/commit/cf988024d18a6ddd9a8146ab8cabfe6e0167ba26 and https://github.com/theory/citext/commit/22f91a0d50003a0c1c27d1fbf0bb5c0a1e3a3cad I switched from VARSIZE_ANY_EXHDR() to strlen() at your suggestion. Also still there.
Anyway, those are all from 2008 and pretty much just copy changes you made to core. The return value of regexp_matches() is the only significant change since then. So I think we’re good.
Best,
David\