BUG #15344: pg_proc.proisagg was removed incompatibly in PostgreSQL 11

Started by PG Bug reporting formalmost 8 years ago7 messagesbugs

noreply@postgresql.org

almost 8 years ago

The following bug has been logged on the website:

Bug reference: 15344
Logged by: Lukas Eder
Email address: lukas.eder@gmail.com
PostgreSQL version: 11beta2
Operating system: Debian
Description:

When comparing the current version (10) [1]https://www.postgresql.org/docs/10/static/catalog-pg-proc.html and the developer version (11)
[2]: https://www.postgresql.org/docs/11/static/catalog-pg-proc.html
pg_proc.proisagg column was removed backwards incompatibly. The
documentation states for [1]https://www.postgresql.org/docs/10/static/catalog-pg-proc.html:

The table contains data for aggregate functions as well as plain

functions. If proisagg is true, there should be a matching row in
pg_aggregate.

And for [2]https://www.postgresql.org/docs/11/static/catalog-pg-proc.html:

If prokind indicates that the entry is for an aggregate function, there

should be a matching row in pg_aggregate.

While I appreciate that an encoded enumeration is more powerful than a
simple boolean value, I think it would have been better if this change had
been done backwards compatibly. For example, there could still be a
pg_proc.proisagg column that defaults to (prokind = 'a').

[1]: https://www.postgresql.org/docs/10/static/catalog-pg-proc.html
[2]: https://www.postgresql.org/docs/11/static/catalog-pg-proc.html

Andres Freund

andres@anarazel.de

almost 8 years ago

In reply to: PG Bug reporting form (#1)

Re: BUG #15344: pg_proc.proisagg was removed incompatibly in PostgreSQL 11

Hi,

On 2018-08-21 14:23:45 +0000, PG Bug reporting form wrote:

When comparing the current version (10) [1] and the developer version (11)
[2] of the pg_proc documentation, then it can be seen that the
pg_proc.proisagg column was removed backwards incompatibly. The
documentation states for [1]:

Please note that the pg_catalog.* tables (and views) are *NOT* intended
to backwards compatible between major versions. We change them in ways
backward incompatible all the time.

While I appreciate that an encoded enumeration is more powerful than a
simple boolean value, I think it would have been better if this change had
been done backwards compatibly. For example, there could still be a
pg_proc.proisagg column that defaults to (prokind = 'a').

When could we have removed that? The few people querying it would've
continued to rely on it until we've removed it, leading to the same
outcry. And supporting both would've required more code, because we
would have to make sure they're consistent, and raise errors etc if not.

Greetings,

Andres Freund

Lukas Eder

lukas.eder@gmail.com

almost 8 years ago

In reply to: Andres Freund (#2)

Re: BUG #15344: pg_proc.proisagg was removed incompatibly in PostgreSQL 11

On Tue, Aug 21, 2018 at 4:28 PM Andres Freund <andres@anarazel.de> wrote:

Hi,

On 2018-08-21 14:23:45 +0000, PG Bug reporting form wrote:

When comparing the current version (10) [1] and the developer version

(11)

[2] of the pg_proc documentation, then it can be seen that the
pg_proc.proisagg column was removed backwards incompatibly. The
documentation states for [1]:

Please note that the pg_catalog.* tables (and views) are *NOT* intended
to backwards compatible between major versions. We change them in ways
backward incompatible all the time.

The pg_catalog tables do seem to be the only way to reverse engineer some
more sophisticated things in the database. I imagine that this is being
done by tool vendors like myself (jOOQ) quite a bit. And there are tons of
Stack Overflow answers that show how to query the pg_catalog tables, all of
them risking to be outdated between major versions. These queries are
probably used by quite a few people in some home grown build tool, reverse
engineering tool, etc.

I understand that backwards compatibility is quite a bit of extra work, but
in cases like this particular one, the price to pay seems relatively low.
Perhaps a new strategy could be to break things only if there is really no
other solution?

Thanks,
Lukas

Andres Freund

andres@anarazel.de

almost 8 years ago

In reply to: Lukas Eder (#3)

Re: BUG #15344: pg_proc.proisagg was removed incompatibly in PostgreSQL 11

On 2018-08-21 16:39:18 +0200, Lukas Eder wrote:

On Tue, Aug 21, 2018 at 4:28 PM Andres Freund <andres@anarazel.de> wrote:

Hi,

On 2018-08-21 14:23:45 +0000, PG Bug reporting form wrote:

When comparing the current version (10) [1] and the developer version

(11)

[2] of the pg_proc documentation, then it can be seen that the
pg_proc.proisagg column was removed backwards incompatibly. The
documentation states for [1]:

Please note that the pg_catalog.* tables (and views) are *NOT* intended
to backwards compatible between major versions. We change them in ways
backward incompatible all the time.

The pg_catalog tables do seem to be the only way to reverse engineer some
more sophisticated things in the database.

Yes, there's some things that aren't represented in a standardized way
in information_schema.*. If that's the case it's good for tool vendors
to pipe up and ask for something intended to be externally visible.

I imagine that this is being
done by tool vendors like myself (jOOQ) quite a bit. And there are tons of
Stack Overflow answers that show how to query the pg_catalog tables, all of
them risking to be outdated between major versions.

People doing bad things on stackoverflow isn't very convincing.... ;)

I understand that backwards compatibility is quite a bit of extra work, but
in cases like this particular one, the price to pay seems relatively low.
Perhaps a new strategy could be to break things only if there is really no
other solution?

I mean we don't break things willy-nilly already. And there's plenty
cases where we kept things around for backward compatibility. But
usually the problem is that that means you have to keep the shim around
forever. Even if better solutions have been around for many years,
there'll be complaints about removing them. So IMO the backward compat
price for pg_catalog.* has to be low enough that there's essentially no
point in keeping it around forever, or so painful that it'd cost a lot
of people a lot.

Greetings,

Andres Freund

Lukas Eder

lukas.eder@gmail.com

almost 8 years ago

In reply to: Andres Freund (#4)

Re: BUG #15344: pg_proc.proisagg was removed incompatibly in PostgreSQL 11

On Tue, Aug 21, 2018 at 4:45 PM Andres Freund <andres@anarazel.de> wrote:

On 2018-08-21 16:39:18 +0200, Lukas Eder wrote:

On Tue, Aug 21, 2018 at 4:28 PM Andres Freund <andres@anarazel.de>

wrote:

Hi,

On 2018-08-21 14:23:45 +0000, PG Bug reporting form wrote:

When comparing the current version (10) [1] and the developer version

(11)

[2] of the pg_proc documentation, then it can be seen that the
pg_proc.proisagg column was removed backwards incompatibly. The
documentation states for [1]:

Please note that the pg_catalog.* tables (and views) are *NOT* intended
to backwards compatible between major versions. We change them in ways
backward incompatible all the time.

The pg_catalog tables do seem to be the only way to reverse engineer some
more sophisticated things in the database.

Yes, there's some things that aren't represented in a standardized way
in information_schema.*. If that's the case it's good for tool vendors
to pipe up and ask for something intended to be externally visible.

I have mixed feelings about this. The information_schema is part of the SQL
standard. I reckon that this particular information would belong in
INFORMATION_SCHEMA.ROUTINES.ROUTINE_TYPE. The SQL:2011 standard (I don't
have newer versions of the SQL/Schemata document) mentions these possible
values for that column:

'PROCEDURE', 'FUNCTION', 'INSTANCE METHOD', 'STATIC METHOD', 'CONSTRUCTOR
METHOD'

So, no aggregate functions or window functions as in pg_proc.prokind. When
extending the standard, the functionality becomes a bit less standard, and
risks breaking as well in the future, e.g. when the standard *does* add
aggregate functions as a possible value, but not using the name PostgreSQL
chooses now.

Which is why the vendor specific pg_catalog is so useful. Any value is
acceptable in those tables as you do not have to coordinate their layout
with the standard committee.

In Oracle, the dictionary views aren't following the information_schema
standard, but are vastly richer than what the standard supports - just like
pg_catalog. They are definitely kept backwards compatible for the same
reason I've mentioned: Tool support.

A database product thrives on the quality of the tools supporting it.
Making it hard for the tool vendors might mean there's less support for
advanced features. From a market adoption perspective, in the long run,
there is no option but to be more backwards compatible.

I imagine that this is being
done by tool vendors like myself (jOOQ) quite a bit. And there are tons

of

Stack Overflow answers that show how to query the pg_catalog tables, all

of

them risking to be outdated between major versions.

People doing bad things on stackoverflow isn't very convincing.... ;)

Your perception of "bad" is biased of course, just like mine. People use
what's available, this has always been the case with any product. If there
were "internal" and "public" catalog tables / views, then it would be more
understandable that using the (internal) pg_catalog is being dismissed, but
given that there is no option...

Here, have a quick google search for questions on Stack Overflow involving
pg_proc.proisagg:
https://www.google.ch/search?q=site%3Astackoverflow.com+proisagg

It returns 107 results on Stack Overflow alone. Some examples:

- https://stackoverflow.com/a/50093399/521799
- https://stackoverflow.com/a/18200250/521799
- https://stackoverflow.com/a/20549944/521799
- https://stackoverflow.com/a/24774064/521799
- https://stackoverflow.com/a/24034609/521799
- https://dba.stackexchange.com/a/46996/2512
- https://stackoverflow.com/a/48709779/521799
- https://stackoverflow.com/a/48709779/521799
- https://stackoverflow.com/a/308500/521799
- https://stackoverflow.com/a/20549944/521799

All of these answers are now outdated with PostgreSQL 11. Not only are they
outdated, but if the answers were fixed for PostgreSQL 11, the fixed
version wouldn't work on older PostgreSQL versions, because pg_proc.prokind
didn't exist earlier.

I find that a relatively high price to pay in this case for the relatively
easy solution to keep a pg_proc.proisagg computed column around for
backwards compatibility.

I understand that backwards compatibility is quite a bit of extra work,

but

in cases like this particular one, the price to pay seems relatively low.
Perhaps a new strategy could be to break things only if there is really

no

other solution?

I mean we don't break things willy-nilly already. And there's plenty
cases where we kept things around for backward compatibility. But
usually the problem is that that means you have to keep the shim around
forever. Even if better solutions have been around for many years,
there'll be complaints about removing them. So IMO the backward compat
price for pg_catalog.* has to be low enough that there's essentially no
point in keeping it around forever, or so painful that it'd cost a lot
of people a lot.

I agree that at some point, the price to pay is too high. But in this case,
a boolean value has been replaced by an enumeration, so it would be really
easy to maintain the boolean value as a computed column, right?

Pavel Stehule

pavel.stehule@gmail.com

almost 8 years ago

In reply to: Lukas Eder (#5)

Re: BUG #15344: pg_proc.proisagg was removed incompatibly in PostgreSQL 11

Dne út 21. 8. 2018 17:04 uživatel Lukas Eder <lukas.eder@gmail.com> napsal:

On Tue, Aug 21, 2018 at 4:45 PM Andres Freund <andres@anarazel.de> wrote:

On 2018-08-21 16:39:18 +0200, Lukas Eder wrote:

On Tue, Aug 21, 2018 at 4:28 PM Andres Freund <andres@anarazel.de>

wrote:

Hi,

On 2018-08-21 14:23:45 +0000, PG Bug reporting form wrote:

When comparing the current version (10) [1] and the developer

version

(11)

[2] of the pg_proc documentation, then it can be seen that the
pg_proc.proisagg column was removed backwards incompatibly. The
documentation states for [1]:

Please note that the pg_catalog.* tables (and views) are *NOT*

intended

to backwards compatible between major versions. We change them in ways
backward incompatible all the time.

The pg_catalog tables do seem to be the only way to reverse engineer

some

more sophisticated things in the database.

Yes, there's some things that aren't represented in a standardized way
in information_schema.*. If that's the case it's good for tool vendors
to pipe up and ask for something intended to be externally visible.

I have mixed feelings about this. The information_schema is part of the
SQL standard. I reckon that this particular information would belong in
INFORMATION_SCHEMA.ROUTINES.ROUTINE_TYPE. The SQL:2011 standard (I don't
have newer versions of the SQL/Schemata document) mentions these possible
values for that column:

'PROCEDURE', 'FUNCTION', 'INSTANCE METHOD', 'STATIC METHOD', 'CONSTRUCTOR
METHOD'

So, no aggregate functions or window functions as in pg_proc.prokind. When
extending the standard, the functionality becomes a bit less standard, and
risks breaking as well in the future, e.g. when the standard *does* add
aggregate functions as a possible value, but not using the name PostgreSQL
chooses now.

Which is why the vendor specific pg_catalog is so useful. Any value is
acceptable in those tables as you do not have to coordinate their layout
with the standard committee.

In Oracle, the dictionary views aren't following the information_schema
standard, but are vastly richer than what the standard supports - just like
pg_catalog. They are definitely kept backwards compatible for the same
reason I've mentioned: Tool support.

A database product thrives on the quality of the tools supporting it.
Making it hard for the tool vendors might mean there's less support for
advanced features. From a market adoption perspective, in the long run,
there is no option but to be more backwards compatible.

I imagine that this is being
done by tool vendors like myself (jOOQ) quite a bit. And there are tons

of

Stack Overflow answers that show how to query the pg_catalog tables,

all of

them risking to be outdated between major versions.

People doing bad things on stackoverflow isn't very convincing.... ;)

Your perception of "bad" is biased of course, just like mine. People use
what's available, this has always been the case with any product. If there
were "internal" and "public" catalog tables / views, then it would be more
understandable that using the (internal) pg_catalog is being dismissed, but
given that there is no option...

Here, have a quick google search for questions on Stack Overflow involving
pg_proc.proisagg:
https://www.google.ch/search?q=site%3Astackoverflow.com+proisagg

It returns 107 results on Stack Overflow alone. Some examples:

- https://stackoverflow.com/a/50093399/521799
- https://stackoverflow.com/a/18200250/521799
- https://stackoverflow.com/a/20549944/521799
- https://stackoverflow.com/a/24774064/521799
- https://stackoverflow.com/a/24034609/521799
- https://dba.stackexchange.com/a/46996/2512
- https://stackoverflow.com/a/48709779/521799
- https://stackoverflow.com/a/48709779/521799
- https://stackoverflow.com/a/308500/521799
- https://stackoverflow.com/a/20549944/521799

All of these answers are now outdated with PostgreSQL 11. Not only are
they outdated, but if the answers were fixed for PostgreSQL 11, the fixed
version wouldn't work on older PostgreSQL versions, because pg_proc.prokind
didn't exist earlier.

I find that a relatively high price to pay in this case for the relatively
easy solution to keep a pg_proc.proisagg computed column around for
backwards compatibility.

I understand that backwards compatibility is quite a bit of extra work,

but

in cases like this particular one, the price to pay seems relatively

low.

Perhaps a new strategy could be to break things only if there is really

no

other solution?

I mean we don't break things willy-nilly already. And there's plenty
cases where we kept things around for backward compatibility. But
usually the problem is that that means you have to keep the shim around
forever. Even if better solutions have been around for many years,
there'll be complaints about removing them. So IMO the backward compat
price for pg_catalog.* has to be low enough that there's essentially no
point in keeping it around forever, or so painful that it'd cost a lot
of people a lot.

I agree that at some point, the price to pay is too high. But in this
case, a boolean value has been replaced by an enumeration, so it would be
really easy to maintain the boolean value as a computed column, right?

now, the logic of pgproc is different, and this column has not sense.
Probably nobody is happy, but I dont see any benefit holds garbage in
system catalogue. Pgproc is table, not view, so computed column is not
possible feature, and if can be, then it has not any consistency there.

Bruce Momjian

bruce@momjian.us

almost 8 years ago

In reply to: Lukas Eder (#5)

Re: BUG #15344: pg_proc.proisagg was removed incompatibly in PostgreSQL 11

On Tue, Aug 21, 2018 at 05:04:13PM +0200, Lukas Eder wrote:

On Tue, Aug 21, 2018 at 4:45 PM Andres Freund <andres@anarazel.de> wrote:

I understand that backwards compatibility is quite a bit of extra work,

but

in cases like this particular one, the price to pay seems relatively low.
Perhaps a new strategy could be to break things only if there is really

no

other solution?

I mean we don't break things willy-nilly already. And there's plenty
cases where we kept things around for backward compatibility. But
usually the problem is that that means you have to keep the shim around
forever. Even if better solutions have been around for many years,
there'll be complaints about removing them.ï¿½ So IMO the backward compat
price for pg_catalog.* has to be low enough that there's essentially no
point in keeping it around forever, or so painful that it'd cost a lot
of people a lot.

I agree that at some point, the price to pay is too high. But in this case, a
boolean value has been replaced by an enumeration, so it would be really easy
to maintain the boolean value as a computed column, right?

We realize we are breaking some things by changing any API. The
question is whether the breakage is worth the cleaner API for users
going forward.

People regularly complain when we break APIs, and ask us to justify
every decision, which we are willing to do. However, few people
complain when we keep old APIs around. The big issue is that keeping
old APIs around has a much higher cost than the API breakage we do, in
most cases. This is because API breakage is a big hurt for a few
people, but keeping around an old API is a small hurt for many, and for
a long duration. So, you hear about the big hurts, but not about the
many small ones, but that doesn't mean we should ignore the small ones.

Anyone who has used an API that allows you to do things in
several old and new ways knows the cost of such API complexity, and the
server programming complexity of keeping old APIs working properly.

Being specific, we almost never change the library API (libpq), and we
rarely change the SQL API, though there are cases listed the major
release notes. We often change the administrators API and system system
catalog contents, as you have seen. These decisions are all related to
the ability of users to detect and adjust to API changes, the ability to
keep the old API active, and the number of users effected.

Telling us we broke an API and that we were wrong really doesn't tell us
anything --- we already knew we broke it, and suspected some people
would be disappointed. You need to walk into that calculus and tell us
why our estimates were wrong. However, most users just don't have the
level of Postgres knowledge to really argue such cases, even if we are
wrong.

We have also made the decision long ago that we would rather present a
clean API to future users than avoid reasonable breakage for current
users. People who want that policy changed must get quite involved so
they can argue with the calculus we are already using. There also might
be cases where our API breakage is too extreme and another relational
database product might be a better fit.

I am thinking I should create a wiki page explaining how we make API
breakage decisions so we don't need to re-litigate this each time.

--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ As you are, so once was I.  As I am, so you will be. +
+                      Ancient Roman grave inscription +