BUG #14825: enum type: unsafe use?

Started by Nonameover 8 years ago48 messages
#1Noname
balazs@obiserver.hu

The following bug has been logged on the website:

Bug reference: 14825
Logged by: Balazs Szilfai
Email address: balazs@obiserver.hu
PostgreSQL version: 10beta4
Operating system: Debian Linux
Description:

Hi,

version: 10rc1

testdb=# begin;
BEGIN
testdb=# create type enumtype as enum ('v1', 'v2');
CREATE TYPE
testdb=# create table testtable (enumcolumn enumtype not null default
'v1');
CREATE TABLE

Everything it's OK! :)

If enum type have "owner to":

testdb=# begin;
BEGIN
testdb=# create type enumtype as enum ('v1', 'v2');
CREATE TYPE
testdb=# alter type enumtype owner to testrole;
ALTER TYPE
testdb=# create table testtable (enumcolumn enumtype not null default
'v1');
ERROR: unsafe use of new value "v1" of enum type enumtype
HINT: New enum values must be committed before they can be used.

Is this unsafe?

Balazs

--
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

#2Tom Lane
tgl@sss.pgh.pa.us
In reply to: Noname (#1)
Re: [BUGS] BUG #14825: enum type: unsafe use?

balazs@obiserver.hu writes:

PostgreSQL version: 10beta4

testdb=# begin;
BEGIN
testdb=# create type enumtype as enum ('v1', 'v2');
CREATE TYPE
testdb=# alter type enumtype owner to testrole;
ALTER TYPE
testdb=# create table testtable (enumcolumn enumtype not null default 'v1');
ERROR: unsafe use of new value "v1" of enum type enumtype
HINT: New enum values must be committed before they can be used.

Hmm, that's pretty annoying. It's happening because check_safe_enum_use
insists that the enum's pg_type entry not be updated-in-transaction.
We thought that the new rules instituted by 15bc038f9 would be generally
more convenient to use than the old ones --- but this example shows
that, for example, pg_dump scripts involving enums often could not
be restored inside a single transaction.

I'm not sure if that qualifies as a stop-ship problem, but it ain't
good, for sure. We need to look at whether we should revert 15bc038f9
or somehow revise its rules.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#3Andrew Dunstan
andrew.dunstan@2ndquadrant.com
In reply to: Tom Lane (#2)
Re: BUG #14825: enum type: unsafe use?

On 09/22/2017 05:46 PM, Tom Lane wrote:

balazs@obiserver.hu writes:

PostgreSQL version: 10beta4
testdb=# begin;
BEGIN
testdb=# create type enumtype as enum ('v1', 'v2');
CREATE TYPE
testdb=# alter type enumtype owner to testrole;
ALTER TYPE
testdb=# create table testtable (enumcolumn enumtype not null default 'v1');
ERROR: unsafe use of new value "v1" of enum type enumtype
HINT: New enum values must be committed before they can be used.

Hmm, that's pretty annoying. It's happening be

cause check_safe_enum_use
insists that the enum's pg_type entry not be updated-in-transaction.
We thought that the new rules instituted by 15bc038f9 would be generally
more convenient to use than the old ones --- but this example shows
that, for example, pg_dump scripts involving enums often could not
be restored inside a single transaction.

I'm not sure if that qualifies as a stop-ship problem, but it ain't
good, for sure. We need to look at whether we should revert 15bc038f9
or somehow revise its rules.

:-(

The only real problem comes from adding a value to an enum that has been
created in an earlier transaction and then using that enum value. What
we're doing here is essentially a heuristic test for that condition, and
we're getting some false positives. I wonder if we wouldn't be better
doing this more directly, keeping a per-transaction hash of unsafe enum
values (which will almost always be empty). It might even speed up the
check.

cheers

andrew

--
Andrew Dunstan https://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

--
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

#4Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andrew Dunstan (#3)
Re: BUG #14825: enum type: unsafe use?

Andrew Dunstan <andrew.dunstan@2ndquadrant.com> writes:

On 09/22/2017 05:46 PM, Tom Lane wrote:

I'm not sure if that qualifies as a stop-ship problem, but it ain't
good, for sure. We need to look at whether we should revert 15bc038f9
or somehow revise its rules.

I wonder if we wouldn't be better
doing this more directly, keeping a per-transaction hash of unsafe enum
values (which will almost always be empty). It might even speed up the
check.

Yeah, I was considering the same thing over dinner, though I'd phrase
it oppositely: keep a list of enum type OIDs created in the current
transaction, so that we could whitelist them. This could maybe become
a problem if someone created a zillion enums in one xact, though.

The immediate question is do we care to design/implement such a thing
post-RC1. I'd have to vote "no". I think the most prudent thing to
do is revert 15bc038f9 and then have another go at it during the v11
cycle.

regards, tom lane

--
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

#5Andrew Dunstan
andrew.dunstan@2ndquadrant.com
In reply to: Tom Lane (#4)
Re: [BUGS] BUG #14825: enum type: unsafe use?

On 09/22/2017 11:19 PM, Tom Lane wrote:

Andrew Dunstan <andrew.dunstan@2ndquadrant.com> writes:

On 09/22/2017 05:46 PM, Tom Lane wrote:

I'm not sure if that qualifies as a stop-ship problem, but it ain't
good, for sure. We need to look at whether we should revert 15bc038f9
or somehow revise its rules.

I wonder if we wouldn't be better
doing this more directly, keeping a per-transaction hash of unsafe enum
values (which will almost always be empty). It might even speed up the
check.

Yeah, I was considering the same thing over dinner, though I'd phrase
it oppositely: keep a list of enum type OIDs created in the current
transaction, so that we could whitelist them. This could maybe become
a problem if someone created a zillion enums in one xact, though.

I see what you're saying, but my idea was slightly different. We would
only add to the hashtable I had in mind at the bottom AddEnumLabel().
Any other value, whether added in the current transaction or not, should
be safe, AIUI. Maybe we should also keep a cache of whitelisted enums
created in the current transaction.

I'm not to worried about people creating a zillion enums (or enum labels
being added for the solution I had in mind). Even a hash of a million
Oids will only consume a few megabytes, won't it?

The immediate question is do we care to design/implement such a thing
post-RC1. I'd have to vote "no". I think the most prudent thing to
do is revert 15bc038f9 and then have another go at it during the v11
cycle.

Sadly I agree. We've made decisions like this in the past, and I have
generally been supportive of them. I think this is the first time I have
been on the receiving end of one so late in the process :-(

cheers

andrew

--
Andrew Dunstan https://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#6Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andrew Dunstan (#5)
Re: BUG #14825: enum type: unsafe use?

Andrew Dunstan <andrew.dunstan@2ndquadrant.com> writes:

On 09/22/2017 11:19 PM, Tom Lane wrote:

Yeah, I was considering the same thing over dinner, though I'd phrase
it oppositely: keep a list of enum type OIDs created in the current
transaction, so that we could whitelist them. This could maybe become
a problem if someone created a zillion enums in one xact, though.

I see what you're saying, but my idea was slightly different. We would
only add to the hashtable I had in mind at the bottom AddEnumLabel().
Any other value, whether added in the current transaction or not, should
be safe, AIUI.

Oh, I see: a list of enum values we need to blacklist, not whitelist.
Yes, that's a significantly better idea than mine, because in common
use-cases that would be empty or have a very small number of entries.
In particular that fixes the "pg_restore -1" scenario, because no
matter how many enums you're restoring, pg_dump doesn't use ALTER
TYPE ADD VALUE. (Well, it does in --binary-upgrade mode, but those
scripts are run in transaction-per-statement mode so it's fine.)

Maybe we should also keep a cache of whitelisted enums
created in the current transaction.

What for? Wouldn't be any faster to search, in fact likely slower
because it could get large in common use-cases.

The immediate question is do we care to design/implement such a thing
post-RC1. I'd have to vote "no". I think the most prudent thing to
do is revert 15bc038f9 and then have another go at it during the v11
cycle.

Sadly I agree. We've made decisions like this in the past, and I have
generally been supportive of them. I think this is the first time I have
been on the receiving end of one so late in the process :-(

Unless you want to try writing a patch for this in the next day or two,
I think we have to do that. But now that I see the plan clearly, maybe
we could get away with a post-RC1 fix.

regards, tom lane

--
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

#7Andrew Dunstan
andrew.dunstan@2ndquadrant.com
In reply to: Tom Lane (#6)
Re: [BUGS] BUG #14825: enum type: unsafe use?

On 09/23/2017 11:16 AM, Tom Lane wrote:

Andrew Dunstan <andrew.dunstan@2ndquadrant.com> writes:

The immediate question is do we care to design/implement such a thing
post-RC1. I'd have to vote "no". I think the most prudent thing to
do is revert 15bc038f9 and then have another go at it during the v11
cycle.

Sadly I agree. We've made decisions like this in the past, and I have
generally been supportive of them. I think this is the first time I have
been on the receiving end of one so late in the process :-(

Unless you want to try writing a patch for this in the next day or two,
I think we have to do that. But now that I see the plan clearly, maybe
we could get away with a post-RC1 fix.

OK, I'll give it a shot.

cheers

andrew

--
Andrew Dunstan https://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#8Tom Lane
tgl@sss.pgh.pa.us
In reply to: Tom Lane (#6)
Re: BUG #14825: enum type: unsafe use?

I wrote:

Andrew Dunstan <andrew.dunstan@2ndquadrant.com> writes:

I see what you're saying, but my idea was slightly different. We would
only add to the hashtable I had in mind at the bottom AddEnumLabel().
Any other value, whether added in the current transaction or not, should
be safe, AIUI.

Oh, I see: a list of enum values we need to blacklist, not whitelist.
Yes, that's a significantly better idea than mine, because in common
use-cases that would be empty or have a very small number of entries.

Oh, wait a minute ... that's not where the problem is, really. We can
already tell reliably whether an enum value was created in the current
transaction: the is-it-committed check in check_safe_enum_use is
perfectly safe for that AFAICS. The hard part of this is the part about
"was the enum type created in the current transaction?". We could make
that reliable with the table I proposed of enum types created in the
current transaction, but the possible performance drag is a concern.

What I understand you to be proposing is to blacklist individual
enum values added by ALTER ADD VALUE, but *not* values created
aboriginally by CREATE TYPE AS ENUM. (The latter are surely safe,
because the type must disappear if they do.) However, that would
require dropping the second part of the current documentation promise:

If <command>ALTER TYPE ... ADD VALUE</> (the form that adds a new value to
an enum type) is executed inside a transaction block, the new value cannot
be used until after the transaction has been committed, except in the case
that the enum type itself was created earlier in the same transaction.

We'd have to just say "it can't be used till the transaction has been
committed", full stop. Otherwise we're right back into the problem of
needing to detect whether the enum type is new in transaction.

Maybe we should also keep a cache of whitelisted enums
created in the current transaction.

What for?

I now realize that what you probably meant here was to track enum *types*,
not values, created in the current transaction. But if we're doing that
then we don't really need the per-enum-value-blacklist part of it.

So I'm back to not being sure about the path forward. Maybe it would be
all right to say "the value added by ADD VALUE can't be used in the same
transaction, period". That's still a step forward compared to the pre-v10
prohibition on doing it at all. I don't remember if there were use-cases
where we really needed the exception for new-in-transaction types.

regards, tom lane

--
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

#9Andrew Dunstan
andrew.dunstan@2ndquadrant.com
In reply to: Tom Lane (#8)
Re: BUG #14825: enum type: unsafe use?

On 09/23/2017 02:00 PM, Tom Lane wrote:

I wrote:

Andrew Dunstan <andrew.dunstan@2ndquadrant.com> writes:

I see what you're saying, but my idea was slightly different. We would
only add to the hashtable I had in mind at the bottom AddEnumLabel().
Any other value, whether added in the current transaction or not, should
be safe, AIUI.

Oh, I see: a list of enum values we need to blacklist, not whitelist.
Yes, that's a significantly better idea than mine, because in common
use-cases that would be empty or have a very small number of entries.

Oh, wait a minute ... that's not where the problem is, really. We can
already tell reliably whether an enum value was created in the current
transaction: the is-it-committed check in check_safe_enum_use is
perfectly safe for that AFAICS. The hard part of this is the part about
"was the enum type created in the current transaction?". We could make
that reliable with the table I proposed of enum types created in the
current transaction, but the possible performance drag is a concern.

What I understand you to be proposing is to blacklist individual
enum values added by ALTER ADD VALUE, but *not* values created
aboriginally by CREATE TYPE AS ENUM. (The latter are surely safe,
because the type must disappear if they do.) However, that would
require dropping the second part of the current documentation promise:

If <command>ALTER TYPE ... ADD VALUE</> (the form that adds a new value to
an enum type) is executed inside a transaction block, the new value cannot
be used until after the transaction has been committed, except in the case
that the enum type itself was created earlier in the same transaction.

We'd have to just say "it can't be used till the transaction has been
committed", full stop. Otherwise we're right back into the problem of
needing to detect whether the enum type is new in transaction.

Maybe we should also keep a cache of whitelisted enums
created in the current transaction.

What for?

I now realize that what you probably meant here was to track enum *types*,
not values, created in the current transaction. But if we're doing that
then we don't really need the per-enum-value-blacklist part of it.

So I'm back to not being sure about the path forward. Maybe it would be
all right to say "the value added by ADD VALUE can't be used in the same
transaction, period". That's still a step forward compared to the pre-v10
prohibition on doing it at all. I don't remember if there were use-cases
where we really needed the exception for new-in-transaction types.

Well, my idea was to have the test run like this:

* is the value an old one? Test txnid of tuple. If yes it's ok
* is the value one created by ALTER TYPE ADD VALUE? Test
blacklist. If no, it's ok.
* is the enum a new one? Test whitelist. If yes, it's ok.
* anything else is not ok.

I think that would let us keep our promise.

If we just did the blacklist and stuck with our current heuristic test
for enum being created in the current transaction, we'd still probably
avoid 99% of the problems, including specifically the one that gave rise
to the bug report.

Am I missing something?

cheers

andrew

--
Andrew Dunstan https://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

--
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

#10Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andrew Dunstan (#9)
Re: BUG #14825: enum type: unsafe use?

Andrew Dunstan <andrew.dunstan@2ndquadrant.com> writes:

On 09/23/2017 02:00 PM, Tom Lane wrote:

So I'm back to not being sure about the path forward. Maybe it would be
all right to say "the value added by ADD VALUE can't be used in the same
transaction, period". That's still a step forward compared to the pre-v10
prohibition on doing it at all. I don't remember if there were use-cases
where we really needed the exception for new-in-transaction types.

Well, my idea was to have the test run like this:

* is the value an old one? Test txnid of tuple. If yes it's ok
* is the value one created by ALTER TYPE ADD VALUE? Test
blacklist. If no, it's ok.
* is the enum a new one? Test whitelist. If yes, it's ok.
* anything else is not ok.

My point is that if you do 1 and 3, you don't need 2. Or if you do
2 and 3, you don't need 1. But in most cases, testing the tuple
hint bits is cheap, so you don't really want that option.

In any case, what I'm worried about is the amount of bookkeeping
overhead added by keeping a whitelist of enum-types-created-in-
current-transaction. That's less than trivial, especially since
you have to account correctly for subtransactions. And there are
common use-cases where that table will become large.

If we just did the blacklist and stuck with our current heuristic test
for enum being created in the current transaction, we'd still probably
avoid 99% of the problems, including specifically the one that gave rise
to the bug report.

True. But I'm not sure whether the heuristic test is adding anything
meaningful if we use a blacklist first. The case where it could help
is

begin;
create type t as enum();
alter type t add value 'v';
-- do something with 'v'
commit;

That perhaps is worth something, but if somebody is trying to build a new
enum type in pieces like that, doesn't it seem fairly likely that they
might throw in an ALTER OWNER or GRANT as well? My feeling is that the
lesson we need to learn is that the heuristic test isn't good enough.

regards, tom lane

--
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

#11Andrew Dunstan
andrew.dunstan@2ndquadrant.com
In reply to: Tom Lane (#10)
1 attachment(s)
Re: BUG #14825: enum type: unsafe use?

On 09/23/2017 03:52 PM, Tom Lane wrote:

Andrew Dunstan <andrew.dunstan@2ndquadrant.com> writes:

On 09/23/2017 02:00 PM, Tom Lane wrote:

So I'm back to not being sure about the path forward. Maybe it would be
all right to say "the value added by ADD VALUE can't be used in the same
transaction, period". That's still a step forward compared to the pre-v10
prohibition on doing it at all. I don't remember if there were use-cases
where we really needed the exception for new-in-transaction types.

Well, my idea was to have the test run like this:
* is the value an old one? Test txnid of tuple. If yes it's ok
* is the value one created by ALTER TYPE ADD VALUE? Test
blacklist. If no, it's ok.
* is the enum a new one? Test whitelist. If yes, it's ok.
* anything else is not ok.

My point is that if you do 1 and 3, you don't need 2. Or if you do
2 and 3, you don't need 1. But in most cases, testing the tuple
hint bits is cheap, so you don't really want that option.

In any case, what I'm worried about is the amount of bookkeeping
overhead added by keeping a whitelist of enum-types-created-in-
current-transaction. That's less than trivial, especially since
you have to account correctly for subtransactions. And there are
common use-cases where that table will become large.

If we just did the blacklist and stuck with our current heuristic test
for enum being created in the current transaction, we'd still probably
avoid 99% of the problems, including specifically the one that gave rise
to the bug report.

True. But I'm not sure whether the heuristic test is adding anything
meaningful if we use a blacklist first. The case where it could help
is

begin;
create type t as enum();
alter type t add value 'v';
-- do something with 'v'
commit;

That perhaps is worth something, but if somebody is trying to build a new
enum type in pieces like that, doesn't it seem fairly likely that they
might throw in an ALTER OWNER or GRANT as well? My feeling is that the
lesson we need to learn is that the heuristic test isn't good enough.

OK, I think I'm convinced. Here's is the WIP code I put together for the
blacklist. I'm was looking for a place to put the init call, but since
it's possibly not going anywhere I stopped :-) . My initial thought
about substransactions was that we should ignore them for this purpose
(That's why I used TopTransactionContext for the table).

I agree the heuristic test isn't good enough, and if we can get a 100%
accurate test for the newness of the enum type then the blacklist would
be redundant.

w.r.t. table size - how large? I confess I haven't seen any systems with
more than a few hundred enum types. But even a million or two shouldn't
consume a huge amount of memory, should it?

cheers

andrew

--

Andrew Dunstan https://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Attachments:

fix_enum_wip.patchtext/x-patch; name=fix_enum_wip.patchDownload
diff --git a/src/backend/catalog/pg_enum.c b/src/backend/catalog/pg_enum.c
index fe61d4d..52c1271 100644
--- a/src/backend/catalog/pg_enum.c
+++ b/src/backend/catalog/pg_enum.c
@@ -28,6 +28,8 @@
 #include "utils/builtins.h"
 #include "utils/catcache.h"
 #include "utils/fmgroids.h"
+#include "utils/hsearch.h"
+#include "utils/memutils.h"
 #include "utils/syscache.h"
 #include "utils/tqual.h"
 
@@ -38,6 +40,9 @@ Oid			binary_upgrade_next_pg_enum_oid = InvalidOid;
 static void RenumberEnumType(Relation pg_enum, HeapTuple *existing, int nelems);
 static int	sort_order_cmp(const void *p1, const void *p2);
 
+/* hash table of values added in current transaction by AddEnumLabel */
+
+static HTAB *enum_blacklist = NULL;
 
 /*
  * EnumValuesCreate
@@ -460,8 +465,44 @@ restart:
 	heap_freetuple(enum_tup);
 
 	heap_close(pg_enum, RowExclusiveLock);
+
+	/* set up blacklist hash if required */
+	if (enum_blacklist == NULL)
+	{
+		HASHCTL     hash_ctl;
+		memset(&hash_ctl, 0, sizeof(hash_ctl));
+		hash_ctl.keysize = sizeof(Oid);
+		hash_ctl.entrysize = sizeof(Oid);
+		hash_ctl.hcxt = CurTransactionContext;
+		enum_blacklist = hash_create("Enum blacklist for current transaction",
+						   32,
+						   &hash_ctl,
+						   HASH_ELEM | HASH_BLOBS | HASH_CONTEXT);
+
+	}
+
+	/* and add the new value to the blacklist */
+
+	(void) hash_search(enum_blacklist, &newOid, HASH_ENTER, NULL);
 }
 
+bool
+EnumBlacklisted(Oid enum_id)
+{
+	bool found;
+
+	if (enum_blacklist == NULL)
+		return false;
+
+	(void) hash_search(enum_blacklist, &enum_id, HASH_FIND, &found);
+	return found;
+}
+
+void
+InitEnumBlacklist(void)
+{
+	enum_blacklist = NULL;
+}
 
 /*
  * RenameEnumLabel
diff --git a/src/backend/utils/adt/enum.c b/src/backend/utils/adt/enum.c
index 973397c..a7ba3d0 100644
--- a/src/backend/utils/adt/enum.c
+++ b/src/backend/utils/adt/enum.c
@@ -76,6 +76,10 @@ check_safe_enum_use(HeapTuple enumval_tup)
 		TransactionIdDidCommit(xmin))
 		return;
 
+	/* Check if the enum value is blacklisted. If not, it's safe */
+	if (! EnumBlacklisted(HeapTupleGetOid(enumval_tup)))
+		return;
+
 	/* It is a new enum value, so check to see if the whole enum is new */
 	en = (Form_pg_enum) GETSTRUCT(enumval_tup);
 	enumtyp_tup = SearchSysCache1(TYPEOID, ObjectIdGetDatum(en->enumtypid));
diff --git a/src/include/catalog/pg_enum.h b/src/include/catalog/pg_enum.h
index 5938ba5..f8a492b 100644
--- a/src/include/catalog/pg_enum.h
+++ b/src/include/catalog/pg_enum.h
@@ -69,5 +69,7 @@ extern void AddEnumLabel(Oid enumTypeOid, const char *newVal,
 			 bool skipIfExists);
 extern void RenameEnumLabel(Oid enumTypeOid,
 				const char *oldVal, const char *newVal);
+extern bool EnumBlacklisted(Oid enum_id);
+extern void InitEnumBlacklist(void);
 
 #endif							/* PG_ENUM_H */
#12Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andrew Dunstan (#11)
Re: [BUGS] BUG #14825: enum type: unsafe use?

Andrew Dunstan <andrew.dunstan@2ndquadrant.com> writes:

OK, I think I'm convinced. Here's is the WIP code I put together for the
blacklist. I'm was looking for a place to put the init call, but since
it's possibly not going anywhere I stopped :-) . My initial thought
about substransactions was that we should ignore them for this purpose
(That's why I used TopTransactionContext for the table).

For the blacklist, I agree we could just ignore subtransactions: all
subtransaction levels are equally uncommitted for this purpose, and
leaving entries from failed subtransactions in place seems like a
non-issue, since they'd never be referenced again. (Well, barring OID
wraparound and an enum-value-OID collision while the transaction runs,
but I think we can ignore that as having probability epsilon.)

But you need to actually put the table in TopTransactionContext, not
CurTransactionContext ;-). Also, I don't think you need an init call
so much as an end-of-transaction cleanup call. Maybe call it
AtEOXactEnum(), for consistency with other functions called in the
same area.

w.r.t. table size - how large? I confess I haven't seen any systems with
more than a few hundred enum types. But even a million or two shouldn't
consume a huge amount of memory, should it?

Dynahash tables are self-expanding, so I don't see a need to stress about
that too much. Anything in 10-100 seems reasonable for initial size.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#13Andrew Dunstan
andrew.dunstan@2ndquadrant.com
In reply to: Tom Lane (#12)
1 attachment(s)
Re: [BUGS] BUG #14825: enum type: unsafe use?

On 09/23/2017 06:06 PM, Tom Lane wrote:

Andrew Dunstan <andrew.dunstan@2ndquadrant.com> writes:

OK, I think I'm convinced. Here's is the WIP code I put together for the
blacklist. I'm was looking for a place to put the init call, but since
it's possibly not going anywhere I stopped :-) . My initial thought
about substransactions was that we should ignore them for this purpose
(That's why I used TopTransactionContext for the table).

For the blacklist, I agree we could just ignore subtransactions: all
subtransaction levels are equally uncommitted for this purpose, and
leaving entries from failed subtransactions in place seems like a
non-issue, since they'd never be referenced again. (Well, barring OID
wraparound and an enum-value-OID collision while the transaction runs,
but I think we can ignore that as having probability epsilon.)

But you need to actually put the table in TopTransactionContext, not
CurTransactionContext ;-). Also, I don't think you need an init call
so much as an end-of-transaction cleanup call. Maybe call it
AtEOXactEnum(), for consistency with other functions called in the
same area.

w.r.t. table size - how large? I confess I haven't seen any systems with
more than a few hundred enum types. But even a million or two shouldn't
consume a huge amount of memory, should it?

Dynahash tables are self-expanding, so I don't see a need to stress about
that too much. Anything in 10-100 seems reasonable for initial size.

OK, here's the finished patch. It has a pretty small footprint all
things considered, and I think it guarantees that nothing that could be
done in this area in 9.6 will be forbidden. That's probably enough to
get us to 10 without having to revert the whole thing, ISTM, and we can
leave any further refinement to the next release.

cheers

andrew

--
Andrew Dunstan https://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Attachments:

fix-enums-3.patchtext/x-patch; name=fix-enums-3.patchDownload
diff --git a/src/backend/access/transam/xact.c b/src/backend/access/transam/xact.c
index 93dca7a..1d6f774 100644
--- a/src/backend/access/transam/xact.c
+++ b/src/backend/access/transam/xact.c
@@ -32,6 +32,7 @@
 #include "access/xlogutils.h"
 #include "catalog/catalog.h"
 #include "catalog/namespace.h"
+#include "catalog/pg_enum.h"
 #include "catalog/storage.h"
 #include "commands/async.h"
 #include "commands/tablecmds.h"
@@ -2126,6 +2127,7 @@ CommitTransaction(void)
 	smgrDoPendingDeletes(true);
 
 	AtCommit_Notify();
+	AtEOXact_Enum();
 	AtEOXact_GUC(true, 1);
 	AtEOXact_SPI(true);
 	AtEOXact_on_commit_actions(true);
@@ -2405,6 +2407,7 @@ PrepareTransaction(void)
 
 	/* PREPARE acts the same as COMMIT as far as GUC is concerned */
 	AtEOXact_GUC(true, 1);
+	AtEOXact_Enum();
 	AtEOXact_SPI(true);
 	AtEOXact_on_commit_actions(true);
 	AtEOXact_Namespace(true, false);
@@ -2606,6 +2609,7 @@ AbortTransaction(void)
 							 false, true);
 		smgrDoPendingDeletes(false);
 
+		AtEOXact_Enum();
 		AtEOXact_GUC(false, 1);
 		AtEOXact_SPI(false);
 		AtEOXact_on_commit_actions(false);
diff --git a/src/backend/catalog/pg_enum.c b/src/backend/catalog/pg_enum.c
index fe61d4d..3056f68 100644
--- a/src/backend/catalog/pg_enum.c
+++ b/src/backend/catalog/pg_enum.c
@@ -28,6 +28,8 @@
 #include "utils/builtins.h"
 #include "utils/catcache.h"
 #include "utils/fmgroids.h"
+#include "utils/hsearch.h"
+#include "utils/memutils.h"
 #include "utils/syscache.h"
 #include "utils/tqual.h"
 
@@ -38,6 +40,8 @@ Oid			binary_upgrade_next_pg_enum_oid = InvalidOid;
 static void RenumberEnumType(Relation pg_enum, HeapTuple *existing, int nelems);
 static int	sort_order_cmp(const void *p1, const void *p2);
 
+/* hash table of values added in the current transaction by AddEnumLabel */
+static HTAB *enum_blacklist = NULL;
 
 /*
  * EnumValuesCreate
@@ -460,8 +464,49 @@ restart:
 	heap_freetuple(enum_tup);
 
 	heap_close(pg_enum, RowExclusiveLock);
+
+	/* Set up the blacklist hash if required */
+	if (enum_blacklist == NULL)
+	{
+		HASHCTL     hash_ctl;
+		memset(&hash_ctl, 0, sizeof(hash_ctl));
+		hash_ctl.keysize = sizeof(Oid);
+		hash_ctl.entrysize = sizeof(Oid);
+		hash_ctl.hcxt = TopTransactionContext;
+		enum_blacklist = hash_create("Enum blacklist for current transaction",
+						   32,
+						   &hash_ctl,
+						   HASH_ELEM | HASH_BLOBS | HASH_CONTEXT);
+
+	}
+
+	/* Add the new value to the blacklist */
+	(void) hash_search(enum_blacklist, &newOid, HASH_ENTER, NULL);
 }
 
+/* Test if the enum is on the blacklist */
+bool
+EnumBlacklisted(Oid enum_id)
+{
+	bool found;
+
+	if (enum_blacklist == NULL)
+		return false;
+
+	(void) hash_search(enum_blacklist, &enum_id, HASH_FIND, &found);
+	return found;
+}
+
+/*
+ * Clean up the blacklist hash at the end of the transaction. The memory will
+ * have been deallocated, so all we need to do is set the pointer back to
+ * NULL for the next transaction.
+ */
+void
+AtEOXact_Enum(void)
+{
+	enum_blacklist = NULL;
+}
 
 /*
  * RenameEnumLabel
diff --git a/src/backend/utils/adt/enum.c b/src/backend/utils/adt/enum.c
index 973397c..a7ba3d0 100644
--- a/src/backend/utils/adt/enum.c
+++ b/src/backend/utils/adt/enum.c
@@ -76,6 +76,10 @@ check_safe_enum_use(HeapTuple enumval_tup)
 		TransactionIdDidCommit(xmin))
 		return;
 
+	/* Check if the enum value is blacklisted. If not, it's safe */
+	if (! EnumBlacklisted(HeapTupleGetOid(enumval_tup)))
+		return;
+
 	/* It is a new enum value, so check to see if the whole enum is new */
 	en = (Form_pg_enum) GETSTRUCT(enumval_tup);
 	enumtyp_tup = SearchSysCache1(TYPEOID, ObjectIdGetDatum(en->enumtypid));
diff --git a/src/include/catalog/pg_enum.h b/src/include/catalog/pg_enum.h
index 5938ba5..dff3d2f 100644
--- a/src/include/catalog/pg_enum.h
+++ b/src/include/catalog/pg_enum.h
@@ -69,5 +69,7 @@ extern void AddEnumLabel(Oid enumTypeOid, const char *newVal,
 			 bool skipIfExists);
 extern void RenameEnumLabel(Oid enumTypeOid,
 				const char *oldVal, const char *newVal);
+extern bool EnumBlacklisted(Oid enum_id);
+extern void AtEOXact_Enum(void);
 
 #endif							/* PG_ENUM_H */
diff --git a/src/test/regress/expected/enum.out b/src/test/regress/expected/enum.out
index 0e60304..96345e4 100644
--- a/src/test/regress/expected/enum.out
+++ b/src/test/regress/expected/enum.out
@@ -648,6 +648,29 @@ SELECT enum_range(null::bogus);
 (1 row)
 
 ROLLBACK;
+-- check that we don't forbid anything except use of a new value on an
+-- existing enum
+BEGIN;
+CREATE ROLE newowner;
+CREATE TYPE bogus AS ENUM('good','bad','ugly');
+ALTER TYPE bogus OWNER TO newowner;
+select enum_range(null::bogus);
+   enum_range    
+-----------------
+ {good,bad,ugly}
+(1 row)
+
+ROLLBACK;
+BEGIN;
+CREATE ROLE newowner;
+ALTER TYPE rainbow OWNER TO newowner;
+select enum_range(null::rainbow);
+                enum_range                 
+-------------------------------------------
+ {crimson,orange,yellow,green,blue,purple}
+(1 row)
+
+ROLLBACK;
 --
 -- Cleanup
 --
diff --git a/src/test/regress/sql/enum.sql b/src/test/regress/sql/enum.sql
index d7e8714..2129f64 100644
--- a/src/test/regress/sql/enum.sql
+++ b/src/test/regress/sql/enum.sql
@@ -311,6 +311,22 @@ ALTER TYPE bogus ADD VALUE 'ugly';
 SELECT enum_range(null::bogus);
 ROLLBACK;
 
+-- check that we don't forbid anything except use of a new value on an
+-- existing enum
+
+BEGIN;
+CREATE ROLE newowner;
+CREATE TYPE bogus AS ENUM('good','bad','ugly');
+ALTER TYPE bogus OWNER TO newowner;
+select enum_range(null::bogus);
+ROLLBACK;
+
+BEGIN;
+CREATE ROLE newowner;
+ALTER TYPE rainbow OWNER TO newowner;
+select enum_range(null::rainbow);
+ROLLBACK;
+
 --
 -- Cleanup
 --
#14Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andrew Dunstan (#13)
Re: BUG #14825: enum type: unsafe use?

Andrew Dunstan <andrew.dunstan@2ndquadrant.com> writes:

OK, here's the finished patch. It has a pretty small footprint all
things considered, and I think it guarantees that nothing that could be
done in this area in 9.6 will be forbidden. That's probably enough to
get us to 10 without having to revert the whole thing, ISTM, and we can
leave any further refinement to the next release.

I think this could do with some more work on the comments and test cases,
but it's basically sound.

What we still need to debate is whether to remove the heuristic
type-is-from-same-transaction test, making the user-visible behavior
simply "you must commit an ALTER TYPE ADD VALUE before you can use the
new value". I'm kind of inclined to do so; the fuzzy (and inadequately
documented) behavior we'll have if we keep it doesn't seem very nice to
me.

regards, tom lane

--
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

#15Andrew Dunstan
andrew.dunstan@2ndquadrant.com
In reply to: Tom Lane (#14)
Re: BUG #14825: enum type: unsafe use?

On 09/24/2017 04:37 PM, Tom Lane wrote:

Andrew Dunstan <andrew.dunstan@2ndquadrant.com> writes:

OK, here's the finished patch. It has a pretty small footprint all
things considered, and I think it guarantees that nothing that could be
done in this area in 9.6 will be forbidden. That's probably enough to
get us to 10 without having to revert the whole thing, ISTM, and we can
leave any further refinement to the next release.

I think this could do with some more work on the comments and test cases,
but it's basically sound.

What we still need to debate is whether to remove the heuristic
type-is-from-same-transaction test, making the user-visible behavior
simply "you must commit an ALTER TYPE ADD VALUE before you can use the
new value". I'm kind of inclined to do so; the fuzzy (and inadequately
documented) behavior we'll have if we keep it doesn't seem very nice to
me.

I'd rather not. The failure cases are going to be vanishingly small, I
suspect, and we've already discussed how we might improve that test. If
you want to put some weasel words in the docs that might be ok.

cheers

andrew

--
Andrew Dunstan https://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

--
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

#16Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andrew Dunstan (#15)
Re: [BUGS] BUG #14825: enum type: unsafe use?

Andrew Dunstan <andrew.dunstan@2ndquadrant.com> writes:

On 09/24/2017 04:37 PM, Tom Lane wrote:

What we still need to debate is whether to remove the heuristic
type-is-from-same-transaction test, making the user-visible behavior
simply "you must commit an ALTER TYPE ADD VALUE before you can use the
new value". I'm kind of inclined to do so; the fuzzy (and inadequately
documented) behavior we'll have if we keep it doesn't seem very nice to
me.

I'd rather not. The failure cases are going to be vanishingly small, I
suspect, and we've already discussed how we might improve that test. If
you want to put some weasel words in the docs that might be ok.

I'm unconvinced.  We get enough complaints about heuristic behaviors
we have elsewhere.  Also, if we ship it like this, we're going to
have backward compatibility concerns if we try to change the behavior
later.  Now admittedly, the next step forward might well be an exact
solution which would necessarily take every case the heuristic allows
--- but I don't want to box us into having to support exactly the
cases the heuristic would allow.  And I don't want to have to
document which those are, either.

Basically, I don't think anyone's shown an important use case that
wouldn't be covered by "committed or not blacklisted". That fixes
the original complaint that you couldn't do ALTER ADD VALUE in a
transaction block at all, and with or without the heuristic test,
you can't use the added value without committing. The case not
covered is where an enum type is built with multiple commands in a
single transaction --- which might be of value, but since it doesn't
work for every such case, we don't know if the heuristic is really
going to provide useful value-add or not.

So I think we should just stop with the blacklist test for v10,
and then see if we still get complaints (and exactly what they're
about) so that we can judge how much more work the problem deserves.
It's still ahead of where we were in previous releases, and ahead of
where we'd be if we end up reverting the patch altogether.

Or in short: having been burned by this heuristic already, I want
it out of there.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#17Andrew Dunstan
andrew.dunstan@2ndquadrant.com
In reply to: Tom Lane (#16)
Re: BUG #14825: enum type: unsafe use?

On 09/24/2017 07:06 PM, Tom Lane wrote:

So I think we should just stop with the blacklist test for v10,
and then see if we still get complaints (and exactly what they're
about) so that we can judge how much more work the problem deserves.
It's still ahead of where we were in previous releases, and ahead of
where we'd be if we end up reverting the patch altogether.

That's pretty much what I was saying.

cheers

andrew

--
Andrew Dunstan https://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

--
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

#18Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andrew Dunstan (#17)
Re: BUG #14825: enum type: unsafe use?

Andrew Dunstan <andrew.dunstan@2ndquadrant.com> writes:

On 09/24/2017 07:06 PM, Tom Lane wrote:

So I think we should just stop with the blacklist test for v10,
and then see if we still get complaints (and exactly what they're
about) so that we can judge how much more work the problem deserves.
It's still ahead of where we were in previous releases, and ahead of
where we'd be if we end up reverting the patch altogether.

That's pretty much what I was saying.

Oh ... I did not think we were on the same page, because your patch
didn't include removal of the same-transaction heuristic. It'd be
sensible to do that as a separate patch, though, to make it easier
to put back if we decide we do want it.

regards, tom lane

--
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

#19Andrew Dunstan
andrew.dunstan@2ndquadrant.com
In reply to: Tom Lane (#18)
Re: BUG #14825: enum type: unsafe use?

On 09/25/2017 10:14 AM, Tom Lane wrote:

Andrew Dunstan <andrew.dunstan@2ndquadrant.com> writes:

On 09/24/2017 07:06 PM, Tom Lane wrote:

So I think we should just stop with the blacklist test for v10,
and then see if we still get complaints (and exactly what they're
about) so that we can judge how much more work the problem deserves.
It's still ahead of where we were in previous releases, and ahead of
where we'd be if we end up reverting the patch altogether.

That's pretty much what I was saying.

Oh ... I did not think we were on the same page, because your patch
didn't include removal of the same-transaction heuristic. It'd be
sensible to do that as a separate patch, though, to make it easier
to put back if we decide we do want it.

I understood you to say that the blacklist patch was all we needed to do
for v10. That's my position, i.e. I think we can live with the heuristic
test for now if the blacklist patch is applied. Maybe we need to
document that the heuristic test can generate some false negatives when
testing for a type that is created in the current transaction.

cheers

andrew

--

Andrew Dunstan https://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

--
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

#20Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andrew Dunstan (#19)
Re: BUG #14825: enum type: unsafe use?

Andrew Dunstan <andrew.dunstan@2ndquadrant.com> writes:

On 09/25/2017 10:14 AM, Tom Lane wrote:

Oh ... I did not think we were on the same page, because your patch
didn't include removal of the same-transaction heuristic. It'd be
sensible to do that as a separate patch, though, to make it easier
to put back if we decide we do want it.

I understood you to say that the blacklist patch was all we needed to do
for v10. That's my position, i.e. I think we can live with the heuristic
test for now if the blacklist patch is applied. Maybe we need to
document that the heuristic test can generate some false negatives when
testing for a type that is created in the current transaction.

No, as I said upthread, I want the heuristic out of there. I think the
blacklist idea covers enough use-cases that we possibly don't need the
same-transaction test at all. Furthermore I'm doubtful that the heuristic
form of the same-transaction test is adequate to satisfy the use-cases
that the blacklist test doesn't cover. So I think we should remove that
test and see whether we get any complaints, and if so what the details of
the real-world use-cases look like.

regards, tom lane

--
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

#21Andrew Dunstan
andrew.dunstan@2ndquadrant.com
In reply to: Tom Lane (#20)
Re: [BUGS] BUG #14825: enum type: unsafe use?

On 09/25/2017 10:42 AM, Tom Lane wrote:

Andrew Dunstan <andrew.dunstan@2ndquadrant.com> writes:

On 09/25/2017 10:14 AM, Tom Lane wrote:

Oh ... I did not think we were on the same page, because your patch
didn't include removal of the same-transaction heuristic. It'd be
sensible to do that as a separate patch, though, to make it easier
to put back if we decide we do want it.

I understood you to say that the blacklist patch was all we needed to do
for v10. That's my position, i.e. I think we can live with the heuristic
test for now if the blacklist patch is applied. Maybe we need to
document that the heuristic test can generate some false negatives when
testing for a type that is created in the current transaction.

No, as I said upthread, I want the heuristic out of there. I think the
blacklist idea covers enough use-cases that we possibly don't need the
same-transaction test at all. Furthermore I'm doubtful that the heuristic
form of the same-transaction test is adequate to satisfy the use-cases
that the blacklist test doesn't cover. So I think we should remove that
test and see whether we get any complaints, and if so what the details of
the real-world use-cases look like.

Let's ask a couple of users who I think are or have been actually
hurting on this point. Christophe and David, any opinions?

cheers

andrew

--
Andrew Dunstan https://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#22Christophe Pettus
christophe.pettus@pgexperts.com
In reply to: Andrew Dunstan (#21)
Re: BUG #14825: enum type: unsafe use?

On Sep 25, 2017, at 07:55, Andrew Dunstan <andrew.dunstan@2ndquadrant.com> wrote:
Let's ask a couple of users who I think are or have been actually
hurting on this point. Christophe and David, any opinions?

Since about 90% of what I encounter in this area are automatically-generated migrations, having a clear set of (perhaps restrictive) rules which never fail is the most important. It's easy to split the CREATE or ALTERs out into their own transaction, and leave usage (such as populating a table from a migration) to a second transaction.

It's not clear to me that this is a vote either way, but I think the easiest thing to explain ("you cannot use a new enum value in the same transaction that created it") is the best in this situation.

--
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

#23David E. Wheeler
david@justatheory.com
In reply to: Andrew Dunstan (#21)
Re: BUG #14825: enum type: unsafe use?

On Sep 25, 2017, at 10:55, Andrew Dunstan <andrew.dunstan@2ndquadrant.com> wrote:

Let's ask a couple of users who I think are or have been actually
hurting on this point. Christophe and David, any opinions?

If I understand the issue correctly, I think I’d be fine with requiring ALTER TYPE ADD LABEL to be disallowed in a transaction that also CREATEs the type if it’s not currently possible to reliably tell when an enum was created in a transaction. Once you can do that, then by all means allow it!

My $2.

Best,

David

#24Andrew Dunstan
andrew.dunstan@2ndquadrant.com
In reply to: David E. Wheeler (#23)
Re: BUG #14825: enum type: unsafe use?

On 09/25/2017 01:34 PM, David E. Wheeler wrote:

On Sep 25, 2017, at 10:55, Andrew Dunstan <andrew.dunstan@2ndquadrant.com> wrote:

Let's ask a couple of users who I think are or have been actually
hurting on this point. Christophe and David, any opinions?

If I understand the issue correctly, I think I’d be fine with requiring ALTER TYPE ADD LABEL to be disallowed in a transaction that also CREATEs the type if it’s not currently possible to reliably tell when an enum was created in a transaction. Once you can do that, then by all means allow it!

OK, that seems to be the consensus. So let's apply the blacklist patch
and then separately remove the 'created in the same transaction' test.
We'll need to adjust the regression tests and docs accordingly.

cheers

andrew

--
Andrew Dunstan https://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

--
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

#25Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andrew Dunstan (#24)
Re: [BUGS] BUG #14825: enum type: unsafe use?

Andrew Dunstan <andrew.dunstan@2ndquadrant.com> writes:

OK, that seems to be the consensus. So let's apply the blacklist patch
and then separately remove the 'created in the same transaction' test.
We'll need to adjust the regression tests and docs accordingly.

Agreed. I'll work on that in a little bit.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#26Tom Lane
tgl@sss.pgh.pa.us
In reply to: Tom Lane (#25)
Re: BUG #14825: enum type: unsafe use?

I wrote:

Andrew Dunstan <andrew.dunstan@2ndquadrant.com> writes:

OK, that seems to be the consensus. So let's apply the blacklist patch
and then separately remove the 'created in the same transaction' test.
We'll need to adjust the regression tests and docs accordingly.

Agreed. I'll work on that in a little bit.

Pushed; sorry for the delay.

I noticed that the blacklist mechanism effectively removed the prohibition
against using a renamed enum value later in the same transaction, so I
added a regression test for that. Also, as committed, I used RENAME TYPE
rather than ALTER OWNER in the test cases requiring an updated pg_type
row. That way we don't need to create a role, even a transient one, which
is a good thing in terms of not risking collisions with other sessions.

regards, tom lane

--
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

#27Tom Lane
tgl@sss.pgh.pa.us
In reply to: Tom Lane (#26)
Re: BUG #14825: enum type: unsafe use?

I wrote:

Pushed; sorry for the delay.

... and the buildfarm's not too happy. It looks like force_parallel_mode
breaks all the regression test cases around unsafe enums; which on
reflection is unsurprising, because parallel workers will not have access
to the parent's blacklist hash, so they will think unsafe values are safe.

Now, as long as parallel workers are read-only, perhaps this matters
little; they would not be allowed to write unsafe values into tables
anyway. I'm concerned though about whether it might be possible for a
parallel worker to return an unsafe value to the parent (in OID form)
and then the parent writes it into a table. If we can convince ourselves
that's not possible, it might be okay to just turn off force_parallel_mode
for these test cases.

A safer answer would be to mark enum_in() and other callers of
check_safe_enum_use() as parallel-restricted. That'd require a
post-RC1 catversion bump, which seems pretty unpleasant, but
none of the other answers are nice either.

Transmitting the blacklist hash to workers would be a good long-term
answer, but I don't want to try to shoehorn it in for v10.

Another idea is that maybe the existence of a blacklist hash should
be enough to turn off parallel mode altogether ... but ugh.

Or maybe we're back to "revert the whole feature, go back to 9.6
behavior".

Thoughts?

regards, tom lane

--
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

#28Andrew Dunstan
andrew.dunstan@2ndquadrant.com
In reply to: Tom Lane (#27)
Re: BUG #14825: enum type: unsafe use?

On 09/26/2017 02:37 PM, Tom Lane wrote:

I wrote:

Pushed; sorry for the delay.

... and the buildfarm's not too happy. It looks like force_parallel_mode
breaks all the regression test cases around unsafe enums; which on
reflection is unsurprising, because parallel workers will not have access
to the parent's blacklist hash, so they will think unsafe values are safe.

Now, as long as parallel workers are read-only, perhaps this matters
little; they would not be allowed to write unsafe values into tables
anyway. I'm concerned though about whether it might be possible for a
parallel worker to return an unsafe value to the parent (in OID form)
and then the parent writes it into a table. If we can convince ourselves
that's not possible, it might be okay to just turn off force_parallel_mode
for these test cases.

A safer answer would be to mark enum_in() and other callers of
check_safe_enum_use() as parallel-restricted. That'd require a
post-RC1 catversion bump, which seems pretty unpleasant, but
none of the other answers are nice either.

Transmitting the blacklist hash to workers would be a good long-term
answer, but I don't want to try to shoehorn it in for v10.

Another idea is that maybe the existence of a blacklist hash should
be enough to turn off parallel mode altogether ... but ugh.

Or maybe we're back to "revert the whole feature, go back to 9.6
behavior".

Thoughts?

I think I would mark enum_in and friends as parallel-restricted. Yes I
know it would involve a cat version bump, so I'll understand if that's
not acceptable, but it seems to me the best of a bad bunch of choices.
Second choice might be turning off parallel mode if the hash exists, but
I'm unclear how that would work.

cheers

andrew

--

Andrew Dunstan https://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

--
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

#29Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andrew Dunstan (#28)
Re: [BUGS] BUG #14825: enum type: unsafe use?

Andrew Dunstan <andrew.dunstan@2ndquadrant.com> writes:

On 09/26/2017 02:37 PM, Tom Lane wrote:

... and the buildfarm's not too happy. It looks like force_parallel_mode
breaks all the regression test cases around unsafe enums; which on
reflection is unsurprising, because parallel workers will not have access
to the parent's blacklist hash, so they will think unsafe values are safe.

I think I would mark enum_in and friends as parallel-restricted. Yes I
know it would involve a cat version bump, so I'll understand if that's
not acceptable, but it seems to me the best of a bad bunch of choices.
Second choice might be turning off parallel mode if the hash exists, but
I'm unclear how that would work.

Meh. I'm starting to slide back to my original opinion that we should
revert back to 9.6 behavior. Even if a post-RC1 catversion bump is OK,
making these sorts of changes a week before GA is not comfort inducing.
I'm losing faith that we've thought through the issue thoroughly, and
there's no longer time to catch any remaining oversights through testing.

Any other votes out there?

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#30Bruce Momjian
bruce@momjian.us
In reply to: Tom Lane (#29)
Re: [BUGS] BUG #14825: enum type: unsafe use?

On Tue, Sep 26, 2017 at 04:07:02PM -0400, Tom Lane wrote:

Andrew Dunstan <andrew.dunstan@2ndquadrant.com> writes:

On 09/26/2017 02:37 PM, Tom Lane wrote:

... and the buildfarm's not too happy. It looks like force_parallel_mode
breaks all the regression test cases around unsafe enums; which on
reflection is unsurprising, because parallel workers will not have access
to the parent's blacklist hash, so they will think unsafe values are safe.

I think I would mark enum_in and friends as parallel-restricted. Yes I
know it would involve a cat version bump, so I'll understand if that's
not acceptable, but it seems to me the best of a bad bunch of choices.
Second choice might be turning off parallel mode if the hash exists, but
I'm unclear how that would work.

Meh. I'm starting to slide back to my original opinion that we should
revert back to 9.6 behavior. Even if a post-RC1 catversion bump is OK,
making these sorts of changes a week before GA is not comfort inducing.
I'm losing faith that we've thought through the issue thoroughly, and
there's no longer time to catch any remaining oversights through testing.

Any other votes out there?

Well, I was concerned yesterday that we had a broken build farm so close
to release. (I got consistent regression failures.) I think PG 11 would
be better for this feature change, so I support reverting this.

--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ As you are, so once was I.  As I am, so you will be. +
+                      Ancient Roman grave inscription +

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#31Tom Lane
tgl@sss.pgh.pa.us
In reply to: Bruce Momjian (#30)
Re: [HACKERS] BUG #14825: enum type: unsafe use?

Bruce Momjian <bruce@momjian.us> writes:

On Tue, Sep 26, 2017 at 04:07:02PM -0400, Tom Lane wrote:

Any other votes out there?

Well, I was concerned yesterday that we had a broken build farm so close
to release. (I got consistent regression failures.) I think PG 11 would
be better for this feature change, so I support reverting this.

I'll take the blame for (most of) yesterday's failures in the v10
branch, but they were unrelated to this patch --- they were because
of that SIGBUS patch I messed up. So that doesn't seem like a very
applicable argument. Still, it's true that this seems like the most
consequential patch that's gone into v10 post-RC1, certainly so if
you discount stuff that was back-patched further than v10.

regards, tom lane

--
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

#32Stephen Frost
sfrost@snowman.net
In reply to: Tom Lane (#31)
Re: [BUGS] BUG #14825: enum type: unsafe use?

Tom, all,

* Tom Lane (tgl@sss.pgh.pa.us) wrote:

Bruce Momjian <bruce@momjian.us> writes:

On Tue, Sep 26, 2017 at 04:07:02PM -0400, Tom Lane wrote:

Any other votes out there?

Well, I was concerned yesterday that we had a broken build farm so close
to release. (I got consistent regression failures.) I think PG 11 would
be better for this feature change, so I support reverting this.

I'll take the blame for (most of) yesterday's failures in the v10
branch, but they were unrelated to this patch --- they were because
of that SIGBUS patch I messed up. So that doesn't seem like a very
applicable argument. Still, it's true that this seems like the most
consequential patch that's gone into v10 post-RC1, certainly so if
you discount stuff that was back-patched further than v10.

I've not been following along very closely- are we sure that ripping
this out won't be worse than dealing with it in-place? Will pulling it
out also require a post-RC1 catversion bump?

If we can pull it out without bumping catversion and with confidence
that it won't cause more problems then, as much as I hate it, I'm
inclined to say we pull it out and come back to it in v11. I really
don't like the idea of a post-rc1 catversion bump and it doesn't seem
like there's a good solution here that doesn't involve more changes and
most likely a catversion bump. If it was reasonably fixable with only
small/local changes and without a catversion bump then I'd be more
inclined to keep it, but I gather from the discussion that's not the
case.

Thanks!

Stephen

#33Bruce Momjian
bruce@momjian.us
In reply to: Tom Lane (#31)
Re: [HACKERS] BUG #14825: enum type: unsafe use?

On Tue, Sep 26, 2017 at 05:32:15PM -0400, Tom Lane wrote:

Bruce Momjian <bruce@momjian.us> writes:

On Tue, Sep 26, 2017 at 04:07:02PM -0400, Tom Lane wrote:

Any other votes out there?

Well, I was concerned yesterday that we had a broken build farm so close
to release. (I got consistent regression failures.) I think PG 11 would
be better for this feature change, so I support reverting this.

I'll take the blame for (most of) yesterday's failures in the v10
branch, but they were unrelated to this patch --- they were because
of that SIGBUS patch I messed up. So that doesn't seem like a very
applicable argument. Still, it's true that this seems like the most
consequential patch that's gone into v10 post-RC1, certainly so if
you discount stuff that was back-patched further than v10.

Oh, I couldn't untangle that the regression failures were unrelated to
enums, so please ignore my opinion.

--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ As you are, so once was I.  As I am, so you will be. +
+                      Ancient Roman grave inscription +

--
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

#34Andrew Dunstan
andrew.dunstan@2ndquadrant.com
In reply to: Stephen Frost (#32)
Re: [BUGS] BUG #14825: enum type: unsafe use?

On 09/26/2017 05:45 PM, Stephen Frost wrote:

I've not been following along very closely- are we sure that ripping
this out won't be worse than dealing with it in-place? Will pulling it
out also require a post-RC1 catversion bump?

It shouldn't do AFAIK - the function signatures weren't changed.

cheers

andrew

--
Andrew Dunstan https://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#35Andrew Dunstan
andrew.dunstan@2ndquadrant.com
In reply to: Andrew Dunstan (#34)
Re: [HACKERS] BUG #14825: enum type: unsafe use?

On 09/26/2017 06:04 PM, Andrew Dunstan wrote:

On 09/26/2017 05:45 PM, Stephen Frost wrote:

I've not been following along very closely- are we sure that ripping
this out won't be worse than dealing with it in-place? Will pulling it
out also require a post-RC1 catversion bump?

It shouldn't do AFAIK - the function signatures weren't changed.

At this stage on reflection I agree it should be pulled :-(

I'm not happy about the idea of marking an input function as not
parallel safe, certainly not without a good deal of thought and
discussion that we don't have time for this cycle.

cheers

andrew

--
Andrew Dunstan https://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

--
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

#36Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andrew Dunstan (#35)
Re: [HACKERS] BUG #14825: enum type: unsafe use?

Andrew Dunstan <andrew.dunstan@2ndquadrant.com> writes:

I'm not happy about the idea of marking an input function as not
parallel safe, certainly not without a good deal of thought and
discussion that we don't have time for this cycle.

Yeah, that aspect of it was bothering me too: it's easy to say
"mark the function unsafe", but that only helps to the extent that
the function is used in queries where the planner has control of
whether to parallelize or not. There's an awful lot of hard-wired
calls to I/O functions in our code, and I would not want to promise
that none of those are reachable in a parallel worker.

As for Stephen's concern, I had already looked at reverting 15bc038f9
earlier, and concluded that none of that code had changed significantly
since then. There's some conflicts due to pgindent activity but I think
pulling it out will be a straightforward thing to do.

regards, tom lane

--
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

#37Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andrew Dunstan (#35)
Re: [HACKERS] BUG #14825: enum type: unsafe use?

Andrew Dunstan <andrew.dunstan@2ndquadrant.com> writes:

At this stage on reflection I agree it should be pulled :-(

That seems to be the consensus, so I'll go make it happen.

I'm not happy about the idea of marking an input function as not
parallel safe, certainly not without a good deal of thought and
discussion that we don't have time for this cycle.

I think the way forward is to do what we had as of HEAD (984c92074),
but add the ability to transmit the blacklist table to parallel
workers. Since we expect the blacklist table would be empty most of
the time, this should be close to no overhead in practice. I concur
that the idea of marking the relevant functions parallel-restricted is
probably not as safe a fix as I originally thought, and it's not a
very desirable restriction even if it did fix the problem.

regards, tom lane

--
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

#38Andrew Dunstan
andrew.dunstan@2ndquadrant.com
In reply to: Tom Lane (#37)
Re: [HACKERS] BUG #14825: enum type: unsafe use?

On 09/27/2017 02:52 PM, Tom Lane wrote:

Andrew Dunstan <andrew.dunstan@2ndquadrant.com> writes:

At this stage on reflection I agree it should be pulled :-(

That seems to be the consensus, so I'll go make it happen.

I'm not happy about the idea of marking an input function as not
parallel safe, certainly not without a good deal of thought and
discussion that we don't have time for this cycle.

I think the way forward is to do what we had as of HEAD (984c92074),
but add the ability to transmit the blacklist table to parallel
workers. Since we expect the blacklist table would be empty most of
the time, this should be close to no overhead in practice. I concur
that the idea of marking the relevant functions parallel-restricted is
probably not as safe a fix as I originally thought, and it's not a
very desirable restriction even if it did fix the problem.

Do you have any suggestion as to how we should transmit the blacklist to
parallel workers?

cheers

andrew

--
Andrew Dunstan https://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

--
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

#39Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andrew Dunstan (#38)
Re: [HACKERS] BUG #14825: enum type: unsafe use?

Andrew Dunstan <andrew.dunstan@2ndquadrant.com> writes:

Do you have any suggestion as to how we should transmit the blacklist to
parallel workers?

Perhaps serialize the contents into an array in DSM, then rebuild a hash
table from that in the worker. Robert might have a better idea though.

regards, tom lane

--
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

#40Andres Freund
andres@anarazel.de
In reply to: Andrew Dunstan (#38)
Re: [BUGS] BUG #14825: enum type: unsafe use?

On 2017-10-03 19:53:41 -0400, Andrew Dunstan wrote:

On 09/27/2017 02:52 PM, Tom Lane wrote:

Andrew Dunstan <andrew.dunstan@2ndquadrant.com> writes:

At this stage on reflection I agree it should be pulled :-(

That seems to be the consensus, so I'll go make it happen.

I'm not happy about the idea of marking an input function as not
parallel safe, certainly not without a good deal of thought and
discussion that we don't have time for this cycle.

I think the way forward is to do what we had as of HEAD (984c92074),
but add the ability to transmit the blacklist table to parallel
workers. Since we expect the blacklist table would be empty most of
the time, this should be close to no overhead in practice. I concur
that the idea of marking the relevant functions parallel-restricted is
probably not as safe a fix as I originally thought, and it's not a
very desirable restriction even if it did fix the problem.

Do you have any suggestion as to how we should transmit the blacklist to
parallel workers?

How about storing them in the a dshash table instead of dynahash?
Similar to how we're now dealing with the shared typmod registry stuff?
It should be fairly simple to now simply add a new struct Session member
shared_enum_whatevs_table.

Greetings,

Andres Freund

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#41Robert Haas
robertmhaas@gmail.com
In reply to: Andres Freund (#40)
Re: [HACKERS] BUG #14825: enum type: unsafe use?

On Tue, Oct 3, 2017 at 9:38 PM, Andres Freund <andres@anarazel.de> wrote:

Do you have any suggestion as to how we should transmit the blacklist to
parallel workers?

How about storing them in the a dshash table instead of dynahash?
Similar to how we're now dealing with the shared typmod registry stuff?
It should be fairly simple to now simply add a new struct Session member
shared_enum_whatevs_table.

Yeah, that approach seems worth exploring.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

#42Thomas Munro
thomas.munro@enterprisedb.com
In reply to: Robert Haas (#41)
Re: [HACKERS] [BUGS] BUG #14825: enum type: unsafe use?

On Fri, Oct 6, 2017 at 2:45 AM, Robert Haas <robertmhaas@gmail.com> wrote:

On Tue, Oct 3, 2017 at 9:38 PM, Andres Freund <andres@anarazel.de> wrote:

Do you have any suggestion as to how we should transmit the blacklist to
parallel workers?

How about storing them in the a dshash table instead of dynahash?
Similar to how we're now dealing with the shared typmod registry stuff?
It should be fairly simple to now simply add a new struct Session member
shared_enum_whatevs_table.

Yeah, that approach seems worth exploring.

Andrew Dunstan asked me off-list which README covers that stuff. Erm,
there isn't one, so I was going to write some explanation here to see
if that could help... but after looking at this I'm not sure I agree
it's the right approach anyway.

The reason commit cc5f8136 introduced Session and
SharedRecordTypmodRegistry was that we have some state that is
session-scoped and writable by any worker. In contrast:

1. The enum OID blacklist has transaction scope. If you wanted to
put it into the Session you'd have to explicitly zap it at end of
transaction. Incidentally dshash has no fast reset, though that could
be added, but I'd probably want fast per-transaction cleanup to skip
retail destruction entirely and simply give back all the memory, just
like MemoryContext does. There are other transaction-scoped things
we'll eventually want to share, like ComboCIDs, so I think we'll need
something like that, but no one has been brave enough to propose the
machinery yet.

2. The enum OID blacklist is read-only for workers. They don't
create new blacklisted enums and don't see that they ever will.

So I agree with Tom's suggestion:

On Wed, Oct 4, 2017 at 2:29 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Perhaps serialize the contents into an array in DSM, then rebuild a hash
table from that in the worker. Robert might have a better idea though.

I'd happily volunteer to write or review a patch to do that. Is there
a rebase of the stuff that got reverted, to build on?

--
Thomas Munro
http://www.enterprisedb.com

#43Robert Haas
robertmhaas@gmail.com
In reply to: Thomas Munro (#42)
Re: [HACKERS] [BUGS] BUG #14825: enum type: unsafe use?

On Thu, Jan 11, 2018 at 6:01 PM, Thomas Munro
<thomas.munro@enterprisedb.com> wrote:

[ the data isn't session lifetime ]

So I agree with Tom's suggestion:

On Wed, Oct 4, 2017 at 2:29 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Perhaps serialize the contents into an array in DSM, then rebuild a hash
table from that in the worker. Robert might have a better idea though.

I'd happily volunteer to write or review a patch to do that. Is there
a rebase of the stuff that got reverted, to build on?

Those seem like reasons not to use Session, but not necessarily
reasons not to have the leader directly build the dshash that the
workers access rather than building a separate hash table in every
worker.

Maybe having every worker build a separate hash table is a good idea
for some reason, but it's not clear to me that you've stated such a
reason.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#44Thomas Munro
thomas.munro@enterprisedb.com
In reply to: Robert Haas (#43)
Re: [HACKERS] [BUGS] BUG #14825: enum type: unsafe use?

On Fri, Jan 12, 2018 at 4:19 PM, Robert Haas <robertmhaas@gmail.com> wrote:

On Thu, Jan 11, 2018 at 6:01 PM, Thomas Munro
<thomas.munro@enterprisedb.com> wrote:

[ the data isn't session lifetime ]

So I agree with Tom's suggestion:

On Wed, Oct 4, 2017 at 2:29 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Perhaps serialize the contents into an array in DSM, then rebuild a hash
table from that in the worker. Robert might have a better idea though.

I'd happily volunteer to write or review a patch to do that. Is there
a rebase of the stuff that got reverted, to build on?

Those seem like reasons not to use Session, but not necessarily
reasons not to have the leader directly build the dshash that the
workers access rather than building a separate hash table in every
worker.

Are you saying we should do the work now to create a per-transaction
DSM segment + DSA area + thing that every backend attaches to? I
guess that would be much like the per-session one, except that we'd
reset it and end of xact (a DSA facility we don't have yet but which
would amount to zapping all superblocks, freeing all but one segment
or something). Like the per-session one, I suppose it would be
created on demand when you run your first parallel query, and then
survive as long as the leader backend. I assumed that we'd do that
work when we really need it for writable parallel query, but that it'd
be overkill for this.

Note that the shared record thing still has the backend local cache in
front of the shared registry, to avoid acquiring locks, so doesn't
actually avoid creating a per-backend hash table, though its entries
point directly to TupleDesc objects in shm.

Maybe having every worker build a separate hash table is a good idea
for some reason, but it's not clear to me that you've stated such a
reason.

I didn't think creating backend local hash tables would be a problem
because it's a vanishingly rare occurrence for the hash table to be
created at all (ie when you've altered an enum), and if created, to
have more than a couple of entries in it. But here's another idea, at
the small end of the how-much-work spectrum:

1. Put the OIDs into a sorted array in the DSM segment.
2. Arrange for EnumBlacklisted() (assuming we're talking about the
infrastructure from commit 1635e80d that was later reverted) to get
its hands on a pointer to that + size and binary search it, instead of
looking in the hash table (enum_blacklist), when running in a worker.

--
Thomas Munro
http://www.enterprisedb.com

#45Robert Haas
robertmhaas@gmail.com
In reply to: Thomas Munro (#44)
Re: [HACKERS] [BUGS] BUG #14825: enum type: unsafe use?

On Thu, Jan 11, 2018 at 11:01 PM, Thomas Munro
<thomas.munro@enterprisedb.com> wrote:

Are you saying we should do the work now to create a per-transaction
DSM segment + DSA area + thing that every backend attaches to?

No, I was just thinking you could stuff it into the per-parallel-query
DSM/DSA. But...

I didn't think creating backend local hash tables would be a problem
because it's a vanishingly rare occurrence for the hash table to be
created at all (ie when you've altered an enum), and if created, to
have more than a couple of entries in it.

...this is also a fair point.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#46Andres Freund
andres@anarazel.de
In reply to: Robert Haas (#45)
Re: [HACKERS] [BUGS] BUG #14825: enum type: unsafe use?

On 2018-01-12 07:51:34 -0500, Robert Haas wrote:

On Thu, Jan 11, 2018 at 11:01 PM, Thomas Munro
<thomas.munro@enterprisedb.com> wrote:

Are you saying we should do the work now to create a per-transaction
DSM segment + DSA area + thing that every backend attaches to?

No, I was just thinking you could stuff it into the per-parallel-query
DSM/DSA. But...

I didn't think creating backend local hash tables would be a problem
because it's a vanishingly rare occurrence for the hash table to be
created at all (ie when you've altered an enum), and if created, to
have more than a couple of entries in it.

...this is also a fair point.

OTOH, it seems quite likely that we'll add more transaction-lifetime
shared data (e.g. combocid), so building per-xact infrastructure
actually seems like a good idea.

Greetings,

Andres Freund

#47Robert Haas
robertmhaas@gmail.com
In reply to: Andres Freund (#46)
Re: [HACKERS] [BUGS] BUG #14825: enum type: unsafe use?

On Fri, Jan 12, 2018 at 5:06 PM, Andres Freund <andres@anarazel.de> wrote:

OTOH, it seems quite likely that we'll add more transaction-lifetime
shared data (e.g. combocid), so building per-xact infrastructure
actually seems like a good idea.

Sure, but there's no urgency about it. Particularly if the first cut
at this is implemented using dshash, moving the dshash to a different
DSA with a different lifespan later should be easy.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#48Thomas Munro
thomas.munro@enterprisedb.com
In reply to: Thomas Munro (#42)
1 attachment(s)
Re: [HACKERS] [BUGS] BUG #14825: enum type: unsafe use?

On Thu, Jan 11, 2018 at 3:01 PM Thomas Munro
<thomas.munro@enterprisedb.com> wrote:

So I agree with Tom's suggestion:

On Wed, Oct 4, 2017 at 2:29 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Perhaps serialize the contents into an array in DSM, then rebuild a hash
table from that in the worker. Robert might have a better idea though.

I'd happily volunteer to write or review a patch to do that. Is there
a rebase of the stuff that got reverted, to build on?

Here is a draft patch showing the approach discussed for transmitting
enum_blacklist in parallel workers. This should be useful for
reviving the code reverted by 93a1af0b.

--
Thomas Munro
http://www.enterprisedb.com

Attachments:

serialize-enum-blacklist.patchapplication/x-patch; name=serialize-enum-blacklist.patchDownload
diff --git a/src/backend/access/transam/parallel.c b/src/backend/access/transam/parallel.c
index c168118467..c95caee017 100644
--- a/src/backend/access/transam/parallel.c
+++ b/src/backend/access/transam/parallel.c
@@ -19,6 +19,7 @@
 #include "access/session.h"
 #include "access/xact.h"
 #include "access/xlog.h"
+#include "catalog/pg_enum.h"
 #include "catalog/index.h"
 #include "catalog/namespace.h"
 #include "commands/async.h"
@@ -71,6 +72,7 @@
 #define PARALLEL_KEY_SESSION_DSM			UINT64CONST(0xFFFFFFFFFFFF000A)
 #define PARALLEL_KEY_REINDEX_STATE			UINT64CONST(0xFFFFFFFFFFFF000B)
 #define PARALLEL_KEY_RELMAPPER_STATE		UINT64CONST(0xFFFFFFFFFFFF000C)
+#define PARALLEL_KEY_ENUMBLACKLIST			UINT64CONST(0xFFFFFFFFFFFF000D)
 
 /* Fixed-size parallel state. */
 typedef struct FixedParallelState
@@ -208,6 +210,7 @@ InitializeParallelDSM(ParallelContext *pcxt)
 	Size		tstatelen = 0;
 	Size		reindexlen = 0;
 	Size		relmapperlen = 0;
+	Size		enumblacklistlen = 0;
 	Size		segsize = 0;
 	int			i;
 	FixedParallelState *fps;
@@ -261,8 +264,10 @@ InitializeParallelDSM(ParallelContext *pcxt)
 		shm_toc_estimate_chunk(&pcxt->estimator, reindexlen);
 		relmapperlen = EstimateRelationMapSpace();
 		shm_toc_estimate_chunk(&pcxt->estimator, relmapperlen);
+		enumblacklistlen = EstimateEnumBlacklistSpace();
+		shm_toc_estimate_chunk(&pcxt->estimator, enumblacklistlen);
 		/* If you add more chunks here, you probably need to add keys. */
-		shm_toc_estimate_keys(&pcxt->estimator, 9);
+		shm_toc_estimate_keys(&pcxt->estimator, 10);
 
 		/* Estimate space need for error queues. */
 		StaticAssertStmt(BUFFERALIGN(PARALLEL_ERROR_QUEUE_SIZE) ==
@@ -336,6 +341,7 @@ InitializeParallelDSM(ParallelContext *pcxt)
 		char	   *error_queue_space;
 		char	   *session_dsm_handle_space;
 		char	   *entrypointstate;
+		char	   *enumblacklistspace;
 		Size		lnamelen;
 
 		/* Serialize shared libraries we have loaded. */
@@ -385,6 +391,12 @@ InitializeParallelDSM(ParallelContext *pcxt)
 		shm_toc_insert(pcxt->toc, PARALLEL_KEY_RELMAPPER_STATE,
 					   relmapperspace);
 
+		/* Serialize enum blacklist state. */
+		enumblacklistspace = shm_toc_allocate(pcxt->toc, enumblacklistlen);
+		SerializeEnumBlacklist(enumblacklistspace);
+		shm_toc_insert(pcxt->toc, PARALLEL_KEY_ENUMBLACKLIST,
+					   enumblacklistspace);
+
 		/* Allocate space for worker information. */
 		pcxt->worker = palloc0(sizeof(ParallelWorkerInfo) * pcxt->nworkers);
 
@@ -1218,6 +1230,7 @@ ParallelWorkerMain(Datum main_arg)
 	char	   *tstatespace;
 	char	   *reindexspace;
 	char	   *relmapperspace;
+	char	   *enumblacklistspace;
 	StringInfoData msgbuf;
 	char	   *session_dsm_handle_space;
 
@@ -1397,6 +1410,11 @@ ParallelWorkerMain(Datum main_arg)
 	relmapperspace = shm_toc_lookup(toc, PARALLEL_KEY_RELMAPPER_STATE, false);
 	RestoreRelationMap(relmapperspace);
 
+	/* Restore enum blacklist. */
+	enumblacklistspace = shm_toc_lookup(toc, PARALLEL_KEY_ENUMBLACKLIST,
+										false);
+	RestoreEnumBlacklist(enumblacklistspace);
+
 	/*
 	 * We've initialized all of our state now; nothing should change
 	 * hereafter.
diff --git a/src/backend/catalog/pg_enum.c b/src/backend/catalog/pg_enum.c
index fde361f367..87a12996c7 100644
--- a/src/backend/catalog/pg_enum.c
+++ b/src/backend/catalog/pg_enum.c
@@ -28,6 +28,7 @@
 #include "utils/builtins.h"
 #include "utils/catcache.h"
 #include "utils/fmgroids.h"
+#include "utils/memutils.h"
 #include "utils/syscache.h"
 #include "utils/tqual.h"
 
@@ -38,6 +39,7 @@ Oid			binary_upgrade_next_pg_enum_oid = InvalidOid;
 static void RenumberEnumType(Relation pg_enum, HeapTuple *existing, int nelems);
 static int	sort_order_cmp(const void *p1, const void *p2);
 
+static HTAB *enum_blacklist = NULL;
 
 /*
  * EnumValuesCreate
@@ -620,3 +622,56 @@ sort_order_cmp(const void *p1, const void *p2)
 	else
 		return 0;
 }
+
+Size
+EstimateEnumBlacklistSpace(void)
+{
+	if (!enum_blacklist)
+		return sizeof(Oid);
+	return sizeof(Oid) + sizeof(Oid) * hash_get_num_entries(enum_blacklist);
+}
+
+void
+SerializeEnumBlacklist(void *space)
+{
+	Oid *serialized = (Oid *) space;
+	HASH_SEQ_STATUS status;
+	Oid *value;
+
+	if (!enum_blacklist)
+	{
+		*serialized = 0;
+		return;
+	}
+
+	*serialized = hash_get_num_entries(enum_blacklist);
+	hash_seq_init(&status, enum_blacklist);
+	while ((value = (Oid *) hash_seq_search(&status)))
+		*serialized++ = *value;
+}
+
+void
+RestoreEnumBlacklist(void *space)
+{
+	Oid *serialized = (Oid *) space;
+	HASHCTL     hash_ctl;
+	int			num_elements;
+
+	Assert(!enum_blacklist);
+
+	num_elements = *serialized++;
+	if (num_elements == 0)
+		return;
+
+	memset(&hash_ctl, 0, sizeof(hash_ctl));
+	hash_ctl.keysize = sizeof(Oid);
+	hash_ctl.entrysize = sizeof(Oid);
+	hash_ctl.hcxt = TopTransactionContext;
+	enum_blacklist = hash_create("Enum value blacklist",
+								 32,
+								 &hash_ctl,
+								 HASH_ELEM | HASH_BLOBS | HASH_CONTEXT);
+
+	while (num_elements-- > 0)
+		hash_search(enum_blacklist, serialized++, HASH_ENTER, NULL);
+}
diff --git a/src/include/catalog/pg_enum.h b/src/include/catalog/pg_enum.h
index e6fd41e5b2..f5a71921a3 100644
--- a/src/include/catalog/pg_enum.h
+++ b/src/include/catalog/pg_enum.h
@@ -52,5 +52,8 @@ extern void AddEnumLabel(Oid enumTypeOid, const char *newVal,
 			 bool skipIfExists);
 extern void RenameEnumLabel(Oid enumTypeOid,
 				const char *oldVal, const char *newVal);
+extern Size EstimateEnumBlacklistSpace(void);
+extern void SerializeEnumBlacklist(void *space);
+extern void RestoreEnumBlacklist(void *space);
 
 #endif							/* PG_ENUM_H */