XX000: enum value 117721 not found in cache for enum enumcrash

Started by Andres Freundalmost 14 years ago8 messageshackers
Jump to latest
#1Andres Freund
andres@anarazel.de

Hi,

Currently its possible to cause transactions to fail with ALTER ENUM ADD
AFTER/BEFORE:

psql 1:

CREATE TYPE enumcrash AS ENUM('a', 'b');
CREATE FUNCTION randenum() RETURNS enumcrash LANGUAGE sql AS $$SELECT * FROM
unnest(enum_range('a'::enumcrash)) ORDER BY random() LIMIT 1$$;
CREATE TABLE enumcrash_table(id serial primary key, val enumcrash);
CREATE INDEX enumcrash_table__val__id ON enumcrash_table (val, id);
INSERT INTO enumcrash_table (val) SELECT randenum() FROM generate_series(1,
10000);INSERT 0 10000

psql 2:
BEGIN TRANSACTION ISOLATION LEVEL REPEATABLE READ;
INSERT INTO enumcrash_table (val) SELECT randenum() FROM generate_series(1,
10000);

psql 1:
ALTER TYPE enumcrash ADD VALUE 'a_1' AFTER 'a';
INSERT INTO enumcrash_table (val) SELECT randenum() FROM generate_series(1,
10000);

psql 2:
INSERT INTO enumcrash_table (val) SELECT randenum() FROM generate_series(1,
10000);
ERROR: XX000: enum value 117745 not found in cache for enum enumcrash
LOCATION: compare_values_of_enum, typcache.c:957

This is not surprising. psql 2's backend finds rows in the index with enum
values that are not visible in its mvcc snapshot. That mvcc snapshot is needed
because a_1 is an enum value with an uneven numbered oid because its inserted
after the initial creation.

Do we consider that something that needs to be fixed or just something to
document? I can't think of a non-intrusive fix right now.

Greetings,

Andres
--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

#2Robert Haas
robertmhaas@gmail.com
In reply to: Andres Freund (#1)
Re: XX000: enum value 117721 not found in cache for enum enumcrash

On Sat, Jun 30, 2012 at 5:51 AM, Andres Freund <andres@2ndquadrant.com> wrote:

Currently its possible to cause transactions to fail with ALTER ENUM ADD
AFTER/BEFORE:

psql 1:

CREATE TYPE enumcrash AS ENUM('a', 'b');
CREATE FUNCTION randenum() RETURNS enumcrash LANGUAGE sql AS $$SELECT * FROM
unnest(enum_range('a'::enumcrash)) ORDER BY random() LIMIT 1$$;
CREATE TABLE enumcrash_table(id serial primary key, val enumcrash);
CREATE INDEX enumcrash_table__val__id ON enumcrash_table (val, id);
INSERT INTO enumcrash_table (val) SELECT randenum() FROM generate_series(1,
10000);INSERT 0 10000

psql 2:
BEGIN TRANSACTION ISOLATION LEVEL REPEATABLE READ;
INSERT INTO enumcrash_table (val) SELECT randenum() FROM generate_series(1,
10000);

psql 1:
ALTER TYPE enumcrash ADD VALUE 'a_1' AFTER 'a';
INSERT INTO enumcrash_table (val) SELECT randenum() FROM generate_series(1,
10000);

psql 2:
INSERT INTO enumcrash_table (val) SELECT randenum() FROM generate_series(1,
10000);
ERROR: XX000: enum value 117745 not found in cache for enum enumcrash
LOCATION: compare_values_of_enum, typcache.c:957

This is not surprising. psql 2's backend finds rows in the index with enum
values that are not visible in its mvcc snapshot. That mvcc snapshot is needed
because a_1 is an enum value with an uneven numbered oid because its inserted
after the initial creation.

I think the problem is that load_enum_cache_data() uses
GetTransactionSnapshot() rather than GetLatestSnapshot().

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#3Tom Lane
tgl@sss.pgh.pa.us
In reply to: Robert Haas (#2)
Re: XX000: enum value 117721 not found in cache for enum enumcrash

Robert Haas <robertmhaas@gmail.com> writes:

On Sat, Jun 30, 2012 at 5:51 AM, Andres Freund <andres@2ndquadrant.com> wrote:

This is not surprising. psql 2's backend finds rows in the index with enum
values that are not visible in its mvcc snapshot.

I think the problem is that load_enum_cache_data() uses
GetTransactionSnapshot() rather than GetLatestSnapshot().

That would only make the race condition window smaller (ie, hard
to reproduce manually like this, but not gone).

regards, tom lane

#4Tom Lane
tgl@sss.pgh.pa.us
In reply to: Tom Lane (#3)
Re: XX000: enum value 117721 not found in cache for enum enumcrash

I wrote:

Robert Haas <robertmhaas@gmail.com> writes:

I think the problem is that load_enum_cache_data() uses
GetTransactionSnapshot() rather than GetLatestSnapshot().

That would only make the race condition window smaller (ie, hard
to reproduce manually like this, but not gone).

No, wait, we made ALTER TYPE ADD VALUE PreventTransactionChain so that
uncommitted enum OIDs could never get into tables or indexes. So I
think you're right, forcing a new snapshot to be used would fix this.

However, I'm a bit worried by the "if (!FirstSnapshotSet)" restriction
in GetLatestSnapshot. Are we sure that enum comparisons could never
happen without a snapshot already being set? What's the point of
throwing an error there anyway, as opposed to letting it redirect to
GetTransactionSnapshot?

regards, tom lane

#5Robert Haas
robertmhaas@gmail.com
In reply to: Tom Lane (#4)
Re: XX000: enum value 117721 not found in cache for enum enumcrash

On Jul 1, 2012, at 4:18 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

However, I'm a bit worried by the "if (!FirstSnapshotSet)" restriction
in GetLatestSnapshot. Are we sure that enum comparisons could never
happen without a snapshot already being set? What's the point of
throwing an error there anyway, as opposed to letting it redirect to
GetTransactionSnapshot?

I don't know whether it should set the transaction snapshot or just r

#6Robert Haas
robertmhaas@gmail.com
In reply to: Robert Haas (#5)
Re: XX000: enum value 117721 not found in cache for enum enumcrash

On Jul 2, 2012, at 12:04 AM, Robert Haas <robertmhaas@gmail.com> wrote:

On Jul 1, 2012, at 4:18 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

However, I'm a bit worried by the "if (!FirstSnapshotSet)" restriction
in GetLatestSnapshot. Are we sure that enum comparisons could never
happen without a snapshot already being set? What's the point of
throwing an error there anyway, as opposed to letting it redirect to
GetTransactionSnapshot?

I don't know whether it should set the transaction snapshot or just r

Argh, sorry.

...or just return a current snapshot, and I also don't know whether it needs to be changed because of this; but I agree with changing it. Erroring out always seemed kind of pointless to me...

...Robert

#7Tom Lane
tgl@sss.pgh.pa.us
In reply to: Robert Haas (#6)
Re: XX000: enum value 117721 not found in cache for enum enumcrash

Robert Haas <robertmhaas@gmail.com> writes:

On Jul 2, 2012, at 12:04 AM, Robert Haas <robertmhaas@gmail.com> wrote:

On Jul 1, 2012, at 4:18 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

However, I'm a bit worried by the "if (!FirstSnapshotSet)" restriction
in GetLatestSnapshot.

I don't know whether it should set the transaction snapshot or just r

Argh, sorry.
...or just return a current snapshot, and I also don't know whether it needs to be changed because of this; but I agree with changing it. Erroring out always seemed kind of pointless to me...

I think it was coded like that because the sole original use was in
ri_triggers.c, in which it would have been a red flag if no transaction
snapshot already existed. However, the restriction clearly doesn't fit
with GetLatestSnapshot's API spec, so it seems to me to be sensible
to change it (as opposed to, say, inventing a new snapshot function
with a subtly different API spec).

As for creating an MVCC snapshot without causing a transaction snapshot
to be established, no thanks --- that would create a path of control
that exists nowhere today and has gotten no testing at all. I suspect
that it might actively break some assumptions in snapshot management.

regards, tom lane

#8Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Tom Lane (#7)
Re: XX000: enum value 117721 not found in cache for enum enumcrash

Excerpts from Tom Lane's message of lun jul 02 00:24:06 -0400 2012:

Robert Haas <robertmhaas@gmail.com> writes:

On Jul 2, 2012, at 12:04 AM, Robert Haas <robertmhaas@gmail.com> wrote:

On Jul 1, 2012, at 4:18 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

However, I'm a bit worried by the "if (!FirstSnapshotSet)" restriction
in GetLatestSnapshot.

I don't know whether it should set the transaction snapshot or just r

Argh, sorry.
...or just return a current snapshot, and I also don't know whether it needs to be changed because of this; but I agree with changing it. Erroring out always seemed kind of pointless to me...

I think it was coded like that because the sole original use was in
ri_triggers.c, in which it would have been a red flag if no transaction
snapshot already existed.

Yeah, that's correct. I guess I could have made the check an Assert()
instead of elog().

However, the restriction clearly doesn't fit
with GetLatestSnapshot's API spec, so it seems to me to be sensible
to change it (as opposed to, say, inventing a new snapshot function
with a subtly different API spec).

Sounds sensible.

As for creating an MVCC snapshot without causing a transaction snapshot
to be established, no thanks --- that would create a path of control
that exists nowhere today and has gotten no testing at all. I suspect
that it might actively break some assumptions in snapshot management.

I think it should work fine as long as the snapshot is registered, but
as you say it is untested. Maybe it'd have interactions with the way
snapshots connect with resource owners.

--
Álvaro Herrera <alvherre@commandprompt.com>
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support