ERROR: cache lookup failed for collation 0 on DELETE query after upgrading from 9.X to 12.3

Started by cenover 5 years ago5 messagesbugs
Jump to latest
#1cen
cen.is.imba@gmail.com

Hi

we are upgrading to 12.3 from 9.X and encountered an error with delete
statements.

2020-07-14 15:26:20.728 CEST [67736] ERROR:  cache lookup failed for
collation 0
2020-07-14 15:26:20.728 CEST [67736] STATEMENT:  delete from my-table
where id='004b8d45-d205-4036-9e70-c8340a015674'

ID column is of type UUID. Additionally, we have an implicit cast
function from varchar to UUID due to a JPA implementation which is
having problems handling of UUID types.

CREATE FUNCTION varchar_to_uuid(VARCHAR) RETURNS uuid AS $$
SELECT uuid_in($1::cstring);
$$ LANGUAGE sql immutable;
CREATE CAST (VARCHAR AS UUID) WITH FUNCTION varchar_to_uuid(VARCHAR) AS IMPLICIT;

I suspected the problem would be connected to this cast but even after dropping the cast and the function I get the same error so it probably has nothing to do with it.
Other than that there is nothing particularly interesting with this table or the setup.
We encountered the same problem on 12.3 debian docker aswell as MacOS brew package which I use locally for development.

Best regards, cen

#2cen
cen.is.imba@gmail.com
In reply to: cen (#1)
Re: ERROR: cache lookup failed for collation 0 on DELETE query after upgrading from 9.X to 12.3

Additional log line with verbose error logging:

2020-07-14 15:52:46.195 CEST [69575] LOCATION:
get_collation_isdeterministic, lsyscache.c:950

Show quoted text

On 14. 07. 20 15:41, cen wrote:

Hi

we are upgrading to 12.3 from 9.X and encountered an error with delete
statements.

2020-07-14 15:26:20.728 CEST [67736] ERROR:  cache lookup failed for
collation 0
2020-07-14 15:26:20.728 CEST [67736] STATEMENT:  delete from my-table
where id='004b8d45-d205-4036-9e70-c8340a015674'

ID column is of type UUID. Additionally, we have an implicit cast
function from varchar to UUID due to a JPA implementation which is
having problems handling of UUID types.

CREATE FUNCTION varchar_to_uuid(VARCHAR) RETURNS uuid AS $$
SELECT uuid_in($1::cstring);
$$ LANGUAGE sql immutable;
CREATE CAST (VARCHAR AS UUID) WITH FUNCTION varchar_to_uuid(VARCHAR) AS IMPLICIT;

I suspected the problem would be connected to this cast but even after dropping the cast and the function I get the same error so it probably has nothing to do with it.
Other than that there is nothing particularly interesting with this table or the setup.
We encountered the same problem on 12.3 debian docker aswell as MacOS brew package which I use locally for development.

Best regards, cen

#3Tom Lane
tgl@sss.pgh.pa.us
In reply to: cen (#1)
Re: ERROR: cache lookup failed for collation 0 on DELETE query after upgrading from 9.X to 12.3

cen <cen.is.imba@gmail.com> writes:

we are upgrading to 12.3 from 9.X and encountered an error with delete
statements.
2020-07-14 15:26:20.728 CEST [67736] ERROR: cache lookup failed for
collation 0

Please supply a self-contained example. (Mostly, I'm not interested
in trying to guess at your table schema.)

regards, tom lane

#4cen
cen.is.imba@gmail.com
In reply to: Tom Lane (#3)
Re: ERROR: cache lookup failed for collation 0 on DELETE query after upgrading from 9.X to 12.3

Found a repro after some trial and error.

The bug appears when you specify a foreign key of type varchar to point
to primary key of type uuid. This is obviously a developer error for
specifying the wrong type

but somehow this used to work in 9.X but fails with internal error on 12.3.

CREATE TABLE public.revisions
(
    id uuid NOT NULL,
    revisions_previous_id character varying COLLATE
pg_catalog."default", --oops, should have used uuid here
    revisions_next_id character varying COLLATE pg_catalog."default",
--same here..
    customer_notice character varying COLLATE pg_catalog."default",
    CONSTRAINT pk_revisions PRIMARY KEY (id),
    CONSTRAINT fk_revisions_next FOREIGN KEY (revisions_next_id)
        REFERENCES public.revisions (id) MATCH SIMPLE
        ON UPDATE CASCADE
        ON DELETE CASCADE,
    CONSTRAINT fk_revisions_previous FOREIGN KEY (revisions_previous_id)
        REFERENCES public.revisions (id) MATCH SIMPLE
        ON UPDATE CASCADE
        ON DELETE CASCADE
)

TABLESPACE pg_default;

CREATE INDEX idx_fk_revisions_cart_revisions_next_id
    ON public.revisions USING btree
    (revisions_next_id COLLATE pg_catalog."default" ASC NULLS LAST) 
--no collate needed here..
    TABLESPACE pg_default;

CREATE INDEX idx_fk_revisions_cart_revisions_previous_id
    ON public.revisions USING btree
    (revisions_previous_id COLLATE pg_catalog."default" ASC NULLS LAST)
--and here also..
    TABLESPACE pg_default;

INSERT INTO public.revisions(
    id, revisions_previous_id, revisions_next_id, customer_notice)
    VALUES ('5c617ce7-688d-4bea-9d66-c0f0ebc635da', null, null, 'hi');

delete from revisions where id='5c617ce7-688d-4bea-9d66-c0f0ebc635da' --
produces error

I will probably be able to fix our db simply by changing the fk columns
to uuid and redefine the indexes.

I am leaving it to the postgres team to evaluate this bug further
whether it works as expected or whether it is a regression and should be
fixed.

Best regards, cen

Show quoted text

On 14. 07. 20 16:11, Tom Lane wrote:

cen <cen.is.imba@gmail.com> writes:

we are upgrading to 12.3 from 9.X and encountered an error with delete
statements.
2020-07-14 15:26:20.728 CEST [67736] ERROR: cache lookup failed for
collation 0

Please supply a self-contained example. (Mostly, I'm not interested
in trying to guess at your table schema.)

regards, tom lane

#5Tom Lane
tgl@sss.pgh.pa.us
In reply to: cen (#4)
Re: ERROR: cache lookup failed for collation 0 on DELETE query after upgrading from 9.X to 12.3

cen <cen.is.imba@gmail.com> writes:

The bug appears when you specify a foreign key of type varchar to point
to primary key of type uuid.

Ah-hah. For the record, attached is an actually-self-contained test
case.

The problem comes from this bit in RI_FKey_cascade_del, which evidently
was added as part of the nondeterministic-collations patch (5e1963fb7):

if (pk_coll != fk_coll && !get_collation_isdeterministic(pk_coll))
ri_GenerateQualCollation(&querybuf, pk_coll);

which is failing to guard against the possibility that pk_coll is
InvalidOid, i.e. the referenced column is non-collatable.
ri_GenerateQualCollation would correctly do nothing in that case,
but we don't get there because get_collation_isdeterministic has
already fallen over.

While the fix seems relatively straightforward --- probably we just
need to add an OidIsValid(pk_coll) clause here and in the similar tests
elsewhere in ri_triggers.c --- I'm still going to hold Peter's feet
to the fire about this code, because I consider the state of its
documentation to be absolutely unforgivable. There is NO comment
explaining why it'd be appropriate to do this for a nondeterministic
PK collation (and not otherwise). Nor has anything been done to fix the
multiple ways in which this addition falsified ri_GenerateQualCollation's
header comment.

regards, tom lane

Attachments:

ri-collation-bug-example.sqltext/plain; charset=us-ascii; name=ri-collation-bug-example.sqlDownload