CHECK that involves a function call behaves differently during bulk load?

Started by Kirk Parkerabout 1 year ago5 messagesbugs
Jump to latest
#1Kirk Parker
khp@equatoria.us

I have run into a problem when using dump/restore to move an existing
system to the current version 17. The restore is failing a column
constraint that works perfectly on the existing system, and also on the new
system once the data is loaded, but fails during the "restore" loading via
psql. I first did the dump in COPY from STDIN mode, then when I
encountered the issue switched to --inserts with no change.

First, here is the function defined for use in column constraints, defined
thus:

-- function is implicitly 'CALLED ON NULL INPUT'
-- since a NULL return value won't work as this function is applied

CREATE FUNCTION public.valid_upc_ean(target character varying) RETURNS
boolean
LANGUAGE plpgsql
AS $$
DECLARE
len integer;
holder varchar;

begin
if target is null then
return true;
end if;
len = length(trim(target));
if len = 12 then
holder := target::upc;
return true;
elsif len = 13 then
holder := target::ean13;
return true;
elsif len = 0 then
return true;
else
return false;
end if;
exception when others then
return false;
end;
$$;

It's used like this:

CREATE TABLE public.inv_variant
(
id integer NOT NULL,
prod_id integer NOT NULL,
sku character varying(126) NOT NULL,
upc character varying(14),
...
CONSTRAINT var_upc_check CHECK (public.valid_upc_ean(upc))
);

As mentioned about, the function fires and works perfectly once the data is
loaded, but the bulk load gives this error:

ERROR: new row for relation "inv_variant" violates check constraint
"var_upc_check"
DETAIL: Failing row contains (2264, 2170, BOX-1, 012345678912, ...

Every row with NULL in this column inserts successfully, every one with
data fails in this way. The data itself is valid -- I can copy the text
starting with the '(' from the DETAIL: Failing row... message, paste it
after "insert into inv_variant", and it executes perfectly--this in the
very same psql where that exact set of insert data is failing on the
redirected dump-file input.

I am quite at a loss about how to even go about troubleshooting this.
Since valid_upc_ean() is a pure function with no side effects or references
to anything other than the table in question, and since it DOES get called
on NULL input... is there a problem with what I'm expecting the tools to
produce as far as dump/restore between versions, or is this exposing an
actual problem or bug somewhere? Note the entire database does have quite
a few other CHECK constraints, but the others all involve only calculations
on the row data, with no calls to user-defined functions.

Thanks!

--Kirk

#2David G. Johnston
david.g.johnston@gmail.com
In reply to: Kirk Parker (#1)
Re: CHECK that involves a function call behaves differently during bulk load?

On Thursday, March 13, 2025, Kirk Parker <khp@equatoria.us> wrote:

CREATE FUNCTION public.valid_upc_ean(target character varying) RETURNS
boolean
LANGUAGE plpgsql
AS $$
DECLARE
len integer;
holder varchar;

begin
if target is null then
return true;
end if;
len = length(trim(target));
if len = 12 then
holder := target::upc;
return true;
elsif len = 13 then
holder := target::ean13;
return true;
elsif len = 0 then
return true;
else
return false;
end if;
exception when others then
return false;
end;
$$;

It's used like this:

CREATE TABLE public.inv_variant
(
id integer NOT NULL,
prod_id integer NOT NULL,
sku character varying(126) NOT NULL,
upc character varying(14),
...
CONSTRAINT var_upc_check CHECK (public.valid_upc_ean(upc))
);

As mentioned about, the function fires and works perfectly once the data
is loaded, but the bulk load gives this error:

ERROR: new row for relation "inv_variant" violates check constraint
"var_upc_check"
DETAIL: Failing row contains (2264, 2170, BOX-1, 012345678912, ...

Every row with NULL in this column inserts successfully, every one with
data fails in this way. The data itself is valid -- I can copy the text
starting with the '(' from the DETAIL: Failing row... message, paste it
after "insert into inv_variant", and it executes perfectly--this in the
very same psql where that exact set of insert data is failing on the
redirected dump-file input.

I am quite at a loss about how to even go about troubleshooting this.

Not discarding useful error messages as you do in the exception block is a
good start.

Since valid_upc_ean() is a pure function with no side effects or
references to anything other than the table in question,

And two non-core data types that lack schema qualifications.

David J.

#3Kirk Parker
khp@equatoria.us
In reply to: David G. Johnston (#2)
Re: CHECK that involves a function call behaves differently during bulk load?

On Thu, Mar 13, 2025 at 2:05 PM David G. Johnston <
david.g.johnston@gmail.com> wrote:

On Thursday, March 13, 2025, Kirk Parker <khp@equatoria.us> wrote:

CREATE FUNCTION public.valid_upc_ean(target character varying)
RETURNS boolean
LANGUAGE plpgsql
AS $$
DECLARE
len integer;
holder varchar;

begin
if target is null then
return true;
end if;
len = length(trim(target));
if len = 12 then
holder := target::upc;
return true;
elsif len = 13 then
holder := target::ean13;
return true;
elsif len = 0 then
return true;
else
return false;
end if;
exception when others then
return false;
end;
$$;

It's used like this:

CREATE TABLE public.inv_variant
(
id integer NOT NULL,
prod_id integer NOT NULL,
sku character varying(126) NOT NULL,
upc character varying(14),
...
CONSTRAINT var_upc_check CHECK (public.valid_upc_ean(upc))
);

As mentioned about, the function fires and works perfectly once the data
is loaded, but the bulk load gives this error:

ERROR: new row for relation "inv_variant" violates check constraint
"var_upc_check"
DETAIL: Failing row contains (2264, 2170, BOX-1, 012345678912, ...

Every row with NULL in this column inserts successfully, every one with
data fails in this way. The data itself is valid -- I can copy the text
starting with the '(' from the DETAIL: Failing row... message, paste it
after "insert into inv_variant", and it executes perfectly--this in the
very same psql where that exact set of insert data is failing on the
redirected dump-file input.

I am quite at a loss about how to even go about troubleshooting this.

Not discarding useful error messages as you do in the exception block is a
good start.

Since valid_upc_ean() is a pure function with no side effects or
references to anything other than the table in question,

And two non-core data types that lack schema qualifications.

David J.

Schema-qualifying the casts to non-core types did the trick, thanks!

But it does raise two questions:

(1) why does it work with individual inserts sent via psql, but not with
the redirected input from the dump?
(2) everything is in the public schema so why wouldn't it find the type
anyway?

Finally, is this maybe something to take up on the documentation list?
There are quite a few example code snippets in the PL/pgSQL chapter, and
not a single instance of a schema-qualifying name among them.

#4David G. Johnston
david.g.johnston@gmail.com
In reply to: Kirk Parker (#3)
Re: CHECK that involves a function call behaves differently during bulk load?

On Thu, Mar 13, 2025 at 3:33 PM Kirk Parker <khp@equatoria.us> wrote:

(1) why does it work with individual inserts sent via psql, but not with
the redirected input from the dump?
(2) everything is in the public schema so why wouldn't it find the type
anyway?

https://www.postgresql.org/docs/current/ddl-schemas.html#DDL-SCHEMAS-PATTERNS

System maintenance and actions use a secure search_path.

David J.

#5Tom Lane
tgl@sss.pgh.pa.us
In reply to: David G. Johnston (#4)
Re: CHECK that involves a function call behaves differently during bulk load?

"David G. Johnston" <david.g.johnston@gmail.com> writes:

On Thu, Mar 13, 2025 at 3:33 PM Kirk Parker <khp@equatoria.us> wrote:

(1) why does it work with individual inserts sent via psql, but not with
the redirected input from the dump?

https://www.postgresql.org/docs/current/ddl-schemas.html#DDL-SCHEMAS-PATTERNS
System maintenance and actions use a secure search_path.

Even more to the point, see the first compatibility entry at [1]https://www.postgresql.org/docs/17/release-17.html#RELEASE-17-MIGRATION:

* Change functions to use a safe search_path during maintenance
operations (Jeff Davis)

This prevents maintenance operations (ANALYZE, CLUSTER, CREATE
INDEX, CREATE MATERIALIZED VIEW, REFRESH MATERIALIZED VIEW,
REINDEX, or VACUUM) from performing unsafe access. Functions
used by expression indexes and materialized views that need to
reference non-default schemas must specify a search path during
function creation.

regards, tom lane

[1]: https://www.postgresql.org/docs/17/release-17.html#RELEASE-17-MIGRATION