MERGE and parsing with prepared statements
We've used INSERT ON CONFLICT for a few years (with partitions as the target).
That's also combined with prepared statements, for bulk loading.
I was looking to see if we should use MERGE (probably not, but looking anyway).
And came across this behavior. I'm not sure if it's any issue.
CREATE TABLE CustomerAccount (CustomerId int, Balance float);
PREPARE p AS
MERGE INTO CustomerAccount CA
USING (SELECT $1 AS CustomerId, $2 AS TransactionValue) AS T
ON CA.CustomerId = T.CustomerId
WHEN NOT MATCHED THEN
INSERT (CustomerId, Balance)
VALUES (T.CustomerId, T.TransactionValue)
WHEN MATCHED THEN
UPDATE SET Balance = Balance + TransactionValue;
ERROR: operator does not exist: integer = text
LINE 3: ON CA.CustomerId = T.CustomerId
postgres: pryzbyj postgres [local] PREPARE(+0x2337be) [0x56108322e7be]
postgres: pryzbyj postgres [local] PREPARE(oper+0x198) [0x56108322f1fb]
postgres: pryzbyj postgres [local] PREPARE(make_op+0x7e) [0x56108322f55a]
postgres: pryzbyj postgres [local] PREPARE(+0x228f2b) [0x561083223f2b]
postgres: pryzbyj postgres [local] PREPARE(+0x227aa9) [0x561083222aa9]
postgres: pryzbyj postgres [local] PREPARE(transformExpr+0x1c) [0x5610832227f9]
postgres: pryzbyj postgres [local] PREPARE(transformMergeStmt+0x339) [0x56108322d988]
postgres: pryzbyj postgres [local] PREPARE(transformStmt+0x70) [0x5610831f4071]
postgres: pryzbyj postgres [local] PREPARE(+0x1fa350) [0x5610831f5350]
postgres: pryzbyj postgres [local] PREPARE(transformTopLevelStmt+0x11) [0x5610831f5385]
postgres: pryzbyj postgres [local] PREPARE(parse_analyze_varparams+0x5b) [0x5610831f54f4]
postgres: pryzbyj postgres [local] PREPARE(pg_analyze_and_rewrite_varparams+0x38) [0x5610834bcdfe]
postgres: pryzbyj postgres [local] PREPARE(PrepareQuery+0xcc) [0x561083292155]
postgres: pryzbyj postgres [local] PREPARE(standard_ProcessUtility+0x4ea) [0x5610834c31a0]
Why is $1 construed to be of type text ?
On Thu, 14 Jul 2022, 18:26 Justin Pryzby, <pryzby@telsasoft.com> wrote:
We've used INSERT ON CONFLICT for a few years (with partitions as the target).
That's also combined with prepared statements, for bulk loading.I was looking to see if we should use MERGE (probably not, but looking anyway).
And came across this behavior. I'm not sure if it's any issue.CREATE TABLE CustomerAccount (CustomerId int, Balance float);
PREPARE p AS
MERGE INTO CustomerAccount CA
USING (SELECT $1 AS CustomerId, $2 AS TransactionValue) AS T
ON CA.CustomerId = T.CustomerId
WHEN NOT MATCHED THEN
INSERT (CustomerId, Balance)
VALUES (T.CustomerId, T.TransactionValue)
WHEN MATCHED THEN
UPDATE SET Balance = Balance + TransactionValue;ERROR: operator does not exist: integer = text
LINE 3: ON CA.CustomerId = T.CustomerIdWhy is $1 construed to be of type text ?
The select statement that generates the row type of T `(select $1 CID,
$2 TxV) AS T` does not put type bounds on the input parameters, so it
remains `unknown` for the scope of that subselect. Once stored into
the row type, the type of that column defaults to text, as a row type
should not have 'unknown'-typed columns.
You'll see the same issue with other subselects that select input
parameters without casts, such as `select a from (select $1 a) A where
A.a = 1;`. It's a pre-existing issue that has popped up earlier, and I
think it's not something we've planned to fix in backbranches.
Kind regards,
Matthias van de Meent
On Fri, Jul 15, 2022 at 11:25:35AM +0200, Matthias van de Meent wrote:
On Thu, 14 Jul 2022, 18:26 Justin Pryzby, <pryzby@telsasoft.com> wrote:
Why is $1 construed to be of type text ?
The select statement that generates the row type of T `(select $1 CID,
$2 TxV) AS T` does not put type bounds on the input parameters, so it
remains `unknown` for the scope of that subselect. Once stored into
the row type, the type of that column defaults to text, as a row type
should not have 'unknown'-typed columns.
Thanks for looking into it.
I see now that the same thing can happen with "ON CONFLICT" if used with a
subselect.
PREPARE p AS INSERT INTO t SELECT a FROM (SELECT $1 AS a)a
ON CONFLICT (i) DO UPDATE SET i=excluded.i;
ERROR: column "i" is of type integer but expression is of type text
It seems a bit odd that it's impossible to use merge with prepared statements
without specifically casting the source types (which I did now to continue my
experiment).
--
Justin
On 2022-Jul-15, Justin Pryzby wrote:
On Fri, Jul 15, 2022 at 11:25:35AM +0200, Matthias van de Meent wrote:
Thanks for looking into it.
Definitely! Thanks, Matthias.
I see now that the same thing can happen with "ON CONFLICT" if used with a
subselect.PREPARE p AS INSERT INTO t SELECT a FROM (SELECT $1 AS a)a
ON CONFLICT (i) DO UPDATE SET i=excluded.i;
ERROR: column "i" is of type integer but expression is of type text
Right, I didn't think that MERGE was doing anything peculiar in this
respect.
It seems a bit odd that it's impossible to use merge with prepared statements
without specifically casting the source types (which I did now to continue my
experiment).
I have no comments on this. Maybe it can be improved, but I don't know
how.
--
Álvaro Herrera Breisgau, Deutschland — https://www.EnterpriseDB.com/
Alvaro Herrera <alvherre@alvh.no-ip.org> writes:
On 2022-Jul-15, Justin Pryzby wrote:
I see now that the same thing can happen with "ON CONFLICT" if used with a
subselect.PREPARE p AS INSERT INTO t SELECT a FROM (SELECT $1 AS a)a
ON CONFLICT (i) DO UPDATE SET i=excluded.i;
ERROR: column "i" is of type integer but expression is of type text
Right, I didn't think that MERGE was doing anything peculiar in this
respect.
Yeah. The current theory about this is that if we haven't assigned a
type to an unknown-type parameter (or literal) that is an output
column of a sub-SELECT, we will as a rule force it to text.
That MERGE USING clause is a sub-SELECT, so that rule applies.
There is a hoary old exception to that rule, which is that if you
write INSERT INTO tab SELECT ..., $1, ...
we will figure out the type of the column of "tab" that $1 is going
into, and force $1 to that type instead of text. It looks like this
also works in INSERT ... VALUES. You could make a case that MERGE
should be equally smart, but it's not clear to me that the info is
available sufficiently close by to make it reasonable to do that.
It looks like the MERGE syntax has a couple of levels of indirection,
which'd probably be enough to put the kibosh on that idea -- in
particular, AFAICS there might not be a unique target column
corresponding to a given data_source column.
regards, tom lane
On Fri, Jul 15, 2022 at 11:40 AM Alvaro Herrera <alvherre@alvh.no-ip.org>
wrote:
On 2022-Jul-15, Justin Pryzby wrote:
It seems a bit odd that it's impossible to use merge with prepared
statements
without specifically casting the source types (which I did now to
continue my
experiment).
I have no comments on this. Maybe it can be improved, but I don't know
how.
Not tested, but the example prepare command fails to make use of the
optional data types specification. Using that should remove the need to
cast the parameter placeholder within the query itself.
That said, in theory the INSERT specification of the MERGE could be used to
either resolve unknowns or even forcibly convert the data types of the
relation produced by the USING clause to match the actual types required
for the INSERT (since that will happen at insert time anyway). This would
make the UPDATE behavior slightly different than a top-level UPDATE command
though, since that does not have the same context information.
David J.
On Fri, Jul 15, 2022 at 12:17:51PM -0700, David G. Johnston wrote:
On Fri, Jul 15, 2022 at 11:40 AM Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:
On 2022-Jul-15, Justin Pryzby wrote:
It seems a bit odd that it's impossible to use merge with prepared statements
without specifically casting the source types (which I did now to continue my
experiment).I have no comments on this. Maybe it can be improved, but I don't know
how.Not tested, but the example prepare command fails to make use of the
optional data types specification. Using that should remove the need to
cast the parameter placeholder within the query itself.
What optional data type specification ?
That said, in theory the INSERT specification of the MERGE could be used to
either resolve unknowns or even forcibly convert the data types of the
relation produced by the USING clause to match the actual types required
for the INSERT (since that will happen at insert time anyway).
Yeah. I hadn't looked before, but just noticed this:
https://www.postgresql.org/docs/devel/sql-merge.html
| If the expression for any column is not of the correct data type, automatic type conversion will be attempted.
That appears to be copied from the INSERT page.
What does that mean, if not that data types will be resolved as needed ?
Note that if I add casts to the "ON" condition, MERGE complains about the
INSERT VALUES.
PREPARE p AS
MERGE INTO CustomerAccount CA
USING (SELECT $1 AS CustomerId, $2 AS TransactionValue) AS T
ON CA.CustomerId = T.CustomerId::int
WHEN NOT MATCHED THEN
INSERT (CustomerId, Balance)
VALUES (T.CustomerId, T.TransactionValue)
WHEN MATCHED THEN
UPDATE SET Balance = Balance + TransactionValue;
ERROR: column "customerid" is of type integer but expression is of type text
LINE 7: VALUES (T.CustomerId, T.TransactionValue)
postgres: pryzbyj postgres [local] PREPARE(transformAssignedExpr+0x3b6) [0x5605f9699e8e]
postgres: pryzbyj postgres [local] PREPARE(transformInsertRow+0x2ce) [0x5605f9653a47]
postgres: pryzbyj postgres [local] PREPARE(transformMergeStmt+0x7ec) [0x5605f968fe3b]
postgres: pryzbyj postgres [local] PREPARE(transformStmt+0x70) [0x5605f9656071]
postgres: pryzbyj postgres [local] PREPARE(+0x1fa350) [0x5605f9657350]
postgres: pryzbyj postgres [local] PREPARE(transformTopLevelStmt+0x11) [0x5605f9657385]
postgres: pryzbyj postgres [local] PREPARE(parse_analyze_varparams+0x5b) [0x5605f96574f4]
postgres: pryzbyj postgres [local] PREPARE(pg_analyze_and_rewrite_varparams+0x38) [0x5605f991edfe]
postgres: pryzbyj postgres [local] PREPARE(PrepareQuery+0xcc) [0x5605f96f4155]
postgres: pryzbyj postgres [local] PREPARE(standard_ProcessUtility+0x4ea) [0x5605f99251a0]
postgres: pryzbyj postgres [local] PREPARE(ProcessUtility+0xdb) [0x5605f992587e]
--
Justin
On Fri, Jul 15, 2022 at 12:40 PM Justin Pryzby <pryzby@telsasoft.com> wrote:
On Fri, Jul 15, 2022 at 12:17:51PM -0700, David G. Johnston wrote:
On Fri, Jul 15, 2022 at 11:40 AM Alvaro Herrera <alvherre@alvh.no-ip.org>
wrote:
On 2022-Jul-15, Justin Pryzby wrote:
It seems a bit odd that it's impossible to use merge with prepared
statements
without specifically casting the source types (which I did now to
continue my
experiment).
I have no comments on this. Maybe it can be improved, but I don't know
how.Not tested, but the example prepare command fails to make use of the
optional data types specification. Using that should remove the need to
cast the parameter placeholder within the query itself.What optional data type specification ?
The one documented here:
https://www.postgresql.org/docs/current/sql-prepare.html
PREPARE name [ ( data_type [, ...] ) ] AS statement
David J.
On Fri, Jul 15, 2022 at 12:40 PM Justin Pryzby <pryzby@telsasoft.com> wrote:
That appears to be copied from the INSERT page.
What does that mean, if not that data types will be resolved as needed ?
Yep, and the system needs to resolve the type at a point where there is no
contextual information and so it chooses text.
Note that if I add casts to the "ON" condition, MERGE complains about the
INSERT VALUES.PREPARE p AS
MERGE INTO CustomerAccount CA
USING (SELECT $1 AS CustomerId, $2 AS TransactionValue) AS T
ON CA.CustomerId = T.CustomerId::int
WHEN NOT MATCHED THEN
INSERT (CustomerId, Balance)
VALUES (T.CustomerId, T.TransactionValue)
WHEN MATCHED THEN
UPDATE SET Balance = Balance + TransactionValue;ERROR: column "customerid" is of type integer but expression is of type
text
LINE 7: VALUES (T.CustomerId, T.TransactionValue)
Noted. Not surprised. That error was always present, it's just that the
join happens first. Since your fix narrowly targeted the join this error
remained to be discovered.
David J.
On Fri, Jul 15, 2022 at 12:59:34PM -0700, David G. Johnston wrote:
On Fri, Jul 15, 2022 at 12:40 PM Justin Pryzby <pryzby@telsasoft.com> wrote:
| If the expression for any column is not of the correct data type, automatic type conversion will be attempted.
That appears to be copied from the INSERT page.
What does that mean, if not that data types will be resolved as needed ?Yep, and the system needs to resolve the type at a point where there is no
contextual information and so it chooses text.
I don't know if that's a viable interpretation. The parser "resolved the type"
by assuming a default, which caused an error before finishing parsing.
In the case of INSERT, "conversion will be attempted" means it looks for a cast
from the source type to the target type, and its "automatic type conversion"
will fail if (for example) you try to insert a timestamp into an int.
In the case of MERGE, that same sentence evidently means that it assumes a
default source type (rather than looking for a cast) and then fails if it
doesn't exactly match the target type.
Should that sentence be removed from MERGE ?
--
Justin
On Fri, Jul 15, 2022 at 03:43:41PM -0500, Justin Pryzby wrote:
Should that sentence be removed from MERGE ?
Also, I think these examples should be more similar.
doc/src/sgml/ref/merge.sgml
<programlisting>
MERGE INTO CustomerAccount CA
USING RecentTransactions T
ON T.CustomerId = CA.CustomerId
WHEN MATCHED THEN
UPDATE SET Balance = Balance + TransactionValue
WHEN NOT MATCHED THEN
INSERT (CustomerId, Balance)
VALUES (T.CustomerId, T.TransactionValue);
</programlisting>
</para><para>
Notice that this would be exactly equivalent to the following
statement because the <literal>MATCHED</literal> result does not change
during execution.<programlisting>
MERGE INTO CustomerAccount CA
USING (Select CustomerId, TransactionValue From RecentTransactions) AS T
ON CA.CustomerId = T.CustomerId
WHEN NOT MATCHED THEN
INSERT (CustomerId, Balance)
VALUES (T.CustomerId, T.TransactionValue)
WHEN MATCHED THEN
UPDATE SET Balance = Balance + TransactionValue;
</programlisting>
</para>
The "ON" lines can be the same.
The "MATCHED" can be in the same order.
--
Justin