new patch of MERGE (merge_204) & a question about duplicated ctid

Started by Boxuan Zhaiover 15 years ago47 messageshackers
Jump to latest
#1Boxuan Zhai
bxzhai2010@gmail.com

Dear Greg,

I have updated the MERGE patch for two main problems.

1. Support of system attributes in MERGE action expressions.
In old version of MERGE, I use the top join as the only RTE in var name
space, during the transformation process in parser. It contains all the
common attributes of source and target table but not the system attributes,
such as oid.
Thus, if user specified system attributes in MERGE actions, error pops.
In current version, the var name space is forced to be the target table RTE
and source table RTE. So system attributes will be correctly transformed.

2. fix the problem of dropped column.
In planner, we need to map the Var nodes in MERGE actions to its
corresponding TE in the main plan's TL. I used to do this under the
assumption that all the TE are ordered by their attribute no (the value of
filed "Var.varattno"). However, this is not the case when the table contain
dropped attributes. The dropped attribute will take one attribute number but
not appear in TL.
Now a new mapping algorithm is forged, which can avoid this bug.

Please help me to review this new patch. Thank you!

PS:

In the new pgsql 9.1, I find that the junk attribute ctid is added twice,
for UPDATE and DELETE commands.

In planner, the function preprocess_targetlist() will invoke a sub
function expand_targetlist() which will add missed TE for INSERT and UPDATE
commands. And for UPDATE and DELETE, it will also create a TE of ctid (a
junk attr) which is used in executor. This is the process in old pgsql
versions.

However, in pgsql 9.1, while the above is kept. A new function in rewriter,
that is rewriteTargetListUD() does the same thing. So for a plain
UPDATE/DELTE command, it will have two ctid junk attr in its final TL.

Is there any particular reason for this duplicated ctid??

This will not cause much problem, normally. However, in MERGE command, the
TL of MERGE actions should contain no junk attributes. So, I add blocking
codes in these two parts, to avoid the change of TL of MERGE actions. I
don't know whether this will cause any problem.

Regards,
Yours Boxan

Attachments:

merge204.tar.gzapplication/x-gzip; name=merge204.tar.gzDownload
#2David Fetter
david@fetter.org
In reply to: Boxuan Zhai (#1)
Re: new patch of MERGE (merge_204) & a question about duplicated ctid

On Sat, Dec 04, 2010 at 09:27:52PM +0800, Boxuan Zhai wrote:

Dear Greg,

I have updated the MERGE patch for two main problems.

Please attach the actual patch :)

Cheers,
David.
--
David Fetter <david@fetter.org> http://fetter.org/
Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter
Skype: davidfetter XMPP: david.fetter@gmail.com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

#3Greg Smith
gsmith@gregsmith.com
In reply to: Boxuan Zhai (#1)
Re: new patch of MERGE (merge_204) & a question about duplicated ctid

Boxuan Zhai wrote:

I have updated the MERGE patch for two main problems.

The patch inside the .tar.gz file you attached isn't right; that
extracts to a tiny file of junk characters.

--
Greg Smith 2ndQuadrant US greg@2ndQuadrant.com Baltimore, MD
PostgreSQL Training, Services and Support www.2ndQuadrant.us

#4Boxuan Zhai
bxzhai2010@gmail.com
In reply to: Boxuan Zhai (#1)
Fwd: new patch of MERGE (merge_204) & a question about duplicated ctid

---------- Forwarded message ----------
From: Boxuan Zhai <bxzhai2010@gmail.com>
Date: Mon, Dec 6, 2010 at 9:17 PM
Subject: Re: new patch of MERGE (merge_204) & a question about duplicated
ctid
To: Greg Smith <greg@2ndquadrant.com>

On Mon, Dec 6, 2010 at 2:12 AM, Greg Smith <greg@2ndquadrant.com> wrote:

Boxuan Zhai wrote:

I have updated the MERGE patch for two main problems.

The patch inside the .tar.gz file you attached isn't right; that extracts
to a tiny file of junk characters.

Sorry for that. The original file is correct in my machine, but the gz file
is broken.

I send the original file directly this time.

Sorry again.

Regards,

Show quoted text

--
Greg Smith 2ndQuadrant US greg@2ndQuadrant.com Baltimore, MD
PostgreSQL Training, Services and Support www.2ndQuadrant.us

Attachments:

merge_204_2010DEC06.patchapplication/octet-stream; name=merge_204_2010DEC06.patchDownload+2641-44
#5Erik Rijkers
er@xs4all.nl
In reply to: Boxuan Zhai (#4)
Re: Fwd: new patch of MERGE (merge_204) & a question about duplicated ctid

On Mon, December 6, 2010 14:17, Boxuan Zhai wrote:

I send the original file directly this time.

I get some whitespace-warnings, followed by error:

$ git apply /home/rijkers/download/pgpatches/0091/merge/20101206/merge_204_2010DEC06.patch
/home/rijkers/download/pgpatches/0091/merge/20101206/merge_204_2010DEC06.patch:481: trailing
whitespace.

/home/rijkers/download/pgpatches/0091/merge/20101206/merge_204_2010DEC06.patch:482: trailing
whitespace.
if (IsA(plan, ModifyTable) &&
/home/rijkers/download/pgpatches/0091/merge/20101206/merge_204_2010DEC06.patch:550: trailing
whitespace.
/*print the action qual*/
/home/rijkers/download/pgpatches/0091/merge/20101206/merge_204_2010DEC06.patch:556: trailing
whitespace.
(act_plan->operation == CMD_INSERT ||
/home/rijkers/download/pgpatches/0091/merge/20101206/merge_204_2010DEC06.patch:560: trailing
whitespace.

error: patch failed: src/backend/optimizer/plan/planner.c:739
error: src/backend/optimizer/plan/planner.c: patch does not apply

Erik Rijkers

#6Greg Smith
gsmith@gregsmith.com
In reply to: Erik Rijkers (#5)
Re: Fwd: new patch of MERGE (merge_204) & a question about duplicated ctid

Erik Rijkers wrote:

I get some whitespace-warnings, followed by error:

$ git apply /home/rijkers/download/pgpatches/0091/merge/20101206/merge_204_2010DEC06.patch
/home/rijkers/download/pgpatches/0091/merge/20101206/merge_204_2010DEC06.patch:481: trailing
whitespace.

/home/rijkers/download/pgpatches/0091/merge/20101206/merge_204_2010DEC06.patch:482: trailing
whitespace.
if (IsA(plan, ModifyTable) &&
/home/rijkers/download/pgpatches/0091/merge/20101206/merge_204_2010DEC06.patch:550: trailing
whitespace.
/*print the action qual*/
/home/rijkers/download/pgpatches/0091/merge/20101206/merge_204_2010DEC06.patch:556: trailing
whitespace.
(act_plan->operation == CMD_INSERT ||
/home/rijkers/download/pgpatches/0091/merge/20101206/merge_204_2010DEC06.patch:560: trailing
whitespace.

error: patch failed: src/backend/optimizer/plan/planner.c:739
error: src/backend/optimizer/plan/planner.c: patch does not appl

Maybe I'm doing something wrong, but I've never had good luck with git
apply. I took this patch and applied it the 12/15 copy of HEAD I had
checked out (trying to minimize drift in there since the patch was
created) using:

patch -p 1 < merge_204_2010DEC06.patch

There was one trivial conflict it produced
src/backend/optimizer/plan/planner.c.rej for, and that fix was
straightforward to apply by hand.

The result is now sitting as the merge204 branch in my github repo:
https://github.com/greg2ndQuadrant/postgres/tree/merge204 if you did
want to try this out.

--
Greg Smith 2ndQuadrant US greg@2ndQuadrant.com Baltimore, MD
PostgreSQL Training, Services and Support www.2ndQuadrant.us

#7Greg Smith
gsmith@gregsmith.com
In reply to: Boxuan Zhai (#1)
Re: new patch of MERGE (merge_204) & a question about duplicated ctid

I did some basic testing of the latest update here, but quickly hit a
problem that wasn't in the previous version. Attached is the standalone
test script that used to work, but now fails like this:

psql:simple.sql:12: ERROR: the vars in merge action tlist of qual
should only belongs to the source table or target table

This test case is intended to implement the common UPSERT situation that
is one of the main requests that MERGE is intended to satisfy, using
this syntax:

MERGE INTO Stock t
USING (VALUES(10,100)) AS s(item_id,balance)
ON s.item_id=t.item_id
WHEN MATCHED THEN UPDATE SET balance=t.balance + s.balance
WHEN NOT MATCHED THEN INSERT VALUES(s.item_id,s.balance)
;

If you can suggest an alternate way to express this that works with the
new patch, I might switch to that and retry. I was never 100% sure this
was the right way to write this, and I don't have another database with
MERGE support here to try against. (Aside: if someone else does, I'd
be really curious to see if the attached test case works or not on
another database system. I think we need to include compatibility
testing with other MERGE implementations into the test mix here soon.)

Regardless, this failure suggests that you need to add this sort of test
to the regression test set. We need to have an example of an UPSERT
using constant data in there to make sure this continues to work in the
future.

This is a good week for me in terms of having time for PostgreSQL
hacking, so if you can suggest something here or update the patch I'll
try it soon afterwards.

--
Greg Smith 2ndQuadrant US greg@2ndQuadrant.com Baltimore, MD
PostgreSQL Training, Services and Support www.2ndQuadrant.us

Attachments:

simple-merge-test.sqltext/x-sql; name=simple-merge-test.sqlDownload
#8Marko Tiikkaja
marko@joh.to
In reply to: Greg Smith (#7)
Re: Re: new patch of MERGE (merge_204) & a question about duplicated ctid

On 2010-12-29 2:14 PM, Greg Smith wrote:

MERGE INTO Stock t
USING (VALUES(10,100)) AS s(item_id,balance)
ON s.item_id=t.item_id
WHEN MATCHED THEN UPDATE SET balance=t.balance + s.balance
WHEN NOT MATCHED THEN INSERT VALUES(s.item_id,s.balance)
;

If you can suggest an alternate way to express this that works with the
new patch, I might switch to that and retry. I was never 100% sure this
was the right way to write this, and I don't have another database with
MERGE support here to try against.

As far as I can tell, this should work. I played around with the patch
and the problem seems to be the VALUES:

INTO Stock t
USING (SELECT 30, 2000) AS s(item_id,balance)
ON s.item_id=t.item_id
WHEN MATCHED THEN UPDATE SET balance=t.balance + s.balance
WHEN NOT MATCHED THEN INSERT VALUES(s.item_id,s.balance)
;
MERGE 1

Regards,
Marko Tiikkaja

#9Greg Smith
gsmith@gregsmith.com
In reply to: Marko Tiikkaja (#8)
Re: Re: new patch of MERGE (merge_204) & a question about duplicated ctid

Marko Tiikkaja wrote:

As far as I can tell, this should work. I played around with the
patch and the problem seems to be the VALUES:

INTO Stock t
USING (SELECT 30, 2000) AS s(item_id,balance)
ON s.item_id=t.item_id
WHEN MATCHED THEN UPDATE SET balance=t.balance + s.balance
WHEN NOT MATCHED THEN INSERT VALUES(s.item_id,s.balance)
;
MERGE 1

Good catch...while I think the VALUES syntax should work, that is a
useful workaround so I could keep testing. I rewrote like this
(original syntax commented out):

MERGE INTO Stock t
-- USING (VALUES(10,100)) AS s(item_id,balance)
USING (SELECT 10,100) AS s(item_id,balance)
ON s.item_id=t.item_id
WHEN MATCHED THEN UPDATE SET balance=t.balance + s.balance
WHEN NOT MATCHED THEN INSERT VALUES(s.item_id,s.balance)
;

And that got me back again to concurrent testing.

Moving onto next two problems...the basic MERGE feature seems to have
stepped backwards a bit too. I'm now seeing these quite often:

ERROR: duplicate key value violates unique constraint
"pgbench_accounts_pkey"
DETAIL: Key (aid)=(176641) already exists.
STATEMENT: MERGE INTO pgbench_accounts t USING (SELECT 176641,1+(176641
/ 1000000)::integer,168,'') AS s(aid,bid,balance,filler) ON s.aid=t.aid
WHEN MATCHED THEN UPDATE SET abalance=abalance + s.balance WHEN NOT
MATCHED THEN INSERT VALUES(s.aid,s.bid,s.balance,s.filler);

On my concurrent pgbench test, which had been working before. Possibly
causing that, the following assertion is tripping:

TRAP: FailedAssertion("!(epqstate->origslot != ((void *)0))", File:
"execMain.c", Line: 1762)

That's coming from the following code:

void
EvalPlanQualFetchRowMarks(EPQState *epqstate)
{
ListCell *l;

Assert(epqstate->origslot != NULL);

foreach(l, epqstate->rowMarks)

Stepping back to summarize...here's a list of issues I know about with
the current v204 code:

1) VALUE syntax doesn't work anymore
2) Assertion failure in EvalPlanQualFetchRowMarks
3) Duplicate key bug (possibly a direct result of #3)
4) Attempts to use MERGE in a fuction spit back "ERROR: <table> is not
a known fuction"
5) The ctid junk attr handling needs to be reviewed more carefully,
based on author request.

I've attached the current revisions of all my testing code in hopes that
Boxuan might try and replicate these (this makes it simple to replicate
#1 through #3), and therefore confirm whether changes made do better.

--
Greg Smith 2ndQuadrant US greg@2ndQuadrant.com Baltimore, MD
PostgreSQL Training, Services and Support www.2ndQuadrant.us
"PostgreSQL 9.0 High Performance": http://www.2ndQuadrant.com/books

Attachments:

simple-merge-test.sqltext/x-sql; name=simple-merge-test.sqlDownload
update-merge.sqltext/x-sql; name=update-merge.sqlDownload
test-merge.shapplication/x-sh; name=test-merge.shDownload
#10Marko Tiikkaja
marko@joh.to
In reply to: Greg Smith (#9)
Re: Re: new patch of MERGE (merge_204) & a question about duplicated ctid

On 2010-12-30 4:39 AM +0200, Greg Smith wrote:

And that got me back again to concurrent testing.

Moving onto next two problems...the basic MERGE feature seems to have
stepped backwards a bit too. I'm now seeing these quite often:

ERROR: duplicate key value violates unique constraint
"pgbench_accounts_pkey"
DETAIL: Key (aid)=(176641) already exists.
STATEMENT: MERGE INTO pgbench_accounts t USING (SELECT 176641,1+(176641
/ 1000000)::integer,168,'') AS s(aid,bid,balance,filler) ON s.aid=t.aid
WHEN MATCHED THEN UPDATE SET abalance=abalance + s.balance WHEN NOT
MATCHED THEN INSERT VALUES(s.aid,s.bid,s.balance,s.filler);

On my concurrent pgbench test, which had been working before.

I have no idea why it worked in the past, but the patch was never
designed to work for UPSERT. This has been discussed in the past and
some people thought that that's not a huge deal.

Regards,
Marko Tiikkaja

#11Robert Haas
robertmhaas@gmail.com
In reply to: Marko Tiikkaja (#10)
Re: Re: new patch of MERGE (merge_204) & a question about duplicated ctid

On Wed, Dec 29, 2010 at 9:45 PM, Marko Tiikkaja
<marko.tiikkaja@cs.helsinki.fi> wrote:

I have no idea why it worked in the past, but the patch was never designed
to work for UPSERT.  This has been discussed in the past and some people
thought that that's not a huge deal.

I think it's expected to fail in some *concurrent* UPSERT cases. It
should work if it's the only game in town.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#12Greg Smith
gsmith@gregsmith.com
In reply to: Marko Tiikkaja (#10)
Re: Re: new patch of MERGE (merge_204) & a question about duplicated ctid

Marko Tiikkaja wrote:

I have no idea why it worked in the past, but the patch was never
designed to work for UPSERT. This has been discussed in the past and
some people thought that that's not a huge deal.

It takes an excessively large lock when doing UPSERT, which means its
performance under a heavy concurrent load can't be good. The idea is
that if the syntax and general implementation issues can get sorted out,
fixing the locking can be a separate performance improvement to be
implemented later. Using MERGE for UPSERT is the #1 use case for this
feature by a gigantic margin. If that doesn't do what's expected, the
whole implementation doesn't provide the community anything really worth
talking about. That's why I keep hammering on this particular area in
all my testing.

One of the reflexive "I can't switch to PostgreSQL easily" stopping
points for MySQL users is "I can't convert my ON DUPLICATE KEY UPDATE
code". Every other use for MERGE is a helpful side-effect of adding the
implementation in my mind, but not the primary driver of why this is
important. My hints in this direction before didn't get adopted, so I'm
saying it outright now: this patch must have an UPSERT implementation
in its regression tests. And the first thing I'm going to do every time
a new rev comes in is try and break it with the pgbench test I
attached. If Boxuan can start doing that as part of his own testing, I
think development here might start moving forward faster. I don't care
so much about the rate at which concurrent UPSERT-style MERGE happens,
so long as it doesn't crash. But that's where this has been stuck at
for a while now.

--
Greg Smith 2ndQuadrant US greg@2ndQuadrant.com Baltimore, MD
PostgreSQL Training, Services and Support www.2ndQuadrant.us
"PostgreSQL 9.0 High Performance": http://www.2ndQuadrant.com/books

#13Marko Tiikkaja
marko@joh.to
In reply to: Greg Smith (#12)
Re: Re: new patch of MERGE (merge_204) & a question about duplicated ctid

On 2010-12-30 9:02 AM +0200, Greg Smith wrote:

Marko Tiikkaja wrote:

I have no idea why it worked in the past, but the patch was never
designed to work for UPSERT. This has been discussed in the past and
some people thought that that's not a huge deal.

It takes an excessively large lock when doing UPSERT, which means its
performance under a heavy concurrent load can't be good. The idea is
that if the syntax and general implementation issues can get sorted out,
fixing the locking can be a separate performance improvement to be
implemented later. Using MERGE for UPSERT is the #1 use case for this
feature by a gigantic margin. If that doesn't do what's expected, the
whole implementation doesn't provide the community anything really worth
talking about. That's why I keep hammering on this particular area in
all my testing.

I'm confused. Are you saying that the patch is supposed to lock the
table against concurrent INSERT/UPDATE/DELETE/MERGE? Because I don't
see it in the patch, and the symptoms you're having are a clear
indication of the fact that it's not happening. I also seem to recall
that people thought locking the table would be excessive.

Regards,
Marko Tiikkaja

#14Andrew Dunstan
andrew@dunslane.net
In reply to: Greg Smith (#12)
Re: Re: new patch of MERGE (merge_204) & a question about duplicated ctid

On 12/30/2010 02:02 AM, Greg Smith wrote:

Marko Tiikkaja wrote:

I have no idea why it worked in the past, but the patch was never
designed to work for UPSERT. This has been discussed in the past and
some people thought that that's not a huge deal.

It takes an excessively large lock when doing UPSERT, which means its
performance under a heavy concurrent load can't be good. The idea is
that if the syntax and general implementation issues can get sorted
out, fixing the locking can be a separate performance improvement to
be implemented later. Using MERGE for UPSERT is the #1 use case for
this feature by a gigantic margin. If that doesn't do what's
expected, the whole implementation doesn't provide the community
anything really worth talking about. That's why I keep hammering on
this particular area in all my testing.

One of the reflexive "I can't switch to PostgreSQL easily" stopping
points for MySQL users is "I can't convert my ON DUPLICATE KEY UPDATE
code". Every other use for MERGE is a helpful side-effect of adding
the implementation in my mind, but not the primary driver of why this
is important. My hints in this direction before didn't get adopted,
so I'm saying it outright now: this patch must have an UPSERT
implementation in its regression tests. And the first thing I'm going
to do every time a new rev comes in is try and break it with the
pgbench test I attached. If Boxuan can start doing that as part of
his own testing, I think development here might start moving forward
faster. I don't care so much about the rate at which concurrent
UPSERT-style MERGE happens, so long as it doesn't crash. But that's
where this has been stuck at for a while now.

I strongly agree. It *is* a huge deal.

cheers

andrew

#15Greg Smith
gsmith@gregsmith.com
In reply to: Marko Tiikkaja (#13)
Re: Re: new patch of MERGE (merge_204) & a question about duplicated ctid

Marko Tiikkaja wrote:

I'm confused. Are you saying that the patch is supposed to lock the
table against concurrent INSERT/UPDATE/DELETE/MERGE? Because I don't
see it in the patch, and the symptoms you're having are a clear
indication of the fact that it's not happening. I also seem to recall
that people thought locking the table would be excessive.

That's exactly what it should be doing. I thought I'd seen just that in
one of the versions of this patch, but maybe that's a mistaken memory on
my part. In advance of the planned but not available yet ability to
lock individual index key values, locking the whole table is the only
possible implementation that can work correctly here I'm aware of. In
earlier versions, I think this code was running into issues before it
even got to there. If you're right that things like the duplicate key
error in the current version are caused exclusively by not locking
enough, that may be the next necessary step here.

--
Greg Smith 2ndQuadrant US greg@2ndQuadrant.com Baltimore, MD
PostgreSQL Training, Services and Support www.2ndQuadrant.us
"PostgreSQL 9.0 High Performance": http://www.2ndQuadrant.com/books

#16Simon Riggs
simon@2ndQuadrant.com
In reply to: Greg Smith (#15)
Re: Re: new patch of MERGE (merge_204) & a question about duplicated ctid

On Mon, 2011-01-03 at 01:53 -0500, Greg Smith wrote:

In advance of the planned but not available yet ability to
lock individual index key values, locking the whole table is the only
possible implementation that can work correctly here I'm aware of.

This was discussed here
http://archives.postgresql.org/pgsql-hackers/2010-10/msg01903.php
with suggested resolutions for this release here
http://archives.postgresql.org/pgsql-hackers/2010-10/msg01907.php

In summary, that means we can either

1. Lock the table for ShareRowExclusiveLock

2. throw a SERIALIZABLE error, if we come up against a row that cannot
be neither MATCHED nor NON MATCHED.

3. Bounce the patch to 9.2, commit early and then work on a full
concurrency solution before commit. The solution strawman is something
like EvalPlanQual with a new snapshot for each re-checked row, emulating
the pattern of snapshots/rechecks that would happen in a PL/pgSQL
version of an UPSERT.

Either way, we're saying that MERGE will not support concurrent
operations safely, in this release.

Given the continued lack of test cases for this patch, and the possible
embarrassment over not doing concurrent actions, do we think (3) is the
right road?

--
Simon Riggs http://www.2ndQuadrant.com/books/
PostgreSQL Development, 24x7 Support, Training and Services

#17Heikki Linnakangas
heikki.linnakangas@enterprisedb.com
In reply to: Simon Riggs (#16)
Re: Re: new patch of MERGE (merge_204) & a question about duplicated ctid

On 03.01.2011 11:37, Simon Riggs wrote:

On Mon, 2011-01-03 at 01:53 -0500, Greg Smith wrote:

In advance of the planned but not available yet ability to
lock individual index key values, locking the whole table is the only
possible implementation that can work correctly here I'm aware of.

This was discussed here
http://archives.postgresql.org/pgsql-hackers/2010-10/msg01903.php
with suggested resolutions for this release here
http://archives.postgresql.org/pgsql-hackers/2010-10/msg01907.php

In summary, that means we can either

1. Lock the table for ShareRowExclusiveLock

2. throw a SERIALIZABLE error, if we come up against a row that cannot
be neither MATCHED nor NON MATCHED.

3. Bounce the patch to 9.2, commit early and then work on a full
concurrency solution before commit. The solution strawman is something
like EvalPlanQual with a new snapshot for each re-checked row, emulating
the pattern of snapshots/rechecks that would happen in a PL/pgSQL
version of an UPSERT.

Either way, we're saying that MERGE will not support concurrent
operations safely, in this release.

Given the continued lack of test cases for this patch, and the possible
embarrassment over not doing concurrent actions, do we think (3) is the
right road?

This patch has never tried to implement concurrency-safe upsert. It
implements the MERGE command as specified by the SQL standard, nothing
more, nothing less. Let's not move the goalposts. Googling around, at
least MS SQL Server's MERGE command is the same
(http://weblogs.sqlteam.com/dang/archive/2009/01/31/UPSERT-Race-Condition-With-MERGE.aspx).
There is nothing embarrassing about it, we just have to document it clearly.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com

#18Simon Riggs
simon@2ndQuadrant.com
In reply to: Heikki Linnakangas (#17)
Re: Re: new patch of MERGE (merge_204) & a question about duplicated ctid

On Mon, 2011-01-03 at 15:12 +0200, Heikki Linnakangas wrote:

This patch has never tried to implement concurrency-safe upsert. It
implements the MERGE command as specified by the SQL standard, nothing
more, nothing less. Let's not move the goalposts. Googling around, at
least MS SQL Server's MERGE command is the same
(http://weblogs.sqlteam.com/dang/archive/2009/01/31/UPSERT-Race-Condition-With-MERGE.aspx).
There is nothing embarrassing about it, we just have to document it clearly.

That article says that SQLServer supplies a locking hint that completely
removes the issue. Because they use locking, they are able to update in
place, so there is no need for them to use snapshots.

Our version won't allow a workaround yet, just for the record.

--
Simon Riggs http://www.2ndQuadrant.com/books/
PostgreSQL Development, 24x7 Support, Training and Services

#19Robert Haas
robertmhaas@gmail.com
In reply to: Simon Riggs (#18)
Re: Re: new patch of MERGE (merge_204) & a question about duplicated ctid

On Mon, Jan 3, 2011 at 8:35 AM, Simon Riggs <simon@2ndquadrant.com> wrote:

On Mon, 2011-01-03 at 15:12 +0200, Heikki Linnakangas wrote:

This patch has never tried to implement concurrency-safe upsert. It
implements the MERGE command as specified by the SQL standard, nothing
more, nothing less. Let's not move the goalposts. Googling around, at
least MS SQL Server's MERGE command is the same
(http://weblogs.sqlteam.com/dang/archive/2009/01/31/UPSERT-Race-Condition-With-MERGE.aspx).
There is nothing embarrassing about it, we just have to document it clearly.

That article says that SQLServer supplies a locking hint that completely
removes the issue. Because they use locking, they are able to update in
place, so there is no need for them to use snapshots.

Our version won't allow a workaround yet, just for the record.

Like Heikki, I'd rather have the feature without a workaround for the
concurrency issues than no feature. But I have to admit that the
discussion we've had thus far gives me very little confidence that
this code is anywhere close to bug-free. So I think we're going to
end up punting it to 9.2 not so much because it's not concurrency-safe
as because it doesn't work.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#20Stephen Frost
sfrost@snowman.net
In reply to: Robert Haas (#19)
Re: Re: new patch of MERGE (merge_204) & a question about duplicated ctid

* Robert Haas (robertmhaas@gmail.com) wrote:

Like Heikki, I'd rather have the feature without a workaround for the
concurrency issues than no feature.

I'm still trying to figure out the problem with having the table-level
lock, unless we really think people will be doing concurrent MERGE's
where they won't overlap..? I'm also a bit nervous about if the result
of concurrent MERGE's would actually be correct if we're not taking a
bigger lock than row-level (I assume we're taking row-level locks as it
goes through..).

In general, I also thought/expected to have some kind of UPSERT type
capability with our initial MERGE support, even if it requires a big
lock and won't operate concurrently, etc.

But I have to admit that the
discussion we've had thus far gives me very little confidence that
this code is anywhere close to bug-free. So I think we're going to
end up punting it to 9.2 not so much because it's not concurrency-safe
as because it doesn't work.

That's certainly a concern. :/

Stephen

#21Heikki Linnakangas
heikki.linnakangas@enterprisedb.com
In reply to: Stephen Frost (#20)
#22Robert Haas
robertmhaas@gmail.com
In reply to: Heikki Linnakangas (#21)
#23Andrew Dunstan
andrew@dunslane.net
In reply to: Heikki Linnakangas (#21)
#24Heikki Linnakangas
heikki.linnakangas@enterprisedb.com
In reply to: Robert Haas (#22)
#25Robert Haas
robertmhaas@gmail.com
In reply to: Heikki Linnakangas (#24)
#26Simon Riggs
simon@2ndQuadrant.com
In reply to: Heikki Linnakangas (#24)
#27Heikki Linnakangas
heikki.linnakangas@enterprisedb.com
In reply to: Simon Riggs (#26)
#28Florian Pflug
fgp@phlo.org
In reply to: Robert Haas (#25)
#29Simon Riggs
simon@2ndQuadrant.com
In reply to: Heikki Linnakangas (#27)
#30Heikki Linnakangas
heikki.linnakangas@enterprisedb.com
In reply to: Simon Riggs (#29)
#31Robert Haas
robertmhaas@gmail.com
In reply to: Florian Pflug (#28)
#32Simon Riggs
simon@2ndQuadrant.com
In reply to: Heikki Linnakangas (#30)
#33Robert Haas
robertmhaas@gmail.com
In reply to: Heikki Linnakangas (#30)
#34Kevin Grittner
Kevin.Grittner@wicourts.gov
In reply to: Robert Haas (#33)
#35Robert Haas
robertmhaas@gmail.com
In reply to: Kevin Grittner (#34)
#36Kevin Grittner
Kevin.Grittner@wicourts.gov
In reply to: Robert Haas (#35)
#37Robert Haas
robertmhaas@gmail.com
In reply to: Kevin Grittner (#36)
#38Greg Smith
gsmith@gregsmith.com
In reply to: Heikki Linnakangas (#21)
#39Greg Smith
gsmith@gregsmith.com
In reply to: Robert Haas (#37)
#40Kevin Grittner
Kevin.Grittner@wicourts.gov
In reply to: Kevin Grittner (#36)
#41David Fetter
david@fetter.org
In reply to: Greg Smith (#38)
#42Marko Tiikkaja
marko@joh.to
In reply to: David Fetter (#41)
#43David Fetter
david@fetter.org
In reply to: Marko Tiikkaja (#42)
#44Greg Smith
gsmith@gregsmith.com
In reply to: Kevin Grittner (#40)
#45Greg Smith
gsmith@gregsmith.com
In reply to: David Fetter (#41)
#46David Fetter
david@fetter.org
In reply to: Greg Smith (#45)
#47Greg Smith
gsmith@gregsmith.com
In reply to: David Fetter (#46)