remove upsert example from docs

Started by Merlin Moncureover 15 years ago11 messages
#1Merlin Moncure
mmoncure@gmail.com
1 attachment(s)

Attached is a patch to remove the upsert example from the pl/pgsql
documentation. It has a serious bug (see:
http://www.spinics.net/lists/pgsql/msg112560.html) which is nontrivial
to fix. IMNSHO, our code examples should encourage good practices and
style.

The 'correct' way to do race free upsert is to take a table lock first
-- you don't have to loop or open a subtransaction. A high
concurrency version is nice but is more of a special case solution (it
looks like concurrent MERGE might render the issue moot anyways).

merlin

Attachments:

rm_upsert.patchtext/x-patch; charset=US-ASCII; name=rm_upsert.patchDownload
Index: doc/src/sgml/plpgsql.sgml
===================================================================
RCS file: /projects/cvsroot/pgsql/doc/src/sgml/plpgsql.sgml,v
retrieving revision 1.156
diff -c -6 -r1.156 plpgsql.sgml
*** doc/src/sgml/plpgsql.sgml	29 Jul 2010 19:34:40 -0000	1.156
--- doc/src/sgml/plpgsql.sgml	5 Aug 2010 17:34:54 -0000
***************
*** 2332,2382 ****
       linkend="errcodes-table"> for a list of possible error
       codes). The <varname>SQLERRM</varname> variable contains the
       error message associated with the exception. These variables are
       undefined outside exception handlers.
      </para>
  
-     <example id="plpgsql-upsert-example">
-     <title>Exceptions with <command>UPDATE</>/<command>INSERT</></title>
-     <para>
- 
-     This example uses exception handling to perform either
-     <command>UPDATE</> or <command>INSERT</>, as appropriate:
- 
- <programlisting>
- CREATE TABLE db (a INT PRIMARY KEY, b TEXT);
- 
- CREATE FUNCTION merge_db(key INT, data TEXT) RETURNS VOID AS
- $$
- BEGIN
-     LOOP
-         -- first try to update the key
-         UPDATE db SET b = data WHERE a = key;
-         IF found THEN
-             RETURN;
-         END IF;
-         -- not there, so try to insert the key
-         -- if someone else inserts the same key concurrently,
-         -- we could get a unique-key failure
-         BEGIN
-             INSERT INTO db(a,b) VALUES (key, data);
-             RETURN;
-         EXCEPTION WHEN unique_violation THEN
-             -- do nothing, and loop to try the UPDATE again
-         END;
-     END LOOP;
- END;
- $$
- LANGUAGE plpgsql;
- 
- SELECT merge_db(1, 'david');
- SELECT merge_db(1, 'dennis');
- </programlisting>
- 
-     </para>
-     </example>
    </sect2>
    </sect1>
  
    <sect1 id="plpgsql-cursors">
     <title>Cursors</title>
  
--- 2332,2343 ----
#2Kevin Grittner
Kevin.Grittner@wicourts.gov
In reply to: Merlin Moncure (#1)
Re: remove upsert example from docs

Merlin Moncure <mmoncure@gmail.com> wrote:

Attached is a patch to remove the upsert example from the pl/pgsql
documentation. It has a serious bug (see:
http://www.spinics.net/lists/pgsql/msg112560.html) which is
nontrivial to fix. IMNSHO, our code examples should encourage
good practices and style.

The 'correct' way to do race free upsert is to take a table lock
first -- you don't have to loop or open a subtransaction. A high
concurrency version is nice but is more of a special case solution
(it looks like concurrent MERGE might render the issue moot
anyways).

Of course, this can be done safely without a table lock if either or
both of the concurrency patches (one by Florian, one by Dan and
myself) get committed, so maybe we should wait to see whether either
of them makes it before adjusting the docs on this point -- at least
for 9.1. Taking a broken example out of 9.0 and back branches might
make sense....

-Kevin

#3Greg Sabino Mullane
greg@turnstep.com
In reply to: Merlin Moncure (#1)
Re: remove upsert example from docs

-----BEGIN PGP SIGNED MESSAGE-----
Hash: RIPEMD160

Attached is a patch to remove the upsert example from the pl/pgsql
documentation. It has a serious bug (see:
http://www.spinics.net/lists/pgsql/msg112560.html) which is nontrivial
to fix. IMNSHO, our code examples should encourage good practices and
style.

No, removing is a bad idea, as it's referenced from here to the North
Pole and back. Better would simply be a warning about the non uniqueness
of the unique constraint message.

The 'correct' way to do race free upsert is to take a table lock first
-- you don't have to loop or open a subtransaction. A high
concurrency version is nice but is more of a special case solution (it
looks like concurrent MERGE might render the issue moot anyways).

I think anything doing table locks should be the "special case solution"
as production systems generally avoid full table locks like the plague.
The existing solution works fine as long as we explain that caveat (which
is a little bit of a corner case, else we'd have heard more complaints
before now).

- --
Greg Sabino Mullane greg@turnstep.com
End Point Corporation http://www.endpoint.com/
PGP Key: 0x14964AC8 201008051402
http://biglumber.com/x/web?pk=2529DF6AB8F79407E94445B4BC9B906714964AC8
-----BEGIN PGP SIGNATURE-----

iEYEAREDAAYFAkxa/XgACgkQvJuQZxSWSsjTbACfcjrsBVXCOGUb6foARfNIztSo
AswAn0bNttP8XOs/2tw6jFsSa0cZkq7e
=HUcq
-----END PGP SIGNATURE-----

#4Tom Lane
tgl@sss.pgh.pa.us
In reply to: Merlin Moncure (#1)
Re: remove upsert example from docs

Merlin Moncure <mmoncure@gmail.com> writes:

Attached is a patch to remove the upsert example from the pl/pgsql
documentation. It has a serious bug (see:
http://www.spinics.net/lists/pgsql/msg112560.html) which is nontrivial
to fix. IMNSHO, our code examples should encourage good practices and
style.

I was not persuaded that there's a real bug in practice. IMO, his
problem was a broken trigger not broken upsert logic. Even if we
conclude this is unsafe, simply removing the example is of no help to
anyone. A more useful response would be to supply a correct example.

regards, tom lane

#5Andrew Dunstan
andrew@dunslane.net
In reply to: Tom Lane (#4)
Re: remove upsert example from docs

On 08/05/2010 02:09 PM, Tom Lane wrote:

Merlin Moncure<mmoncure@gmail.com> writes:

Attached is a patch to remove the upsert example from the pl/pgsql
documentation. It has a serious bug (see:
http://www.spinics.net/lists/pgsql/msg112560.html) which is nontrivial
to fix. IMNSHO, our code examples should encourage good practices and
style.

I was not persuaded that there's a real bug in practice. IMO, his
problem was a broken trigger not broken upsert logic. Even if we
conclude this is unsafe, simply removing the example is of no help to
anyone. A more useful response would be to supply a correct example.

Yeah, that's how it struck me just now. Maybe we should document that
the inserts had better not fire a trigger that could cause an uncaught
uniqueness violation exception. You could also possibly usefully prevent
infinite looping in such cases by using a limited loop rather an
unlimited loop.

cheers

andrew

#6Merlin Moncure
mmoncure@gmail.com
In reply to: Tom Lane (#4)
Re: remove upsert example from docs

On Thu, Aug 5, 2010 at 2:09 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Merlin Moncure <mmoncure@gmail.com> writes:

Attached is a patch to remove the upsert example from the pl/pgsql
documentation.  It has a serious bug (see:
http://www.spinics.net/lists/pgsql/msg112560.html) which is nontrivial
to fix.  IMNSHO, our code examples should encourage good practices and
style.

I was not persuaded that there's a real bug in practice.  IMO, his
problem was a broken trigger not broken upsert logic.  Even if we
conclude this is unsafe, simply removing the example is of no help to
anyone.

Well, the error handler is assuming that the unique_volation is coming
from the insert made within the loop. This is obviously not a safe
assumption in an infinite loop context. It should be double checking
where the error was being thrown from -- but the only way I can think
of to do that is to check sqlerrm. Or you arguing that if you're
doing this, all dependent triggers must not throw unique violations up
the exception chain?

Looping N times and punting is meh: since you have to now check in the
app, why have this mechanism at all?

A more useful response would be to supply a correct example.

Agree: I'd go further I would argue to supply both the 'safe' and
'high concurrency (with caveat)' way. I'm not saying the example is
necessarily bad, just that it's maybe not a good thing to be pointing
as a learning example without qualifications. Then you get a lesson
both on upsert methods and defensive error handling (barring
objection, I'll provide that).

merlin

#7Tom Lane
tgl@sss.pgh.pa.us
In reply to: Merlin Moncure (#6)
Re: remove upsert example from docs

Merlin Moncure <mmoncure@gmail.com> writes:

On Thu, Aug 5, 2010 at 2:09 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

I was not persuaded that there's a real bug in practice. �IMO, his
problem was a broken trigger not broken upsert logic. �Even if we
conclude this is unsafe, simply removing the example is of no help to
anyone.

Well, the error handler is assuming that the unique_volation is coming
from the insert made within the loop. This is obviously not a safe
assumption in an infinite loop context.

Well, that's a fair point. Perhaps we should just add a note that if
there are any triggers that do additional inserts/updates, the exception
catcher had better check which table the unique_violation is being
reported for.

A more useful response would be to supply a correct example.

Agree: I'd go further I would argue to supply both the 'safe' and
'high concurrency (with caveat)' way. I'm not saying the example is
necessarily bad, just that it's maybe not a good thing to be pointing
as a learning example without qualifications. Then you get a lesson
both on upsert methods and defensive error handling (barring
objection, I'll provide that).

Have at it.

regards, tom lane

#8Marko Tiikkaja
marko.tiikkaja@cs.helsinki.fi
In reply to: Merlin Moncure (#6)
Re: remove upsert example from docs

On 8/5/2010 9:44 PM, Merlin Moncure wrote:

On Thu, Aug 5, 2010 at 2:09 PM, Tom Lane<tgl@sss.pgh.pa.us> wrote:

I was not persuaded that there's a real bug in practice. IMO, his
problem was a broken trigger not broken upsert logic. Even if we
conclude this is unsafe, simply removing the example is of no help to
anyone.

Well, the error handler is assuming that the unique_volation is coming
from the insert made within the loop. This is obviously not a safe
assumption in an infinite loop context. It should be double checking
where the error was being thrown from -- but the only way I can think
of to do that is to check sqlerrm.

Yeah, this is a known problem with our exception system. If there was
an easy and reliable way of knowing where the exception came from, I'm
sure the example would include that.

Or you arguing that if you're
doing this, all dependent triggers must not throw unique violations up
the exception chain?

If he isn't, I am. I'm pretty sure you can break every example in the
docs with a trigger (or a rule) you haven't thought through.

A more useful response would be to supply a correct example.

Agree: I'd go further I would argue to supply both the 'safe' and
'high concurrency (with caveat)' way. I'm not saying the example is
necessarily bad, just that it's maybe not a good thing to be pointing
as a learning example without qualifications. Then you get a lesson
both on upsert methods and defensive error handling (barring
objection, I'll provide that).

The problem with the "safe" way is that it's not safe if called in a
transaction with isolation level set to SERIALIZABLE.

Regards,
Marko Tiikkaja

#9Bruce Momjian
bruce@momjian.us
In reply to: Marko Tiikkaja (#8)
1 attachment(s)
Re: remove upsert example from docs

Marko Tiikkaja wrote:

On 8/5/2010 9:44 PM, Merlin Moncure wrote:

On Thu, Aug 5, 2010 at 2:09 PM, Tom Lane<tgl@sss.pgh.pa.us> wrote:

I was not persuaded that there's a real bug in practice. IMO, his
problem was a broken trigger not broken upsert logic. Even if we
conclude this is unsafe, simply removing the example is of no help to
anyone.

Well, the error handler is assuming that the unique_volation is coming
from the insert made within the loop. This is obviously not a safe
assumption in an infinite loop context. It should be double checking
where the error was being thrown from -- but the only way I can think
of to do that is to check sqlerrm.

Yeah, this is a known problem with our exception system. If there was
an easy and reliable way of knowing where the exception came from, I'm
sure the example would include that.

Or you arguing that if you're
doing this, all dependent triggers must not throw unique violations up
the exception chain?

If he isn't, I am. I'm pretty sure you can break every example in the
docs with a trigger (or a rule) you haven't thought through.

A more useful response would be to supply a correct example.

Agree: I'd go further I would argue to supply both the 'safe' and
'high concurrency (with caveat)' way. I'm not saying the example is
necessarily bad, just that it's maybe not a good thing to be pointing
as a learning example without qualifications. Then you get a lesson
both on upsert methods and defensive error handling (barring
objection, I'll provide that).

The problem with the "safe" way is that it's not safe if called in a
transaction with isolation level set to SERIALIZABLE.

Good analysis. Documentation patch attached and applied.

--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ It's impossible for everything to be true. +

Attachments:

/rtmp/upsert.difftext/x-diffDownload
diff --git a/doc/src/sgml/plpgsql.sgml b/doc/src/sgml/plpgsql.sgml
index c342916..d2e74dc 100644
*** a/doc/src/sgml/plpgsql.sgml
--- b/doc/src/sgml/plpgsql.sgml
*************** BEGIN
*** 2464,2470 ****
              INSERT INTO db(a,b) VALUES (key, data);
              RETURN;
          EXCEPTION WHEN unique_violation THEN
!             -- do nothing, and loop to try the UPDATE again
          END;
      END LOOP;
  END;
--- 2464,2470 ----
              INSERT INTO db(a,b) VALUES (key, data);
              RETURN;
          EXCEPTION WHEN unique_violation THEN
!             -- Do nothing, and loop to try the UPDATE again.
          END;
      END LOOP;
  END;
*************** LANGUAGE plpgsql;
*** 2474,2480 ****
  SELECT merge_db(1, 'david');
  SELECT merge_db(1, 'dennis');
  </programlisting>
! 
      </para>
      </example>
    </sect2>
--- 2474,2483 ----
  SELECT merge_db(1, 'david');
  SELECT merge_db(1, 'dennis');
  </programlisting>
!      This example assumes the <literal>unique_violation</> error is caused by
!      the <command>INSERT</>, and not by an <command>INSERT</> trigger function
!      on the table.  Also, this example only works in the default Read
!      Committed transaction mode.
      </para>
      </example>
    </sect2>
#10Marko Tiikkaja
marko.tiikkaja@cs.helsinki.fi
In reply to: Bruce Momjian (#9)
Re: remove upsert example from docs

On 2011-02-17 8:37 PM +0200, Bruce Momjian wrote:

Marko Tiikkaja wrote:

The problem with the "safe" way is that it's not safe if called in a
transaction with isolation level set to SERIALIZABLE.

Good analysis. Documentation patch attached and applied.

The "safe way" I was referring to above was the LOCK TABLE method, not
the one described in the documentation, so the remark about READ
COMMITTED in the patch should be removed. The first part looks fine though.

Regards,
Marko Tiikkaja

#11Bruce Momjian
bruce@momjian.us
In reply to: Marko Tiikkaja (#10)
1 attachment(s)
Re: remove upsert example from docs

Marko Tiikkaja wrote:

On 2011-02-17 8:37 PM +0200, Bruce Momjian wrote:

Marko Tiikkaja wrote:

The problem with the "safe" way is that it's not safe if called in a
transaction with isolation level set to SERIALIZABLE.

Good analysis. Documentation patch attached and applied.

The "safe way" I was referring to above was the LOCK TABLE method, not
the one described in the documentation, so the remark about READ
COMMITTED in the patch should be removed. The first part looks fine though.

OK, sentence removed. Thanks.

--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ It's impossible for everything to be true. +

Attachments:

/rtmp/difftext/x-diffDownload
commit 3472a2b0565ad0302e5ea47e49a95305c2b07f64
Author: Bruce Momjian <bruce@momjian.us>
Date:   Thu Feb 17 14:24:14 2011 -0500

    Remove doc mention about read committed in upsert example.

diff --git a/doc/src/sgml/plpgsql.sgml b/doc/src/sgml/plpgsql.sgml
index d2e74dc..d0672bb 100644
--- a/doc/src/sgml/plpgsql.sgml
+++ b/doc/src/sgml/plpgsql.sgml
@@ -2476,8 +2476,7 @@ SELECT merge_db(1, 'dennis');
 </programlisting>
      This example assumes the <literal>unique_violation</> error is caused by
      the <command>INSERT</>, and not by an <command>INSERT</> trigger function
-     on the table.  Also, this example only works in the default Read
-     Committed transaction mode.
+     on the table.
     </para>
     </example>
   </sect2>