possible race condition in trigger functions on insert operations?

Started by Kenneth Tiltonover 14 years ago13 messagesgeneral
Jump to latest
#1Kenneth Tilton
ktilton@mcna.net

Bit of a trigger NOOB Q:

I am trying to use a trigger function to automatically populate new
rows in a table with a public ID of the form YYYY-NNN such that the
42nd row created in 2011 would get the ID "2011-042". Each row is
associated via an iasid column with a row in an audit table that has a
timestamp column called created. This works OK, but I am worried about
two rows getting the same case_no if they come in at the same time
(whatever that means):

declare
case_yr integer;
yr_case_count bigint;
begin
select date_part('year', created) into case_yr
from audit
where audit.sid = NEW.iasid;

select count(*) into yr_case_count
from fwa_case, audit
where fwa_case.iasid=audit.sid
and date_part('year', created) = case_yr;

NEW.case_no = to_char( case_yr, '9999' ) || '-' ||
to_char(1+yr_case_count, 'FM000');
return NEW;
end;

Do I have to worry about this, or does ACID bail me out? If the
former, what do I do? I am thinking first put a uniqueness constraint
on the column and then figure out how to do retries in a trigger
function.

kenneth

#2David G. Johnston
david.g.johnston@gmail.com
In reply to: Kenneth Tilton (#1)
Re: possible race condition in trigger functions on insert operations?

-----Original Message-----
From: pgsql-general-owner@postgresql.org
[mailto:pgsql-general-owner@postgresql.org] On Behalf Of Kenneth Tilton
Sent: Tuesday, November 22, 2011 12:26 PM
To: pgsql-general@postgresql.org
Subject: [GENERAL] possible race condition in trigger functions on insert
operations?

Bit of a trigger NOOB Q:

I am trying to use a trigger function to automatically populate new rows in
a table with a public ID of the form YYYY-NNN such that the 42nd row
created in 2011 would get the ID "2011-042". Each row is associated via an
iasid column with a row in an audit table that has a timestamp column called
created. This works OK, but I am worried about two rows getting the same
case_no if they come in at the same time (whatever that means):

declare
case_yr integer;
yr_case_count bigint;
begin
select date_part('year', created) into case_yr
from audit
where audit.sid = NEW.iasid;

select count(*) into yr_case_count
from fwa_case, audit
where fwa_case.iasid=audit.sid
and date_part('year', created) = case_yr;

NEW.case_no = to_char( case_yr, '9999' ) || '-' ||
to_char(1+yr_case_count, 'FM000');
return NEW;
end;

Do I have to worry about this, or does ACID bail me out? If the former, what
do I do? I am thinking first put a uniqueness constraint on the column and
then figure out how to do retries in a trigger function.

kenneth
--------------------------------------------------------

Why can't you just use a sequence? Yes, gaps end up being possible but that
usually is not a big issue and you will avoid any possibility of collision.

David J.

#3Andreas Kretschmer
akretschmer@spamfence.net
In reply to: Kenneth Tilton (#1)
Re: possible race condition in trigger functions on insert operations?

Kenneth Tilton <ktilton@mcna.net> wrote:

Bit of a trigger NOOB Q:

I am trying to use a trigger function to automatically populate new
rows in a table with a public ID of the form YYYY-NNN such that the
42nd row created in 2011 would get the ID "2011-042". Each row is
associated via an iasid column with a row in an audit table that has a
timestamp column called created. This works OK, but I am worried about
two rows getting the same case_no if they come in at the same time
(whatever that means):

declare
case_yr integer;
yr_case_count bigint;
begin
select date_part('year', created) into case_yr
from audit
where audit.sid = NEW.iasid;

select count(*) into yr_case_count
from fwa_case, audit
where fwa_case.iasid=audit.sid
and date_part('year', created) = case_yr;

NEW.case_no = to_char( case_yr, '9999' ) || '-' ||
to_char(1+yr_case_count, 'FM000');
return NEW;
end;

If i where you, i would not use such a column. What happens if you
insert/delete a record?

I would use something like to_char(row_number() over (...),'FM000') to
count while select.

Andreas
--
Really, I'm not out to destroy Microsoft. That will just be a completely
unintentional side effect. (Linus Torvalds)
"If I was god, I would recompile penguin with --enable-fly." (unknown)
Kaufbach, Saxony, Germany, Europe. N 51.05082�, E 13.56889�

#4Merlin Moncure
mmoncure@gmail.com
In reply to: Kenneth Tilton (#1)
Re: possible race condition in trigger functions on insert operations?

On Tue, Nov 22, 2011 at 11:25 AM, Kenneth Tilton <ktilton@mcna.net> wrote:

Bit of a trigger NOOB Q:

I am trying to use a trigger function to automatically populate new
rows in a table with  a public ID of the form YYYY-NNN such that the
42nd row created in 2011 would get the ID "2011-042". Each row is
associated via an iasid column with a row in an audit table that has a
timestamp column called created. This works OK, but I am worried about
two rows getting the same case_no if they come in at the same time
(whatever that means):

declare
 case_yr integer;
 yr_case_count bigint;
begin
 select date_part('year', created) into case_yr
        from audit
        where audit.sid = NEW.iasid;

 select count(*) into yr_case_count
   from fwa_case, audit
   where fwa_case.iasid=audit.sid
     and date_part('year', created) = case_yr;

 NEW.case_no = to_char( case_yr, '9999' ) || '-' ||
to_char(1+yr_case_count, 'FM000');
 return NEW;
end;

Do I have to worry about this, or does ACID bail me out? If the
former, what do I do? I am thinking first put a uniqueness constraint
on the column and then figure out how to do retries in a trigger
function.

ACID does not bail you out -- you've put no synchonization in to
prevent to concurrent counts coming at roughly the same time and
getting the same answer. A uniqueness constraint is definitely a good
idea. In terms of doing a gapless sequence generally, see here:
http://www.varlena.com/GeneralBits/130.php. Basically the general
idea is to keep a counter field somewhere that you lock and update.

merlin

#5Kenneth Tilton
ktilton@mcna.net
In reply to: David G. Johnston (#2)
Re: possible race condition in trigger functions on insert operations?

On Tue, Nov 22, 2011 at 12:48 PM, David Johnston <polobo@yahoo.com> wrote:

-----Original Message-----
From: pgsql-general-owner@postgresql.org
[mailto:pgsql-general-owner@postgresql.org] On Behalf Of Kenneth Tilton
Sent: Tuesday, November 22, 2011 12:26 PM
To: pgsql-general@postgresql.org
Subject: [GENERAL] possible race condition in trigger functions on insert
operations?

Bit of a trigger NOOB Q:

I am trying to use a trigger function to automatically populate new rows in
a table with  a public ID of the form YYYY-NNN such that the 42nd row
created in 2011 would get the ID "2011-042". Each row is associated via an
iasid column with a row in an audit table that has a timestamp column called
created. This works OK, but I am worried about two rows getting the same
case_no if they come in at the same time (whatever that means):

declare
 case_yr integer;
 yr_case_count bigint;
begin
 select date_part('year', created) into case_yr
        from audit
        where audit.sid = NEW.iasid;

 select count(*) into yr_case_count
   from fwa_case, audit
   where fwa_case.iasid=audit.sid
     and date_part('year', created) = case_yr;

 NEW.case_no = to_char( case_yr, '9999' ) || '-' ||
to_char(1+yr_case_count, 'FM000');
 return NEW;
end;

Do I have to worry about this, or does ACID bail me out? If the former, what
do I do? I am thinking first put a uniqueness constraint on the column and
then figure out how to do retries in a trigger function.

kenneth
--------------------------------------------------------

Why can't you just use a sequence?

The sequence has to be within the year. Someone suggested a cron job
to reset the sequence at the beginning of the year but I find that
alternative unappealing for some reason.

-kt

#6Kenneth Tilton
ktilton@mcna.net
In reply to: Andreas Kretschmer (#3)
Re: possible race condition in trigger functions on insert operations?

On Tue, Nov 22, 2011 at 12:53 PM, Andreas Kretschmer
<akretschmer@spamfence.net> wrote:

Kenneth Tilton <ktilton@mcna.net> wrote:

Bit of a trigger NOOB Q:

I am trying to use a trigger function to automatically populate new
rows in a table with  a public ID of the form YYYY-NNN such that the
42nd row created in 2011 would get the ID "2011-042". Each row is
associated via an iasid column with a row in an audit table that has a
timestamp column called created. This works OK, but I am worried about
two rows getting the same case_no if they come in at the same time
(whatever that means):

declare
  case_yr integer;
  yr_case_count bigint;
begin
  select date_part('year', created) into case_yr
         from audit
         where audit.sid = NEW.iasid;

  select count(*) into yr_case_count
    from fwa_case, audit
    where fwa_case.iasid=audit.sid
      and date_part('year', created) = case_yr;

  NEW.case_no = to_char( case_yr, '9999' ) || '-' ||
to_char(1+yr_case_count, 'FM000');
  return NEW;
end;

If i where you, i would not use such a column. What happens if you
insert/delete a record?

We only do logical deletes. Not sure what you mean about inserts --
that is what I am working on, and they always should get the next
highest sequence number in a year.

I would use something like to_char(row_number() over (...),'FM000') to
count while select.

Unfortunately it must be within the year, not overall.

-kt

#7Kenneth Tilton
ktilton@mcna.net
In reply to: Merlin Moncure (#4)
Re: possible race condition in trigger functions on insert operations?

Thanks, that's perfect.

-kenneth

Show quoted text

On Tue, Nov 22, 2011 at 12:53 PM, Merlin Moncure <mmoncure@gmail.com> wrote:

On Tue, Nov 22, 2011 at 11:25 AM, Kenneth Tilton <ktilton@mcna.net> wrote:

Bit of a trigger NOOB Q:

I am trying to use a trigger function to automatically populate new
rows in a table with  a public ID of the form YYYY-NNN such that the
42nd row created in 2011 would get the ID "2011-042". Each row is
associated via an iasid column with a row in an audit table that has a
timestamp column called created. This works OK, but I am worried about
two rows getting the same case_no if they come in at the same time
(whatever that means):

declare
 case_yr integer;
 yr_case_count bigint;
begin
 select date_part('year', created) into case_yr
        from audit
        where audit.sid = NEW.iasid;

 select count(*) into yr_case_count
   from fwa_case, audit
   where fwa_case.iasid=audit.sid
     and date_part('year', created) = case_yr;

 NEW.case_no = to_char( case_yr, '9999' ) || '-' ||
to_char(1+yr_case_count, 'FM000');
 return NEW;
end;

Do I have to worry about this, or does ACID bail me out? If the
former, what do I do? I am thinking first put a uniqueness constraint
on the column and then figure out how to do retries in a trigger
function.

ACID does not bail you out -- you've put no synchonization in to
prevent to concurrent counts coming at roughly the same time and
getting the same answer.  A uniqueness constraint is definitely a good
idea.  In terms of doing a gapless sequence generally, see here:
http://www.varlena.com/GeneralBits/130.php.  Basically the general
idea is to keep a counter field somewhere that you lock and update.

merlin

#8Andreas Kretschmer
akretschmer@spamfence.net
In reply to: Kenneth Tilton (#6)
Re: possible race condition in trigger functions on insert operations?

Kenneth Tilton <ktilton@mcna.net> wrote:

On Tue, Nov 22, 2011 at 12:53 PM, Andreas Kretschmer
<akretschmer@spamfence.net> wrote:

Kenneth Tilton <ktilton@mcna.net> wrote:

Bit of a trigger NOOB Q:

I am trying to use a trigger function to automatically populate new
rows in a table with �a public ID of the form YYYY-NNN such that the
42nd row created in 2011 would get the ID "2011-042". Each row is
associated via an iasid column with a row in an audit table that has a
timestamp column called created. This works OK, but I am worried about
two rows getting the same case_no if they come in at the same time
(whatever that means):

declare
� case_yr integer;
� yr_case_count bigint;
begin
� select date_part('year', created) into case_yr
� � � � �from audit
� � � � �where audit.sid = NEW.iasid;

� select count(*) into yr_case_count
� � from fwa_case, audit
� � where fwa_case.iasid=audit.sid
� � � and date_part('year', created) = case_yr;

� NEW.case_no = to_char( case_yr, '9999' ) || '-' ||
to_char(1+yr_case_count, 'FM000');
� return NEW;
end;

If i where you, i would not use such a column. What happens if you
insert/delete a record?

We only do logical deletes. Not sure what you mean about inserts --

inserts that change the numbering...

that is what I am working on, and they always should get the next
highest sequence number in a year.

I would use something like to_char(row_number() over (...),'FM000') to
count while select.

Unfortunately it must be within the year, not overall.

That is not a problem - over (...) with, for instance, PARTITION BY ...

Andreas
--
Really, I'm not out to destroy Microsoft. That will just be a completely
unintentional side effect. (Linus Torvalds)
"If I was god, I would recompile penguin with --enable-fly." (unknown)
Kaufbach, Saxony, Germany, Europe. N 51.05082�, E 13.56889�

#9Kenneth Tilton
ktilton@mcna.net
In reply to: Andreas Kretschmer (#8)
Re: possible race condition in trigger functions on insert operations?

On Tue, Nov 22, 2011 at 2:05 PM, Andreas Kretschmer
<akretschmer@spamfence.net> wrote:

Kenneth Tilton <ktilton@mcna.net> wrote:

On Tue, Nov 22, 2011 at 12:53 PM, Andreas Kretschmer
<akretschmer@spamfence.net> wrote:

Kenneth Tilton <ktilton@mcna.net> wrote:

Bit of a trigger NOOB Q:

I am trying to use a trigger function to automatically populate new
rows in a table with  a public ID of the form YYYY-NNN such that the
42nd row created in 2011 would get the ID "2011-042". Each row is
associated via an iasid column with a row in an audit table that has a
timestamp column called created. This works OK, but I am worried about
two rows getting the same case_no if they come in at the same time
(whatever that means):

declare
  case_yr integer;
  yr_case_count bigint;
begin
  select date_part('year', created) into case_yr
         from audit
         where audit.sid = NEW.iasid;

  select count(*) into yr_case_count
    from fwa_case, audit
    where fwa_case.iasid=audit.sid
      and date_part('year', created) = case_yr;

  NEW.case_no = to_char( case_yr, '9999' ) || '-' ||
to_char(1+yr_case_count, 'FM000');
  return NEW;
end;

If i where you, i would not use such a column. What happens if you
insert/delete a record?

We only do logical deletes. Not sure what you mean about inserts --

inserts that change the numbering...

that is what I am working on, and they always should get the next
highest sequence number in a year.

I would use something like to_char(row_number() over (...),'FM000') to
count while select.

Unfortunately it must be within the year, not overall.

That is not a problem - over (...) with, for instance, PARTITION BY ...

Ah, OK, I did not understand what they meant by "partition" when I
checked out row_number. I'll give that a try.

First Q: does a row have a row_number at the time of insert/before?
Or will I be updating the row in the insert/after? I should know
soon... :)

-kenneth

#10David G. Johnston
david.g.johnston@gmail.com
In reply to: Kenneth Tilton (#5)
Re: possible race condition in trigger functions on insert operations?

-----Original Message-----
From: Kenneth Tilton [mailto:ktilton@mcna.net]
Sent: Tuesday, November 22, 2011 1:52 PM
To: David Johnston
Cc: pgsql-general@postgresql.org
Subject: Re: [GENERAL] possible race condition in trigger functions on
insert operations?

On Tue, Nov 22, 2011 at 12:48 PM, David Johnston <polobo@yahoo.com> wrote:

-----Original Message-----
From: pgsql-general-owner@postgresql.org
[mailto:pgsql-general-owner@postgresql.org] On Behalf Of Kenneth
Tilton
Sent: Tuesday, November 22, 2011 12:26 PM
To: pgsql-general@postgresql.org
Subject: [GENERAL] possible race condition in trigger functions on
insert operations?

Bit of a trigger NOOB Q:

I am trying to use a trigger function to automatically populate new
rows in a table with  a public ID of the form YYYY-NNN such that the
42nd row created in 2011 would get the ID "2011-042". Each row is
associated via an iasid column with a row in an audit table that has a
timestamp column called created. This works OK, but I am worried about
two rows getting the same case_no if they come in at the same time

(whatever that means):

declare
 case_yr integer;
 yr_case_count bigint;
begin
 select date_part('year', created) into case_yr
        from audit
        where audit.sid = NEW.iasid;

 select count(*) into yr_case_count
   from fwa_case, audit
   where fwa_case.iasid=audit.sid
     and date_part('year', created) = case_yr;

 NEW.case_no = to_char( case_yr, '9999' ) || '-' ||
to_char(1+yr_case_count, 'FM000');
 return NEW;
end;

Do I have to worry about this, or does ACID bail me out? If the
former, what do I do? I am thinking first put a uniqueness constraint
on the column and then figure out how to do retries in a trigger function.

kenneth
--------------------------------------------------------

Why can't you just use a sequence?

The sequence has to be within the year. Someone suggested a cron job to
reset the sequence at the beginning of the year but I find that alternative
unappealing for some reason.

----------------------------------------------

Just create a single sequence for each year and then call the proper one
on-the-fly. You can create multiple sequences in advance and possible even
auto-create the sequence the first time one is attempted to be used in a
given year. If you can live with possible (but probably unlikely) gaps in
the sequence then all the concurrency will be handled for you and you can
focus on writing a function that, given a year, will return the proper
value.

David J.

#11Merlin Moncure
mmoncure@gmail.com
In reply to: David G. Johnston (#10)
Re: possible race condition in trigger functions on insert operations?

On Tue, Nov 22, 2011 at 2:43 PM, David Johnston <polobo@yahoo.com> wrote:

Just create a single sequence for each year and then call the proper one
on-the-fly.  You can create multiple sequences in advance and possible even
auto-create the sequence the first time one is attempted to be used in a
given year.  If you can live with possible (but probably unlikely) gaps in
the sequence then all the concurrency will be handled for you and you can
focus on writing a function that, given a year, will return the proper
value.

I personally think the 'record the next to be inserted value' in a
table somewhere is better unless you are trying to support a lot of
concurrent operations. Also the gap issue is more likely to come up
than you're letting on -- a rolled back transaction is all it takes.

merlin

#12Kenneth Tilton
ktilton@mcna.net
In reply to: David G. Johnston (#10)
Re: possible race condition in trigger functions on insert operations?

Pre-created sequences per year. Not a bad idea at all. I had not been
worrying about gaps but I checked with the user and they definitely do
not want gaps. But as long as we do not delete, is there still a risk
of gaps?

I am leaning towards a serial column maintained by postgres, using
row_number partitioned by year ordered by the serial column. No gaps
since we never delete, and no concurrency issues. I think. :)

Might also go for allocating the IDs from a separate table.

-kenneth

Show quoted text

On Tue, Nov 22, 2011 at 3:43 PM, David Johnston <polobo@yahoo.com> wrote:

-----Original Message-----
From: Kenneth Tilton [mailto:ktilton@mcna.net]
Sent: Tuesday, November 22, 2011 1:52 PM
To: David Johnston
Cc: pgsql-general@postgresql.org
Subject: Re: [GENERAL] possible race condition in trigger functions on
insert operations?

On Tue, Nov 22, 2011 at 12:48 PM, David Johnston <polobo@yahoo.com> wrote:

-----Original Message-----
From: pgsql-general-owner@postgresql.org
[mailto:pgsql-general-owner@postgresql.org] On Behalf Of Kenneth
Tilton
Sent: Tuesday, November 22, 2011 12:26 PM
To: pgsql-general@postgresql.org
Subject: [GENERAL] possible race condition in trigger functions on
insert operations?

Bit of a trigger NOOB Q:

I am trying to use a trigger function to automatically populate new
rows in a table with  a public ID of the form YYYY-NNN such that the
42nd row created in 2011 would get the ID "2011-042". Each row is
associated via an iasid column with a row in an audit table that has a
timestamp column called created. This works OK, but I am worried about
two rows getting the same case_no if they come in at the same time

(whatever that means):

declare
 case_yr integer;
 yr_case_count bigint;
begin
 select date_part('year', created) into case_yr
        from audit
        where audit.sid = NEW.iasid;

 select count(*) into yr_case_count
   from fwa_case, audit
   where fwa_case.iasid=audit.sid
     and date_part('year', created) = case_yr;

 NEW.case_no = to_char( case_yr, '9999' ) || '-' ||
to_char(1+yr_case_count, 'FM000');
 return NEW;
end;

Do I have to worry about this, or does ACID bail me out? If the
former, what do I do? I am thinking first put a uniqueness constraint
on the column and then figure out how to do retries in a trigger function.

kenneth
--------------------------------------------------------

Why can't you just use a sequence?

The sequence has to be within the year. Someone suggested a cron job to
reset the sequence at the beginning of the year but I find that alternative
unappealing for some reason.

----------------------------------------------

Just create a single sequence for each year and then call the proper one
on-the-fly.  You can create multiple sequences in advance and possible even
auto-create the sequence the first time one is attempted to be used in a
given year.  If you can live with possible (but probably unlikely) gaps in
the sequence then all the concurrency will be handled for you and you can
focus on writing a function that, given a year, will return the proper
value.

David J.

#13Kenneth Tilton
ktilton@mcna.net
In reply to: Merlin Moncure (#11)
Re: possible race condition in trigger functions on insert operations?

On Tue, Nov 22, 2011 at 3:52 PM, Merlin Moncure <mmoncure@gmail.com> wrote:

On Tue, Nov 22, 2011 at 2:43 PM, David Johnston <polobo@yahoo.com> wrote:

Just create a single sequence for each year and then call the proper one
on-the-fly.  You can create multiple sequences in advance and possible even
auto-create the sequence the first time one is attempted to be used in a
given year.  If you can live with possible (but probably unlikely) gaps in
the sequence then all the concurrency will be handled for you and you can
focus on writing a function that, given a year, will return the proper
value.

I personally think the 'record the next to be inserted value' in a
table somewhere is better unless you are trying to support a lot of
concurrent operations.  Also the gap issue is more likely to come up
than you're letting on -- a rolled back transaction is all it takes.

Yeah, using a table seems to have the advantage of being about the
only thing that would work, though I think row_number properly used
(with an invariant ordering and no deletes) also works.

-kenneth