Thread: possible race condition in trigger functions on insert operations?

possible race condition in trigger functions on insert operations?

From

Kenneth Tilton

Date:

22 November 2011, 13:37:17

Bit of a trigger NOOB Q:

I am trying to use a trigger function to automatically populate new
rows in a table with  a public ID of the form YYYY-NNN such that the
42nd row created in 2011 would get the ID "2011-042". Each row is
associated via an iasid column with a row in an audit table that has a
timestamp column called created. This works OK, but I am worried about
two rows getting the same case_no if they come in at the same time
(whatever that means):

declare
  case_yr integer;
  yr_case_count bigint;
begin
  select date_part('year', created) into case_yr
         from audit
         where audit.sid = NEW.iasid;

  select count(*) into yr_case_count
    from fwa_case, audit
    where fwa_case.iasid=audit.sid
      and date_part('year', created) = case_yr;

  NEW.case_no = to_char( case_yr, '9999' ) || '-' ||
to_char(1+yr_case_count, 'FM000');
  return NEW;
end;

Do I have to worry about this, or does ACID bail me out? If the
former, what do I do? I am thinking first put a uniqueness constraint
on the column and then figure out how to do retries in a trigger
function.

kenneth

Re: possible race condition in trigger functions on insert operations?

From

"David Johnston"

Date:

22 November 2011, 13:48:42

-----Original Message-----
From: pgsql-general-owner@postgresql.org
[mailto:pgsql-general-owner@postgresql.org] On Behalf Of Kenneth Tilton
Sent: Tuesday, November 22, 2011 12:26 PM
To: pgsql-general@postgresql.org
Subject: [GENERAL] possible race condition in trigger functions on insert
operations?

Bit of a trigger NOOB Q:

I am trying to use a trigger function to automatically populate new rows in
a table with  a public ID of the form YYYY-NNN such that the 42nd row
created in 2011 would get the ID "2011-042". Each row is associated via an
iasid column with a row in an audit table that has a timestamp column called
created. This works OK, but I am worried about two rows getting the same
case_no if they come in at the same time (whatever that means):

declare
  case_yr integer;
  yr_case_count bigint;
begin
  select date_part('year', created) into case_yr
         from audit
         where audit.sid = NEW.iasid;

  select count(*) into yr_case_count
    from fwa_case, audit
    where fwa_case.iasid=audit.sid
      and date_part('year', created) = case_yr;

  NEW.case_no = to_char( case_yr, '9999' ) || '-' ||
to_char(1+yr_case_count, 'FM000');
  return NEW;
end;

Do I have to worry about this, or does ACID bail me out? If the former, what
do I do? I am thinking first put a uniqueness constraint on the column and
then figure out how to do retries in a trigger function.

kenneth
--------------------------------------------------------

Why can't you just use a sequence?  Yes, gaps end up being possible but that
usually is not a big issue and you will avoid any possibility of collision.

David J.

Re: possible race condition in trigger functions on insert operations?

From

Andreas Kretschmer

Date:

22 November 2011, 13:54:01

Kenneth Tilton <ktilton@mcna.net> wrote:

> Bit of a trigger NOOB Q:
>
> I am trying to use a trigger function to automatically populate new
> rows in a table with  a public ID of the form YYYY-NNN such that the
> 42nd row created in 2011 would get the ID "2011-042". Each row is
> associated via an iasid column with a row in an audit table that has a
> timestamp column called created. This works OK, but I am worried about
> two rows getting the same case_no if they come in at the same time
> (whatever that means):
>
> declare
>   case_yr integer;
>   yr_case_count bigint;
> begin
>   select date_part('year', created) into case_yr
>          from audit
>          where audit.sid = NEW.iasid;
>
>   select count(*) into yr_case_count
>     from fwa_case, audit
>     where fwa_case.iasid=audit.sid
>       and date_part('year', created) = case_yr;
>
>   NEW.case_no = to_char( case_yr, '9999' ) || '-' ||
> to_char(1+yr_case_count, 'FM000');
>   return NEW;
> end;

If i where you, i would not use such a column. What happens if you
insert/delete a record?

I would use something like to_char(row_number() over (...),'FM000') to
count while select.


Andreas
--
Really, I'm not out to destroy Microsoft. That will just be a completely
unintentional side effect.                              (Linus Torvalds)
"If I was god, I would recompile penguin with --enable-fly."   (unknown)
Kaufbach, Saxony, Germany, Europe.              N 51.05082°, E 13.56889°

Re: possible race condition in trigger functions on insert operations?

From

Merlin Moncure

Date:

22 November 2011, 13:54:13

On Tue, Nov 22, 2011 at 11:25 AM, Kenneth Tilton <ktilton@mcna.net> wrote:
> Bit of a trigger NOOB Q:
>
> I am trying to use a trigger function to automatically populate new
> rows in a table with  a public ID of the form YYYY-NNN such that the
> 42nd row created in 2011 would get the ID "2011-042". Each row is
> associated via an iasid column with a row in an audit table that has a
> timestamp column called created. This works OK, but I am worried about
> two rows getting the same case_no if they come in at the same time
> (whatever that means):
>
> declare
>  case_yr integer;
>  yr_case_count bigint;
> begin
>  select date_part('year', created) into case_yr
>         from audit
>         where audit.sid = NEW.iasid;
>
>  select count(*) into yr_case_count
>    from fwa_case, audit
>    where fwa_case.iasid=audit.sid
>      and date_part('year', created) = case_yr;
>
>  NEW.case_no = to_char( case_yr, '9999' ) || '-' ||
> to_char(1+yr_case_count, 'FM000');
>  return NEW;
> end;
>
> Do I have to worry about this, or does ACID bail me out? If the
> former, what do I do? I am thinking first put a uniqueness constraint
> on the column and then figure out how to do retries in a trigger
> function.

ACID does not bail you out -- you've put no synchonization in to
prevent to concurrent counts coming at roughly the same time and
getting the same answer.  A uniqueness constraint is definitely a good
idea.  In terms of doing a gapless sequence generally, see here:
http://www.varlena.com/GeneralBits/130.php.  Basically the general
idea is to keep a counter field somewhere that you lock and update.

merlin

Re: possible race condition in trigger functions on insert operations?

From

Kenneth Tilton

Date:

22 November 2011, 14:54:59

On Tue, Nov 22, 2011 at 12:48 PM, David Johnston <polobo@yahoo.com> wrote:
> -----Original Message-----
> From: pgsql-general-owner@postgresql.org
> [mailto:pgsql-general-owner@postgresql.org] On Behalf Of Kenneth Tilton
> Sent: Tuesday, November 22, 2011 12:26 PM
> To: pgsql-general@postgresql.org
> Subject: [GENERAL] possible race condition in trigger functions on insert
> operations?
>
> Bit of a trigger NOOB Q:
>
> I am trying to use a trigger function to automatically populate new rows in
> a table with  a public ID of the form YYYY-NNN such that the 42nd row
> created in 2011 would get the ID "2011-042". Each row is associated via an
> iasid column with a row in an audit table that has a timestamp column called
> created. This works OK, but I am worried about two rows getting the same
> case_no if they come in at the same time (whatever that means):
>
> declare
>  case_yr integer;
>  yr_case_count bigint;
> begin
>  select date_part('year', created) into case_yr
>         from audit
>         where audit.sid = NEW.iasid;
>
>  select count(*) into yr_case_count
>    from fwa_case, audit
>    where fwa_case.iasid=audit.sid
>      and date_part('year', created) = case_yr;
>
>  NEW.case_no = to_char( case_yr, '9999' ) || '-' ||
> to_char(1+yr_case_count, 'FM000');
>  return NEW;
> end;
>
> Do I have to worry about this, or does ACID bail me out? If the former, what
> do I do? I am thinking first put a uniqueness constraint on the column and
> then figure out how to do retries in a trigger function.
>
> kenneth
> --------------------------------------------------------
>
> Why can't you just use a sequence?

The sequence has to be within the year. Someone suggested a cron job
to reset the sequence at the beginning of the year but I find that
alternative unappealing for some reason.

-kt

Re: possible race condition in trigger functions on insert operations?

From

Kenneth Tilton

Date:

22 November 2011, 14:56:22

On Tue, Nov 22, 2011 at 12:53 PM, Andreas Kretschmer
<akretschmer@spamfence.net> wrote:
> Kenneth Tilton <ktilton@mcna.net> wrote:
>
>> Bit of a trigger NOOB Q:
>>
>> I am trying to use a trigger function to automatically populate new
>> rows in a table with  a public ID of the form YYYY-NNN such that the
>> 42nd row created in 2011 would get the ID "2011-042". Each row is
>> associated via an iasid column with a row in an audit table that has a
>> timestamp column called created. This works OK, but I am worried about
>> two rows getting the same case_no if they come in at the same time
>> (whatever that means):
>>
>> declare
>>   case_yr integer;
>>   yr_case_count bigint;
>> begin
>>   select date_part('year', created) into case_yr
>>          from audit
>>          where audit.sid = NEW.iasid;
>>
>>   select count(*) into yr_case_count
>>     from fwa_case, audit
>>     where fwa_case.iasid=audit.sid
>>       and date_part('year', created) = case_yr;
>>
>>   NEW.case_no = to_char( case_yr, '9999' ) || '-' ||
>> to_char(1+yr_case_count, 'FM000');
>>   return NEW;
>> end;
>
> If i where you, i would not use such a column. What happens if you
> insert/delete a record?

We only do logical deletes. Not sure what you mean about inserts --
that is what I am working on, and they always should get the next
highest sequence number in a year.

>
> I would use something like to_char(row_number() over (...),'FM000') to
> count while select.

Unfortunately it must be within the year, not overall.

-kt

Re: possible race condition in trigger functions on insert operations?

From

Kenneth Tilton

Date:

22 November 2011, 15:03:01

Thanks, that's perfect.

-kenneth

On Tue, Nov 22, 2011 at 12:53 PM, Merlin Moncure <mmoncure@gmail.com> wrote:
> On Tue, Nov 22, 2011 at 11:25 AM, Kenneth Tilton <ktilton@mcna.net> wrote:
>> Bit of a trigger NOOB Q:
>>
>> I am trying to use a trigger function to automatically populate new
>> rows in a table with  a public ID of the form YYYY-NNN such that the
>> 42nd row created in 2011 would get the ID "2011-042". Each row is
>> associated via an iasid column with a row in an audit table that has a
>> timestamp column called created. This works OK, but I am worried about
>> two rows getting the same case_no if they come in at the same time
>> (whatever that means):
>>
>> declare
>>  case_yr integer;
>>  yr_case_count bigint;
>> begin
>>  select date_part('year', created) into case_yr
>>         from audit
>>         where audit.sid = NEW.iasid;
>>
>>  select count(*) into yr_case_count
>>    from fwa_case, audit
>>    where fwa_case.iasid=audit.sid
>>      and date_part('year', created) = case_yr;
>>
>>  NEW.case_no = to_char( case_yr, '9999' ) || '-' ||
>> to_char(1+yr_case_count, 'FM000');
>>  return NEW;
>> end;
>>
>> Do I have to worry about this, or does ACID bail me out? If the
>> former, what do I do? I am thinking first put a uniqueness constraint
>> on the column and then figure out how to do retries in a trigger
>> function.
>
> ACID does not bail you out -- you've put no synchonization in to
> prevent to concurrent counts coming at roughly the same time and
> getting the same answer.  A uniqueness constraint is definitely a good
> idea.  In terms of doing a gapless sequence generally, see here:
> http://www.varlena.com/GeneralBits/130.php.  Basically the general
> idea is to keep a counter field somewhere that you lock and update.
>
> merlin
>

Re: possible race condition in trigger functions on insert operations?

From

Andreas Kretschmer

Date:

22 November 2011, 15:06:11

Kenneth Tilton <ktilton@mcna.net> wrote:

> On Tue, Nov 22, 2011 at 12:53 PM, Andreas Kretschmer
> <akretschmer@spamfence.net> wrote:
> > Kenneth Tilton <ktilton@mcna.net> wrote:
> >
> >> Bit of a trigger NOOB Q:
> >>
> >> I am trying to use a trigger function to automatically populate new
> >> rows in a table with  a public ID of the form YYYY-NNN such that the
> >> 42nd row created in 2011 would get the ID "2011-042". Each row is
> >> associated via an iasid column with a row in an audit table that has a
> >> timestamp column called created. This works OK, but I am worried about
> >> two rows getting the same case_no if they come in at the same time
> >> (whatever that means):
> >>
> >> declare
> >>   case_yr integer;
> >>   yr_case_count bigint;
> >> begin
> >>   select date_part('year', created) into case_yr
> >>          from audit
> >>          where audit.sid = NEW.iasid;
> >>
> >>   select count(*) into yr_case_count
> >>     from fwa_case, audit
> >>     where fwa_case.iasid=audit.sid
> >>       and date_part('year', created) = case_yr;
> >>
> >>   NEW.case_no = to_char( case_yr, '9999' ) || '-' ||
> >> to_char(1+yr_case_count, 'FM000');
> >>   return NEW;
> >> end;
> >
> > If i where you, i would not use such a column. What happens if you
> > insert/delete a record?
>
> We only do logical deletes. Not sure what you mean about inserts --

inserts that change the numbering...


> that is what I am working on, and they always should get the next
> highest sequence number in a year.
>
> >
> > I would use something like to_char(row_number() over (...),'FM000') to
> > count while select.
>
> Unfortunately it must be within the year, not overall.

That is not a problem - over (...) with, for instance, PARTITION BY ...



Andreas
--
Really, I'm not out to destroy Microsoft. That will just be a completely
unintentional side effect.                              (Linus Torvalds)
"If I was god, I would recompile penguin with --enable-fly."   (unknown)
Kaufbach, Saxony, Germany, Europe.              N 51.05082°, E 13.56889°

Re: possible race condition in trigger functions on insert operations?

From

Kenneth Tilton

Date:

22 November 2011, 15:13:17

On Tue, Nov 22, 2011 at 2:05 PM, Andreas Kretschmer
<akretschmer@spamfence.net> wrote:
> Kenneth Tilton <ktilton@mcna.net> wrote:
>
>> On Tue, Nov 22, 2011 at 12:53 PM, Andreas Kretschmer
>> <akretschmer@spamfence.net> wrote:
>> > Kenneth Tilton <ktilton@mcna.net> wrote:
>> >
>> >> Bit of a trigger NOOB Q:
>> >>
>> >> I am trying to use a trigger function to automatically populate new
>> >> rows in a table with  a public ID of the form YYYY-NNN such that the
>> >> 42nd row created in 2011 would get the ID "2011-042". Each row is
>> >> associated via an iasid column with a row in an audit table that has a
>> >> timestamp column called created. This works OK, but I am worried about
>> >> two rows getting the same case_no if they come in at the same time
>> >> (whatever that means):
>> >>
>> >> declare
>> >>   case_yr integer;
>> >>   yr_case_count bigint;
>> >> begin
>> >>   select date_part('year', created) into case_yr
>> >>          from audit
>> >>          where audit.sid = NEW.iasid;
>> >>
>> >>   select count(*) into yr_case_count
>> >>     from fwa_case, audit
>> >>     where fwa_case.iasid=audit.sid
>> >>       and date_part('year', created) = case_yr;
>> >>
>> >>   NEW.case_no = to_char( case_yr, '9999' ) || '-' ||
>> >> to_char(1+yr_case_count, 'FM000');
>> >>   return NEW;
>> >> end;
>> >
>> > If i where you, i would not use such a column. What happens if you
>> > insert/delete a record?
>>
>> We only do logical deletes. Not sure what you mean about inserts --
>
> inserts that change the numbering...
>
>
>> that is what I am working on, and they always should get the next
>> highest sequence number in a year.
>>
>> >
>> > I would use something like to_char(row_number() over (...),'FM000') to
>> > count while select.
>>
>> Unfortunately it must be within the year, not overall.
>
> That is not a problem - over (...) with, for instance, PARTITION BY ...

Ah, OK, I did not understand what they meant by "partition" when I
checked out row_number. I'll give that a try.

First Q: does  a row have a row_number at the time of insert/before?
Or will I be updating the row in the insert/after? I should know
soon... :)

-kenneth

Re: possible race condition in trigger functions on insert operations?

From

"David Johnston"

Date:

22 November 2011, 16:43:37

-----Original Message-----
From: Kenneth Tilton [mailto:ktilton@mcna.net]
Sent: Tuesday, November 22, 2011 1:52 PM
To: David Johnston
Cc: pgsql-general@postgresql.org
Subject: Re: [GENERAL] possible race condition in trigger functions on
insert operations?

On Tue, Nov 22, 2011 at 12:48 PM, David Johnston <polobo@yahoo.com> wrote:
> -----Original Message-----
> From: pgsql-general-owner@postgresql.org
> [mailto:pgsql-general-owner@postgresql.org] On Behalf Of Kenneth
> Tilton
> Sent: Tuesday, November 22, 2011 12:26 PM
> To: pgsql-general@postgresql.org
> Subject: [GENERAL] possible race condition in trigger functions on
> insert operations?
>
> Bit of a trigger NOOB Q:
>
> I am trying to use a trigger function to automatically populate new
> rows in a table with  a public ID of the form YYYY-NNN such that the
> 42nd row created in 2011 would get the ID "2011-042". Each row is
> associated via an iasid column with a row in an audit table that has a
> timestamp column called created. This works OK, but I am worried about
> two rows getting the same case_no if they come in at the same time
(whatever that means):
>
> declare
>  case_yr integer;
>  yr_case_count bigint;
> begin
>  select date_part('year', created) into case_yr
>         from audit
>         where audit.sid = NEW.iasid;
>
>  select count(*) into yr_case_count
>    from fwa_case, audit
>    where fwa_case.iasid=audit.sid
>      and date_part('year', created) = case_yr;
>
>  NEW.case_no = to_char( case_yr, '9999' ) || '-' ||
> to_char(1+yr_case_count, 'FM000');
>  return NEW;
> end;
>
> Do I have to worry about this, or does ACID bail me out? If the
> former, what do I do? I am thinking first put a uniqueness constraint
> on the column and then figure out how to do retries in a trigger function.
>
> kenneth
> --------------------------------------------------------
>
> Why can't you just use a sequence?

The sequence has to be within the year. Someone suggested a cron job to
reset the sequence at the beginning of the year but I find that alternative
unappealing for some reason.

----------------------------------------------

Just create a single sequence for each year and then call the proper one
on-the-fly.  You can create multiple sequences in advance and possible even
auto-create the sequence the first time one is attempted to be used in a
given year.  If you can live with possible (but probably unlikely) gaps in
the sequence then all the concurrency will be handled for you and you can
focus on writing a function that, given a year, will return the proper
value.

David J.

Re: possible race condition in trigger functions on insert operations?

From

Merlin Moncure

Date:

22 November 2011, 16:52:42

On Tue, Nov 22, 2011 at 2:43 PM, David Johnston <polobo@yahoo.com> wrote:
> Just create a single sequence for each year and then call the proper one
> on-the-fly.  You can create multiple sequences in advance and possible even
> auto-create the sequence the first time one is attempted to be used in a
> given year.  If you can live with possible (but probably unlikely) gaps in
> the sequence then all the concurrency will be handled for you and you can
> focus on writing a function that, given a year, will return the proper
> value.

I personally think the 'record the next to be inserted value' in a
table somewhere is better unless you are trying to support a lot of
concurrent operations.  Also the gap issue is more likely to come up
than you're letting on -- a rolled back transaction is all it takes.

merlin

Re: possible race condition in trigger functions on insert operations?

From

Kenneth Tilton

Date:

22 November 2011, 16:54:30

Pre-created sequences per year. Not a bad idea at all. I had not been
worrying about gaps but I checked with the user and they definitely do
not want gaps. But as long as we do not delete, is there still a risk
of gaps?

I am leaning towards  a serial column maintained by postgres, using
row_number partitioned by year ordered by the serial column. No gaps
since we never delete, and no concurrency issues. I think. :)

Might also go for allocating the IDs from a separate table.

-kenneth

On Tue, Nov 22, 2011 at 3:43 PM, David Johnston <polobo@yahoo.com> wrote:
> -----Original Message-----
> From: Kenneth Tilton [mailto:ktilton@mcna.net]
> Sent: Tuesday, November 22, 2011 1:52 PM
> To: David Johnston
> Cc: pgsql-general@postgresql.org
> Subject: Re: [GENERAL] possible race condition in trigger functions on
> insert operations?
>
> On Tue, Nov 22, 2011 at 12:48 PM, David Johnston <polobo@yahoo.com> wrote:
>> -----Original Message-----
>> From: pgsql-general-owner@postgresql.org
>> [mailto:pgsql-general-owner@postgresql.org] On Behalf Of Kenneth
>> Tilton
>> Sent: Tuesday, November 22, 2011 12:26 PM
>> To: pgsql-general@postgresql.org
>> Subject: [GENERAL] possible race condition in trigger functions on
>> insert operations?
>>
>> Bit of a trigger NOOB Q:
>>
>> I am trying to use a trigger function to automatically populate new
>> rows in a table with  a public ID of the form YYYY-NNN such that the
>> 42nd row created in 2011 would get the ID "2011-042". Each row is
>> associated via an iasid column with a row in an audit table that has a
>> timestamp column called created. This works OK, but I am worried about
>> two rows getting the same case_no if they come in at the same time
> (whatever that means):
>>
>> declare
>>  case_yr integer;
>>  yr_case_count bigint;
>> begin
>>  select date_part('year', created) into case_yr
>>         from audit
>>         where audit.sid = NEW.iasid;
>>
>>  select count(*) into yr_case_count
>>    from fwa_case, audit
>>    where fwa_case.iasid=audit.sid
>>      and date_part('year', created) = case_yr;
>>
>>  NEW.case_no = to_char( case_yr, '9999' ) || '-' ||
>> to_char(1+yr_case_count, 'FM000');
>>  return NEW;
>> end;
>>
>> Do I have to worry about this, or does ACID bail me out? If the
>> former, what do I do? I am thinking first put a uniqueness constraint
>> on the column and then figure out how to do retries in a trigger function.
>>
>> kenneth
>> --------------------------------------------------------
>>
>> Why can't you just use a sequence?
>
> The sequence has to be within the year. Someone suggested a cron job to
> reset the sequence at the beginning of the year but I find that alternative
> unappealing for some reason.
>
> ----------------------------------------------
>
> Just create a single sequence for each year and then call the proper one
> on-the-fly.  You can create multiple sequences in advance and possible even
> auto-create the sequence the first time one is attempted to be used in a
> given year.  If you can live with possible (but probably unlikely) gaps in
> the sequence then all the concurrency will be handled for you and you can
> focus on writing a function that, given a year, will return the proper
> value.
>
> David J.
>
>
>

Re: possible race condition in trigger functions on insert operations?

From

Kenneth Tilton

Date:

22 November 2011, 16:57:00

On Tue, Nov 22, 2011 at 3:52 PM, Merlin Moncure <mmoncure@gmail.com> wrote:
> On Tue, Nov 22, 2011 at 2:43 PM, David Johnston <polobo@yahoo.com> wrote:
>> Just create a single sequence for each year and then call the proper one
>> on-the-fly.  You can create multiple sequences in advance and possible even
>> auto-create the sequence the first time one is attempted to be used in a
>> given year.  If you can live with possible (but probably unlikely) gaps in
>> the sequence then all the concurrency will be handled for you and you can
>> focus on writing a function that, given a year, will return the proper
>> value.
>
> I personally think the 'record the next to be inserted value' in a
> table somewhere is better unless you are trying to support a lot of
> concurrent operations.  Also the gap issue is more likely to come up
> than you're letting on -- a rolled back transaction is all it takes.

Yeah, using a table seems to have the advantage of being about the
only thing that would work, though I think row_number properly used
(with an invariant ordering and no deletes) also works.

-kenneth