Abort transaction on duplicate key error

Started by Haller Christophover 24 years ago2 messages

ch@rodos.fzk.de

over 24 years ago

Hi all,
Sorry for bothering you with my stuff for the second time
but I haven't got any answer within two days and the problem
appears fundamental, at least to me.
I have a C application running to deal with meteorological data
like temperature, precipitation, wind speed, wind direction, ...
And I mean loads of data like several thousand sets within every
ten minutes.

From time to time it happens the transmitters have delivered wrong data,

so they send the sets again to be taken as correction.
The idea is to create a unique index on the timestamp, the location id
and the measurement id, then when receiving a duplicate key error
move on to an update command on that specific row.
But, within PostgreSQL this strategy does not work any longer within
a chained transaction, because the duplicate key error leads to
'abort the whole transaction'.
What I can do is change from chained transaction to unchained transaction,
but what I have read in the mailing list so far, the commit operation
requires loads of cpu time, and I do not have time for this when
processing thousands of sets.
I am wondering now whether there is a fundamental design error in
my strategy.
Any ideas, suggestions highly appreciated and thanks for reading so far.
Regards, Christoph

My first message:
In a C application I want to run several
insert commands within a chained transaction
(for faster execution).

From time to time there will be an insert command

causing an
ERROR: Cannot insert a duplicate key into a unique index

As a result, the whole transaction is aborted and all
the previous inserts are lost.
Is there any way to preserve the data
except working with "autocommit" ?
What I have in mind particularly is something like
"Do not abort on duplicate key error".

Barry Lind

barry@xythos.com

over 24 years ago

In reply to: Haller Christoph (#1)

Re: Abort transaction on duplicate key error

Haller,

The way I have handled this in the past is to attempt the following
insert, followed by an update if the insert doesn't insert any rows:

insert into foo (fooPK, foo2)
select 'valuePK', 'value2'
where not exists
(select 'x' from foo
where fooPK = 'valuePK')

if number of rows inserted = 0, then the row already exists so do an update

update foo set foo2 = 'value2'
where fooPK = 'valuePK'

Since I don't know what client interface you are using (java, perl, C),
I can't give you exact code for this, but the above should be easily
implemented in any language.

thanks,
--Barry

Haller Christoph wrote:

Show quoted text

Hi all,
Sorry for bothering you with my stuff for the second time
but I haven't got any answer within two days and the problem
appears fundamental, at least to me.
I have a C application running to deal with meteorological data
like temperature, precipitation, wind speed, wind direction, ...
And I mean loads of data like several thousand sets within every
ten minutes.

From time to time it happens the transmitters have delivered wrong data,

so they send the sets again to be taken as correction.
The idea is to create a unique index on the timestamp, the location id
and the measurement id, then when receiving a duplicate key error
move on to an update command on that specific row.
But, within PostgreSQL this strategy does not work any longer within
a chained transaction, because the duplicate key error leads to
'abort the whole transaction'.
What I can do is change from chained transaction to unchained transaction,
but what I have read in the mailing list so far, the commit operation
requires loads of cpu time, and I do not have time for this when
processing thousands of sets.
I am wondering now whether there is a fundamental design error in
my strategy.
Any ideas, suggestions highly appreciated and thanks for reading so far.
Regards, Christoph

My first message:
In a C application I want to run several
insert commands within a chained transaction
(for faster execution).

From time to time there will be an insert command

causing an
ERROR: Cannot insert a duplicate key into a unique index

As a result, the whole transaction is aborted and all
the previous inserts are lost.
Is there any way to preserve the data
except working with "autocommit" ?
What I have in mind particularly is something like
"Do not abort on duplicate key error".

---------------------------(end of broadcast)---------------------------
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/users-lounge/docs/faq.html