How to create "auto-increment" field WITHOUT a sequence object?
Hello.
I need to create an auto-increment field on a table WITHOUT using sequences:
CREATE TABLE tbl(
name TEXT,
uniq_id INTEGER
);
Each INSERT to this table must generate a new uniq_id which is distinct from
all others.
The problem is that these INSERTs are rolled back oftenly (i.e. they are
executed within a transaction block which is rolled back time to time), this
is an existing design of the current architecture and unfortunately we have
to live with it. And I need as compact uniq_id generation (with minimum
"holes") as it possible - this is a VERY important requirement (to export
these values into external systems which accepts only IDs limited from 1 to
100000).
So I cannot use sequences: sequence value is obviously not rolled back, so
if I insert nextval(...) as uniq_id, I will have large holes (because of
often transaction rollbacks) and exhaust 100000 uniq_ids very fast. How to
deal with all this without sequences?
I tried
BEGIN;
LOCK TABLE tbl;
INSERT INTO tbl(uniq_id) VALUES((SELECT max(uniq_id) FROM tbl) + 1);
COMMIT;
but seems it performs too hard locking - time to time this query is timed
out (or sometimes deadlocks with other queries).
Is there any other, less hard, locking which allow me to guarantee that no
INSERTs will be performed into tbl between max() calculation and UPDATE
query itself, but does not lock the whole table?
Hello,
This is an ugly hack. Try to have a temporary holder that carries the maximum
value. for example create a table with a one columnn and certainly one row
and synchronize this value with your sequence
Regards
________________________________
From: Dmitry Koterov <dmitry@koterov.ru>
To: Postgres General <pgsql-general@postgresql.org>
Sent: Thu, June 30, 2011 8:40:39 PM
Subject: [GENERAL] How to create "auto-increment" field WITHOUT a sequence
object?
Hello.
I need to create an auto-increment field on a table WITHOUT using sequences:
CREATE TABLE tbl(
name TEXT,
uniq_id INTEGER
);
Each INSERT to this table must generate a new uniq_id which is distinct from all
others.
The problem is that these INSERTs are rolled back oftenly (i.e. they are
executed within a transaction block which is rolled back time to time), this is
an existing design of the current architecture and unfortunately we have to live
with it. And I need as compact uniq_id generation (with minimum "holes") as it
possible - this is a VERY important requirement (to export these values into
external systems which accepts only IDs limited from 1 to 100000).
So I cannot use sequences: sequence value is obviously not rolled back, so if I
insert nextval(...) as uniq_id, I will have large holes (because of often
transaction rollbacks) and exhaust 100000 uniq_ids very fast. How to deal with
all this without sequences?
I tried
BEGIN;
LOCK TABLE tbl;
INSERT INTO tbl(uniq_id) VALUES((SELECT max(uniq_id) FROM tbl) + 1);
COMMIT;
but seems it performs too hard locking - time to time this query is timed out
(or sometimes deadlocks with other queries).
Is there any other, less hard, locking which allow me to guarantee that no
INSERTs will be performed into tbl between max() calculation and UPDATE query
itself, but does not lock the whole table?
On Jun 30, 2011, at 2:40 PM, Dmitry Koterov wrote:
Hello.
I need to create an auto-increment field on a table WITHOUT using sequences:
This problem pops up a lot for invoice sequence numbers for the tax office and related cases. (Search for "gapless sequence".)
Since the numbers are really only needed for an external system (as you mention), then it may make sense to generate the gapless IDs when necessary and map the generated IDs to the rows later. The drawback is that some rows in the table will not have the gapless ID until the batch job is run, but all rows will still be addressable by the real sequence ID.
Cheers,
M
Thank you.
It may not be "fully" gapless. The main cause is to keep uniq_id as low as
it could be to not to exhaust 100000 values too fast.
I think solutions with addition tables look too complicated for this case,
is there a possiblilty to not to use an additional table?
On Thu, Jun 30, 2011 at 10:55 PM, A.M. <agentm@themactionfaction.com> wrote:
Show quoted text
On Jun 30, 2011, at 2:40 PM, Dmitry Koterov wrote:
Hello.
I need to create an auto-increment field on a table WITHOUT using
sequences:
This problem pops up a lot for invoice sequence numbers for the tax office
and related cases. (Search for "gapless sequence".)Since the numbers are really only needed for an external system (as you
mention), then it may make sense to generate the gapless IDs when necessary
and map the generated IDs to the rows later. The drawback is that some rows
in the table will not have the gapless ID until the batch job is run, but
all rows will still be addressable by the real sequence ID.Cheers,
M--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general
Original Message From: Dmitry Koterov
I need to create an auto-increment field on a table WITHOUT using
sequences:
You may get some ideas at http://www.varlena.com/GeneralBits/130.php
George
From: pgsql-general-owner@postgresql.org
[mailto:pgsql-general-owner@postgresql.org] On Behalf Of Dmitry Koterov
Sent: Thursday, June 30, 2011 2:41 PM
To: Postgres General
Subject: [GENERAL] How to create "auto-increment" field WITHOUT a sequence
object?
Hello.
I need to create an auto-increment field on a table WITHOUT using sequences:
CREATE TABLE tbl(
name TEXT,
uniq_id INTEGER
);
Each INSERT to this table must generate a new uniq_id which is distinct from
all others.
The problem is that these INSERTs are rolled back oftenly (i.e. they are
executed within a transaction block which is rolled back time to time), this
is an existing design of the current architecture and unfortunately we have
to live with it. And I need as compact uniq_id generation (with minimum
"holes") as it possible - this is a VERY important requirement (to export
these values into external systems which accepts only IDs limited from 1 to
100000).
So I cannot use sequences: sequence value is obviously not rolled back, so
if I insert nextval(...) as uniq_id, I will have large holes (because of
often transaction rollbacks) and exhaust 100000 uniq_ids very fast. How to
deal with all this without sequences?
I tried
BEGIN;
LOCK TABLE tbl;
INSERT INTO tbl(uniq_id) VALUES((SELECT max(uniq_id) FROM tbl) + 1);
COMMIT;
but seems it performs too hard locking - time to time this query is timed
out (or sometimes deadlocks with other queries).
Is there any other, less hard, locking which allow me to guarantee that no
INSERTs will be performed into tbl between max() calculation and UPDATE
query itself, but does not lock the whole table?
Why not have an internal and an external id? The internal one would use the
sequence and wouldn't care about being gap-less. The external one would be
assigned post-Insert and thus, ignoring deletes, can be gap-less. Depending
upon how frequently/quickly the external identifier needs to be present you
have various options to actually assign the external identifier value.
CREATE TABLE tbl(
name TEXT,
uniq_id serial,
external_id integer NULL
);
Upon creating a new record you have a record with a NULL external_id. At
some point in the future, prior to export, you can replace all the NULLs
with actual values using a sequence. Depending on whether or not the
transaction can be rolled back when successful you can add the "UPDATE"
statement as the last statement of the transaction so that it will only fire
if the transaction is otherwise going to complete successfully.
Without more detail about the how and why of your restrictions it is
difficult to provide solutions.
David J.
On Thu, Jun 30, 2011 at 12:40 PM, Dmitry Koterov <dmitry@koterov.ru> wrote:
Hello.
I need to create an auto-increment field on a table WITHOUT using sequences:
CREATE TABLE tbl(
name TEXT,
uniq_id INTEGER
);
Each INSERT to this table must generate a new uniq_id which is distinct from
all others.
Do they need to be in order? It looks like you only need a few since
the range you mention is 1 to 100000. you could put those numbers in
another table. Then in a transaction you grab a number from the
table, try to delete it. If it's not there you lost a race condition,
exit and try it again. If you can delete it then you try the insert
transaction. If it fails things roll back and the lock on the row is
released. If the transaction works you commit the whole thing.
Are the transactions really long running?
Thanks to all, I'll try to live with this solution.
(It is not ideal, but pretty good for my case.)
The better way may be to create an additional table with ANY structure and
no data and LOCK using it, but not the "tbl" table. It theoretically
decrease race conditions - the only thing which I need is to make mutex
around only one update statement.
On Fri, Jul 1, 2011 at 12:01 AM, Dmitry Koterov <dmitry@koterov.ru> wrote:
Show quoted text
OK.
Possible next solution is ON AFTER UPDATE trigger:
BEGIN
LOCK TABLE tbl IN SHARE UPDATE EXCLUSIVE MODE;
UPDATE tbl SET uniq_id=(SELECT MAX(uniq_id) + 1 FROM tbl) WHERE id =
NEW.id;
END;Seems SHARE UPDATE EXCLUSIVE is a minimum locking which conflicts with
itself and which does not conflict with pg_dump.
(Unfortunately it conflicts with VACUUM which is performed by autovacuum
process.)SHARE UPDATE EXCLUSIVE is better than default LOCK (which also blocks
reading).On Thu, Jun 30, 2011 at 11:38 PM, A.M. <agentm@themactionfaction.com>wrote:
On Jun 30, 2011, at 3:36 PM, Dmitry Koterov wrote:
...possibly within ON AFTER INSERT trigger:
BEGIN
pg_advisory_lock(0xDEADBEEF);
UPDATE tbl SET uniq_id=(SELECT MAX(uniq_id) + 1 FROM tbl) WHERE id =
NEW.id;
END;Would it work without explicit pg_advisory_unlock() - would the locking
be
released on COMMIT/ROLLBACK?
No- advisory locks are managed by the application, so that is exactly what
you don't want. The exclusive table lock is still exactly what you need
unless you can postpone the generation of the secondary IDs.Cheers,
M
Import Notes
Reply to msg id not found: BANLkTi=2ROE_na_MKThM9wjNPsYCUfReRg@mail.gmail.com
On 06/30/2011 03:01 PM, Dmitry Koterov wrote:
It may not be "fully" gapless. The main cause is to keep uniq_id as
low as it could be to not to exhaust 100000 values too fast.
I think solutions with addition tables look too complicated for this
case, is there a possiblilty to not to use an additional table?
You may think it's possible to build a gapless design that is less
complicated by writing some application or server code to enforce it.
You've already tried this and learned that it's much harder than it
seems. Doing this correctly without causing timeout and deadlock issues
is a hard problem.
Meanwhile, generating a gapless translation table that only includes
things that have been committed is easy, and you're not likely to run
into really strange and unexpected bugs in that implementation later.
Given those are the two situations you're comparing here, I would say
using the extra table is less complicated in every way. Sure, you're
adding another table, but the process happening against it is really
easy. The alternative doesn't have the extra table, but that doesn't
make it less complicated. Complexity needs to consider how difficult a
program is going to be to debug and maintain. And in those areas,
making a single table gapless is quite complicated.
--
Greg Smith 2ndQuadrant US greg@2ndQuadrant.com Baltimore, MD
Comprehensive and Customized PostgreSQL Training Classes:
http://www.2ndquadrant.us/postgresql-training/