Best approach for a "gap-less" sequence
Hi!
I was trying to solve a problem on an old system and realized that there might
be some better approach for doing what I need.
We have some documents that need to be ordered sequentially and without gaps.
I could use a sequence, but if the transaction fails then when I rollback the
sequence will already have been incremented.
So, today I have a control table and I acquire a SHARE ROW EXCLUSIVE lock to
it, read the value, increase it, do what I need and then I COMMIT the
transaction, ensuring that the sequence has no gaps.
Is there a better way to guarantee that there will be no gaps in my sequence
if something goes wrong with my transaction?
--
Jorge Godoy <jgodoy@gmail.com>
On 8/12/06, Jorge Godoy <jgodoy@gmail.com> wrote:
Hi!
I was trying to solve a problem on an old system and realized that there might
be some better approach for doing what I need.We have some documents that need to be ordered sequentially and without gaps.
I could use a sequence, but if the transaction fails then when I rollback the
sequence will already have been incremented.So, today I have a control table and I acquire a SHARE ROW EXCLUSIVE lock to
it, read the value, increase it, do what I need and then I COMMIT the
transaction, ensuring that the sequence has no gaps.Is there a better way to guarantee that there will be no gaps in my sequence
if something goes wrong with my transaction?
Why does it matter?
I assume there is a reason you need it like this..
--
Postgresql & php tutorials
http://www.designmagick.com/
Jorge Godoy wrote on 12.08.2006 01:33:
I was trying to solve a problem on an old system and realized that there might
be some better approach for doing what I need.We have some documents that need to be ordered sequentially and without gaps.
I could use a sequence, but if the transaction fails then when I rollback the
sequence will already have been incremented.So, today I have a control table and I acquire a SHARE ROW EXCLUSIVE lock to
it, read the value, increase it, do what I need and then I COMMIT the
transaction, ensuring that the sequence has no gaps.Is there a better way to guarantee that there will be no gaps in my sequence
if something goes wrong with my transaction?
What do you do if a document gets deleted? Renumber the "following" documents so
that no gaps are present in the already used ids?
Thomas
"chris smith" <dmagick@gmail.com> writes:
Why does it matter?
I assume there is a reason you need it like this..
Of course there is. It is a project requirement and also a law requirement
that there's no unused number and that they be chronologically ordered as
well. This is also part of the documented procedure that existed in paper and
that got ISO 9001 certified (so a lot of money was spent here before). The
law requirement is the strongest reason, though.
After a number is assigned, it can't be changed, reused or have anything
"newer" in a 'previous' (numerically-wise) entry.
Concurrency is a problem since there might be a lot of people using it. I
wanted to see if there was something that could improve performance here or to
solve the problem in a better way without locking the table.
Thanks,
--
Jorge Godoy <jgodoy@gmail.com>
Thomas Kellerer <spam_eater@gmx.net> writes:
What do you do if a document gets deleted? Renumber the "following" documents
so that no gaps are present in the already used ids?
There's no deletion possibility. A RULE sets a column named "active" to
"False" instead (I can set it manually or let the RULE do that for me...).
--
Jorge Godoy <jgodoy@gmail.com>
Jorge Godoy <jgodoy@gmail.com> writes:
Is there a better way to guarantee that there will be no gaps in my sequence
if something goes wrong with my transaction?
From the overwhelming feedback I assume there isn't a better way yet...
Thanks. I'll see how I can improve the model then to separate these sequences
into different tables.
--
Jorge Godoy <jgodoy@gmail.com>
Hi,
On Sat, 12 Aug 2006, chris smith wrote:
On 8/12/06, Jorge Godoy <jgodoy@gmail.com> wrote:
<snipp/>
Is there a better way to guarantee that there will be no gaps in my
sequence
if something goes wrong with my transaction?Why does it matter?
I assume there is a reason you need it like this..
For example german tax law requires invoices to be numbered
sequentially without gaps. This is supposed to make it harder
to cheat on VAT.
You cannot just drop an invoice as that would leave a gap. Tax
inspectors will search for gaps and query to whatever invoice
is missing from records.
I could not care less about gaps in surrogate keys but this
kind of stuff is an external requirement.
Theres propably not much choice on how to implement something
like this but to just store the last assigned number in some row.
I would at least try to assign multiple such numbers in batches
to mimize contention on the row you store the counter in.
Greetings
Christian
--
Christian Kratzer ck@cksoft.de
CK Software GmbH http://www.cksoft.de/
Phone: +49 7452 889 135 Fax: +49 7452 889 136
Christian Kratzer <ck-lists@cksoft.de> writes:
I would at least try to assign multiple such numbers in batches to mimize
contention on the row you store the counter in.
What do you mean here? How would you guarantee that on of the receiver
transactions didn't rollback and left a gap in the "sequence"?
I believe that for invoices it is less problematic. At least here I don't
need the "time" part control, so if I leave one blank I can fill it later in
the same day without problems (except, of course, if the sequence number is
tied to some other physical evidence such as the paper counterpart of the
invoice and that is also chronologically assigned).
The whole problem appears because no matter how much we validate input and
relationships on the input interface, something might happen and make the
"INSERT" transaction fail. Theoretically, all should go fine, but... :-)
--
Jorge Godoy <jgodoy@gmail.com>
Hi,
On Sun, 13 Aug 2006, Jorge Godoy wrote:
Christian Kratzer <ck-lists@cksoft.de> writes:
I would at least try to assign multiple such numbers in batches to mimize
contention on the row you store the counter in.What do you mean here? How would you guarantee that on of the receiver
transactions didn't rollback and left a gap in the "sequence"?
you would need to serialize the transactions assigning the
numbers and you would need to update the the counter in
the same transaction that assigns your numbers to your
documents or whatever.
Assigning a batch of 1000 numbers in one transaction would
propably be more efficient than assigning 1000 numbers in
1000 separate transactions that all need to be serialized.
I believe that for invoices it is less problematic. At least here I don't
need the "time" part control, so if I leave one blank I can fill it later in
the same day without problems (except, of course, if the sequence number is
tied to some other physical evidence such as the paper counterpart of the
invoice and that is also chronologically assigned).
Thats of course the idea. The numbers on the paper invoices
have to be gapless. The tax people want to have a warm
fuzzy feeling that they are seeing all your invoices or they
will begin to speculate on how much vat they have not
received from you.
The whole problem appears because no matter how much we validate input and
relationships on the input interface, something might happen and make the
"INSERT" transaction fail. Theoretically, all should go fine, but... :-)
increment the counter in the same transaction that assigns
your values.
Of course I know little or nothing about your application
and what you need gaples sequences for.
I just pulled the invoice example out of my hat to show
that there are legitimate use cases for gapless sequences
of numbers.
Greetings
Christian
--
Christian Kratzer ck@cksoft.de
CK Software GmbH http://www.cksoft.de/
Phone: +49 7452 889 135 Fax: +49 7452 889 136
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Jorge Godoy wrote:
Jorge Godoy <jgodoy@gmail.com> writes:
Is there a better way to guarantee that there will be no gaps in my sequence
if something goes wrong with my transaction?From the overwhelming feedback I assume there isn't a better way yet...
Thanks. I'll see how I can improve the model then to separate these sequences
into different tables.
Pre-allocate records. The (primary key?) field would have the
numbers already filled in, but all the rest of the fields in each
record be NULL, blanks, zeros or indicator values ("~~~~~~~~~~",
- -999999999, etc).
Then create a single-field table called, for example, CUR_MAX_VALUE
that gets incremented as part of each transaction. To serialize
access, transactions would need an EXCLUSIVE lock on the table.
- --
Ron Johnson, Jr.
Jefferson LA USA
Is "common sense" really valid?
For example, it is "common sense" to white-power racists that
whites are superior to blacks, and that those with brown skins
are mud people.
However, that "common sense" is obviously wrong.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.5 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org
iD8DBQFE31J0S9HxQb37XmcRAkofAKCATXegeO6VRM8MW7AOkrFenMBtWgCgkksN
+7yKXTm3STQvLo7KTduUhsY=
=kxsK
-----END PGP SIGNATURE-----
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Jorge Godoy wrote:
Ron Johnson <ron.l.johnson@cox.net> writes:
Pre-allocate records. The (primary key?) field would have the
numbers already filled in, but all the rest of the fields in each
record be NULL, blanks, zeros or indicator values ("~~~~~~~~~~",
-999999999, etc).Then create a single-field table called, for example, CUR_MAX_VALUE
that gets incremented as part of each transaction. To serialize
access, transactions would need an EXCLUSIVE lock on the table.What's the difference to having just the table with the sequence where I make
an exclusive lock to get the value while inside the transaction? This
approach seems more complicated since I'd have to exclude records that match
the "not-used" pattern.
The use of CUR_MAX_VALUE "should" ensure that you never have gaps,
since a rollback or process death would not update CUR_MAX_VALUE.
Your WHERE clauses would *not* have
AND NAME <> "~~~~~~~~~~".
They would say
AND SEQ_NO <= (SELECT CUR_MAX_VALUE FROM CUR_MAX_VALUE).
- --
Ron Johnson, Jr.
Jefferson LA USA
Is "common sense" really valid?
For example, it is "common sense" to white-power racists that
whites are superior to blacks, and that those with brown skins
are mud people.
However, that "common sense" is obviously wrong.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.5 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org
iD8DBQFE34jdS9HxQb37XmcRArBMAJ9ZS3/daUhhKu5f22nfo2m2AlXRfgCg7IfG
amkfOOnaJ1UzKRdlyZfJlvE=
=KCnM
-----END PGP SIGNATURE-----
Jorge Godoy wrote:
Jorge Godoy <jgodoy@gmail.com> writes:
Is there a better way to guarantee that there will be no gaps in my sequence
if something goes wrong with my transaction?From the overwhelming feedback I assume there isn't a better way yet...
Thanks. I'll see how I can improve the model then to separate these sequences
into different tables.
I'm not sure what type of lock you'd need to make sure no other
transactions updated the table (see
http://www.postgresql.org/docs/8.1/interactive/sql-lock.html) but "in
theory" something like this should work:
begin;
select id from table order by id desc limit 1;
insert into table (id, blah) values (id+1, 'blah');
commit;
P.S. I'm sure in older versions this query wouldn't use an index:
select max(id) from table;
I'm not sure about 8.0+.. hence doing an order by the id desc limit 1.
--
Postgresql & php tutorials
http://www.designmagick.com/
Chris <dmagick@gmail.com> writes:
I'm not sure what type of lock you'd need to make sure no other transactions
updated the table (see
http://www.postgresql.org/docs/8.1/interactive/sql-lock.html) but "in theory"
something like this should work:begin;
select id from table order by id desc limit 1;
insert into table (id, blah) values (id+1, 'blah');
commit;
This is part of the solution, yes. But I would still need locking this table
so that no other concurrent transaction gets another "id". I don't want to
lock the main table -- as I believe you're suggesting -- because I want it to
be searchable and updatable while I'm inserting new data. I just can't have
gaps in the sequence but I don't want to restrict everything else here.
P.S. I'm sure in older versions this query wouldn't use an index:
select max(id) from table;
It doesn't. You'd have to do what you did: "order by <x> desc limit 1" to
have it using indexes...
I'm not sure about 8.0+.. hence doing an order by the id desc limit 1.
I also have to test it... But I still keep using the "order by desc" syntax
:-)
Thanks for your answer,
--
Jorge Godoy <jgodoy@gmail.com>
On Mon, Aug 14, 2006 at 09:09:51AM -0300, Jorge Godoy wrote:
Chris <dmagick@gmail.com> writes:
P.S. I'm sure in older versions this query wouldn't use an index:
select max(id) from table;It doesn't. You'd have to do what you did: "order by <x> desc limit 1" to
have it using indexes...I'm not sure about 8.0+.. hence doing an order by the id desc limit 1.
I also have to test it... But I still keep using the "order by desc" syntax
Excerpt from the 8.1 Release Notes:
Automatically use indexes for MIN() and MAX() (Tom)
In previous releases, the only way to use an index for MIN()
or MAX() was to rewrite the query as SELECT col FROM tab ORDER
BY col LIMIT 1. Index usage now happens automatically.
--
Michael Fuhr
Michael Fuhr <mike@fuhr.org> writes:
Automatically use indexes for MIN() and MAX() (Tom)
In previous releases, the only way to use an index for MIN()
or MAX() was to rewrite the query as SELECT col FROM tab ORDER
BY col LIMIT 1. Index usage now happens automatically.
Thanks Michael! This is great.
--
Jorge Godoy <jgodoy@gmail.com>
Jorge Godoy wrote:
Chris <dmagick@gmail.com> writes:
I'm not sure what type of lock you'd need to make sure no other transactions
updated the table (see
http://www.postgresql.org/docs/8.1/interactive/sql-lock.html) but "in theory"
something like this should work:begin;
select id from table order by id desc limit 1;
insert into table (id, blah) values (id+1, 'blah');
commit;This is part of the solution, yes. But I would still need locking this table
so that no other concurrent transaction gets another "id". I don't want to
lock the main table -- as I believe you're suggesting -- because I want it to
be searchable and updatable while I'm inserting new data.
So you have to hold a lock that conflicts with itself, but not with
ACCESS SHARE which is the lock acquired by SELECT. I think the first
one on the list with these two properties is SHARE UPDATE EXCLUSIVE.
Have a look at the list yourself:
http://www.postgresql.org/docs/8.1/static/explicit-locking.html
Note the tip at the end of the table:
Tip: Only an ACCESS EXCLUSIVE lock blocks a SELECT (without FOR
UPDATE/SHARE) statement.
--
Alvaro Herrera http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support
Since the gapless numbers are purely for the benefit of the tax
people, you could build your db with regular sequences as primary
keys and then regularly (or just before tax-time) insert into a table
which maps the gapless sequence to the real primary key.
-M
AgentM <agentm@themactionfaction.com> writes:
Since the gapless numbers are purely for the benefit of the tax people, you
could build your db with regular sequences as primary keys and then regularly
(or just before tax-time) insert into a table which maps the gapless sequence
to the real primary key.
That's also an interesting approach. An auxiliary table like
transaction integer FK to the transactions table
transaction_nb integer gapless sequence
should do it. A trigger inserting on this auxiliary table would also take
care of everything... If I have an after trigger I believe I wouldn't need
any locking... I have to think about this...
As simple as this might be, I haven't thought about it :-) Thanks for your
suggestion.
--
Jorge Godoy <jgodoy@gmail.com>
In article <878xlrwcxm.fsf@gmail.com>,
Jorge Godoy <jgodoy@gmail.com> writes:
AgentM <agentm@themactionfaction.com> writes:
Since the gapless numbers are purely for the benefit of the tax people, you
could build your db with regular sequences as primary keys and then regularly
(or just before tax-time) insert into a table which maps the gapless sequence
to the real primary key.
That's also an interesting approach. An auxiliary table like
transaction integer FK to the transactions table
transaction_nb integer gapless sequence
should do it. A trigger inserting on this auxiliary table would also take
care of everything... If I have an after trigger I believe I wouldn't need
any locking... I have to think about this...
Why putting gapless numbers into the database at all? Just calculate them at
query time.