Best approach for a &quot;gap-less&quot; sequence

jgodoy@gmail.com

over 19 years ago

In reply to: Chris (#2)

Re: Best approach for a "gap-less" sequence

"chris smith" <dmagick@gmail.com> writes:

Why does it matter?

I assume there is a reason you need it like this..

Of course there is. It is a project requirement and also a law requirement
that there's no unused number and that they be chronologically ordered as
well. This is also part of the documented procedure that existed in paper and
that got ISO 9001 certified (so a lot of money was spent here before). The
law requirement is the strongest reason, though.

After a number is assigned, it can't be changed, reused or have anything
"newer" in a 'previous' (numerically-wise) entry.

Concurrency is a problem since there might be a lot of people using it. I
wanted to see if there was something that could improve performance here or to
solve the problem in a better way without locking the table.

Thanks,
--
Jorge Godoy <jgodoy@gmail.com>

jgodoy@gmail.com

over 19 years ago

In reply to: Thomas Kellerer (#3)

Re: Best approach for a "gap-less" sequence

Thomas Kellerer <spam_eater@gmx.net> writes:

What do you do if a document gets deleted? Renumber the "following" documents
so that no gaps are present in the already used ids?

There's no deletion possibility. A RULE sets a column named "active" to
"False" instead (I can set it manually or let the RULE do that for me...).

--
Jorge Godoy <jgodoy@gmail.com>

jgodoy@gmail.com

over 19 years ago

In reply to: Jorge Godoy (#1)

Re: Best approach for a "gap-less" sequence

Jorge Godoy <jgodoy@gmail.com> writes:

Is there a better way to guarantee that there will be no gaps in my sequence
if something goes wrong with my transaction?

From the overwhelming feedback I assume there isn't a better way yet...
Thanks. I'll see how I can improve the model then to separate these sequences
into different tables.

--
Jorge Godoy <jgodoy@gmail.com>

Christian Kratzer

ck-lists@cksoft.de

over 19 years ago

In reply to: Chris (#2)

Re: Best approach for a "gap-less" sequence

Hi,

On Sat, 12 Aug 2006, chris smith wrote:

On 8/12/06, Jorge Godoy <jgodoy@gmail.com> wrote:

Is there a better way to guarantee that there will be no gaps in my
sequence
if something goes wrong with my transaction?

Why does it matter?

I assume there is a reason you need it like this..

For example german tax law requires invoices to be numbered
sequentially without gaps. This is supposed to make it harder
to cheat on VAT.

You cannot just drop an invoice as that would leave a gap. Tax
inspectors will search for gaps and query to whatever invoice
is missing from records.

I could not care less about gaps in surrogate keys but this
kind of stuff is an external requirement.

Theres propably not much choice on how to implement something
like this but to just store the last assigned number in some row.

I would at least try to assign multiple such numbers in batches
to mimize contention on the row you store the counter in.

Greetings
Christian

--
Christian Kratzer ck@cksoft.de
CK Software GmbH http://www.cksoft.de/
Phone: +49 7452 889 135 Fax: +49 7452 889 136

godoy@ieee.org

over 19 years ago

In reply to: Christian Kratzer (#7)

Re: Best approach for a "gap-less" sequence

Christian Kratzer <ck-lists@cksoft.de> writes:

I would at least try to assign multiple such numbers in batches to mimize
contention on the row you store the counter in.

What do you mean here? How would you guarantee that on of the receiver
transactions didn't rollback and left a gap in the "sequence"?

I believe that for invoices it is less problematic. At least here I don't
need the "time" part control, so if I leave one blank I can fill it later in
the same day without problems (except, of course, if the sequence number is
tied to some other physical evidence such as the paper counterpart of the
invoice and that is also chronologically assigned).

The whole problem appears because no matter how much we validate input and
relationships on the input interface, something might happen and make the
"INSERT" transaction fail. Theoretically, all should go fine, but... :-)

--
Jorge Godoy <jgodoy@gmail.com>

Christian Kratzer

ck-lists@cksoft.de

over 19 years ago

In reply to: Jorge Godoy (#8)

Re: Best approach for a "gap-less" sequence

Hi,

On Sun, 13 Aug 2006, Jorge Godoy wrote:

Christian Kratzer <ck-lists@cksoft.de> writes:

I would at least try to assign multiple such numbers in batches to mimize
contention on the row you store the counter in.

What do you mean here? How would you guarantee that on of the receiver
transactions didn't rollback and left a gap in the "sequence"?

you would need to serialize the transactions assigning the
numbers and you would need to update the the counter in
the same transaction that assigns your numbers to your
documents or whatever.

Assigning a batch of 1000 numbers in one transaction would
propably be more efficient than assigning 1000 numbers in
1000 separate transactions that all need to be serialized.

I believe that for invoices it is less problematic. At least here I don't
need the "time" part control, so if I leave one blank I can fill it later in
the same day without problems (except, of course, if the sequence number is
tied to some other physical evidence such as the paper counterpart of the
invoice and that is also chronologically assigned).

Thats of course the idea. The numbers on the paper invoices
have to be gapless. The tax people want to have a warm
fuzzy feeling that they are seeing all your invoices or they
will begin to speculate on how much vat they have not
received from you.

The whole problem appears because no matter how much we validate input and
relationships on the input interface, something might happen and make the
"INSERT" transaction fail. Theoretically, all should go fine, but... :-)

increment the counter in the same transaction that assigns
your values.

Of course I know little or nothing about your application
and what you need gaples sequences for.

I just pulled the invoice example out of my hat to show
that there are legitimate use cases for gapless sequences
of numbers.

Greetings
Christian

--
Christian Kratzer ck@cksoft.de
CK Software GmbH http://www.cksoft.de/
Phone: +49 7452 889 135 Fax: +49 7452 889 136

#10

Ron Johnson

ron.l.johnson@cox.net

over 19 years ago

In reply to: Jorge Godoy (#6)

Re: Best approach for a "gap-less" sequence

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Jorge Godoy wrote:

Jorge Godoy <jgodoy@gmail.com> writes:

Is there a better way to guarantee that there will be no gaps in my sequence
if something goes wrong with my transaction?

From the overwhelming feedback I assume there isn't a better way yet...
Thanks. I'll see how I can improve the model then to separate these sequences
into different tables.

Pre-allocate records. The (primary key?) field would have the
numbers already filled in, but all the rest of the fields in each
record be NULL, blanks, zeros or indicator values ("~~~~~~~~~~",
- -999999999, etc).

Then create a single-field table called, for example, CUR_MAX_VALUE
that gets incremented as part of each transaction. To serialize
access, transactions would need an EXCLUSIVE lock on the table.

- --
Ron Johnson, Jr.
Jefferson LA USA

Is "common sense" really valid?
For example, it is "common sense" to white-power racists that
whites are superior to blacks, and that those with brown skins
are mud people.
However, that "common sense" is obviously wrong.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.5 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFE31J0S9HxQb37XmcRAkofAKCATXegeO6VRM8MW7AOkrFenMBtWgCgkksN
+7yKXTm3STQvLo7KTduUhsY=
=kxsK
-----END PGP SIGNATURE-----

#11

jgodoy@gmail.com

over 19 years ago

In reply to: Ron Johnson (#10)

Re: Best approach for a "gap-less" sequence

#12

Ron Johnson

ron.l.johnson@cox.net

over 19 years ago

In reply to: Jorge Godoy (#11)

Re: Best approach for a "gap-less" sequence

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Jorge Godoy wrote:

Ron Johnson <ron.l.johnson@cox.net> writes:

Pre-allocate records. The (primary key?) field would have the
numbers already filled in, but all the rest of the fields in each
record be NULL, blanks, zeros or indicator values ("~~~~~~~~~~",
-999999999, etc).

Then create a single-field table called, for example, CUR_MAX_VALUE
that gets incremented as part of each transaction. To serialize
access, transactions would need an EXCLUSIVE lock on the table.

What's the difference to having just the table with the sequence where I make
an exclusive lock to get the value while inside the transaction? This
approach seems more complicated since I'd have to exclude records that match
the "not-used" pattern.

The use of CUR_MAX_VALUE "should" ensure that you never have gaps,
since a rollback or process death would not update CUR_MAX_VALUE.

Your WHERE clauses would *not* have
AND NAME <> "~~~~~~~~~~".
They would say
AND SEQ_NO <= (SELECT CUR_MAX_VALUE FROM CUR_MAX_VALUE).

- --
Ron Johnson, Jr.
Jefferson LA USA

iD8DBQFE34jdS9HxQb37XmcRArBMAJ9ZS3/daUhhKu5f22nfo2m2AlXRfgCg7IfG
amkfOOnaJ1UzKRdlyZfJlvE=
=KCnM
-----END PGP SIGNATURE-----

#13

Chris

dmagick@gmail.com

over 19 years ago

In reply to: Jorge Godoy (#6)

Re: Best approach for a "gap-less" sequence

Jorge Godoy wrote:

Jorge Godoy <jgodoy@gmail.com> writes:

Is there a better way to guarantee that there will be no gaps in my sequence
if something goes wrong with my transaction?

From the overwhelming feedback I assume there isn't a better way yet...

Thanks. I'll see how I can improve the model then to separate these sequences
into different tables.

I'm not sure what type of lock you'd need to make sure no other
transactions updated the table (see
http://www.postgresql.org/docs/8.1/interactive/sql-lock.html) but "in
theory" something like this should work:

begin;
select id from table order by id desc limit 1;
insert into table (id, blah) values (id+1, 'blah');
commit;

P.S. I'm sure in older versions this query wouldn't use an index:
select max(id) from table;

I'm not sure about 8.0+.. hence doing an order by the id desc limit 1.

--
Postgresql & php tutorials
http://www.designmagick.com/

#14

jgodoy@gmail.com

over 19 years ago

In reply to: Chris (#13)

Re: Best approach for a "gap-less" sequence

Chris <dmagick@gmail.com> writes:

I'm not sure what type of lock you'd need to make sure no other transactions
updated the table (see
http://www.postgresql.org/docs/8.1/interactive/sql-lock.html) but "in theory"
something like this should work:

begin;
select id from table order by id desc limit 1;
insert into table (id, blah) values (id+1, 'blah');
commit;

This is part of the solution, yes. But I would still need locking this table
so that no other concurrent transaction gets another "id". I don't want to
lock the main table -- as I believe you're suggesting -- because I want it to
be searchable and updatable while I'm inserting new data. I just can't have
gaps in the sequence but I don't want to restrict everything else here.

P.S. I'm sure in older versions this query wouldn't use an index:
select max(id) from table;

It doesn't. You'd have to do what you did: "order by <x> desc limit 1" to
have it using indexes...

I'm not sure about 8.0+.. hence doing an order by the id desc limit 1.

I also have to test it... But I still keep using the "order by desc" syntax
:-)

Thanks for your answer,
--
Jorge Godoy <jgodoy@gmail.com>

#15

Michael Fuhr

mike@fuhr.org

over 19 years ago

In reply to: Jorge Godoy (#14)

Re: Best approach for a "gap-less" sequence

On Mon, Aug 14, 2006 at 09:09:51AM -0300, Jorge Godoy wrote:

Chris <dmagick@gmail.com> writes:

P.S. I'm sure in older versions this query wouldn't use an index:
select max(id) from table;

It doesn't. You'd have to do what you did: "order by <x> desc limit 1" to
have it using indexes...

I'm not sure about 8.0+.. hence doing an order by the id desc limit 1.

I also have to test it... But I still keep using the "order by desc" syntax

Excerpt from the 8.1 Release Notes:

Automatically use indexes for MIN() and MAX() (Tom)

In previous releases, the only way to use an index for MIN()
or MAX() was to rewrite the query as SELECT col FROM tab ORDER
BY col LIMIT 1. Index usage now happens automatically.

--
Michael Fuhr

#16

http://www.postgresql.org/docs/8.1/static/explicit-locking.html

godoy@ieee.org

over 19 years ago

In reply to: Michael Fuhr (#15)

Re: Best approach for a "gap-less" sequence

Michael Fuhr <mike@fuhr.org> writes:

Automatically use indexes for MIN() and MAX() (Tom)

In previous releases, the only way to use an index for MIN()
or MAX() was to rewrite the query as SELECT col FROM tab ORDER
BY col LIMIT 1. Index usage now happens automatically.

Thanks Michael! This is great.

--
Jorge Godoy <jgodoy@gmail.com>

#17

Alvaro Herrera

alvherre@2ndquadrant.com

over 19 years ago

In reply to: Jorge Godoy (#14)

Re: Best approach for a "gap-less" sequence

Jorge Godoy wrote:

Chris <dmagick@gmail.com> writes:

I'm not sure what type of lock you'd need to make sure no other transactions
updated the table (see
http://www.postgresql.org/docs/8.1/interactive/sql-lock.html) but "in theory"
something like this should work:

begin;
select id from table order by id desc limit 1;
insert into table (id, blah) values (id+1, 'blah');
commit;

This is part of the solution, yes. But I would still need locking this table
so that no other concurrent transaction gets another "id". I don't want to
lock the main table -- as I believe you're suggesting -- because I want it to
be searchable and updatable while I'm inserting new data.

So you have to hold a lock that conflicts with itself, but not with
ACCESS SHARE which is the lock acquired by SELECT. I think the first
one on the list with these two properties is SHARE UPDATE EXCLUSIVE.
Have a look at the list yourself:

Note the tip at the end of the table:

Tip: Only an ACCESS EXCLUSIVE lock blocks a SELECT (without FOR
UPDATE/SHARE) statement.

--
Alvaro Herrera http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

#18

A.M.

agentm@themactionfaction.com

over 19 years ago

In reply to: Alvaro Herrera (#17)

Re: Best approach for a "gap-less" sequence

Since the gapless numbers are purely for the benefit of the tax
people, you could build your db with regular sequences as primary
keys and then regularly (or just before tax-time) insert into a table
which maps the gapless sequence to the real primary key.

-M

#19

jgodoy@gmail.com

over 19 years ago

In reply to: A.M. (#18)

Re: Best approach for a "gap-less" sequence

AgentM <agentm@themactionfaction.com> writes:

Since the gapless numbers are purely for the benefit of the tax people, you
could build your db with regular sequences as primary keys and then regularly
(or just before tax-time) insert into a table which maps the gapless sequence
to the real primary key.

That's also an interesting approach. An auxiliary table like

transaction integer FK to the transactions table
transaction_nb integer gapless sequence

should do it. A trigger inserting on this auxiliary table would also take
care of everything... If I have an after trigger I believe I wouldn't need
any locking... I have to think about this...

As simple as this might be, I haven't thought about it :-) Thanks for your
suggestion.

--
Jorge Godoy <jgodoy@gmail.com>

#20

hf0731x@protecting.net

over 19 years ago

In reply to: Jorge Godoy (#1)

Re: Best approach for a "gap-less" sequence

In article <878xlrwcxm.fsf@gmail.com>,
Jorge Godoy <jgodoy@gmail.com> writes:

AgentM <agentm@themactionfaction.com> writes:

Since the gapless numbers are purely for the benefit of the tax people, you
could build your db with regular sequences as primary keys and then regularly
(or just before tax-time) insert into a table which maps the gapless sequence
to the real primary key.

That's also an interesting approach. An auxiliary table like

transaction integer FK to the transactions table
transaction_nb integer gapless sequence

should do it. A trigger inserting on this auxiliary table would also take
care of everything... If I have an after trigger I believe I wouldn't need
any locking... I have to think about this...

Why putting gapless numbers into the database at all? Just calculate them at
query time.

#21

Richard Broersma Jr

rabroersma@yahoo.com

over 19 years ago

In reply to: Harald Fuchs (#20)

#22

jgodoy@gmail.com

over 19 years ago

In reply to: Harald Fuchs (#20)

#23

Scott Ribe

scott_ribe@killerbytes.com

over 19 years ago

In reply to: Harald Fuchs (#20)

#24

btober@seaworthysys.com

over 19 years ago

In reply to: Jorge Godoy (#14)

#25

hf0731x@protecting.net

over 19 years ago

In reply to: Harald Fuchs (#20)

#26

hf0731x@protecting.net

over 19 years ago

In reply to: Jorge Godoy (#1)

#27

hf0731x@protecting.net

over 19 years ago

In reply to: Harald Fuchs (#20)

#28

jgodoy@gmail.com

over 19 years ago

In reply to: Harald Fuchs (#26)

#29

bnichols@ca.afilias.info

over 19 years ago

In reply to: Berend Tober (#24)

#30

Adrian Klaver

adrian.klaver@aklaver.com

over 19 years ago

In reply to: Brad Nicholson (#29)

#31

btober@seaworthysys.com

over 19 years ago

In reply to: Brad Nicholson (#29)

#32

Adrian Klaver

adrian.klaver@aklaver.com

over 19 years ago

In reply to: Adrian Klaver (#30)

#33

elein

elein@varlena.com

over 19 years ago

In reply to: Adrian Klaver (#30)

#34

btober@seaworthysys.com

over 19 years ago

In reply to: elein (#33)

#35

qnex42@gmail.com

over 19 years ago

In reply to: Jorge Godoy (#1)

#36

qnex42@gmail.com

over 19 years ago

In reply to: Dawid Kuroczko (#35)

#37

Adrian Klaver

adrian.klaver@aklaver.com

over 19 years ago

In reply to: elein (#33)

#38

jgodoy@gmail.com

over 19 years ago

In reply to: Dawid Kuroczko (#35)

#39

jgodoy@gmail.com

over 19 years ago

In reply to: elein (#33)

#40

jgodoy@gmail.com

over 19 years ago

In reply to: Berend Tober (#34)

#41

qnex42@gmail.com

over 19 years ago

In reply to: Jorge Godoy (#1)

#42

btober@seaworthysys.com

over 19 years ago

In reply to: Jorge Godoy (#40)

#43

A.M.

agentm@themactionfaction.com

over 19 years ago

In reply to: Jorge Godoy (#40)

#44

Merlin Moncure

mmoncure@gmail.com

over 19 years ago

In reply to: Dawid Kuroczko (#41)

#45

bnichols@ca.afilias.info

over 19 years ago

In reply to: Merlin Moncure (#44)

#46

Merlin Moncure

mmoncure@gmail.com

over 19 years ago

In reply to: Brad Nicholson (#45)

#47

Scott Marlowe

smarlowe@g2switchworks.com

over 19 years ago

In reply to: Merlin Moncure (#46)

#48

bnichols@ca.afilias.info

over 19 years ago

In reply to: Merlin Moncure (#46)

#49

bnichols@ca.afilias.info

over 19 years ago

In reply to: Scott Marlowe (#47)

#50

qnex42@gmail.com

over 19 years ago

In reply to: Merlin Moncure (#46)

#51

elein

elein@varlena.com

over 19 years ago

In reply to: Dawid Kuroczko (#50)

#52