Could postgres12 support millions of sequences? (like 10 million)

Started by pabloa98about 6 years ago58 messagesgeneral
Jump to latest
#1pabloa98
pabloa98@gmail.com

Hello,

My schema requires a counter for each combination of 2 values. Something
like:

CREATE TABLE counter(
group INT NOT NULL,
element INT NOT NULL,
seq_number INT NOT NULL default 0,
CONSTRAINT PRIMARY KEY (group, element)
);

For each entry in counter, aka for each (group, element) pair, the model
requires a seq_number.

If I use a table "counter", I could still have counter collisions between 2
transactions. I need truly sequence behavior. Is that possible by using a
table like "counter" table, where the counter could be increased out of the
transaction so it performs as a sequence without having race conditions
between concurrent transactions?

The other option is to create sequences for each new pair of (group,
element) using triggers. There are millions of pairs. So this approach will
generate millions of sequences.

How a PostgreSQL database would behave having millions of sequences in a
schema? Would it degrade its performance? Is there any negative impact?

Regards

Pablo

#2Rob Sargent
robjsargent@gmail.com
In reply to: pabloa98 (#1)
Re: Could postgres12 support millions of sequences? (like 10 million)

On Mar 19, 2020, at 3:36 PM, pabloa98 <pabloa98@gmail.com> wrote:

Hello,

My schema requires a counter for each combination of 2 values. Something like:

CREATE TABLE counter(
group INT NOT NULL,
element INT NOT NULL,
seq_number INT NOT NULL default 0,
CONSTRAINT PRIMARY KEY (group, element)
);

For each entry in counter, aka for each (group, element) pair, the model requires a seq_number.

If I use a table "counter", I could still have counter collisions between 2 transactions. I need truly sequence behavior. Is that possible by using a table like "counter" table, where the counter could be increased out of the transaction so it performs as a sequence without having race conditions between concurrent transactions?

The other option is to create sequences for each new pair of (group, element) using triggers. There are millions of pairs. So this approach will generate millions of sequences.

How a PostgreSQL database would behave having millions of sequences in a schema? Would it degrade its performance? Is there any negative impact?

Regards

Pablo

To clarify, are you hoping for consecutive numbers as the each row is added to the table, i.e. “serial”?

What is the intension of “seq_number”?

Show quoted text
#3pabloa98
pabloa98@gmail.com
In reply to: Rob Sargent (#2)
Re: Could postgres12 support millions of sequences? (like 10 million)

On Thu, Mar 19, 2020 at 2:50 PM Rob Sargent <robjsargent@gmail.com> wrote:

On Mar 19, 2020, at 3:36 PM, pabloa98 <pabloa98@gmail.com> wrote:

Hello,

My schema requires a counter for each combination of 2 values. Something

like:

CREATE TABLE counter(
group INT NOT NULL,
element INT NOT NULL,
seq_number INT NOT NULL default 0,
CONSTRAINT PRIMARY KEY (group, element)
);

For each entry in counter, aka for each (group, element) pair, the model

requires a seq_number.

If I use a table "counter", I could still have counter collisions

between 2 transactions. I need truly sequence behavior. Is that possible by
using a table like "counter" table, where the counter could be increased
out of the transaction so it performs as a sequence without having race
conditions between concurrent transactions?

The other option is to create sequences for each new pair of (group,

element) using triggers. There are millions of pairs. So this approach will
generate millions of sequences.

How a PostgreSQL database would behave having millions of sequences in a

schema? Would it degrade its performance? Is there any negative impact?

Regards

Pablo

To clarify, are you hoping for consecutive numbers as the each row is
added to the table, i.e. “serial”?

What is the intension of “seq_number”?

the idea is to have like a serial sequence, but for each pair of (group,
element).

so that when we insert rows in another table, we could have something like:

group, element, event_id, ...
1, 1, 1
1, 1, 2
1, 1, 3
2, 1, 1
1, 1, 4
1, 3, 1
1, 1, 5
1, 3, 2
2, 1, 2
2, 1, 3

The 3rd column is the sequence number we get from the appropriate sequence
created by the trigger.

I want to implement a variation of https://stackoverflow.com/a/30204854 and
that will generate millions of sequences.

#4Rob Sargent
robjsargent@gmail.com
In reply to: pabloa98 (#3)
Re: Could postgres12 support millions of sequences? (like 10 million)

On Mar 19, 2020, at 4:13 PM, pabloa98 <pabloa98@gmail.com> wrote:

On Thu, Mar 19, 2020 at 2:50 PM Rob Sargent <robjsargent@gmail.com <mailto:robjsargent@gmail.com>> wrote:

On Mar 19, 2020, at 3:36 PM, pabloa98 <pabloa98@gmail.com <mailto:pabloa98@gmail.com>> wrote:

Hello,

My schema requires a counter for each combination of 2 values. Something like:

CREATE TABLE counter(
group INT NOT NULL,
element INT NOT NULL,
seq_number INT NOT NULL default 0,
CONSTRAINT PRIMARY KEY (group, element)
);

For each entry in counter, aka for each (group, element) pair, the model requires a seq_number.

If I use a table "counter", I could still have counter collisions between 2 transactions. I need truly sequence behavior. Is that possible by using a table like "counter" table, where the counter could be increased out of the transaction so it performs as a sequence without having race conditions between concurrent transactions?

The other option is to create sequences for each new pair of (group, element) using triggers. There are millions of pairs. So this approach will generate millions of sequences.

How a PostgreSQL database would behave having millions of sequences in a schema? Would it degrade its performance? Is there any negative impact?

Regards

Pablo

To clarify, are you hoping for consecutive numbers as the each row is added to the table, i.e. “serial”?

What is the intension of “seq_number”?

the idea is to have like a serial sequence, but for each pair of (group, element).

so that when we insert rows in another table, we could have something like:

group, element, event_id, ...
1, 1, 1
1, 1, 2
1, 1, 3
2, 1, 1
1, 1, 4
1, 3, 1
1, 1, 5
1, 3, 2
2, 1, 2
2, 1, 3

The 3rd column is the sequence number we get from the appropriate sequence created by the trigger.

I want to implement a variation of https://stackoverflow.com/a/30204854 <https://stackoverflow.com/a/30204854&gt; and that will generate millions of sequences.

Then I don’t thing group/element can be a PRIMARY KEY

#5pabloa98
pabloa98@gmail.com
In reply to: Rob Sargent (#4)
Re: Could postgres12 support millions of sequences? (like 10 million)

On Thu, Mar 19, 2020 at 3:17 PM Rob Sargent <robjsargent@gmail.com> wrote:

On Mar 19, 2020, at 4:13 PM, pabloa98 <pabloa98@gmail.com> wrote:

On Thu, Mar 19, 2020 at 2:50 PM Rob Sargent <robjsargent@gmail.com> wrote:

On Mar 19, 2020, at 3:36 PM, pabloa98 <pabloa98@gmail.com> wrote:

Hello,

My schema requires a counter for each combination of 2 values.

Something like:

CREATE TABLE counter(
group INT NOT NULL,
element INT NOT NULL,
seq_number INT NOT NULL default 0,
CONSTRAINT PRIMARY KEY (group, element)
);

For each entry in counter, aka for each (group, element) pair, the

model requires a seq_number.

If I use a table "counter", I could still have counter collisions

between 2 transactions. I need truly sequence behavior. Is that possible by
using a table like "counter" table, where the counter could be increased
out of the transaction so it performs as a sequence without having race
conditions between concurrent transactions?

The other option is to create sequences for each new pair of (group,

element) using triggers. There are millions of pairs. So this approach will
generate millions of sequences.

How a PostgreSQL database would behave having millions of sequences in

a schema? Would it degrade its performance? Is there any negative impact?

Regards

Pablo

To clarify, are you hoping for consecutive numbers as the each row is
added to the table, i.e. “serial”?

What is the intension of “seq_number”?

the idea is to have like a serial sequence, but for each pair of (group,
element).

so that when we insert rows in another table, we could have something like:

group, element, event_id, ...
1, 1, 1
1, 1, 2
1, 1, 3
2, 1, 1
1, 1, 4
1, 3, 1
1, 1, 5
1, 3, 2
2, 1, 2
2, 1, 3

The 3rd column is the sequence number we get from the appropriate sequence
created by the trigger.

I want to implement a variation of https://stackoverflow.com/a/30204854
and that will generate millions of sequences.

Then I don’t thing group/element can be a PRIMARY KEY

Thank you for your answer. Let me explain:

Table "counter" will have triggers that will create a sequence with a new
row is inserted called counter_1_1_seq, counter_2_1_seq, counter_1_3_seq,
etc that will be used to insert values in **another** table. It will be
used for nothing else. When we insert millions of pairs group/element, the
trigger in that table will generate millions of sequences.

My question is how PostgreSQL will behave. Could it deal with millions of
sequences? What about system operations as vacuum, etc?

Pablo

#6David G. Johnston
david.g.johnston@gmail.com
In reply to: pabloa98 (#5)
Re: Could postgres12 support millions of sequences? (like 10 million)

On Thu, Mar 19, 2020 at 3:33 PM pabloa98 <pabloa98@gmail.com> wrote:

Table "counter" will have triggers that will create a sequence with a new
row is inserted called counter_1_1_seq, counter_2_1_seq, counter_1_3_seq,
etc that will be used to insert values in **another** table. It will be
used for nothing else. When we insert millions of pairs group/element, the
trigger in that table will generate millions of sequences.

My question is how PostgreSQL will behave. Could it deal with millions of
sequences? What about system operations as vacuum, etc?

First, it sounds like you care about there being no gaps in the records you
end up saving. If that is the case then sequences will not work for you.

.
.
.

If you are still reading because you can live with some gaps - then does
having one sequence per pair really matter?

Regardless, at this scale you probably should setup a performance test as
part of your continuous integration/test infrastructure, and let it answer
the "which performs better" question.

However, one other consideration with sequences: do you care that
PostgreSQL will cache/pin (i.e., no release) every single sequence you
touch for the lifetime of the session? (I do not think DISCARD matters here
but I'm just guessing)

A big determinant would seem to be how your sessions would go about using
the sequences. You've described the model requirement but haven't describe
the process by which the model will be used; and without that information
useful comments pertaining to alternative implementations are difficult to
formulate.

David J.

#7Adrian Klaver
adrian.klaver@aklaver.com
In reply to: pabloa98 (#5)
Re: Could postgres12 support millions of sequences? (like 10 million)

On 3/19/20 3:32 PM, pabloa98 wrote:

On Thu, Mar 19, 2020 at 3:17 PM Rob Sargent <robjsargent@gmail.com
<mailto:robjsargent@gmail.com>> wrote:

On Mar 19, 2020, at 4:13 PM, pabloa98 <pabloa98@gmail.com
<mailto:pabloa98@gmail.com>> wrote:

On Thu, Mar 19, 2020 at 2:50 PM Rob Sargent <robjsargent@gmail.com
<mailto:robjsargent@gmail.com>> wrote:

On Mar 19, 2020, at 3:36 PM, pabloa98 <pabloa98@gmail.com

<mailto:pabloa98@gmail.com>> wrote:

Hello,

My schema requires a counter for each combination of 2

values. Something like:

CREATE TABLE counter(
group INT NOT NULL,
element INT NOT NULL,
seq_number INT NOT NULL default 0,
CONSTRAINT PRIMARY KEY (group, element)
);

For each entry in counter, aka for each (group, element)

pair, the model requires a seq_number.

If I use a table "counter", I could still have counter

collisions between 2 transactions. I need truly sequence
behavior. Is that possible by using a table like "counter"
table, where the counter could be increased out of the
transaction so it performs as a sequence without having race
conditions between concurrent transactions?

The other option is to create sequences for each new pair of

(group, element) using triggers. There are millions of pairs.
So this approach will generate millions of sequences.

How a PostgreSQL database would behave having millions of

sequences in a schema? Would it degrade its performance? Is
there any negative impact?

Regards

Pablo

To clarify, are you hoping for consecutive numbers as the each
row is added to the table, i.e. “serial”?

What is the intension of “seq_number”?

the idea is to have like a serial sequence, but for each pair of
(group, element).

so that when we insert rows in another table, we could have
something like:

group, element, event_id, ...
1, 1, 1
1, 1, 2
1, 1, 3
2, 1, 1
1, 1, 4
1, 3, 1
1, 1, 5
1, 3, 2
2, 1, 2
2, 1, 3

The 3rd column is the sequence number we get from the appropriate
sequence created by the trigger.
I want to implement a variation of
https://stackoverflow.com/a/30204854 and that will generate
millions of sequences.

Then I don’t thing group/element can be a PRIMARY KEY

Thank you for your answer. Let me explain:

Table "counter" will have triggers that will create a sequence with a
new row is inserted called counter_1_1_seq, counter_2_1_seq,
counter_1_3_seq, etc that will be used to insert values in **another**
table. It will be used for nothing else. When we insert millions of
pairs group/element, the trigger in that table will generate millions of
sequences.

This has train wreck written all over it. If you are going to have a
trigger for each combination of (group, element) I gotta believe the
table will fall over before you run into sequence issues. Not sure why
there can't be serial column that has supplies the sequence numbers and
therefore only one sequence in play. From what I see all you care about
is that:

group, element, event_id

has an increasing event_id for (group, element) pairs.

My question is how PostgreSQL will behave. Could it deal with millions
of sequences? What about system operations as vacuum, etc?

Pablo

--
Adrian Klaver
adrian.klaver@aklaver.com

#8pabloa98
pabloa98@gmail.com
In reply to: Adrian Klaver (#7)
Re: Could postgres12 support millions of sequences? (like 10 million)

On Thu, Mar 19, 2020 at 5:13 PM Adrian Klaver <adrian.klaver@aklaver.com>
wrote:

On 3/19/20 3:32 PM, pabloa98 wrote:

On Thu, Mar 19, 2020 at 3:17 PM Rob Sargent <robjsargent@gmail.com
<mailto:robjsargent@gmail.com>> wrote:

On Mar 19, 2020, at 4:13 PM, pabloa98 <pabloa98@gmail.com
<mailto:pabloa98@gmail.com>> wrote:

On Thu, Mar 19, 2020 at 2:50 PM Rob Sargent <robjsargent@gmail.com
<mailto:robjsargent@gmail.com>> wrote:

On Mar 19, 2020, at 3:36 PM, pabloa98 <pabloa98@gmail.com

<mailto:pabloa98@gmail.com>> wrote:

Hello,

My schema requires a counter for each combination of 2

values. Something like:

CREATE TABLE counter(
group INT NOT NULL,
element INT NOT NULL,
seq_number INT NOT NULL default 0,
CONSTRAINT PRIMARY KEY (group, element)
);

For each entry in counter, aka for each (group, element)

pair, the model requires a seq_number.

If I use a table "counter", I could still have counter

collisions between 2 transactions. I need truly sequence
behavior. Is that possible by using a table like "counter"
table, where the counter could be increased out of the
transaction so it performs as a sequence without having race
conditions between concurrent transactions?

The other option is to create sequences for each new pair of

(group, element) using triggers. There are millions of pairs.
So this approach will generate millions of sequences.

How a PostgreSQL database would behave having millions of

sequences in a schema? Would it degrade its performance? Is
there any negative impact?

Regards

Pablo

To clarify, are you hoping for consecutive numbers as the each
row is added to the table, i.e. “serial”?

What is the intension of “seq_number”?

the idea is to have like a serial sequence, but for each pair of
(group, element).

so that when we insert rows in another table, we could have
something like:

group, element, event_id, ...
1, 1, 1
1, 1, 2
1, 1, 3
2, 1, 1
1, 1, 4
1, 3, 1
1, 1, 5
1, 3, 2
2, 1, 2
2, 1, 3

The 3rd column is the sequence number we get from the appropriate
sequence created by the trigger.
I want to implement a variation of
https://stackoverflow.com/a/30204854 and that will generate
millions of sequences.

Then I don’t thing group/element can be a PRIMARY KEY

Thank you for your answer. Let me explain:

Table "counter" will have triggers that will create a sequence with a
new row is inserted called counter_1_1_seq, counter_2_1_seq,
counter_1_3_seq, etc that will be used to insert values in **another**
table. It will be used for nothing else. When we insert millions of
pairs group/element, the trigger in that table will generate millions of
sequences.

This has train wreck written all over it. If you are going to have a
trigger for each combination of (group, element) I gotta believe the
table will fall over before you run into sequence issues. Not sure why
there can't be serial column that has supplies the sequence numbers and
therefore only one sequence in play. From what I see all you care about
is that:

group, element, event_id

has an increasing event_id for (group, element) pairs.

I need a different sequence because it is a business requirement.
Please forget about the first example. I was not clear. It will be
something like this.

I have a table called "pair":

CREATE TABLE pair(
group INT NOT NULL,
element INT NOT NULL,
CONSTRAINT PRIMARY KEY (group, element)
);

I must add entries to the table "event". This table event will have a code
that will be generated using a sequence of that is a function of
s(group,element).
CREATE TABLE event(
group INT NOT NULL,
element INT NOT NULL,
code INT NOT NULL,
CONSTRAINT PRIMARY KEY(code, element, group)
);

I could make column code SERIAL and be done with it. HOWEVER, the "code"
column starting from 1 for each pair (group, element) is a business
requirement. The business domain dictates that for each pair of (group,
element) there is a code that will start from 1 to 99999999. It is OK to
have gaps in the sequence.

So this table "event" will contain entries like:

1, 1, 1
1, 1, 2
1, 1, 3
2, 1, 1
1, 1, 4
1, 3, 1
1, 1, 5
1, 3, 2
2, 1, 2
2, 1, 3

These triplets will be inserted by several processes (aka several
concurrent transactions could happen at the same time). Since they are
concurrent transactions, it could be possible that a trivial calculation of
"code" column (something like SELECT MAX(code) + 1 FROM event WHERE group=1
AND element = 1;) will generate collisions because 2 transactions could
come with the same value of "code" at the same time.

So I need something like a sequence for each pair of (group, element)

So I added to the table "pair" a trigger to create/drop these sequences
each time a new row is inserted or deleted.

Something like:

CREATE FUNCTION make_pair_sequence() RETURNS trigger
LANGUAGE plpgsql
AS $make_pair_sequence$
BEGIN
IF (TG_OP = 'INSERT' OR TG_OP = 'UPDATE') THEN
EXECUTE format('
CREATE SEQUENCE IF NOT EXISTS code_%s_%s_seq
INCREMENT 1
START 1
CACHE 1
MINVALUE 1
MAXVALUE 99999999 -- 8 digits
', NEW.group, NEW.element);
ELSEIF (TG_OP = 'DELETE') THEN
EXECUTE format('
DROP SEQUENCE IF EXISTS code_%s_%s_seq
', NEW.group, NEW.element);
-- ELSEIF (TG_OP = 'TRUNCATE') THEN
END IF;
RETURN NULL; -- result ignored, so no biggie about what it is
returned..
END
$make_pair_sequence$;

DROP TRIGGER IF EXISTS make_pair_sequence
ON pair CASCADE;
CREATE TRIGGER make_pair_sequence
AFTER INSERT OR UPDATE OR DELETE ON pair
FOR EACH ROW EXECUTE FUNCTION make_pair_sequence();

each time I insert a pair with something like:

INSERT INTO PAIR( group, element) VALUES (1, 1); -- trigger creates
sequence code_1_1_seq
INSERT INTO PAIR( group, element) VALUES (1, 3); -- trigger creates
sequence code_1_3_seq
INSERT INTO PAIR( group, element) VALUES (2, 1); -- trigger creates
sequence code_2_1_seq

Later I will insert values in the "event" table with values like:

INSERT INTO event(group, element, code) VALUES (1, 1,
nextval('code_1_1_seq'));
INSERT INTO event(group, element, code) VALUES (1, 3,
nextval('code_1_3_seq'));
INSERT INTO event(group, element, code) VALUES (2, 1,
nextval('code_2_1_seq'));
etc

And since there are millions of possible pairs, the database will get
millions of sequences. The "code" column value generation on table "event"
will have no race conditions and everything will work fine from the
business point of view.

Now the questions are:

- Could PostgreSQL deal with millions of sequences? What will happen with
maintenance commands (like VACCUM)? Will they suffer performance issues?
- Is there a better method of having independent code counters
transaction-safe with no locks?

Sorry about the first email. I did not explain clearly what the problem was.

Show quoted text

My question is how PostgreSQL will behave. Could it deal with millions
of sequences? What about system operations as vacuum, etc?

Pablo

--
Adrian Klaver
adrian.klaver@aklaver.com

#9Rob Sargent
robjsargent@gmail.com
In reply to: pabloa98 (#8)
Re: Could postgres12 support millions of sequences? (like 10 million)

On Mar 19, 2020, at 6:45 PM, pabloa98 <pabloa98@gmail.com> wrote:

On Thu, Mar 19, 2020 at 5:13 PM Adrian Klaver <adrian.klaver@aklaver.com <mailto:adrian.klaver@aklaver.com>> wrote:
On 3/19/20 3:32 PM, pabloa98 wrote:

On Thu, Mar 19, 2020 at 3:17 PM Rob Sargent <robjsargent@gmail.com <mailto:robjsargent@gmail.com>
<mailto:robjsargent@gmail.com <mailto:robjsargent@gmail.com>>> wrote:I have a table called "pair":

CREATE TABLE pair(
group INT NOT NULL,
element INT NOT NULL,
CONSTRAINT PRIMARY KEY (group, element)
);

I must add entries to the table "event". This table event will have a code that will be generated using a sequence of that is a function of s(group,element).
CREATE TABLE event(
group INT NOT NULL,
element INT NOT NULL,
code INT NOT NULL,
CONSTRAINT PRIMARY KEY(code, element, group)
);

Unless event table is searched by code more than group, you probably want to maintain the key order from the pair table's primary key.

If gaps are ok do you still near multiple sequences?

#10pabloa98
pabloa98@gmail.com
In reply to: Rob Sargent (#9)
Re: Could postgres12 support millions of sequences? (like 10 million)

On Thu, Mar 19, 2020 at 6:16 PM Rob Sargent <robjsargent@gmail.com> wrote:

On Mar 19, 2020, at 6:45 PM, pabloa98 <pabloa98@gmail.com> wrote:

On Thu, Mar 19, 2020 at 5:13 PM Adrian Klaver <adrian.klaver@aklaver.com>
wrote:

On 3/19/20 3:32 PM, pabloa98 wrote:

On Thu, Mar 19, 2020 at 3:17 PM Rob Sargent <robjsargent@gmail.com
<mailto:robjsargent@gmail.com>> wrote:I have a table called "pair":

CREATE TABLE pair(
group INT NOT NULL,
element INT NOT NULL,
CONSTRAINT PRIMARY KEY (group, element)
);

I must add entries to the table "event". This table event will have a code
that will be generated using a sequence of that is a function of
s(group,element).
CREATE TABLE event(
group INT NOT NULL,
element INT NOT NULL,
code INT NOT NULL,
CONSTRAINT PRIMARY KEY(code, element, group)
);

Unless event table is searched by code more than group, you probably want
to maintain the key order from the pair table's primary key.

OK. I will do that.

If gaps are ok do you still near multiple sequences?

I need to start each "code" value from 1 in each (group, element) pair.

This is because we only have 99999999 codes for each pair. We do not want
to waste numbers. If there is a gap is OK but no a gap of millions of
numbers.

#11Rob Sargent
robjsargent@gmail.com
In reply to: pabloa98 (#10)
Re: Could postgres12 support millions of sequences? (like 10 million)
Show quoted text

On Mar 19, 2020, at 7:35 PM, pabloa98 <pabloa98@gmail.com> wrote:

On Thu, Mar 19, 2020 at 6:16 PM Rob Sargent <robjsargent@gmail.com <mailto:robjsargent@gmail.com>> wrote:

On Mar 19, 2020, at 6:45 PM, pabloa98 <pabloa98@gmail.com <mailto:pabloa98@gmail.com>> wrote:

On Thu, Mar 19, 2020 at 5:13 PM Adrian Klaver <adrian.klaver@aklaver.com <mailto:adrian.klaver@aklaver.com>> wrote:
On 3/19/20 3:32 PM, pabloa98 wrote:

On Thu, Mar 19, 2020 at 3:17 PM Rob Sargent <robjsargent@gmail.com <mailto:robjsargent@gmail.com>
<mailto:robjsargent@gmail.com <mailto:robjsargent@gmail.com>>> wrote:I have a table called "pair":

CREATE TABLE pair(
group INT NOT NULL,
element INT NOT NULL,
CONSTRAINT PRIMARY KEY (group, element)
);

I must add entries to the table "event". This table event will have a code that will be generated using a sequence of that is a function of s(group,element).
CREATE TABLE event(
group INT NOT NULL,
element INT NOT NULL,
code INT NOT NULL,
CONSTRAINT PRIMARY KEY(code, element, group)
);

Unless event table is searched by code more than group, you probably want to maintain the key order from the pair table's primary key.

OK. I will do that.

If gaps are ok do you still near multiple sequences?

I need to start each "code" value from 1 in each (group, element) pair. This is because we only have 99999999 codes for each pair. We do not want to waste numbers. If there is a gap is OK but no a gap of millions of numbers.

#12Michael Lewis
mlewis@entrata.com
In reply to: David G. Johnston (#6)
Re: Could postgres12 support millions of sequences? (like 10 million)

On Thu, Mar 19, 2020, 5:48 PM David G. Johnston <david.g.johnston@gmail.com>
wrote:

However, one other consideration with sequences: do you care that
PostgreSQL will cache/pin (i.e., no release) every single sequence you
touch for the lifetime of the session? (I do not think DISCARD matters here
but I'm just guessing)

Would you expand on this point or is there someplace specific in the
documentation on this?

Show quoted text
#13Tom Lane
tgl@sss.pgh.pa.us
In reply to: Michael Lewis (#12)
Re: Could postgres12 support millions of sequences? (like 10 million)

Michael Lewis <mlewis@entrata.com> writes:

On Thu, Mar 19, 2020, 5:48 PM David G. Johnston <david.g.johnston@gmail.com>
wrote:

However, one other consideration with sequences: do you care that
PostgreSQL will cache/pin (i.e., no release) every single sequence you
touch for the lifetime of the session? (I do not think DISCARD matters here
but I'm just guessing)

Would you expand on this point or is there someplace specific in the
documentation on this?

I think what David is worried about is that a sequence object is a
one-row table in PG's implementation. Thus

(1) each sequence requires a dozen or two rows in assorted system
catalogs (not sure exactly how many offhand).

(2) each sequence uses up 8KB on disk for its table file.

(3) each sequence you actually access within a particular session
results in creation of relcache and catcache entries in that
session's local memory. I'm not sure offhand how big those are
either, but a few KB per sequence would be a reasonable guess.

(4) each sequence competes for space in the shared-buffer arena,
since its 8K block has to be swapped into there whenever you try
to access/increment the sequence's value.

This is all mighty inefficient of course, and there's been talk
of trying to reduce the per-sequence overhead; but I don't know
of anyone actively working on that. As things stand, I think
having millions of sequences would be quite painful performance-
wise, especially if your workload were actively hitting a lot
of them concurrently. It would work, for some value of "work",
but it wouldn't perform very well.

Also, as multiple people mentioned already, this still wouldn't
guarantee gap-free sequences of ID values.

regards, tom lane

#14Adrian Klaver
adrian.klaver@aklaver.com
In reply to: Michael Lewis (#12)
Re: Could postgres12 support millions of sequences? (like 10 million)

On 3/19/20 7:38 PM, Michael Lewis wrote:

On Thu, Mar 19, 2020, 5:48 PM David G. Johnston
<david.g.johnston@gmail.com <mailto:david.g.johnston@gmail.com>> wrote:

However, one other consideration with sequences: do you care that
PostgreSQL will cache/pin (i.e., no release) every single sequence
you touch for the lifetime of the session? (I do not think DISCARD
matters here but I'm just guessing)

Would you expand on this point or is there someplace specific in the
documentation on this?

See the section starting here:

https://www.postgresql.org/docs/12/sql-createsequence.html

Notes

"Unexpected results might be obtained if a cache setting greater than
one is used for a sequence object that will be used concurrently by
multiple sessions. Each session will allocate and cache successive
sequence values during one access to the sequence object and increase
the sequence object's last_value accordingly. Then, the next cache-1
uses of nextval within that session simply return the preallocated
values without touching the sequence object. So, any numbers allocated
but not used within a session will be lost when that session ends,
resulting in “holes” in the sequence.

...
"

--
Adrian Klaver
adrian.klaver@aklaver.com

#15pabloa98
pabloa98@gmail.com
In reply to: Adrian Klaver (#14)
Re: Could postgres12 support millions of sequences? (like 10 million)

On Thu, Mar 19, 2020 at 9:12 PM Adrian Klaver <adrian.klaver@aklaver.com>
wrote:

On 3/19/20 7:38 PM, Michael Lewis wrote:

On Thu, Mar 19, 2020, 5:48 PM David G. Johnston
<david.g.johnston@gmail.com <mailto:david.g.johnston@gmail.com>> wrote:

However, one other consideration with sequences: do you care that
PostgreSQL will cache/pin (i.e., no release) every single sequence
you touch for the lifetime of the session? (I do not think DISCARD
matters here but I'm just guessing)

Would you expand on this point or is there someplace specific in the
documentation on this?

See the section starting here:

https://www.postgresql.org/docs/12/sql-createsequence.html

Notes

"Unexpected results might be obtained if a cache setting greater than
one is used for a sequence object that will be used concurrently by
multiple sessions. Each session will allocate and cache successive
sequence values during one access to the sequence object and increase
the sequence object's last_value accordingly. Then, the next cache-1
uses of nextval within that session simply return the preallocated
values without touching the sequence object. So, any numbers allocated
but not used within a session will be lost when that session ends,
resulting in “holes” in the sequence.

...
"

We will use a CACHE 1. This is because when nextval('seq') is invoked, we

are hitting 3 or 4 more tables so the sequence will not be a performance
blocker (compared with all the operations in the transaction).

Show quoted text

--
Adrian Klaver
adrian.klaver@aklaver.com

#16pabloa98
pabloa98@gmail.com
In reply to: Tom Lane (#13)
Re: Could postgres12 support millions of sequences? (like 10 million)

I see.

Any suggestion? It should behave like a sequence in the sense that
concurrent transitions will get different numbers from this alternative
sequence like solution.

In our case, we will need to do a call nextval('some_seq') (or similar)
from different processes no more than twice every minute.

It would be nice to have a sequence data type. Correct me if I am wrong,
but It seems to me that a sequence data type would cost the same or less
than the current sequence implementation.

The schema would be more clear too. We could have a table like:

CREATE TABLE pair(
group INT NOT NULL,
element INT NOT NULL,
seq SEQUENCE INCREMENT 1
START 1
CACHE 1
MINVALUE 1
MAXVALUE 99999999
NOT NULL,
CONSTRAINT PRIMARY KEY (group, element)
);

And then:

INSERT INTO event(group, element, code)
VALUES (
1,
1,
( SELECT seq.nextval('code_1_1_seq') FROM pair p WHERE p.group=1
and p.code=1 )
);

Or perhaps storing all the sequences in the same table as rows will have
the same behavior.

Pablo

On Thu, Mar 19, 2020 at 7:56 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:

Show quoted text

Michael Lewis <mlewis@entrata.com> writes:

On Thu, Mar 19, 2020, 5:48 PM David G. Johnston <

david.g.johnston@gmail.com>

wrote:

However, one other consideration with sequences: do you care that
PostgreSQL will cache/pin (i.e., no release) every single sequence you
touch for the lifetime of the session? (I do not think DISCARD matters

here

but I'm just guessing)

Would you expand on this point or is there someplace specific in the
documentation on this?

I think what David is worried about is that a sequence object is a
one-row table in PG's implementation. Thus

(1) each sequence requires a dozen or two rows in assorted system
catalogs (not sure exactly how many offhand).

(2) each sequence uses up 8KB on disk for its table file.

(3) each sequence you actually access within a particular session
results in creation of relcache and catcache entries in that
session's local memory. I'm not sure offhand how big those are
either, but a few KB per sequence would be a reasonable guess.

(4) each sequence competes for space in the shared-buffer arena,
since its 8K block has to be swapped into there whenever you try
to access/increment the sequence's value.

This is all mighty inefficient of course, and there's been talk
of trying to reduce the per-sequence overhead; but I don't know
of anyone actively working on that. As things stand, I think
having millions of sequences would be quite painful performance-
wise, especially if your workload were actively hitting a lot
of them concurrently. It would work, for some value of "work",
but it wouldn't perform very well.

Also, as multiple people mentioned already, this still wouldn't
guarantee gap-free sequences of ID values.

regards, tom lane

#17David G. Johnston
david.g.johnston@gmail.com
In reply to: Tom Lane (#13)
Re: Could postgres12 support millions of sequences? (like 10 million)

On Thursday, March 19, 2020, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Michael Lewis <mlewis@entrata.com> writes:

On Thu, Mar 19, 2020, 5:48 PM David G. Johnston <

david.g.johnston@gmail.com>

wrote:

However, one other consideration with sequences: do you care that
PostgreSQL will cache/pin (i.e., no release) every single sequence you
touch for the lifetime of the session? (I do not think DISCARD matters

here

but I'm just guessing)

Would you expand on this point or is there someplace specific in the
documentation on this?

I think what David is worried about is that a sequence object is a
one-row table in PG's implementation. Thus

(1) each sequence requires a dozen or two rows in assorted system
catalogs (not sure exactly how many offhand).

Actually I seemed to have missed that dynamic. I was actually referring to
the SeqTable hash table specified here:

https://github.com/postgres/postgres/blob/85f6b49c2c53fb1e08d918ec9305faac13cf7ad6/src/backend/commands/sequence.c#L69

I wouldn’t think there would be much spreading of data throughout the
catalog if the sequences are unowned (by tables).

David J.

#18David G. Johnston
david.g.johnston@gmail.com
In reply to: pabloa98 (#15)
Re: Could postgres12 support millions of sequences? (like 10 million)

On Thursday, March 19, 2020, pabloa98 <pabloa98@gmail.com> wrote:

We will use a CACHE 1. This is because when nextval('seq') is invoked, we

are hitting 3 or 4 more tables so the sequence will not be a performance
blocker (compared with all the operations in the transaction).

The other implementation detail to remember is:

“Because nextval and setval calls are never rolled back, sequence objects
cannot be used if “gapless” assignment of sequence numbers is needed.”

David J.

#19rob stone
floriparob@gmail.com
In reply to: pabloa98 (#1)
Re: Could postgres12 support millions of sequences? (like 10 million)

Hello,

On Thu, 2020-03-19 at 14:36 -0700, pabloa98 wrote:

Hello,

My schema requires a counter for each combination of 2 values.
Something like:

CREATE TABLE counter(
group INT NOT NULL,
element INT NOT NULL,
seq_number INT NOT NULL default 0,
CONSTRAINT PRIMARY KEY (group, element)
);

For each entry in counter, aka for each (group, element) pair, the
model requires a seq_number.

If I use a table "counter", I could still have counter collisions
between 2 transactions. I need truly sequence behavior. Is that
possible by using a table like "counter" table, where the counter
could be increased out of the transaction so it performs as a
sequence without having race conditions between concurrent
transactions?

The other option is to create sequences for each new pair of (group,
element) using triggers. There are millions of pairs. So this
approach will generate millions of sequences.

How a PostgreSQL database would behave having millions of sequences
in a schema? Would it degrade its performance? Is there any negative
impact?

Regards

Pablo

Have you tried the following:-

BEGIN;
SELECT (seq_number + 1) FROM counter WHERE group = $1
AND element = $2 FOR UPDATE;

If that gives a NOTFOUND exception, then you need to insert into table
counter using a seq_number = 1, or update it with (seq_number + 1).
Then do the insert into whatever table needs the seq_number value.

If no errors detected, then COMMIT else ROLLBACK.
It is all in one transaction block.
We use this approach to assign invoice numbers.

You could pre-seed table counter when a group/element pair is created
for the first time.

HTH,
Robert

#20Adrian Klaver
adrian.klaver@aklaver.com
In reply to: pabloa98 (#16)
Re: Could postgres12 support millions of sequences? (like 10 million)

On 3/19/20 10:31 PM, pabloa98 wrote:

I see.

Any suggestion? It should behave like a sequence in the sense that
concurrent transitions will get different numbers from this alternative
sequence like solution.

In our case, we will need to do a call nextval('some_seq') (or similar)
from different processes no more than twice every minute.

It would be nice to have a sequence data type. Correct me if I am wrong,
but It seems to me that a sequence data type would cost the same or less
than the current sequence implementation.

The schema would be more clear too. We could have a table like:

CREATE TABLE pair(
group INT NOT NULL,
element INT NOT NULL,
seq SEQUENCE INCREMENT 1
              START 1
              CACHE 1
              MINVALUE 1
              MAXVALUE 99999999
        NOT NULL,
CONSTRAINT PRIMARY KEY (group, element)
);

And then:

INSERT INTO event(group, element, code)
VALUES (
         1,
         1,
         ( SELECT seq.nextval('code_1_1_seq') FROM pair p WHERE
p.group=1 and p.code=1 )
);

Or perhaps storing all the sequences in the same table as rows will have
the same behavior.

If code is just something to show the sequence of insertion for group,
element combinations then maybe something like below:

CREATE TABLE event(
group_id INT NOT NULL, --changed as group is reserved word
element INT NOT NULL,
insert_ts timestamptz NOT NULL DEFAULT clock_timestamp(),
PRIMARY KEY(group_id, element, insert_ts)
);

insert into event(group_id, element) VALUES
(1, 1),
(1, 1),
(1, 1),
(2, 1),
(1, 1),
(1, 3),
(1, 1),
(1, 3),
(2, 1),
(2, 1);

select * from event ;
group_id | element | insert_ts
----------+---------+--------------------------------
1 | 1 | 03/20/2020 09:51:12.675926 PDT
1 | 1 | 03/20/2020 09:51:12.675985 PDT
1 | 1 | 03/20/2020 09:51:12.675991 PDT
2 | 1 | 03/20/2020 09:51:12.675996 PDT
1 | 1 | 03/20/2020 09:51:12.676 PDT
1 | 3 | 03/20/2020 09:51:12.676004 PDT
1 | 1 | 03/20/2020 09:51:12.676008 PDT
1 | 3 | 03/20/2020 09:51:12.676012 PDT
2 | 1 | 03/20/2020 09:51:12.676016 PDT
2 | 1 | 03/20/2020 09:51:12.67602 PDT
(10 rows)

select group_id, element, row_number() OVER (partition by (group_id,
element) order by (group_id, element)) AS code from event;
group_id | element | code
----------+---------+------
1 | 1 | 1
1 | 1 | 2
1 | 1 | 3
1 | 1 | 4
1 | 1 | 5
1 | 3 | 1
1 | 3 | 2
2 | 1 | 1
2 | 1 | 2
2 | 1 | 3
(10 rows)

Pablo

On Thu, Mar 19, 2020 at 7:56 PM Tom Lane <tgl@sss.pgh.pa.us
<mailto:tgl@sss.pgh.pa.us>> wrote:

Michael Lewis <mlewis@entrata.com <mailto:mlewis@entrata.com>> writes:

On Thu, Mar 19, 2020, 5:48 PM David G. Johnston

<david.g.johnston@gmail.com <mailto:david.g.johnston@gmail.com>>

wrote:

However, one other consideration with sequences: do you care that
PostgreSQL will cache/pin (i.e., no release) every single

sequence you

touch for the lifetime of the session? (I do not think DISCARD

matters here

but I'm just guessing)

Would you expand on this point or is there someplace specific in the
documentation on this?

I think what David is worried about is that a sequence object is a
one-row table in PG's implementation.  Thus

(1) each sequence requires a dozen or two rows in assorted system
catalogs (not sure exactly how many offhand).

(2) each sequence uses up 8KB on disk for its table file.

(3) each sequence you actually access within a particular session
results in creation of relcache and catcache entries in that
session's local memory.  I'm not sure offhand how big those are
either, but a few KB per sequence would be a reasonable guess.

(4) each sequence competes for space in the shared-buffer arena,
since its 8K block has to be swapped into there whenever you try
to access/increment the sequence's value.

This is all mighty inefficient of course, and there's been talk
of trying to reduce the per-sequence overhead; but I don't know
of anyone actively working on that.  As things stand, I think
having millions of sequences would be quite painful performance-
wise, especially if your workload were actively hitting a lot
of them concurrently.  It would work, for some value of "work",
but it wouldn't perform very well.

Also, as multiple people mentioned already, this still wouldn't
guarantee gap-free sequences of ID values.

                        regards, tom lane

--
Adrian Klaver
adrian.klaver@aklaver.com

#21Adrian Klaver
adrian.klaver@aklaver.com
In reply to: Adrian Klaver (#20)
#22pabloa98
pabloa98@gmail.com
In reply to: rob stone (#19)
#23pabloa98
pabloa98@gmail.com
In reply to: Adrian Klaver (#21)
#24Peter J. Holzer
hjp-pgsql@hjp.at
In reply to: David G. Johnston (#6)
#25Rob Sargent
robjsargent@gmail.com
In reply to: Peter J. Holzer (#24)
#26Peter J. Holzer
hjp-pgsql@hjp.at
In reply to: Rob Sargent (#25)
#27Rob Sargent
robjsargent@gmail.com
In reply to: Peter J. Holzer (#26)
#28Adrian Klaver
adrian.klaver@aklaver.com
In reply to: Peter J. Holzer (#26)
#29Adrian Klaver
adrian.klaver@aklaver.com
In reply to: pabloa98 (#23)
#30pabloa98
pabloa98@gmail.com
In reply to: Peter J. Holzer (#24)
#31pabloa98
pabloa98@gmail.com
In reply to: Adrian Klaver (#28)
#32David G. Johnston
david.g.johnston@gmail.com
In reply to: pabloa98 (#31)
#33John W Higgins
wishdev@gmail.com
In reply to: pabloa98 (#31)
#34Adrian Klaver
adrian.klaver@aklaver.com
In reply to: pabloa98 (#31)
#35Rob Sargent
robjsargent@gmail.com
In reply to: Adrian Klaver (#34)
#36Ron
ronljohnsonjr@gmail.com
In reply to: Rob Sargent (#35)
#37pabloa98
pabloa98@gmail.com
In reply to: Rob Sargent (#35)
#38pabloa98
pabloa98@gmail.com
In reply to: John W Higgins (#33)
#39pabloa98
pabloa98@gmail.com
In reply to: Ron (#36)
#40Rob Sargent
robjsargent@gmail.com
In reply to: pabloa98 (#39)
#41Peter J. Holzer
hjp-pgsql@hjp.at
In reply to: Adrian Klaver (#28)
#42Peter J. Holzer
hjp-pgsql@hjp.at
In reply to: Rob Sargent (#40)
#43pabloa98
pabloa98@gmail.com
In reply to: Peter J. Holzer (#41)
#44Rob Sargent
robjsargent@gmail.com
In reply to: Peter J. Holzer (#42)
#45Adrian Klaver
adrian.klaver@aklaver.com
In reply to: pabloa98 (#43)
#46pabloa98
pabloa98@gmail.com
In reply to: Adrian Klaver (#45)
#47Peter J. Holzer
hjp-pgsql@hjp.at
In reply to: Rob Sargent (#44)
#48Peter J. Holzer
hjp-pgsql@hjp.at
In reply to: pabloa98 (#43)
#49Adrian Klaver
adrian.klaver@aklaver.com
In reply to: pabloa98 (#37)
#50pabloa98
pabloa98@gmail.com
In reply to: Adrian Klaver (#49)
#51Adrian Klaver
adrian.klaver@aklaver.com
In reply to: pabloa98 (#50)
#52pabloa98
pabloa98@gmail.com
In reply to: Adrian Klaver (#51)
#53Chris Browne
cbbrowne@acm.org
In reply to: pabloa98 (#50)
#54David G. Johnston
david.g.johnston@gmail.com
In reply to: Chris Browne (#53)
#55pabloa98
pabloa98@gmail.com
In reply to: Chris Browne (#53)
#56pabloa98
pabloa98@gmail.com
In reply to: David G. Johnston (#54)
#57Daniel Verite
daniel@manitou-mail.org
In reply to: pabloa98 (#55)
#58pabloa98
pabloa98@gmail.com
In reply to: Daniel Verite (#57)