Could postgres12 support millions of sequences? (like 10 million)

robjsargent@gmail.com

about 6 years ago

In reply to: pabloa98 (#1)

Re: Could postgres12 support millions of sequences? (like 10 million)

On Mar 19, 2020, at 3:36 PM, pabloa98 <pabloa98@gmail.com> wrote:

Hello,

My schema requires a counter for each combination of 2 values. Something like:

CREATE TABLE counter(
group INT NOT NULL,
element INT NOT NULL,
seq_number INT NOT NULL default 0,
CONSTRAINT PRIMARY KEY (group, element)
);

For each entry in counter, aka for each (group, element) pair, the model requires a seq_number.

If I use a table "counter", I could still have counter collisions between 2 transactions. I need truly sequence behavior. Is that possible by using a table like "counter" table, where the counter could be increased out of the transaction so it performs as a sequence without having race conditions between concurrent transactions?

The other option is to create sequences for each new pair of (group, element) using triggers. There are millions of pairs. So this approach will generate millions of sequences.

How a PostgreSQL database would behave having millions of sequences in a schema? Would it degrade its performance? Is there any negative impact?

Regards

Pablo

To clarify, are you hoping for consecutive numbers as the each row is added to the table, i.e. “serial”?

What is the intension of “seq_number”?

Show quoted text

pabloa98@gmail.com

about 6 years ago

In reply to: Rob Sargent (#2)

Re: Could postgres12 support millions of sequences? (like 10 million)

On Thu, Mar 19, 2020 at 2:50 PM Rob Sargent <robjsargent@gmail.com> wrote:

On Mar 19, 2020, at 3:36 PM, pabloa98 <pabloa98@gmail.com> wrote:

Hello,

My schema requires a counter for each combination of 2 values. Something

like:

CREATE TABLE counter(
group INT NOT NULL,
element INT NOT NULL,
seq_number INT NOT NULL default 0,
CONSTRAINT PRIMARY KEY (group, element)
);

For each entry in counter, aka for each (group, element) pair, the model

requires a seq_number.

If I use a table "counter", I could still have counter collisions

between 2 transactions. I need truly sequence behavior. Is that possible by
using a table like "counter" table, where the counter could be increased
out of the transaction so it performs as a sequence without having race
conditions between concurrent transactions?

The other option is to create sequences for each new pair of (group,

element) using triggers. There are millions of pairs. So this approach will
generate millions of sequences.

How a PostgreSQL database would behave having millions of sequences in a

schema? Would it degrade its performance? Is there any negative impact?

Regards

Pablo

To clarify, are you hoping for consecutive numbers as the each row is
added to the table, i.e. “serial”?

What is the intension of “seq_number”?

the idea is to have like a serial sequence, but for each pair of (group,
element).

so that when we insert rows in another table, we could have something like:

group, element, event_id, ...
1, 1, 1
1, 1, 2
1, 1, 3
2, 1, 1
1, 1, 4
1, 3, 1
1, 1, 5
1, 3, 2
2, 1, 2
2, 1, 3

The 3rd column is the sequence number we get from the appropriate sequence
created by the trigger.

I want to implement a variation of https://stackoverflow.com/a/30204854 and
that will generate millions of sequences.

robjsargent@gmail.com

about 6 years ago

In reply to: pabloa98 (#3)

Re: Could postgres12 support millions of sequences? (like 10 million)

On Mar 19, 2020, at 4:13 PM, pabloa98 <pabloa98@gmail.com> wrote:

On Thu, Mar 19, 2020 at 2:50 PM Rob Sargent <robjsargent@gmail.com <mailto:robjsargent@gmail.com>> wrote:

On Mar 19, 2020, at 3:36 PM, pabloa98 <pabloa98@gmail.com <mailto:pabloa98@gmail.com>> wrote:

Hello,

My schema requires a counter for each combination of 2 values. Something like:

CREATE TABLE counter(
group INT NOT NULL,
element INT NOT NULL,
seq_number INT NOT NULL default 0,
CONSTRAINT PRIMARY KEY (group, element)
);

For each entry in counter, aka for each (group, element) pair, the model requires a seq_number.

If I use a table "counter", I could still have counter collisions between 2 transactions. I need truly sequence behavior. Is that possible by using a table like "counter" table, where the counter could be increased out of the transaction so it performs as a sequence without having race conditions between concurrent transactions?

The other option is to create sequences for each new pair of (group, element) using triggers. There are millions of pairs. So this approach will generate millions of sequences.

How a PostgreSQL database would behave having millions of sequences in a schema? Would it degrade its performance? Is there any negative impact?

Regards

Pablo

To clarify, are you hoping for consecutive numbers as the each row is added to the table, i.e. “serial”?

What is the intension of “seq_number”?

the idea is to have like a serial sequence, but for each pair of (group, element).

so that when we insert rows in another table, we could have something like:

group, element, event_id, ...
1, 1, 1
1, 1, 2
1, 1, 3
2, 1, 1
1, 1, 4
1, 3, 1
1, 1, 5
1, 3, 2
2, 1, 2
2, 1, 3

The 3rd column is the sequence number we get from the appropriate sequence created by the trigger.

I want to implement a variation of https://stackoverflow.com/a/30204854 <https://stackoverflow.com/a/30204854> and that will generate millions of sequences.

Then I don’t thing group/element can be a PRIMARY KEY

pabloa98@gmail.com

about 6 years ago

In reply to: Rob Sargent (#4)

Re: Could postgres12 support millions of sequences? (like 10 million)

On Thu, Mar 19, 2020 at 3:17 PM Rob Sargent <robjsargent@gmail.com> wrote:

On Mar 19, 2020, at 4:13 PM, pabloa98 <pabloa98@gmail.com> wrote:

On Thu, Mar 19, 2020 at 2:50 PM Rob Sargent <robjsargent@gmail.com> wrote:

On Mar 19, 2020, at 3:36 PM, pabloa98 <pabloa98@gmail.com> wrote:

Hello,

My schema requires a counter for each combination of 2 values.

Something like:

CREATE TABLE counter(
group INT NOT NULL,
element INT NOT NULL,
seq_number INT NOT NULL default 0,
CONSTRAINT PRIMARY KEY (group, element)
);

For each entry in counter, aka for each (group, element) pair, the

model requires a seq_number.

If I use a table "counter", I could still have counter collisions

between 2 transactions. I need truly sequence behavior. Is that possible by
using a table like "counter" table, where the counter could be increased
out of the transaction so it performs as a sequence without having race
conditions between concurrent transactions?

The other option is to create sequences for each new pair of (group,

element) using triggers. There are millions of pairs. So this approach will
generate millions of sequences.

How a PostgreSQL database would behave having millions of sequences in

a schema? Would it degrade its performance? Is there any negative impact?

Regards

Pablo

To clarify, are you hoping for consecutive numbers as the each row is
added to the table, i.e. “serial”?

What is the intension of “seq_number”?

the idea is to have like a serial sequence, but for each pair of (group,
element).

so that when we insert rows in another table, we could have something like:

group, element, event_id, ...
1, 1, 1
1, 1, 2
1, 1, 3
2, 1, 1
1, 1, 4
1, 3, 1
1, 1, 5
1, 3, 2
2, 1, 2
2, 1, 3

The 3rd column is the sequence number we get from the appropriate sequence
created by the trigger.

I want to implement a variation of https://stackoverflow.com/a/30204854
and that will generate millions of sequences.

Then I don’t thing group/element can be a PRIMARY KEY

Thank you for your answer. Let me explain:

Table "counter" will have triggers that will create a sequence with a new
row is inserted called counter_1_1_seq, counter_2_1_seq, counter_1_3_seq,
etc that will be used to insert values in **another** table. It will be
used for nothing else. When we insert millions of pairs group/element, the
trigger in that table will generate millions of sequences.

My question is how PostgreSQL will behave. Could it deal with millions of
sequences? What about system operations as vacuum, etc?

Pablo

david.g.johnston@gmail.com

about 6 years ago

In reply to: pabloa98 (#5)

Re: Could postgres12 support millions of sequences? (like 10 million)

On Thu, Mar 19, 2020 at 3:33 PM pabloa98 <pabloa98@gmail.com> wrote:

Table "counter" will have triggers that will create a sequence with a new
row is inserted called counter_1_1_seq, counter_2_1_seq, counter_1_3_seq,
etc that will be used to insert values in **another** table. It will be
used for nothing else. When we insert millions of pairs group/element, the
trigger in that table will generate millions of sequences.

My question is how PostgreSQL will behave. Could it deal with millions of
sequences? What about system operations as vacuum, etc?

First, it sounds like you care about there being no gaps in the records you
end up saving. If that is the case then sequences will not work for you.

.
.
.

If you are still reading because you can live with some gaps - then does
having one sequence per pair really matter?

Regardless, at this scale you probably should setup a performance test as
part of your continuous integration/test infrastructure, and let it answer
the "which performs better" question.

However, one other consideration with sequences: do you care that
PostgreSQL will cache/pin (i.e., no release) every single sequence you
touch for the lifetime of the session? (I do not think DISCARD matters here
but I'm just guessing)

A big determinant would seem to be how your sessions would go about using
the sequences. You've described the model requirement but haven't describe
the process by which the model will be used; and without that information
useful comments pertaining to alternative implementations are difficult to
formulate.

David J.

adrian.klaver@aklaver.com

about 6 years ago

In reply to: pabloa98 (#5)

Re: Could postgres12 support millions of sequences? (like 10 million)

On 3/19/20 3:32 PM, pabloa98 wrote:

On Thu, Mar 19, 2020 at 3:17 PM Rob Sargent <robjsargent@gmail.com
<mailto:robjsargent@gmail.com>> wrote:

On Mar 19, 2020, at 4:13 PM, pabloa98 <pabloa98@gmail.com
<mailto:pabloa98@gmail.com>> wrote:

On Thu, Mar 19, 2020 at 2:50 PM Rob Sargent <robjsargent@gmail.com
<mailto:robjsargent@gmail.com>> wrote:

On Mar 19, 2020, at 3:36 PM, pabloa98 <pabloa98@gmail.com

<mailto:pabloa98@gmail.com>> wrote:

Hello,

My schema requires a counter for each combination of 2

values. Something like:

CREATE TABLE counter(
group INT NOT NULL,
element INT NOT NULL,
seq_number INT NOT NULL default 0,
CONSTRAINT PRIMARY KEY (group, element)
);

For each entry in counter, aka for each (group, element)

pair, the model requires a seq_number.

If I use a table "counter", I could still have counter

collisions between 2 transactions. I need truly sequence
behavior. Is that possible by using a table like "counter"
table, where the counter could be increased out of the
transaction so it performs as a sequence without having race
conditions between concurrent transactions?

The other option is to create sequences for each new pair of

(group, element) using triggers. There are millions of pairs.
So this approach will generate millions of sequences.

How a PostgreSQL database would behave having millions of

sequences in a schema? Would it degrade its performance? Is
there any negative impact?

Regards

Pablo

To clarify, are you hoping for consecutive numbers as the each
row is added to the table, i.e. “serial”?

What is the intension of “seq_number”?

the idea is to have like a serial sequence, but for each pair of
(group, element).

so that when we insert rows in another table, we could have
something like:

group, element, event_id, ...
1, 1, 1
1, 1, 2
1, 1, 3
2, 1, 1
1, 1, 4
1, 3, 1
1, 1, 5
1, 3, 2
2, 1, 2
2, 1, 3

The 3rd column is the sequence number we get from the appropriate
sequence created by the trigger.
I want to implement a variation of
https://stackoverflow.com/a/30204854 and that will generate
millions of sequences.

Then I don’t thing group/element can be a PRIMARY KEY

Thank you for your answer. Let me explain:

Table "counter" will have triggers that will create a sequence with a
new row is inserted called counter_1_1_seq, counter_2_1_seq,
counter_1_3_seq, etc that will be used to insert values in **another**
table. It will be used for nothing else. When we insert millions of
pairs group/element, the trigger in that table will generate millions of
sequences.

This has train wreck written all over it. If you are going to have a
trigger for each combination of (group, element) I gotta believe the
table will fall over before you run into sequence issues. Not sure why
there can't be serial column that has supplies the sequence numbers and
therefore only one sequence in play. From what I see all you care about
is that:

group, element, event_id

has an increasing event_id for (group, element) pairs.

My question is how PostgreSQL will behave. Could it deal with millions
of sequences? What about system operations as vacuum, etc?

Pablo

--
Adrian Klaver
adrian.klaver@aklaver.com

pabloa98@gmail.com

about 6 years ago

In reply to: Adrian Klaver (#7)

Re: Could postgres12 support millions of sequences? (like 10 million)

On Thu, Mar 19, 2020 at 5:13 PM Adrian Klaver <adrian.klaver@aklaver.com>
wrote:

On 3/19/20 3:32 PM, pabloa98 wrote:

On Thu, Mar 19, 2020 at 3:17 PM Rob Sargent <robjsargent@gmail.com
<mailto:robjsargent@gmail.com>> wrote:

On Mar 19, 2020, at 4:13 PM, pabloa98 <pabloa98@gmail.com
<mailto:pabloa98@gmail.com>> wrote:

On Thu, Mar 19, 2020 at 2:50 PM Rob Sargent <robjsargent@gmail.com
<mailto:robjsargent@gmail.com>> wrote:

On Mar 19, 2020, at 3:36 PM, pabloa98 <pabloa98@gmail.com

<mailto:pabloa98@gmail.com>> wrote:

Hello,

My schema requires a counter for each combination of 2

values. Something like:

CREATE TABLE counter(
group INT NOT NULL,
element INT NOT NULL,
seq_number INT NOT NULL default 0,
CONSTRAINT PRIMARY KEY (group, element)
);

For each entry in counter, aka for each (group, element)

pair, the model requires a seq_number.

If I use a table "counter", I could still have counter

collisions between 2 transactions. I need truly sequence
behavior. Is that possible by using a table like "counter"
table, where the counter could be increased out of the
transaction so it performs as a sequence without having race
conditions between concurrent transactions?

The other option is to create sequences for each new pair of

(group, element) using triggers. There are millions of pairs.
So this approach will generate millions of sequences.

How a PostgreSQL database would behave having millions of

sequences in a schema? Would it degrade its performance? Is
there any negative impact?

Regards

Pablo

To clarify, are you hoping for consecutive numbers as the each
row is added to the table, i.e. “serial”?

What is the intension of “seq_number”?

the idea is to have like a serial sequence, but for each pair of
(group, element).

so that when we insert rows in another table, we could have
something like:

group, element, event_id, ...
1, 1, 1
1, 1, 2
1, 1, 3
2, 1, 1
1, 1, 4
1, 3, 1
1, 1, 5
1, 3, 2
2, 1, 2
2, 1, 3

The 3rd column is the sequence number we get from the appropriate
sequence created by the trigger.
I want to implement a variation of
https://stackoverflow.com/a/30204854 and that will generate
millions of sequences.

Then I don’t thing group/element can be a PRIMARY KEY

Thank you for your answer. Let me explain:

Table "counter" will have triggers that will create a sequence with a
new row is inserted called counter_1_1_seq, counter_2_1_seq,
counter_1_3_seq, etc that will be used to insert values in **another**
table. It will be used for nothing else. When we insert millions of
pairs group/element, the trigger in that table will generate millions of
sequences.

This has train wreck written all over it. If you are going to have a
trigger for each combination of (group, element) I gotta believe the
table will fall over before you run into sequence issues. Not sure why
there can't be serial column that has supplies the sequence numbers and
therefore only one sequence in play. From what I see all you care about
is that:

group, element, event_id

has an increasing event_id for (group, element) pairs.

I need a different sequence because it is a business requirement.
Please forget about the first example. I was not clear. It will be
something like this.

I have a table called "pair":

CREATE TABLE pair(
group INT NOT NULL,
element INT NOT NULL,
CONSTRAINT PRIMARY KEY (group, element)
);

I must add entries to the table "event". This table event will have a code
that will be generated using a sequence of that is a function of
s(group,element).
CREATE TABLE event(
group INT NOT NULL,
element INT NOT NULL,
code INT NOT NULL,
CONSTRAINT PRIMARY KEY(code, element, group)
);

I could make column code SERIAL and be done with it. HOWEVER, the "code"
column starting from 1 for each pair (group, element) is a business
requirement. The business domain dictates that for each pair of (group,
element) there is a code that will start from 1 to 99999999. It is OK to
have gaps in the sequence.

So this table "event" will contain entries like:

1, 1, 1
1, 1, 2
1, 1, 3
2, 1, 1
1, 1, 4
1, 3, 1
1, 1, 5
1, 3, 2
2, 1, 2
2, 1, 3

These triplets will be inserted by several processes (aka several
concurrent transactions could happen at the same time). Since they are
concurrent transactions, it could be possible that a trivial calculation of
"code" column (something like SELECT MAX(code) + 1 FROM event WHERE group=1
AND element = 1;) will generate collisions because 2 transactions could
come with the same value of "code" at the same time.

So I need something like a sequence for each pair of (group, element)

So I added to the table "pair" a trigger to create/drop these sequences
each time a new row is inserted or deleted.

Something like:

CREATE FUNCTION make_pair_sequence() RETURNS trigger
LANGUAGE plpgsql
AS $make_pair_sequence$
BEGIN
IF (TG_OP = 'INSERT' OR TG_OP = 'UPDATE') THEN
EXECUTE format('
CREATE SEQUENCE IF NOT EXISTS code_%s_%s_seq
INCREMENT 1
START 1
CACHE 1
MINVALUE 1
MAXVALUE 99999999 -- 8 digits
', NEW.group, NEW.element);
ELSEIF (TG_OP = 'DELETE') THEN
EXECUTE format('
DROP SEQUENCE IF EXISTS code_%s_%s_seq
', NEW.group, NEW.element);
-- ELSEIF (TG_OP = 'TRUNCATE') THEN
END IF;
RETURN NULL; -- result ignored, so no biggie about what it is
returned..
END
$make_pair_sequence$;

DROP TRIGGER IF EXISTS make_pair_sequence
ON pair CASCADE;
CREATE TRIGGER make_pair_sequence
AFTER INSERT OR UPDATE OR DELETE ON pair
FOR EACH ROW EXECUTE FUNCTION make_pair_sequence();

each time I insert a pair with something like:

INSERT INTO PAIR( group, element) VALUES (1, 1); -- trigger creates
sequence code_1_1_seq
INSERT INTO PAIR( group, element) VALUES (1, 3); -- trigger creates
sequence code_1_3_seq
INSERT INTO PAIR( group, element) VALUES (2, 1); -- trigger creates
sequence code_2_1_seq

Later I will insert values in the "event" table with values like:

INSERT INTO event(group, element, code) VALUES (1, 1,
nextval('code_1_1_seq'));
INSERT INTO event(group, element, code) VALUES (1, 3,
nextval('code_1_3_seq'));
INSERT INTO event(group, element, code) VALUES (2, 1,
nextval('code_2_1_seq'));
etc

And since there are millions of possible pairs, the database will get
millions of sequences. The "code" column value generation on table "event"
will have no race conditions and everything will work fine from the
business point of view.

Now the questions are:

- Could PostgreSQL deal with millions of sequences? What will happen with
maintenance commands (like VACCUM)? Will they suffer performance issues?
- Is there a better method of having independent code counters
transaction-safe with no locks?

Sorry about the first email. I did not explain clearly what the problem was.

Show quoted text

My question is how PostgreSQL will behave. Could it deal with millions
of sequences? What about system operations as vacuum, etc?

Pablo

--
Adrian Klaver
adrian.klaver@aklaver.com

robjsargent@gmail.com

about 6 years ago

In reply to: pabloa98 (#8)

Re: Could postgres12 support millions of sequences? (like 10 million)

On Mar 19, 2020, at 6:45 PM, pabloa98 <pabloa98@gmail.com> wrote:

On Thu, Mar 19, 2020 at 5:13 PM Adrian Klaver <adrian.klaver@aklaver.com <mailto:adrian.klaver@aklaver.com>> wrote:
On 3/19/20 3:32 PM, pabloa98 wrote:

On Thu, Mar 19, 2020 at 3:17 PM Rob Sargent <robjsargent@gmail.com <mailto:robjsargent@gmail.com>
<mailto:robjsargent@gmail.com <mailto:robjsargent@gmail.com>>> wrote:I have a table called "pair":

CREATE TABLE pair(
group INT NOT NULL,
element INT NOT NULL,
CONSTRAINT PRIMARY KEY (group, element)
);

I must add entries to the table "event". This table event will have a code that will be generated using a sequence of that is a function of s(group,element).
CREATE TABLE event(
group INT NOT NULL,
element INT NOT NULL,
code INT NOT NULL,
CONSTRAINT PRIMARY KEY(code, element, group)
);

Unless event table is searched by code more than group, you probably want to maintain the key order from the pair table's primary key.

If gaps are ok do you still near multiple sequences?

#10

pabloa98@gmail.com

about 6 years ago

In reply to: Rob Sargent (#9)

Re: Could postgres12 support millions of sequences? (like 10 million)

On Thu, Mar 19, 2020 at 6:16 PM Rob Sargent <robjsargent@gmail.com> wrote:

On Mar 19, 2020, at 6:45 PM, pabloa98 <pabloa98@gmail.com> wrote:

On Thu, Mar 19, 2020 at 5:13 PM Adrian Klaver <adrian.klaver@aklaver.com>
wrote:

On 3/19/20 3:32 PM, pabloa98 wrote:

On Thu, Mar 19, 2020 at 3:17 PM Rob Sargent <robjsargent@gmail.com
<mailto:robjsargent@gmail.com>> wrote:I have a table called "pair":

CREATE TABLE pair(
group INT NOT NULL,
element INT NOT NULL,
CONSTRAINT PRIMARY KEY (group, element)
);

I must add entries to the table "event". This table event will have a code
that will be generated using a sequence of that is a function of
s(group,element).
CREATE TABLE event(
group INT NOT NULL,
element INT NOT NULL,
code INT NOT NULL,
CONSTRAINT PRIMARY KEY(code, element, group)
);

Unless event table is searched by code more than group, you probably want
to maintain the key order from the pair table's primary key.

OK. I will do that.

If gaps are ok do you still near multiple sequences?

I need to start each "code" value from 1 in each (group, element) pair.

This is because we only have 99999999 codes for each pair. We do not want
to waste numbers. If there is a gap is OK but no a gap of millions of
numbers.

#11

robjsargent@gmail.com

about 6 years ago

In reply to: pabloa98 (#10)

Re: Could postgres12 support millions of sequences? (like 10 million)

Show quoted text

On Mar 19, 2020, at 7:35 PM, pabloa98 <pabloa98@gmail.com> wrote:

On Thu, Mar 19, 2020 at 6:16 PM Rob Sargent <robjsargent@gmail.com <mailto:robjsargent@gmail.com>> wrote:

On Mar 19, 2020, at 6:45 PM, pabloa98 <pabloa98@gmail.com <mailto:pabloa98@gmail.com>> wrote:

On Thu, Mar 19, 2020 at 5:13 PM Adrian Klaver <adrian.klaver@aklaver.com <mailto:adrian.klaver@aklaver.com>> wrote:
On 3/19/20 3:32 PM, pabloa98 wrote:

On Thu, Mar 19, 2020 at 3:17 PM Rob Sargent <robjsargent@gmail.com <mailto:robjsargent@gmail.com>
<mailto:robjsargent@gmail.com <mailto:robjsargent@gmail.com>>> wrote:I have a table called "pair":

CREATE TABLE pair(
group INT NOT NULL,
element INT NOT NULL,
CONSTRAINT PRIMARY KEY (group, element)
);

I must add entries to the table "event". This table event will have a code that will be generated using a sequence of that is a function of s(group,element).
CREATE TABLE event(
group INT NOT NULL,
element INT NOT NULL,
code INT NOT NULL,
CONSTRAINT PRIMARY KEY(code, element, group)
);

Unless event table is searched by code more than group, you probably want to maintain the key order from the pair table's primary key.

OK. I will do that.

If gaps are ok do you still near multiple sequences?

I need to start each "code" value from 1 in each (group, element) pair. This is because we only have 99999999 codes for each pair. We do not want to waste numbers. If there is a gap is OK but no a gap of millions of numbers.

#12

Michael Lewis

mlewis@entrata.com

about 6 years ago

In reply to: David G. Johnston (#6)

Re: Could postgres12 support millions of sequences? (like 10 million)

On Thu, Mar 19, 2020, 5:48 PM David G. Johnston <david.g.johnston@gmail.com>
wrote:

However, one other consideration with sequences: do you care that
PostgreSQL will cache/pin (i.e., no release) every single sequence you
touch for the lifetime of the session? (I do not think DISCARD matters here
but I'm just guessing)

Would you expand on this point or is there someplace specific in the
documentation on this?

Show quoted text

#13

Tom Lane

tgl@sss.pgh.pa.us

about 6 years ago

In reply to: Michael Lewis (#12)

Re: Could postgres12 support millions of sequences? (like 10 million)

Michael Lewis <mlewis@entrata.com> writes:

On Thu, Mar 19, 2020, 5:48 PM David G. Johnston <david.g.johnston@gmail.com>
wrote:

However, one other consideration with sequences: do you care that
PostgreSQL will cache/pin (i.e., no release) every single sequence you
touch for the lifetime of the session? (I do not think DISCARD matters here
but I'm just guessing)

Would you expand on this point or is there someplace specific in the
documentation on this?

I think what David is worried about is that a sequence object is a
one-row table in PG's implementation. Thus

(1) each sequence requires a dozen or two rows in assorted system
catalogs (not sure exactly how many offhand).

(2) each sequence uses up 8KB on disk for its table file.

(3) each sequence you actually access within a particular session
results in creation of relcache and catcache entries in that
session's local memory. I'm not sure offhand how big those are
either, but a few KB per sequence would be a reasonable guess.

(4) each sequence competes for space in the shared-buffer arena,
since its 8K block has to be swapped into there whenever you try
to access/increment the sequence's value.

This is all mighty inefficient of course, and there's been talk
of trying to reduce the per-sequence overhead; but I don't know
of anyone actively working on that. As things stand, I think
having millions of sequences would be quite painful performance-
wise, especially if your workload were actively hitting a lot
of them concurrently. It would work, for some value of "work",
but it wouldn't perform very well.

Also, as multiple people mentioned already, this still wouldn't
guarantee gap-free sequences of ID values.

regards, tom lane

#14

https://www.postgresql.org/docs/12/sql-createsequence.html

adrian.klaver@aklaver.com

about 6 years ago

In reply to: Michael Lewis (#12)

Re: Could postgres12 support millions of sequences? (like 10 million)

On 3/19/20 7:38 PM, Michael Lewis wrote:

On Thu, Mar 19, 2020, 5:48 PM David G. Johnston
<david.g.johnston@gmail.com <mailto:david.g.johnston@gmail.com>> wrote:

However, one other consideration with sequences: do you care that
PostgreSQL will cache/pin (i.e., no release) every single sequence
you touch for the lifetime of the session? (I do not think DISCARD
matters here but I'm just guessing)

Would you expand on this point or is there someplace specific in the
documentation on this?

See the section starting here:

Notes

"Unexpected results might be obtained if a cache setting greater than
one is used for a sequence object that will be used concurrently by
multiple sessions. Each session will allocate and cache successive
sequence values during one access to the sequence object and increase
the sequence object's last_value accordingly. Then, the next cache-1
uses of nextval within that session simply return the preallocated
values without touching the sequence object. So, any numbers allocated
but not used within a session will be lost when that session ends,
resulting in “holes” in the sequence.

...
"

--
Adrian Klaver
adrian.klaver@aklaver.com

#15

pabloa98@gmail.com

about 6 years ago

In reply to: Adrian Klaver (#14)

Re: Could postgres12 support millions of sequences? (like 10 million)

On Thu, Mar 19, 2020 at 9:12 PM Adrian Klaver <adrian.klaver@aklaver.com>
wrote:

On 3/19/20 7:38 PM, Michael Lewis wrote:

On Thu, Mar 19, 2020, 5:48 PM David G. Johnston
<david.g.johnston@gmail.com <mailto:david.g.johnston@gmail.com>> wrote:

However, one other consideration with sequences: do you care that
PostgreSQL will cache/pin (i.e., no release) every single sequence
you touch for the lifetime of the session? (I do not think DISCARD
matters here but I'm just guessing)

Would you expand on this point or is there someplace specific in the
documentation on this?

See the section starting here:

https://www.postgresql.org/docs/12/sql-createsequence.html

Notes

"Unexpected results might be obtained if a cache setting greater than
one is used for a sequence object that will be used concurrently by
multiple sessions. Each session will allocate and cache successive
sequence values during one access to the sequence object and increase
the sequence object's last_value accordingly. Then, the next cache-1
uses of nextval within that session simply return the preallocated
values without touching the sequence object. So, any numbers allocated
but not used within a session will be lost when that session ends,
resulting in “holes” in the sequence.

...
"

We will use a CACHE 1. This is because when nextval('seq') is invoked, we

are hitting 3 or 4 more tables so the sequence will not be a performance
blocker (compared with all the operations in the transaction).

Show quoted text

--
Adrian Klaver
adrian.klaver@aklaver.com

#16

pabloa98@gmail.com

about 6 years ago

In reply to: Tom Lane (#13)

Re: Could postgres12 support millions of sequences? (like 10 million)

I see.

Any suggestion? It should behave like a sequence in the sense that
concurrent transitions will get different numbers from this alternative
sequence like solution.

In our case, we will need to do a call nextval('some_seq') (or similar)
from different processes no more than twice every minute.

It would be nice to have a sequence data type. Correct me if I am wrong,
but It seems to me that a sequence data type would cost the same or less
than the current sequence implementation.

The schema would be more clear too. We could have a table like:

CREATE TABLE pair(
group INT NOT NULL,
element INT NOT NULL,
seq SEQUENCE INCREMENT 1
START 1
CACHE 1
MINVALUE 1
MAXVALUE 99999999
NOT NULL,
CONSTRAINT PRIMARY KEY (group, element)
);

And then:

INSERT INTO event(group, element, code)
VALUES (
1,
1,
( SELECT seq.nextval('code_1_1_seq') FROM pair p WHERE p.group=1
and p.code=1 )
);

Or perhaps storing all the sequences in the same table as rows will have
the same behavior.

Pablo

On Thu, Mar 19, 2020 at 7:56 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:

Show quoted text

Michael Lewis <mlewis@entrata.com> writes:

On Thu, Mar 19, 2020, 5:48 PM David G. Johnston <

david.g.johnston@gmail.com>

wrote:

However, one other consideration with sequences: do you care that
PostgreSQL will cache/pin (i.e., no release) every single sequence you
touch for the lifetime of the session? (I do not think DISCARD matters

here

but I'm just guessing)

Would you expand on this point or is there someplace specific in the
documentation on this?

I think what David is worried about is that a sequence object is a
one-row table in PG's implementation. Thus

(1) each sequence requires a dozen or two rows in assorted system
catalogs (not sure exactly how many offhand).

(2) each sequence uses up 8KB on disk for its table file.

(3) each sequence you actually access within a particular session
results in creation of relcache and catcache entries in that
session's local memory. I'm not sure offhand how big those are
either, but a few KB per sequence would be a reasonable guess.

(4) each sequence competes for space in the shared-buffer arena,
since its 8K block has to be swapped into there whenever you try
to access/increment the sequence's value.

This is all mighty inefficient of course, and there's been talk
of trying to reduce the per-sequence overhead; but I don't know
of anyone actively working on that. As things stand, I think
having millions of sequences would be quite painful performance-
wise, especially if your workload were actively hitting a lot
of them concurrently. It would work, for some value of "work",
but it wouldn't perform very well.

Also, as multiple people mentioned already, this still wouldn't
guarantee gap-free sequences of ID values.

regards, tom lane

#17

https://github.com/postgres/postgres/blob/85f6b49c2c53fb1e08d918ec9305faac13cf7ad6/src/backend/commands/sequence.c#L69

david.g.johnston@gmail.com

about 6 years ago

In reply to: Tom Lane (#13)

Re: Could postgres12 support millions of sequences? (like 10 million)

On Thursday, March 19, 2020, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Michael Lewis <mlewis@entrata.com> writes:

On Thu, Mar 19, 2020, 5:48 PM David G. Johnston <

david.g.johnston@gmail.com>

wrote:

However, one other consideration with sequences: do you care that
PostgreSQL will cache/pin (i.e., no release) every single sequence you
touch for the lifetime of the session? (I do not think DISCARD matters

here

but I'm just guessing)

Would you expand on this point or is there someplace specific in the
documentation on this?

I think what David is worried about is that a sequence object is a
one-row table in PG's implementation. Thus

(1) each sequence requires a dozen or two rows in assorted system
catalogs (not sure exactly how many offhand).

Actually I seemed to have missed that dynamic. I was actually referring to
the SeqTable hash table specified here:

I wouldn’t think there would be much spreading of data throughout the
catalog if the sequences are unowned (by tables).

David J.

#18

david.g.johnston@gmail.com

about 6 years ago

In reply to: pabloa98 (#15)

Re: Could postgres12 support millions of sequences? (like 10 million)

On Thursday, March 19, 2020, pabloa98 <pabloa98@gmail.com> wrote:

We will use a CACHE 1. This is because when nextval('seq') is invoked, we

are hitting 3 or 4 more tables so the sequence will not be a performance
blocker (compared with all the operations in the transaction).

The other implementation detail to remember is:

“Because nextval and setval calls are never rolled back, sequence objects
cannot be used if “gapless” assignment of sequence numbers is needed.”

David J.

#19

rob stone

floriparob@gmail.com

about 6 years ago

In reply to: pabloa98 (#1)

Re: Could postgres12 support millions of sequences? (like 10 million)

Hello,

On Thu, 2020-03-19 at 14:36 -0700, pabloa98 wrote:

Hello,

My schema requires a counter for each combination of 2 values.
Something like:

CREATE TABLE counter(
group INT NOT NULL,
element INT NOT NULL,
seq_number INT NOT NULL default 0,
CONSTRAINT PRIMARY KEY (group, element)
);

For each entry in counter, aka for each (group, element) pair, the
model requires a seq_number.

If I use a table "counter", I could still have counter collisions
between 2 transactions. I need truly sequence behavior. Is that
possible by using a table like "counter" table, where the counter
could be increased out of the transaction so it performs as a
sequence without having race conditions between concurrent
transactions?

The other option is to create sequences for each new pair of (group,
element) using triggers. There are millions of pairs. So this
approach will generate millions of sequences.

How a PostgreSQL database would behave having millions of sequences
in a schema? Would it degrade its performance? Is there any negative
impact?

Regards

Pablo

Have you tried the following:-

BEGIN;
SELECT (seq_number + 1) FROM counter WHERE group = $1
AND element = $2 FOR UPDATE;

If that gives a NOTFOUND exception, then you need to insert into table
counter using a seq_number = 1, or update it with (seq_number + 1).
Then do the insert into whatever table needs the seq_number value.

If no errors detected, then COMMIT else ROLLBACK.
It is all in one transaction block.
We use this approach to assign invoice numbers.

You could pre-seed table counter when a group/element pair is created
for the first time.

HTH,
Robert

#20

adrian.klaver@aklaver.com

about 6 years ago

In reply to: pabloa98 (#16)

Re: Could postgres12 support millions of sequences? (like 10 million)

On 3/19/20 10:31 PM, pabloa98 wrote:

I see.

Any suggestion? It should behave like a sequence in the sense that
concurrent transitions will get different numbers from this alternative
sequence like solution.

In our case, we will need to do a call nextval('some_seq') (or similar)
from different processes no more than twice every minute.

It would be nice to have a sequence data type. Correct me if I am wrong,
but It seems to me that a sequence data type would cost the same or less
than the current sequence implementation.

The schema would be more clear too. We could have a table like:

CREATE TABLE pair(
group INT NOT NULL,
element INT NOT NULL,
seq SEQUENCE INCREMENT 1
START 1
CACHE 1
MINVALUE 1
MAXVALUE 99999999
        NOT NULL,
CONSTRAINT PRIMARY KEY (group, element)
);

And then:

INSERT INTO event(group, element, code)
VALUES (
         1,
         1,
         ( SELECT seq.nextval('code_1_1_seq') FROM pair p WHERE
p.group=1 and p.code=1 )
);

Or perhaps storing all the sequences in the same table as rows will have
the same behavior.

If code is just something to show the sequence of insertion for group,
element combinations then maybe something like below:

CREATE TABLE event(
group_id INT NOT NULL, --changed as group is reserved word
element INT NOT NULL,
insert_ts timestamptz NOT NULL DEFAULT clock_timestamp(),
PRIMARY KEY(group_id, element, insert_ts)
);

insert into event(group_id, element) VALUES
(1, 1),
(1, 1),
(1, 1),
(2, 1),
(1, 1),
(1, 3),
(1, 1),
(1, 3),
(2, 1),
(2, 1);

select * from event ;
group_id | element | insert_ts
----------+---------+--------------------------------
1 | 1 | 03/20/2020 09:51:12.675926 PDT
1 | 1 | 03/20/2020 09:51:12.675985 PDT
1 | 1 | 03/20/2020 09:51:12.675991 PDT
2 | 1 | 03/20/2020 09:51:12.675996 PDT
1 | 1 | 03/20/2020 09:51:12.676 PDT
1 | 3 | 03/20/2020 09:51:12.676004 PDT
1 | 1 | 03/20/2020 09:51:12.676008 PDT
1 | 3 | 03/20/2020 09:51:12.676012 PDT
2 | 1 | 03/20/2020 09:51:12.676016 PDT
2 | 1 | 03/20/2020 09:51:12.67602 PDT
(10 rows)

select group_id, element, row_number() OVER (partition by (group_id,
element) order by (group_id, element)) AS code from event;
group_id | element | code
----------+---------+------
1 | 1 | 1
1 | 1 | 2
1 | 1 | 3
1 | 1 | 4
1 | 1 | 5
1 | 3 | 1
1 | 3 | 2
2 | 1 | 1
2 | 1 | 2
2 | 1 | 3
(10 rows)

Pablo

On Thu, Mar 19, 2020 at 7:56 PM Tom Lane <tgl@sss.pgh.pa.us
<mailto:tgl@sss.pgh.pa.us>> wrote:

Michael Lewis <mlewis@entrata.com <mailto:mlewis@entrata.com>> writes:

On Thu, Mar 19, 2020, 5:48 PM David G. Johnston

<david.g.johnston@gmail.com <mailto:david.g.johnston@gmail.com>>

wrote:

However, one other consideration with sequences: do you care that
PostgreSQL will cache/pin (i.e., no release) every single

sequence you

touch for the lifetime of the session? (I do not think DISCARD

matters here

but I'm just guessing)

Would you expand on this point or is there someplace specific in the
documentation on this?

I think what David is worried about is that a sequence object is a
one-row table in PG's implementation. Thus

(1) each sequence requires a dozen or two rows in assorted system
catalogs (not sure exactly how many offhand).

(2) each sequence uses up 8KB on disk for its table file.

(3) each sequence you actually access within a particular session
results in creation of relcache and catcache entries in that
session's local memory. I'm not sure offhand how big those are
either, but a few KB per sequence would be a reasonable guess.

(4) each sequence competes for space in the shared-buffer arena,
since its 8K block has to be swapped into there whenever you try
to access/increment the sequence's value.

This is all mighty inefficient of course, and there's been talk
of trying to reduce the per-sequence overhead; but I don't know
of anyone actively working on that. As things stand, I think
having millions of sequences would be quite painful performance-
wise, especially if your workload were actively hitting a lot
of them concurrently. It would work, for some value of "work",
but it wouldn't perform very well.

Also, as multiple people mentioned already, this still wouldn't
guarantee gap-free sequences of ID values.

regards, tom lane

--
Adrian Klaver
adrian.klaver@aklaver.com

#21

adrian.klaver@aklaver.com

about 6 years ago

In reply to: Adrian Klaver (#20)

#22

pabloa98@gmail.com

about 6 years ago

In reply to: rob stone (#19)

#23

pabloa98@gmail.com

about 6 years ago

In reply to: Adrian Klaver (#21)

#24

hjp-pgsql@hjp.at

about 6 years ago

In reply to: David G. Johnston (#6)

#25

robjsargent@gmail.com

about 6 years ago

In reply to: Peter J. Holzer (#24)

#26

hjp-pgsql@hjp.at

about 6 years ago

In reply to: Rob Sargent (#25)

#27

robjsargent@gmail.com

about 6 years ago

In reply to: Peter J. Holzer (#26)

#28

adrian.klaver@aklaver.com

about 6 years ago

In reply to: Peter J. Holzer (#26)

#29

adrian.klaver@aklaver.com

about 6 years ago

In reply to: pabloa98 (#23)

#30

pabloa98@gmail.com

about 6 years ago

In reply to: Peter J. Holzer (#24)

#31

pabloa98@gmail.com

about 6 years ago

In reply to: Adrian Klaver (#28)

#32

david.g.johnston@gmail.com

about 6 years ago

In reply to: pabloa98 (#31)

#33

John W Higgins

wishdev@gmail.com

about 6 years ago

In reply to: pabloa98 (#31)

#34

adrian.klaver@aklaver.com

about 6 years ago

In reply to: pabloa98 (#31)

#35

robjsargent@gmail.com

about 6 years ago

In reply to: Adrian Klaver (#34)

#36

Ron

ronljohnsonjr@gmail.com

about 6 years ago

In reply to: Rob Sargent (#35)

#37

pabloa98@gmail.com

about 6 years ago

In reply to: Rob Sargent (#35)

#38

pabloa98@gmail.com

about 6 years ago

In reply to: John W Higgins (#33)

#39

pabloa98@gmail.com

about 6 years ago

In reply to: Ron (#36)

#40

robjsargent@gmail.com

about 6 years ago

In reply to: pabloa98 (#39)

#41

hjp-pgsql@hjp.at

about 6 years ago

In reply to: Adrian Klaver (#28)

#42

hjp-pgsql@hjp.at

about 6 years ago

In reply to: Rob Sargent (#40)

#43

pabloa98@gmail.com

about 6 years ago

In reply to: Peter J. Holzer (#41)

#44

robjsargent@gmail.com

about 6 years ago

In reply to: Peter J. Holzer (#42)

#45

adrian.klaver@aklaver.com

about 6 years ago

In reply to: pabloa98 (#43)

#46

pabloa98@gmail.com

about 6 years ago

In reply to: Adrian Klaver (#45)

#47

hjp-pgsql@hjp.at

about 6 years ago

In reply to: Rob Sargent (#44)

#48

hjp-pgsql@hjp.at

about 6 years ago

In reply to: pabloa98 (#43)

#49

adrian.klaver@aklaver.com

about 6 years ago

In reply to: pabloa98 (#37)

#50

pabloa98@gmail.com

about 6 years ago

In reply to: Adrian Klaver (#49)

#51

adrian.klaver@aklaver.com

about 6 years ago

In reply to: pabloa98 (#50)

#52

pabloa98@gmail.com

about 6 years ago

In reply to: Adrian Klaver (#51)

#53

Chris Browne

cbbrowne@acm.org

about 6 years ago

In reply to: pabloa98 (#50)

#54

david.g.johnston@gmail.com

about 6 years ago

In reply to: Chris Browne (#53)

#55

pabloa98@gmail.com

about 6 years ago

In reply to: Chris Browne (#53)

#56

pabloa98@gmail.com

about 6 years ago

In reply to: David G. Johnston (#54)

#57

Daniel Verite

daniel@manitou-mail.org

about 6 years ago

In reply to: pabloa98 (#55)

#58