unlogged sequences

Started by Peter Eisentrautover 6 years ago40 messages
#1Peter Eisentraut
Peter Eisentraut
peter.eisentraut@2ndquadrant.com
4 attachment(s)

The discussion in bug #15631 revealed that serial/identity sequences of
temporary tables should really also be temporary (easy), and that
serial/identity sequences of unlogged tables should also be unlogged.
But there is no support for unlogged sequences, so I looked into that.

If you copy the initial sequence relation file to the init fork, then
this all seems to work out just fine. Attached is a patch. The
low-level copying seems to be handled quite inconsistently across the
code, so I'm not sure what the most appropriate way to do this would be.
I'm looking for feedback from those who have worked on tableam and
storage manager to see what the right interfaces are or whether some new
interfaces might perhaps be appropriate.

(What's still missing in this patch is ALTER SEQUENCE SET
{LOGGED|UNLOGGED} as well as propagating the analogous ALTER TABLE
command to owned sequences.)

--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Attachments:

v1-0001-Make-command-order-in-test-more-sensible.patchtext/plain; charset=UTF-8; name=v1-0001-Make-command-order-in-test-more-sensible.patch; x-mac-creator=0; x-mac-type=0
v1-0002-Fix-comment.patchtext/plain; charset=UTF-8; name=v1-0002-Fix-comment.patch; x-mac-creator=0; x-mac-type=0
v1-0003-Unlogged-sequences.patchtext/plain; charset=UTF-8; name=v1-0003-Unlogged-sequences.patch; x-mac-creator=0; x-mac-type=0
v1-0004-Make-identity-serial-sequences-unlogged-when-thei.patchtext/plain; charset=UTF-8; name=v1-0004-Make-identity-serial-sequences-unlogged-when-thei.patch; x-mac-creator=0; x-mac-type=0
#2Michael Paquier
Michael Paquier
michael@paquier.xyz
In reply to: Peter Eisentraut (#1)
Re: unlogged sequences

On Thu, Jun 20, 2019 at 09:30:34AM +0200, Peter Eisentraut wrote:

The discussion in bug #15631 revealed that serial/identity sequences of
temporary tables should really also be temporary (easy), and that
serial/identity sequences of unlogged tables should also be unlogged.
But there is no support for unlogged sequences, so I looked into that.

Thanks for doing so.

If you copy the initial sequence relation file to the init fork, then
this all seems to work out just fine. Attached is a patch. The
low-level copying seems to be handled quite inconsistently across the
code, so I'm not sure what the most appropriate way to do this would be.
I'm looking for feedback from those who have worked on tableam and
storage manager to see what the right interfaces are or whether some new
interfaces might perhaps be appropriate.

But the actual deal is that the sequence meta-data is now in
pg_sequences and not the init forks, no? I have just done a small
test:
1) Some SQL queries:
create unlogged sequence popo;
alter sequence popo increment 2;
select nextval('popo');
select nextval('popo');
2) Then a hard crash:
pg_ctl stop -m immediate
pg_ctl start
3) Again, with a crash:
select nextval('popo');
#2 0x000055ce60f3208d in ExceptionalCondition
(conditionName=0x55ce610f0570 "!(((PageHeader) (page))->pd_special >=
(__builtin_offsetof (PageHeaderData, pd_linp)))",
errorType=0x55ce610f0507 "FailedAssertion",
fileName=0x55ce610f04e0 "../../../src/include/storage/bufpage.h",
lineNumber=317) at assert.c:54
#3 0x000055ce60b43200 in PageValidateSpecialPointer
(page=0x7ff7692b3d80 "") at
../../../src/include/storage/bufpage.h:317
#4 0x000055ce60b459d4 in read_seq_tuple (rel=0x7ff768ad27e0,
buf=0x7ffc5707f0bc, seqdatatuple=0x7ffc5707f0a0) at
sequence.c:1213
#5 0x000055ce60b447ee in nextval_internal (relid=16385,
check_permissions=true) at sequence.c:678
#6 0x000055ce60b44533 in nextval_oid (fcinfo=0x55ce62537570) at sequence.c:607
--
Michael

#3Peter Eisentraut
Peter Eisentraut
peter.eisentraut@2ndquadrant.com
In reply to: Michael Paquier (#2)
Re: unlogged sequences

On 2019-06-21 07:31, Michael Paquier wrote:

1) Some SQL queries:
create unlogged sequence popo;
alter sequence popo increment 2;

The problem is that the above command does a relation rewrite but the
code doesn't know to copy the init fork of the sequence. That will need
to be addressed.

select nextval('popo');
select nextval('popo');
2) Then a hard crash:
pg_ctl stop -m immediate
pg_ctl start
3) Again, with a crash:
select nextval('popo');
#2 0x000055ce60f3208d in ExceptionalCondition

--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#4Andres Freund
Andres Freund
andres@anarazel.de
In reply to: Peter Eisentraut (#1)
Re: unlogged sequences

Hi,

On 2019-06-20 09:30:34 +0200, Peter Eisentraut wrote:

I'm looking for feedback from those who have worked on tableam and
storage manager to see what the right interfaces are or whether some new
interfaces might perhaps be appropriate.

Hm, it's not clear to me that tableam design matters much around
sequences? To me it's a historical accident that sequences kinda look
like tables, not more.

+	/*
+	 * create init fork for unlogged sequences
+	 *
+	 * The logic follows that of RelationCreateStorage() and
+	 * RelationCopyStorage().
+	 */
+	if (seq->sequence->relpersistence == RELPERSISTENCE_UNLOGGED)
+	{
+		SMgrRelation srel;
+		PGAlignedBlock buf;
+		Page		page = (Page) buf.data;
+
+		FlushRelationBuffers(rel);

That's pretty darn expensive, especially when we just need to flush out
a *single* page, as it needs to scan all of shared buffers. Seems better
to just to initialize the page from scratch? Any reason not to do that?

+		srel = smgropen(rel->rd_node, InvalidBackendId);
+		smgrcreate(srel, INIT_FORKNUM, false);
+		log_smgrcreate(&rel->rd_node, INIT_FORKNUM);
+
+		Assert(smgrnblocks(srel, MAIN_FORKNUM) == 1);
+
+		smgrread(srel, MAIN_FORKNUM, 0, buf.data);
+
+		if (!PageIsVerified(page, 0))
+			ereport(ERROR,
+					(errcode(ERRCODE_DATA_CORRUPTED),
+					 errmsg("invalid page in block %u of relation %s",
+							0,
+							relpathbackend(srel->smgr_rnode.node,
+										   srel->smgr_rnode.backend,
+										   MAIN_FORKNUM))));
+
+		log_newpage(&srel->smgr_rnode.node, INIT_FORKNUM, 0, page, false);
+		PageSetChecksumInplace(page, 0);
+		smgrextend(srel, INIT_FORKNUM, 0, buf.data, false);
+		smgrclose(srel);
+	}

I.e. I think it'd be better if we just added a fork argument to
fill_seq_with_data(), and then do something like

smgrcreate(srel, INIT_FORKNUM, false);
log_smgrcreate(&rel->rd_node, INIT_FORKNUM);
fill_seq_with_data(rel, tuple, INIT_FORKNUM);

and add a FlushBuffer() to the end of fill_seq_with_data() if writing
INIT_FORKNUM. The if (RelationNeedsWAL(rel)) would need an || forkNum ==
INIT_FORKNUM.

Alternatively you could just copy the contents from the buffer currently
filled in fill_seq_with_data() to the main fork, and do a memcpy. But
that seems unnecessarily complicated, because you'd again need to do WAL
logging etc.

Greetings,

Andres Freund

#5Thomas Munro
Thomas Munro
thomas.munro@gmail.com
In reply to: Andres Freund (#4)
Re: unlogged sequences

On Wed, Jun 26, 2019 at 6:38 AM Andres Freund <andres@anarazel.de> wrote:

On 2019-06-20 09:30:34 +0200, Peter Eisentraut wrote:

I'm looking for feedback from those who have worked on tableam and
storage manager to see what the right interfaces are or whether some new
interfaces might perhaps be appropriate.

[lots of feedback that requires making decisions]

Seems to be actively under development but no new patch yet. Moved to next CF.

--
Thomas Munro
https://enterprisedb.com

#6Alvaro Herrera from 2ndQuadrant
Alvaro Herrera from 2ndQuadrant
alvherre@alvh.no-ip.org
In reply to: Thomas Munro (#5)
Re: unlogged sequences

On 2019-Aug-01, Thomas Munro wrote:

On Wed, Jun 26, 2019 at 6:38 AM Andres Freund <andres@anarazel.de> wrote:

On 2019-06-20 09:30:34 +0200, Peter Eisentraut wrote:

I'm looking for feedback from those who have worked on tableam and
storage manager to see what the right interfaces are or whether some new
interfaces might perhaps be appropriate.

[lots of feedback that requires making decisions]

Seems to be actively under development but no new patch yet. Moved to next CF.

Marked Waiting on Author.

--
Álvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#7Peter Eisentraut
Peter Eisentraut
peter.eisentraut@enterprisedb.com
In reply to: Andres Freund (#4)
1 attachment(s)
Re: unlogged sequences

On 25.06.19 20:37, Andres Freund wrote:

I.e. I think it'd be better if we just added a fork argument to
fill_seq_with_data(), and then do something like

smgrcreate(srel, INIT_FORKNUM, false);
log_smgrcreate(&rel->rd_node, INIT_FORKNUM);
fill_seq_with_data(rel, tuple, INIT_FORKNUM);

and add a FlushBuffer() to the end of fill_seq_with_data() if writing
INIT_FORKNUM. The if (RelationNeedsWAL(rel)) would need an || forkNum ==
INIT_FORKNUM.

Now that logical replication of sequences is nearing completion, I
figured it would be suitable to dust off this old discussion on unlogged
sequences, mainly so that sequences attached to unlogged tables can be
excluded from replication.

Attached is a new patch that incorporates the above suggestions, with
some slight refactoring. The only thing I didn't/couldn't do was to
call FlushBuffers(), since that is not an exported function. So this
still calls FlushRelationBuffers(), which was previously not liked.
Ideas welcome.

I have also re-tested the crash reported by Michael Paquier in the old
discussion and added test cases that catch them.

The rest of the patch is just documentation, DDL support, client
support, etc.

What is not done yet is support for ALTER SEQUENCE ... SET
LOGGED/UNLOGGED. This is a bit of a problem because:

1. The new behavior is that a serial/identity sequence of a new unlogged
table is now also unlogged.
2. There is also a new restriction that changing a table to logged is
not allowed if it is linked to an unlogged sequence. (This is IMO
similar to the existing restriction on linking mixed logged/unlogged
tables via foreign keys.)
3. Thus, currently, you can't create an unlogged table with a
serial/identity column and then change it to logged. This is reflected
in some of the test changes I had to make in alter_table.sql to work
around this. These should eventually go away.

Interestingly, there is grammar support for ALTER SEQUENCE ... SET
LOGGED/UNLOGGED because there is this:

| ALTER SEQUENCE qualified_name alter_table_cmds
{
AlterTableStmt *n = makeNode(AlterTableStmt);
n->relation = $3;
n->cmds = $4;
n->objtype = OBJECT_SEQUENCE;
n->missing_ok = false;
$$ = (Node *)n;
}

But it is rejected later in tablecmds.c. In fact, it appears that this
piece of grammar is currently useless because there are no
alter_table_cmds that actually work for sequences. (This used to be
different because things like OWNER TO also went through here.)

I tried to make tablecmds.c handle sequences as well, but that became
messy. So I'm thinking about making ALTER SEQUENCE ... SET
LOGGED/UNLOGGED an entirely separate code path and rip out the above
grammar, but that needs some further pondering.

But all that is a bit of a separate effort, so in the meantime some
review of the changes in and around fill_seq_with_data() would be useful.

Attachments:

v2-0001-Unlogged-sequences.patchtext/plain; charset=UTF-8; name=v2-0001-Unlogged-sequences.patch
#8Peter Eisentraut
Peter Eisentraut
peter.eisentraut@enterprisedb.com
In reply to: Peter Eisentraut (#7)
1 attachment(s)
Re: unlogged sequences

rebased patch, no functional changes

Show quoted text

On 11.02.22 10:12, Peter Eisentraut wrote:

On 25.06.19 20:37, Andres Freund wrote:

I.e. I think it'd be better if we just added a fork argument to
fill_seq_with_data(), and then do something like

smgrcreate(srel, INIT_FORKNUM, false);
log_smgrcreate(&rel->rd_node, INIT_FORKNUM);
fill_seq_with_data(rel, tuple, INIT_FORKNUM);

and add a FlushBuffer() to the end of fill_seq_with_data() if writing
INIT_FORKNUM. The if (RelationNeedsWAL(rel)) would need an || forkNum ==
INIT_FORKNUM.

Now that logical replication of sequences is nearing completion, I
figured it would be suitable to dust off this old discussion on unlogged
sequences, mainly so that sequences attached to unlogged tables can be
excluded from replication.

Attached is a new patch that incorporates the above suggestions, with
some slight refactoring.  The only thing I didn't/couldn't do was to
call FlushBuffers(), since that is not an exported function.  So this
still calls FlushRelationBuffers(), which was previously not liked.
Ideas welcome.

I have also re-tested the crash reported by Michael Paquier in the old
discussion and added test cases that catch them.

The rest of the patch is just documentation, DDL support, client
support, etc.

What is not done yet is support for ALTER SEQUENCE ... SET
LOGGED/UNLOGGED.  This is a bit of a problem because:

1. The new behavior is that a serial/identity sequence of a new unlogged
table is now also unlogged.
2. There is also a new restriction that changing a table to logged is
not allowed if it is linked to an unlogged sequence.  (This is IMO
similar to the existing restriction on linking mixed logged/unlogged
tables via foreign keys.)
3. Thus, currently, you can't create an unlogged table with a
serial/identity column and then change it to logged.  This is reflected
in some of the test changes I had to make in alter_table.sql to work
around this.  These should eventually go away.

Interestingly, there is grammar support for ALTER SEQUENCE ... SET
LOGGED/UNLOGGED because there is this:

        |   ALTER SEQUENCE qualified_name alter_table_cmds
                {
                    AlterTableStmt *n = makeNode(AlterTableStmt);
                    n->relation = $3;
                    n->cmds = $4;
                    n->objtype = OBJECT_SEQUENCE;
                    n->missing_ok = false;
                    $$ = (Node *)n;
                }

But it is rejected later in tablecmds.c.  In fact, it appears that this
piece of grammar is currently useless because there are no
alter_table_cmds that actually work for sequences.  (This used to be
different because things like OWNER TO also went through here.)

I tried to make tablecmds.c handle sequences as well, but that became
messy.  So I'm thinking about making ALTER SEQUENCE ... SET
LOGGED/UNLOGGED an entirely separate code path and rip out the above
grammar, but that needs some further pondering.

But all that is a bit of a separate effort, so in the meantime some
review of the changes in and around fill_seq_with_data() would be useful.

Attachments:

v3-0001-Unlogged-sequences.patchtext/plain; charset=UTF-8; name=v3-0001-Unlogged-sequences.patch
#9Peter Eisentraut
Peter Eisentraut
peter.eisentraut@enterprisedb.com
In reply to: Peter Eisentraut (#8)
1 attachment(s)
Re: unlogged sequences

Here is an updated patch that now also includes SET LOGGED/UNLOGGED
support. So this version addresses all known issues and open problems.

Show quoted text

On 28.02.22 10:56, Peter Eisentraut wrote:

rebased patch, no functional changes

On 11.02.22 10:12, Peter Eisentraut wrote:

On 25.06.19 20:37, Andres Freund wrote:

I.e. I think it'd be better if we just added a fork argument to
fill_seq_with_data(), and then do something like

smgrcreate(srel, INIT_FORKNUM, false);
log_smgrcreate(&rel->rd_node, INIT_FORKNUM);
fill_seq_with_data(rel, tuple, INIT_FORKNUM);

and add a FlushBuffer() to the end of fill_seq_with_data() if writing
INIT_FORKNUM. The if (RelationNeedsWAL(rel)) would need an || forkNum ==
INIT_FORKNUM.

Now that logical replication of sequences is nearing completion, I
figured it would be suitable to dust off this old discussion on
unlogged sequences, mainly so that sequences attached to unlogged
tables can be excluded from replication.

Attached is a new patch that incorporates the above suggestions, with
some slight refactoring.  The only thing I didn't/couldn't do was to
call FlushBuffers(), since that is not an exported function.  So this
still calls FlushRelationBuffers(), which was previously not liked.
Ideas welcome.

I have also re-tested the crash reported by Michael Paquier in the old
discussion and added test cases that catch them.

The rest of the patch is just documentation, DDL support, client
support, etc.

What is not done yet is support for ALTER SEQUENCE ... SET
LOGGED/UNLOGGED.  This is a bit of a problem because:

1. The new behavior is that a serial/identity sequence of a new
unlogged table is now also unlogged.
2. There is also a new restriction that changing a table to logged is
not allowed if it is linked to an unlogged sequence.  (This is IMO
similar to the existing restriction on linking mixed logged/unlogged
tables via foreign keys.)
3. Thus, currently, you can't create an unlogged table with a
serial/identity column and then change it to logged.  This is
reflected in some of the test changes I had to make in alter_table.sql
to work around this.  These should eventually go away.

Interestingly, there is grammar support for ALTER SEQUENCE ... SET
LOGGED/UNLOGGED because there is this:

         |   ALTER SEQUENCE qualified_name alter_table_cmds
                 {
                     AlterTableStmt *n = makeNode(AlterTableStmt);
                     n->relation = $3;
                     n->cmds = $4;
                     n->objtype = OBJECT_SEQUENCE;
                     n->missing_ok = false;
                     $$ = (Node *)n;
                 }

But it is rejected later in tablecmds.c.  In fact, it appears that
this piece of grammar is currently useless because there are no
alter_table_cmds that actually work for sequences.  (This used to be
different because things like OWNER TO also went through here.)

I tried to make tablecmds.c handle sequences as well, but that became
messy.  So I'm thinking about making ALTER SEQUENCE ... SET
LOGGED/UNLOGGED an entirely separate code path and rip out the above
grammar, but that needs some further pondering.

But all that is a bit of a separate effort, so in the meantime some
review of the changes in and around fill_seq_with_data() would be useful.

Attachments:

v4-0001-Unlogged-sequences.patchtext/plain; charset=UTF-8; name=v4-0001-Unlogged-sequences.patch
#10Peter Eisentraut
Peter Eisentraut
peter.eisentraut@enterprisedb.com
In reply to: Peter Eisentraut (#9)
1 attachment(s)
Re: unlogged sequences

Patch rebased over some conflicts, and some tests simplified.

Show quoted text

On 24.03.22 14:10, Peter Eisentraut wrote:

Here is an updated patch that now also includes SET LOGGED/UNLOGGED
support.  So this version addresses all known issues and open problems.

On 28.02.22 10:56, Peter Eisentraut wrote:

rebased patch, no functional changes

On 11.02.22 10:12, Peter Eisentraut wrote:

On 25.06.19 20:37, Andres Freund wrote:

I.e. I think it'd be better if we just added a fork argument to
fill_seq_with_data(), and then do something like

smgrcreate(srel, INIT_FORKNUM, false);
log_smgrcreate(&rel->rd_node, INIT_FORKNUM);
fill_seq_with_data(rel, tuple, INIT_FORKNUM);

and add a FlushBuffer() to the end of fill_seq_with_data() if writing
INIT_FORKNUM. The if (RelationNeedsWAL(rel)) would need an ||
forkNum ==
INIT_FORKNUM.

Now that logical replication of sequences is nearing completion, I
figured it would be suitable to dust off this old discussion on
unlogged sequences, mainly so that sequences attached to unlogged
tables can be excluded from replication.

Attached is a new patch that incorporates the above suggestions, with
some slight refactoring.  The only thing I didn't/couldn't do was to
call FlushBuffers(), since that is not an exported function.  So this
still calls FlushRelationBuffers(), which was previously not liked.
Ideas welcome.

I have also re-tested the crash reported by Michael Paquier in the
old discussion and added test cases that catch them.

The rest of the patch is just documentation, DDL support, client
support, etc.

What is not done yet is support for ALTER SEQUENCE ... SET
LOGGED/UNLOGGED.  This is a bit of a problem because:

1. The new behavior is that a serial/identity sequence of a new
unlogged table is now also unlogged.
2. There is also a new restriction that changing a table to logged is
not allowed if it is linked to an unlogged sequence.  (This is IMO
similar to the existing restriction on linking mixed logged/unlogged
tables via foreign keys.)
3. Thus, currently, you can't create an unlogged table with a
serial/identity column and then change it to logged.  This is
reflected in some of the test changes I had to make in
alter_table.sql to work around this.  These should eventually go away.

Interestingly, there is grammar support for ALTER SEQUENCE ... SET
LOGGED/UNLOGGED because there is this:

         |   ALTER SEQUENCE qualified_name alter_table_cmds
                 {
                     AlterTableStmt *n = makeNode(AlterTableStmt);
                     n->relation = $3;
                     n->cmds = $4;
                     n->objtype = OBJECT_SEQUENCE;
                     n->missing_ok = false;
                     $$ = (Node *)n;
                 }

But it is rejected later in tablecmds.c.  In fact, it appears that
this piece of grammar is currently useless because there are no
alter_table_cmds that actually work for sequences.  (This used to be
different because things like OWNER TO also went through here.)

I tried to make tablecmds.c handle sequences as well, but that became
messy.  So I'm thinking about making ALTER SEQUENCE ... SET
LOGGED/UNLOGGED an entirely separate code path and rip out the above
grammar, but that needs some further pondering.

But all that is a bit of a separate effort, so in the meantime some
review of the changes in and around fill_seq_with_data() would be
useful.

Attachments:

v5-0001-Unlogged-sequences.patchtext/plain; charset=UTF-8; name=v5-0001-Unlogged-sequences.patch
#11Tomas Vondra
Tomas Vondra
tomas.vondra@enterprisedb.com
In reply to: Peter Eisentraut (#10)
1 attachment(s)
Re: unlogged sequences

Hi,

Here's a slightly improved patch, adding a couple checks and tests for
owned sequences to ensure both objects have the same persistence. In
particular:

* When linking a sequence to a table (ALTER SEQUENCE ... OWNED BY),
there's an ereport(ERROR) if the relpersistence values do not match.

* Disallow changing persistence for owned sequences directly.

But I wonder about two things:

1) Do we need to do something about pg_upgrade? I mean, we did not have
unlogged sequences until now, so existing databases may have unlogged
tables with logged sequences. If people run pg_upgrade, what should be
the end result? Should it convert the sequences to unlogged ones, should
it fail and force the user to fix this manually, or what?

2) Does it actually make sense to force owned sequences to have the same
relpersistence as the table? I can imagine use cases where it's OK to
discard and recalculate the data, but I'd still want to ensure unique
IDs. Like some data loads, for example.

regards

--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Attachments:

0001-Unlogged-sequences-20220331.patchtext/x-patch; charset=UTF-8; name=0001-Unlogged-sequences-20220331.patch
#12Andres Freund
Andres Freund
andres@anarazel.de
In reply to: Tomas Vondra (#11)
Re: unlogged sequences

Hi,

On 2022-03-31 16:14:25 +0200, Tomas Vondra wrote:

1) Do we need to do something about pg_upgrade? I mean, we did not have
unlogged sequences until now, so existing databases may have unlogged
tables with logged sequences. If people run pg_upgrade, what should be
the end result? Should it convert the sequences to unlogged ones, should
it fail and force the user to fix this manually, or what?

2) Does it actually make sense to force owned sequences to have the same
relpersistence as the table? I can imagine use cases where it's OK to
discard and recalculate the data, but I'd still want to ensure unique
IDs. Like some data loads, for example.

I agree it makes sense to have logged sequences with unlogged tables. We
should call out the behavioural change somewhere prominent in the release
notes.

I don't think we should make pg_upgrade change the loggedness of sequences.

Greetings,

Andres Freund

#13David G. Johnston
David G. Johnston
david.g.johnston@gmail.com
In reply to: Andres Freund (#12)
Re: unlogged sequences

On Thu, Mar 31, 2022 at 9:28 AM Andres Freund <andres@anarazel.de> wrote:

I agree it makes sense to have logged sequences with unlogged tables. We
should call out the behavioural change somewhere prominent in the release
notes.

We can/do already support that unlikely use case by allowing one to remove
the OWNERSHIP dependency between the table and the sequence.

I'm fine with owned sequences tracking the persistence attribute of the
owning table.

I don't think we should make pg_upgrade change the loggedness of sequences.

We are willing to change the default behavior here so it is going to affect
dump/restore anyway, might as well fully commit and do the same for
pg_upgrade. The vast majority of users will benefit from the new default
behavior.

I don't actually get, though, how that would play with pg_dump since it
always emits an unowned, and thus restored as logged, sequence first and
then alters the sequence to be owned by the table. Thus restoring an old
SQL dump into the v15 is going to fail if we prohibit
unlogged-table/logged-sequence; unless we actively change the logged-ness
of the sequence when subordinating it to a table.

Thus, the choices seem to be:

1) implement forced persistence agreement for owned sequences, changing the
sequence to match the table when the alter table happens, and during
pg_upgrade.
2) do not force persistence agreement for owned sequences

If choosing option 2, are you on board with changing the behavior of CREATE
UNLOGGED TABLE with respect to any auto-generated sequences?

David J.

#14Tomas Vondra
Tomas Vondra
tomas.vondra@enterprisedb.com
In reply to: David G. Johnston (#13)
Re: unlogged sequences

On 3/31/22 19:35, David G. Johnston wrote:

On Thu, Mar 31, 2022 at 9:28 AM Andres Freund <andres@anarazel.de
<mailto:andres@anarazel.de>> wrote:

I agree it makes sense to have logged sequences with unlogged tables. We
should call out the behavioural change somewhere prominent in the
release
notes.

I'm not sure I follow. If we allow logged sequences with unlogged
tables, there's be no behavioral change, no?

We can/do already support that unlikely use case by allowing one to
remove the OWNERSHIP dependency between the table and the sequence.

I'm fine with owned sequences tracking the persistence attribute of the
owning table.

So essentially an independent sequence, used in a default value.

I don't think we should make pg_upgrade change the loggedness of
sequences.

We are willing to change the default behavior here so it is going to
affect dump/restore anyway, might as well fully commit and do the same
for pg_upgrade.  The vast majority of users will benefit from the new
default behavior.

Whatever we do, I think we should keep the pg_dump and pg_upgrade
behavior as consistent as possible.

I don't actually get, though, how that would play with pg_dump since it
always emits an unowned, and thus restored as logged, sequence first and
then alters the sequence to be owned by the table.  Thus restoring an
old SQL dump into the v15 is going to fail if we prohibit
unlogged-table/logged-sequence; unless we actively change the
logged-ness of the sequence when subordinating it to a table.

Yeah. I guess we'd need to either automatically switch the sequence to
the right persistence when linking it to the table, or maybe we could
modify pg_dump to emit UNLOGGED when the table is unlogged (but that
would work only when using the new pg_dump).

Thus, the choices seem to be:

1) implement forced persistence agreement for owned sequences, changing
the sequence to match the table when the alter table happens, and during
pg_upgrade.
2) do not force persistence agreement for owned sequences

If choosing option 2, are you on board with changing the behavior of
CREATE UNLOGGED TABLE with respect to any auto-generated sequences?

What behavior change, exactly? To create the sequences as UNLOGGED, but
we'd not update the persistence after that?

regards

--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#15David G. Johnston
David G. Johnston
david.g.johnston@gmail.com
In reply to: Tomas Vondra (#14)
Re: unlogged sequences

On Thu, Mar 31, 2022 at 12:36 PM Tomas Vondra <tomas.vondra@enterprisedb.com>
wrote:

On 3/31/22 19:35, David G. Johnston wrote:

On Thu, Mar 31, 2022 at 9:28 AM Andres Freund <andres@anarazel.de
<mailto:andres@anarazel.de>> wrote:

I agree it makes sense to have logged sequences with unlogged

tables. We

should call out the behavioural change somewhere prominent in the
release
notes.

I'm not sure I follow. If we allow logged sequences with unlogged
tables, there's be no behavioral change, no?

As noted below, the behavior change is in how CREATE TABLE behaves. Not
whether or not mixed persistence is allowed.

or maybe we could
modify pg_dump to emit UNLOGGED when the table is unlogged (but that
would work only when using the new pg_dump).

Yes, the horse has already left the barn. I don't really have an opinion
on whether to leave the barn door open or closed.

If choosing option 2, are you on board with changing the behavior of
CREATE UNLOGGED TABLE with respect to any auto-generated sequences?

What behavior change, exactly? To create the sequences as UNLOGGED, but
we'd not update the persistence after that?

Today, a newly created unlogged table with an automatically owned sequence
(serial, generated identity) has a logged sequence. This patch changes
that so the new automatically owned sequence is unlogged. This seems to be
generally agreed upon as being desirable - but given the fact that unlogged
tables will not have unlogged sequences it seems worth confirming that this
minor inconsistency is acceptable.

The first newly added behavior is just allowing sequences to be unlogged.
That is the only mandatory feature introduced by this patch and doesn't
seem contentious.

The second newly added behavior being proposed is to have the persistence
of the sequence be forcibly matched to the table. Whether this is
desirable is the main point under discussion.

David J.

#16Tomas Vondra
Tomas Vondra
tomas.vondra@enterprisedb.com
In reply to: David G. Johnston (#15)
Re: unlogged sequences

On 3/31/22 21:55, David G. Johnston wrote:

On Thu, Mar 31, 2022 at 12:36 PM Tomas Vondra
<tomas.vondra@enterprisedb.com <mailto:tomas.vondra@enterprisedb.com>>
wrote:

On 3/31/22 19:35, David G. Johnston wrote:

On Thu, Mar 31, 2022 at 9:28 AM Andres Freund <andres@anarazel.de

<mailto:andres@anarazel.de>

<mailto:andres@anarazel.de <mailto:andres@anarazel.de>>> wrote:

     I agree it makes sense to have logged sequences with unlogged

tables. We

     should call out the behavioural change somewhere prominent in the
     release
     notes.

I'm not sure I follow. If we allow logged sequences with unlogged
tables, there's be no behavioral change, no?

As noted below, the behavior change is in how CREATE TABLE behaves.  Not
whether or not mixed persistence is allowed.
 

or maybe we could
modify pg_dump to emit UNLOGGED when the table is unlogged (but that
would work only when using the new pg_dump).

Yes, the horse has already left the barn.  I don't really have an
opinion on whether to leave the barn door open or closed.
 

If choosing option 2, are you on board with changing the behavior of
CREATE UNLOGGED TABLE with respect to any auto-generated sequences?

What behavior change, exactly? To create the sequences as UNLOGGED, but
we'd not update the persistence after that?

Today, a newly created unlogged table with an automatically owned
sequence (serial, generated identity) has a logged sequence.  This patch
changes that so the new automatically owned sequence is unlogged.  This
seems to be generally agreed upon as being desirable - but given the
fact that unlogged tables will not have unlogged sequences it seems
worth confirming that this minor inconsistency is acceptable.

The first newly added behavior is just allowing sequences to be
unlogged.  That is the only mandatory feature introduced by this patch
and doesn't seem contentious.

The second newly added behavior being proposed is to have the
persistence of the sequence be forcibly matched to the table.  Whether
this is desirable is the main point under discussion.

Right. The latest version of the patch also prohibits changing
persistence of owned sequences directly. But that's probably considered
to be part of the second behavior.

I agree the first part is not contentious, so shall we extract this part
of the patch and get that committed for PG15? Or is that too late to
make such changes to the patch?

regards

--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#17David G. Johnston
David G. Johnston
david.g.johnston@gmail.com
In reply to: Tomas Vondra (#16)
Re: unlogged sequences

On Thu, Mar 31, 2022 at 1:05 PM Tomas Vondra <tomas.vondra@enterprisedb.com>
wrote:

I agree the first part is not contentious, so shall we extract this part
of the patch and get that committed for PG15? Or is that too late to
make such changes to the patch?

The minimum viable feature for me, given the written goal for the patch and
the premise of not changing any existing behavior, is:

DB State: Allow a sequence to be unlogged.
Command: ALTER SEQUENCE SET UNLOGGED
Limitation: The above command fails if the sequence is unowned, or it is
owned and the table owning it is not UNLOGGED

(optional safety) Limitation: Changing a table from unlogged to logged
while owning unlogged sequences would be prohibited
(optional safety) Compensatory Behavior: Add the ALTER SEQUENCE SET LOGGED
command for owned sequences to get them logged again in preparation for
changing the table to being logged.

In particular, I don't see CREATE UNLOGGED SEQUENCE to be all that valuable
since CREATE UNLOGGED TABLE wouldn't leverage it.

The above, possibly only half-baked, patch scope does not change any
existing behavior but allows for the stated goal: an unlogged table having
an unlogged sequence. The DBA just has to execute the ALTER SEQUENCE
command on all relevant sequences. They can't even really get it wrong
since only relevant sequences can be altered. Not having CREATE TABLE make
an unlogged sequence by default is annoying though and likely should be
changed - though it can leverage ALTER SEQUENCE too.

Anything else they wish to do can be done via a combination of ownership
manipulation and, worse case, dropping and recreating the sequence. Though
allowed for unowned unlogged sequences, while outside the explicit goal of
the patch, would be an easy add (just don't error on the ALTER SEQUENCE SET
UNLOGGED when the sequence is unowned).

David J.

#18David G. Johnston
David G. Johnston
david.g.johnston@gmail.com
In reply to: David G. Johnston (#17)
Re: unlogged sequences

On Thu, Mar 31, 2022 at 1:40 PM David G. Johnston <
david.g.johnston@gmail.com> wrote:

The DBA just has to execute the ALTER SEQUENCE command on all relevant
sequences.

Additional, if we do not implement the forced matching of persistence mode,
we should consider adding an "ALTER TABLE SET ALL SEQUENCES TO UNLOGGED"
command for convenience. Or maybe make it a function - which would allow
for SQL execution against a catalog lookup.

David J.

#19Tomas Vondra
Tomas Vondra
tomas.vondra@enterprisedb.com
In reply to: David G. Johnston (#17)
Re: unlogged sequences

On 3/31/22 22:40, David G. Johnston wrote:

On Thu, Mar 31, 2022 at 1:05 PM Tomas Vondra
<tomas.vondra@enterprisedb.com <mailto:tomas.vondra@enterprisedb.com>>
wrote:

I agree the first part is not contentious, so shall we extract this part
of the patch and get that committed for PG15? Or is that too late to
make such changes to the patch?

The minimum viable feature for me, given the written goal for the patch
and the premise of not changing any existing behavior, is:

DB State: Allow a sequence to be unlogged.
Command: ALTER SEQUENCE SET UNLOGGED
Limitation: The above command fails if the sequence is unowned, or it is
owned and the table owning it is not UNLOGGED

(optional safety) Limitation: Changing a table from unlogged to logged
while owning unlogged sequences would be prohibited
(optional safety) Compensatory Behavior: Add the ALTER SEQUENCE SET
LOGGED command for owned sequences to get them logged again in
preparation for changing the table to being logged.

In particular, I don't see CREATE UNLOGGED SEQUENCE to be all that
valuable since CREATE UNLOGGED TABLE wouldn't leverage it.

Hmm, so what about doing a little bit different thing:

1) owned sequences inherit persistence of the table by default

2) allow ALTER SEQUENCE to change persistence for all sequences (no
restriction for owned sequences)

3) ALTER TABLE ... SET [UN]LOGGED changes persistence for sequences
matching the initial table persistence

IMHO (1) would address vast majority of cases, which simply want the
same persistence for the whole table and all auxiliary objects. (2)
would address use cases requiring different persistence for sequences
(including owned ones).

I'm not sure about (3) though, maybe that's overkill.

Of course, we'll always have problems with older releases, as it's not
clear whether a logged sequence on unlogged table would be desirable or
is used just because unlogged sequences were not supported. (We do have
the same issue for logged tables too, but I doubt anyone really needs
defining unlogged sequences on logged tables.)

So no matter what we do, we'll make the wrong decision in some cases.

The above, possibly only half-baked, patch scope does not change any
existing behavior but allows for the stated goal: an unlogged table
having an unlogged sequence.  The DBA just has to execute the ALTER
SEQUENCE command on all relevant sequences.  They can't even really get
it wrong since only relevant sequences can be altered.  Not having
CREATE TABLE make an unlogged sequence by default is annoying though and
likely should be changed - though it can leverage ALTER SEQUENCE too.

Anything else they wish to do can be done via a combination of ownership
manipulation and, worse case, dropping and recreating the sequence. 
Though allowed for unowned unlogged sequences, while outside the
explicit goal of the patch, would be an easy add (just don't error on
the ALTER SEQUENCE SET UNLOGGED when the sequence is unowned).

Yeah. I think my proposal is pretty close to that, except that the
sequence would first inherit persistence from the table, and there'd be
an ALTER SEQUENCE for owned sequences where it differs. (And non-owned
sequences would be created as logged/unlogged explicitly.)

I don't think we need to worry about old pg_dump versions on new PG
versions, because that's not really supported.

And for old PG versions the behavior would differ a bit depending on the
pg_dump version used. With old pg_dump version, the ALTER SEQUENCE would
not be emitted, so all owned sequences would inherit table persistence.
With new pg_dump we'd get the expected persistence (which might differ).

That's need to be documented, of course.

regards

--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#20David G. Johnston
David G. Johnston
david.g.johnston@gmail.com
In reply to: Tomas Vondra (#19)
Re: unlogged sequences

On Thu, Mar 31, 2022 at 3:43 PM Tomas Vondra <tomas.vondra@enterprisedb.com>
wrote:

On 3/31/22 22:40, David G. Johnston wrote:

On Thu, Mar 31, 2022 at 1:05 PM Tomas Vondra
<tomas.vondra@enterprisedb.com <mailto:tomas.vondra@enterprisedb.com>>
wrote:

I agree the first part is not contentious, so shall we extract this

part

of the patch and get that committed for PG15? Or is that too late to
make such changes to the patch?

The minimum viable feature for me, given the written goal for the patch
and the premise of not changing any existing behavior, is:

DB State: Allow a sequence to be unlogged.
Command: ALTER SEQUENCE SET UNLOGGED
Limitation: The above command fails if the sequence is unowned, or it is
owned and the table owning it is not UNLOGGED

(optional safety) Limitation: Changing a table from unlogged to logged
while owning unlogged sequences would be prohibited
(optional safety) Compensatory Behavior: Add the ALTER SEQUENCE SET
LOGGED command for owned sequences to get them logged again in
preparation for changing the table to being logged.

In particular, I don't see CREATE UNLOGGED SEQUENCE to be all that
valuable since CREATE UNLOGGED TABLE wouldn't leverage it.

Hmm, so what about doing a little bit different thing:

1) owned sequences inherit persistence of the table by default

This is the contentious point. If we are going to do it by default - thus
changing existing behavior - I would rather just do it always. This is
also underspecified, there are multiple ways for a sequence to become owned.

Personally I'm for the choice to effectively remove the sequence's own
concept of logged/unlogged when it is owned by a table and to always just
use the table's value.

2) allow ALTER SEQUENCE to change persistence for all sequences (no
restriction for owned sequences)

A generalization that is largely incontrovertible.

3) ALTER TABLE ... SET [UN]LOGGED changes persistence for sequences
matching the initial table persistence

I'm leaning against this, leaving users to set each owned sequence to
logged/unlogged as they see fit if they want something other than
all-or-nothing. I would stick to only providing an easy method to get the
assumed desired all-same behavior.
ALTER TABLE SET [UN]LOGGED, SET ALL SEQUENCES TO [UN]LOGGED;

IMHO (1) would address vast majority of cases, which simply want the
same persistence for the whole table and all auxiliary objects. (2)
would address use cases requiring different persistence for sequences
(including owned ones).

I'm not sure about (3) though, maybe that's overkill.

Of course, we'll always have problems with older releases, as it's not
clear whether a logged sequence on unlogged table would be desirable or
is used just because unlogged sequences were not supported. (We do have
the same issue for logged tables too, but I doubt anyone really needs
defining unlogged sequences on logged tables.)

So no matter what we do, we'll make the wrong decision in some cases.

Again, I don't have too much concern here because you lose very little by
having an unowned sequence. Which is why I'm fine with owned sequences
becoming even moreso implementation details that adhere to the persistence
mode of the owning relation. But if the goal here is to defer such a
decision then the tradeoff is the DBA is given control and they get to
enforce consistency even if they are not benefitting from the flexibility.

Not having
CREATE TABLE make an unlogged sequence by default is annoying though and
likely should be changed - though it can leverage ALTER SEQUENCE too.

Yeah. I think my proposal is pretty close to that, except that the
sequence would first inherit persistence from the table, and there'd be
an ALTER SEQUENCE for owned sequences where it differs. (And non-owned
sequences would be created as logged/unlogged explicitly.)

I don't have any real problem with 1 or 2, they fill out the feature so it
is generally designed as opposed to solving a very specific problem.

For 1:
The "ADD COLUMN" (whether in CREATE TABLE or ALTER TABLE) pathway will
produce a new sequence whose persistence matches that of the target table.
While a behavior change it is one aligned with the goal of the patch for
typical ongoing behavior and should benefit way more people than it may
inconvenience. The "sequence not found" error that would be generated
seems minimally impactful.

The "ALTER SEQUENCE OWNED BY" pathway will not change the sequence's
persistence. This is what pg_dump will use for serial/bigserial
The "ALTER TABLE ALTER COLUMN" pathway will not change the sequence's
persistence. This is what pg_dump will use for generated always as identity

Provide a general purpose ALTER SEQUENCE SET [UN]LOGGED command

Provide an SQL Command to change all owned sequences of a table to be
UNLOGGED or LOGGED (I mentioned a function as well if someone thinks it
worth the time - in lieu of a function a psql script leveraging \gexec may
be nice to reference).

I don't think we need to worry about old pg_dump versions on new PG
versions, because that's not really supported.

Correct.

And for old PG versions the behavior would differ a bit depending on the
pg_dump version used. With old pg_dump version, the ALTER SEQUENCE would
not be emitted,

Correct, nothing else is emitted either...

That's need to be documented, of course.

It (the general promises for pg_dump) is documented.

https://www.postgresql.org/docs/current/app-pgdump.html : Notes

David J.

#21Robert Haas
Robert Haas
robertmhaas@gmail.com
In reply to: Tomas Vondra (#11)
Re: unlogged sequences

On Thu, Mar 31, 2022 at 10:14 AM Tomas Vondra
<tomas.vondra@enterprisedb.com> wrote:

* When linking a sequence to a table (ALTER SEQUENCE ... OWNED BY),
there's an ereport(ERROR) if the relpersistence values do not match.

* Disallow changing persistence for owned sequences directly.

Wait, what? I don't understand why we would want to do either of these things.

It seems to me that it's totally fine to use a logged table with an
unlogged sequence, or an unlogged table with a logged sequence, or any
of the other combinations. You get what you ask for, so make sure to
ask for what you want. And that's it.

If you say something like CREATE [UNLOGGED] TABLE foo (a serial) it's
fine for serial to attribute the same persistence level to the
sequence as it does to the table. But when that's dumped, it's going
to be dumped as a CREATE TABLE command and a CREATE SEQUENCE command,
each of which has a separate persistence level. So you can recreate
whatever state you have.

--
Robert Haas
EDB: http://www.enterprisedb.com

#22Tomas Vondra
Tomas Vondra
tomas.vondra@enterprisedb.com
In reply to: Robert Haas (#21)
Re: unlogged sequences

On 4/1/22 02:25, Robert Haas wrote:

On Thu, Mar 31, 2022 at 10:14 AM Tomas Vondra
<tomas.vondra@enterprisedb.com> wrote:

* When linking a sequence to a table (ALTER SEQUENCE ... OWNED BY),
there's an ereport(ERROR) if the relpersistence values do not match.

* Disallow changing persistence for owned sequences directly.

Wait, what? I don't understand why we would want to do either of these things.

It seems to me that it's totally fine to use a logged table with an
unlogged sequence, or an unlogged table with a logged sequence, or any
of the other combinations. You get what you ask for, so make sure to
ask for what you want. And that's it.

If you say something like CREATE [UNLOGGED] TABLE foo (a serial) it's
fine for serial to attribute the same persistence level to the
sequence as it does to the table. But when that's dumped, it's going
to be dumped as a CREATE TABLE command and a CREATE SEQUENCE command,
each of which has a separate persistence level. So you can recreate
whatever state you have.

Well, yeah. I did this because the patch was somewhat inconsistent when
handling owned sequences - it updated persistence for owned sequences
when persistence for the table changed, expecting to keep them in sync,
but then it also allowed operations that'd break it.

But that started a discussion about exactly this, and AFAICS there's
agreement we want to allow the table and owned sequences to have
different persistence values.

The discussion about the details is still ongoing, but I think it's
clear we'll ditch the restrictions you point out.

regards

--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#23David G. Johnston
David G. Johnston
david.g.johnston@gmail.com
In reply to: Robert Haas (#21)
Re: unlogged sequences

On Thu, Mar 31, 2022 at 5:25 PM Robert Haas <robertmhaas@gmail.com> wrote:

On Thu, Mar 31, 2022 at 10:14 AM Tomas Vondra
<tomas.vondra@enterprisedb.com> wrote:

* When linking a sequence to a table (ALTER SEQUENCE ... OWNED BY),
there's an ereport(ERROR) if the relpersistence values do not match.

* Disallow changing persistence for owned sequences directly.

Wait, what? I don't understand why we would want to do either of these
things.

It seems to me that it's totally fine to use a logged table with an
unlogged sequence, or an unlogged table with a logged sequence, or any
of the other combinations. You get what you ask for, so make sure to
ask for what you want. And that's it.

It seems reasonable to extend the definition of "ownership of a sequence"
in this way. We always let you create unowned sequences with whatever
persistence you like if you need flexibility.

The "give the user power" argument is also valid. But since they already
have power through unowned sequences, having the owned sequences more
narrowly defined doesn't detract from usability, and in many ways enhances
it by further reinforcing the fact that the sequence internally used when
you say "GENERATED ALWAYS AS IDENTITY" is an implementation detail - one
that has the same persistence as the table.

David J.

#24Robert Haas
Robert Haas
robertmhaas@gmail.com
In reply to: Tomas Vondra (#22)
Re: unlogged sequences

On Thu, Mar 31, 2022 at 8:42 PM Tomas Vondra
<tomas.vondra@enterprisedb.com> wrote:

Well, yeah. I did this because the patch was somewhat inconsistent when
handling owned sequences - it updated persistence for owned sequences
when persistence for the table changed, expecting to keep them in sync,
but then it also allowed operations that'd break it.

Oops.

But that started a discussion about exactly this, and AFAICS there's
agreement we want to allow the table and owned sequences to have
different persistence values.

The discussion about the details is still ongoing, but I think it's
clear we'll ditch the restrictions you point out.

Great.

--
Robert Haas
EDB: http://www.enterprisedb.com

#25Robert Haas
Robert Haas
robertmhaas@gmail.com
In reply to: David G. Johnston (#23)
Re: unlogged sequences

On Thu, Mar 31, 2022 at 8:44 PM David G. Johnston
<david.g.johnston@gmail.com> wrote:

It seems reasonable to extend the definition of "ownership of a sequence" in this way. We always let you create unowned sequences with whatever persistence you like if you need flexibility.

I'd say it doesn't seem to have any benefit, and therefore seems
unreasonable. Right now, OWNED BY is documented as a way of getting
the sequence to automatically be dropped if the table column goes
away. If it already did five things, maybe you could argue that this
thing is just like the other five and therefore changing it is the
right idea. But going from one thing to two that don't seem to have
much to do with each other seems much less reasonable, especially
since it doesn't seem to buy anything.

The "give the user power" argument is also valid. But since they already have power through unowned sequences, having the owned sequences more narrowly defined doesn't detract from usability, and in many ways enhances it by further reinforcing the fact that the sequence internally used when you say "GENERATED ALWAYS AS IDENTITY" is an implementation detail - one that has the same persistence as the table.

I think there's a question about what happens in the GENERATED ALWAYS
AS IDENTITY case. The DDL commands that create such sequences are of
the form ALTER TABLE something ALTER COLUMN somethingelse GENERATED
ALWAYS AS (sequence_parameters), and if we need to specify somewhere
in the whether the sequence should be logged or unlogged, how do we do
that? Consider:

rhaas=# create unlogged table xyz (a int generated always as identity);
CREATE TABLE
rhaas=# \d+ xyz
Unlogged table "public.xyz"
Column | Type | Collation | Nullable | Default
| Storage | Compression | Stats target | Description
--------+---------+-----------+----------+------------------------------+---------+-------------+--------------+-------------
a | integer | | not null | generated always as
identity | plain | | |
Access method: heap

rhaas=# \d+ xyz_a_seq
Sequence "public.xyz_a_seq"
Type | Start | Minimum | Maximum | Increment | Cycles? | Cache
---------+-------+---------+------------+-----------+---------+-------
integer | 1 | 1 | 2147483647 | 1 | no | 1
Sequence for identity column: public.xyz.a

In this new system, does the user still get a logged sequence? If they
get an unlogged sequence, how does dump-and-restore work? What if they
want to still have a logged sequence? But for sequences that are
simply owned, there is no problem here, and I think that inventing one
would not be a good plan.

--
Robert Haas
EDB: http://www.enterprisedb.com

#26David G. Johnston
David G. Johnston
david.g.johnston@gmail.com
In reply to: Robert Haas (#25)
Re: unlogged sequences

On Thu, Mar 31, 2022 at 6:03 PM Robert Haas <robertmhaas@gmail.com> wrote:

In this new system, does the user still get a logged sequence? If they
get an unlogged sequence, how does dump-and-restore work? What if they
want to still have a logged sequence? But for sequences that are
simply owned, there is no problem here, and I think that inventing one
would not be a good plan.

There is no design problem here, just coding (including special handling
for pg_upgrade). When making a sequence owned we can, without requiring
any syntax, choose to change its persistence to match the table owning it.
Or not. These are basically options 1 and 2 I laid out earlier:

/messages/by-id/CAKFQuwY6GsC1CvweCkgaYi-+HNF2F-fqCp8JpdFK9bk18gqzFA@mail.gmail.com

I slightly favor 1, making owned sequences implementation details having a
matched persistence mode. But we seem to be leaning toward option 2 per
subsequent emails. I'm ok with that - just give me an easy way to change
all my upgraded logged sequences to unlogged. And probably do the same if
I change my table's mode as well.

That there is less implementation complexity is nice but the end user won't
see that. I think the typical end user would appreciate having the
sequence stay in sync with the table instead of having to worry about those
kinds of details. Hence my slight favor given toward 1.

David J.

#27David G. Johnston
David G. Johnston
david.g.johnston@gmail.com
In reply to: Robert Haas (#25)
Re: unlogged sequences

On Thu, Mar 31, 2022 at 6:03 PM Robert Haas <robertmhaas@gmail.com> wrote:

On Thu, Mar 31, 2022 at 8:44 PM David G. Johnston
<david.g.johnston@gmail.com> wrote:

The "give the user power" argument is also valid. But since they

already have power through unowned sequences, having the owned sequences
more narrowly defined doesn't detract from usability, and in many ways
enhances it by further reinforcing the fact that the sequence internally
used when you say "GENERATED ALWAYS AS IDENTITY" is an implementation
detail - one that has the same persistence as the table.

I think there's a question about what happens in the GENERATED ALWAYS
AS IDENTITY case. The DDL commands that create such sequences are of
the form ALTER TABLE something ALTER COLUMN somethingelse GENERATED
ALWAYS AS (sequence_parameters), and if we need to specify somewhere
in the whether the sequence should be logged or unlogged, how do we do
that?

I give answers for the "owned sequences match their owning table's
persistence" model below:

You would not need to specify it - the table is specified and that is
sufficient to know what value to choose.

Consider:

rhaas=# create unlogged table xyz (a int generated always as identity);
CREATE TABLE
rhaas=# \d+ xyz
Unlogged table "
public.xyz"
Column | Type | Collation | Nullable | Default
| Storage | Compression | Stats target | Description

--------+---------+-----------+----------+------------------------------+---------+-------------+--------------+-------------
a | integer | | not null | generated always as
identity | plain | | |
Access method: heap

rhaas=# \d+ xyz_a_seq
Sequence "public.xyz_a_seq"
Type | Start | Minimum | Maximum | Increment | Cycles? | Cache
---------+-------+---------+------------+-----------+---------+-------
integer | 1 | 1 | 2147483647 | 1 | no | 1
Sequence for identity column: public.xyz.a

In this new system, does the user still get a logged sequence?

No

If they
get an unlogged sequence, how does dump-and-restore work?

As described in the first response, since ALTER COLUMN is used during
dump-and-restore, the sequence creation occurs in a command where we know
the owning table is unlogged so the created sequence is unlogged.

What if they
want to still have a logged sequence?

I was expecting the following to work, though it does not presently:

ALTER SEQUENCE yetanotherthing OWNED BY NONE;
ERROR: cannot change ownership of identity sequence

ALTER SEQUENCE yetanotherthing SET LOGGED;

IMO, the generated case is the stronger one for not allowing them to be
different. They can fall back onto the DEFAULT
nextval('sequence_that_is_unowned') option to get the desired behavior.

David J.

#28Peter Eisentraut
Peter Eisentraut
peter.eisentraut@enterprisedb.com
In reply to: Tomas Vondra (#19)
Re: unlogged sequences

On 01.04.22 00:43, Tomas Vondra wrote:

Hmm, so what about doing a little bit different thing:

1) owned sequences inherit persistence of the table by default

2) allow ALTER SEQUENCE to change persistence for all sequences (no
restriction for owned sequences)

3) ALTER TABLE ... SET [UN]LOGGED changes persistence for sequences
matching the initial table persistence

Consider that an identity sequence creates an "internal" dependency and
a serial sequence creates an "auto" dependency.

An "internal" dependency means that the internal object shouldn't really
be operated on directly. (In some cases it's allowed for convenience.)
So I think in that case the sequence must follow the table's persistence
in all cases. This is accomplished by setting the initial persistence
to the table's, making ALTER TABLE propagate persistence changes, and
prohibiting direct ALTER SEQUENCE SET.

An "auto" dependency is looser, so manipulating both objects
independently can be allowed. In that case, I would do (1), (2), and (3).

(I think your (3) is already the behavior in the patch, since there are
only two persistence levels in play at that point.)

I wanted to check if you can have a persistent sequence owned by a temp
table, but that is rejected because both sequence and table must be in
the same schema. So the sequence owned-by schema does insist on some
tight coupling. So for example, once a sequence is owned by a table,
you can't move it around or change the ownership.

#29Peter Eisentraut
Peter Eisentraut
peter.eisentraut@enterprisedb.com
In reply to: Peter Eisentraut (#28)
Re: unlogged sequences

On 01.04.22 18:22, Peter Eisentraut wrote:

On 01.04.22 00:43, Tomas Vondra wrote:

Hmm, so what about doing a little bit different thing:

1) owned sequences inherit persistence of the table by default

2) allow ALTER SEQUENCE to change persistence for all sequences (no
restriction for owned sequences)

3) ALTER TABLE ... SET [UN]LOGGED changes persistence for sequences
matching the initial table persistence

Consider that an identity sequence creates an "internal" dependency and
a serial sequence creates an "auto" dependency.

An "internal" dependency means that the internal object shouldn't really
be operated on directly.  (In some cases it's allowed for convenience.)
So I think in that case the sequence must follow the table's persistence
in all cases.  This is accomplished by setting the initial persistence
to the table's, making ALTER TABLE propagate persistence changes, and
prohibiting direct ALTER SEQUENCE SET.

But to make pg_upgrade work for identity sequences of unlogged tables,
we need to allow ALTER SEQUENCE ... SET LOGGED on such sequences. Which
I guess is not a real problem in the end.

#30David G. Johnston
David G. Johnston
david.g.johnston@gmail.com
In reply to: Peter Eisentraut (#28)
Re: unlogged sequences

On Fri, Apr 1, 2022 at 9:22 AM Peter Eisentraut <
peter.eisentraut@enterprisedb.com> wrote:

On 01.04.22 00:43, Tomas Vondra wrote:

Hmm, so what about doing a little bit different thing:

1) owned sequences inherit persistence of the table by default

2) allow ALTER SEQUENCE to change persistence for all sequences (no
restriction for owned sequences)

3) ALTER TABLE ... SET [UN]LOGGED changes persistence for sequences
matching the initial table persistence

Consider that an identity sequence creates an "internal" dependency and
a serial sequence creates an "auto" dependency.

An "internal" dependency means that the internal object shouldn't really
be operated on directly. (In some cases it's allowed for convenience.)
So I think in that case the sequence must follow the table's persistence
in all cases. This is accomplished by setting the initial persistence
to the table's, making ALTER TABLE propagate persistence changes, and
prohibiting direct ALTER SEQUENCE SET.

An "auto" dependency is looser, so manipulating both objects
independently can be allowed. In that case, I would do (1), (2), and (3).

(I think your (3) is already the behavior in the patch, since there are
only two persistence levels in play at that point.)

I would support having a serial sequence be allowed to be changed
independently while an identity sequence is made to match the table it is
owned by. Older version restores would produce a logged serial sequence
(since the sequence is independently created and then attached to the
table) on unlogged tables but since identity sequences are only even
implicitly created they would become unlogged as part of the restore.
Though I suspect that pg_upgrade will need to change them explicitly.

I would support all owned sequences as well, but that seems unreachable at
the moment.

David J.

#31David G. Johnston
David G. Johnston
david.g.johnston@gmail.com
In reply to: Peter Eisentraut (#29)
Re: unlogged sequences

On Fri, Apr 1, 2022 at 9:31 AM Peter Eisentraut <
peter.eisentraut@enterprisedb.com> wrote:

On 01.04.22 18:22, Peter Eisentraut wrote:

On 01.04.22 00:43, Tomas Vondra wrote:

Hmm, so what about doing a little bit different thing:

1) owned sequences inherit persistence of the table by default

2) allow ALTER SEQUENCE to change persistence for all sequences (no
restriction for owned sequences)

3) ALTER TABLE ... SET [UN]LOGGED changes persistence for sequences
matching the initial table persistence

Consider that an identity sequence creates an "internal" dependency and
a serial sequence creates an "auto" dependency.

An "internal" dependency means that the internal object shouldn't really
be operated on directly. (In some cases it's allowed for convenience.)
So I think in that case the sequence must follow the table's persistence
in all cases. This is accomplished by setting the initial persistence
to the table's, making ALTER TABLE propagate persistence changes, and
prohibiting direct ALTER SEQUENCE SET.

But to make pg_upgrade work for identity sequences of unlogged tables,
we need to allow ALTER SEQUENCE ... SET LOGGED on such sequences. Which
I guess is not a real problem in the end.

Indeed, we need the syntax anyway. We can constrain it though, and error
when trying to make them different but allow making them the same. To
change a table's persistence you have to then change the table first -
putting them back into different states - then sync up the sequence again.

David J.

#32Robert Haas
Robert Haas
robertmhaas@gmail.com
In reply to: Peter Eisentraut (#29)
Re: unlogged sequences

On Fri, Apr 1, 2022 at 12:31 PM Peter Eisentraut
<peter.eisentraut@enterprisedb.com> wrote:

An "internal" dependency means that the internal object shouldn't really
be operated on directly. (In some cases it's allowed for convenience.)
So I think in that case the sequence must follow the table's persistence
in all cases. This is accomplished by setting the initial persistence
to the table's, making ALTER TABLE propagate persistence changes, and
prohibiting direct ALTER SEQUENCE SET.

But to make pg_upgrade work for identity sequences of unlogged tables,
we need to allow ALTER SEQUENCE ... SET LOGGED on such sequences. Which
I guess is not a real problem in the end.

And I think also SET UNLOGGED, since it would be weird IMHO to make
such a change irreversible.

--
Robert Haas
EDB: http://www.enterprisedb.com

#33Peter Eisentraut
Peter Eisentraut
peter.eisentraut@enterprisedb.com
In reply to: Peter Eisentraut (#29)
1 attachment(s)
Re: unlogged sequences

On 01.04.22 18:31, Peter Eisentraut wrote:

Consider that an identity sequence creates an "internal" dependency
and a serial sequence creates an "auto" dependency.

An "internal" dependency means that the internal object shouldn't
really be operated on directly.  (In some cases it's allowed for
convenience.) So I think in that case the sequence must follow the
table's persistence in all cases.  This is accomplished by setting the
initial persistence to the table's, making ALTER TABLE propagate
persistence changes, and prohibiting direct ALTER SEQUENCE SET.

But to make pg_upgrade work for identity sequences of unlogged tables,
we need to allow ALTER SEQUENCE ... SET LOGGED on such sequences.  Which
I guess is not a real problem in the end.

Here is an updated patch that fixes this pg_dump/pg_upgrade issue and
also adds a few more comments and documentation sentences about what
happens and what is allowed. I didn't change any behaviors; it seems we
didn't have consensus to do that.

These details about how tables and sequences are linked or not are
pretty easy to adjust, if people still have some qualms.

Attachments:

v6-0001-Unlogged-sequences.patchtext/plain; charset=UTF-8; name=v6-0001-Unlogged-sequences.patch
#34David G. Johnston
David G. Johnston
david.g.johnston@gmail.com
In reply to: Peter Eisentraut (#33)
Re: unlogged sequences

On Sun, Apr 3, 2022 at 10:19 AM Peter Eisentraut <
peter.eisentraut@enterprisedb.com> wrote:

Here is an updated patch that fixes this pg_dump/pg_upgrade issue and
also adds a few more comments and documentation sentences about what
happens and what is allowed. I didn't change any behaviors; it seems we
didn't have consensus to do that.

IIUC the patch behavior with respect to migration is to have pg_upgrade
retain the current logged persistence mode for all owned sequences
regardless of the owning table's persistence. The same goes for pg_dump
for serial sequences since they will never be annotated with UNLOGGED and
simply adding an ownership link doesn't cause a table rewrite.

However, tables having an identity sequence seem to be unaddressed in this
patch. The existing (and unchanged) pg_dump.c code results in:

CREATE TABLE public.testgenid (
getid bigint NOT NULL
);

ALTER TABLE public.testgenid OWNER TO postgres;

ALTER TABLE public.testgenid ALTER COLUMN getid ADD GENERATED ALWAYS AS
IDENTITY (
SEQUENCE NAME public.testgenid_getid_seq
START WITH 1
INCREMENT BY 1
NO MINVALUE
NO MAXVALUE
CACHE 1
);

ISTM that we need to add the ability to specify [UN]LOGGED in those
sequence_options and have pg_dump.c output the choice explicitly instead of
relying upon a default.

Without that, the post-patch dump/restore cannot retain the existing
persistence mode value for the sequence. For the default we would want to
have ALTER TABLE ALTER COLUMN be LOGGED to match the claim that pg_dump
doesn't change the persistence mode. The main decision, then, is whether
CREATE TABLE and ALTER TABLE ADD COLUMN should default to UNLOGGED (this
combination preserves existing values via pg_dump while still letting the
user benefit from the new feature without having to specify UNLOGGED in
multiple places) or LOGGED (preserving existing values and consistency).
All UNLOGGED is an option but I think it would need to be considered along
with pg_upgrade changing them all as well. Again, limiting this decision
to identity sequences only.

David J.

#35Peter Eisentraut
Peter Eisentraut
peter.eisentraut@enterprisedb.com
In reply to: David G. Johnston (#34)
Re: unlogged sequences

On 03.04.22 20:50, David G. Johnston wrote:

However, tables having an identity sequence seem to be unaddressed in
this patch.  The existing (and unchanged) pg_dump.c code results in:

It is addressed. For example, run this in PG14:

create unlogged table t1 (a int generated always as identity, b text);

Then dump it with PG15 with this patch:

CREATE UNLOGGED TABLE public.t1 (
a integer NOT NULL,
b text
);

ALTER TABLE public.t1 OWNER TO peter;

--
-- Name: t1_a_seq; Type: SEQUENCE; Schema: public; Owner: peter
--

ALTER TABLE public.t1 ALTER COLUMN a ADD GENERATED ALWAYS AS IDENTITY (
SEQUENCE NAME public.t1_a_seq
START WITH 1
INCREMENT BY 1
NO MINVALUE
NO MAXVALUE
CACHE 1
);
ALTER SEQUENCE public.t1_a_seq SET LOGGED;

#36David G. Johnston
David G. Johnston
david.g.johnston@gmail.com
In reply to: Peter Eisentraut (#35)
Re: unlogged sequences

On Sun, Apr 3, 2022 at 12:36 PM Peter Eisentraut <
peter.eisentraut@enterprisedb.com> wrote:

On 03.04.22 20:50, David G. Johnston wrote:

However, tables having an identity sequence seem to be unaddressed in
this patch. The existing (and unchanged) pg_dump.c code results in:

It is addressed. For example, run this in PG14:

create unlogged table t1 (a int generated always as identity, b text);

Then dump it with PG15 with this patch:

Sorry, I wasn't being specific enough. Per our documentation (and I seem
to recall many comments from Tom):
"Because pg_dump is used to transfer data to newer versions of PostgreSQL,
the output of pg_dump can be expected to load into PostgreSQL server
versions newer than pg_dump's version." [1]https://www.postgresql.org/docs/current/app-pgdump.html

That is what I'm getting on about when talking about migrations. So a v14
SQL backup produced by a v14 pg_dump restored by a v15 psql. (custom format
and pg_restore supposedly aren't supposed to be different though, right?)

[1]: https://www.postgresql.org/docs/current/app-pgdump.html

David J.

#37David G. Johnston
David G. Johnston
david.g.johnston@gmail.com
In reply to: Peter Eisentraut (#35)
Re: unlogged sequences

On Sun, Apr 3, 2022 at 12:36 PM Peter Eisentraut <
peter.eisentraut@enterprisedb.com> wrote:

On 03.04.22 20:50, David G. Johnston wrote:

However, tables having an identity sequence seem to be unaddressed in
this patch. The existing (and unchanged) pg_dump.c code results in:

It is addressed. For example, run this in PG14:

ALTER TABLE public.t1 ALTER COLUMN a ADD GENERATED ALWAYS AS IDENTITY (
SEQUENCE NAME public.t1_a_seq
START WITH 1
INCREMENT BY 1
NO MINVALUE
NO MAXVALUE
CACHE 1
);
ALTER SEQUENCE public.t1_a_seq SET LOGGED;

OK, I do see the new code for this and see how my prior email was
confusing/wrong. I do still have the v14 dump file restoration concern but
that actually isn't something pg_dump.c has to (or even can) worry about.
Ensuring that a v15+ dump represents the existing state correctly is
basically a given which is why I wasn't seeing how my comments would be
interpreted relative to that.

For the patch I'm still thinking we want to add [UN]LOGGED to
sequence_options. Even if pg_dump doesn't utilize it, though aside from
potential code cleanliness I don't see why it wouldn't. If absent, the
default behavior shown here (sequence matches table, as per "+
seqstmt->sequence->relpersistence = cxt->relation->relpersistence;" would
take effect) applies, otherwise the newly created sequence is as requested.

From this, in the current patch, a pg_dump v14- produced dump file
restoration will change the persistence of owned sequences on an unlogged
table to unlogged from logged during restoration into v15+ (since the alter
sequence will not be present after the alter table). A v15+ pg_dump
produced dump file will retain the logged persistence mode for the
sequence. The only way to avoid this discrepancy is to have
sequence_options taken on a [UN]LOGGED option that defaults to LOGGED.
This then correctly reflects historical behavior and will produce a
consistently restored dump file.

David J.

#38Peter Eisentraut
Peter Eisentraut
peter.eisentraut@enterprisedb.com
In reply to: David G. Johnston (#36)
Re: unlogged sequences

On 04.04.22 01:58, David G. Johnston wrote:

"Because pg_dump is used to transfer data to newer versions of
PostgreSQL, the output of pg_dump can be expected to load into
PostgreSQL server versions newer than pg_dump's version." [1]

That is what I'm getting on about when talking about migrations.  So a
v14 SQL backup produced by a v14 pg_dump restored by a v15 psql.

It has always been the case that if you want the best upgrade
experience, you need to use the pg_dump that is >= server version.

The above quote is a corollary to that we don't want to gratuitously
break SQL syntax compatibility. But I don't think that implies that the
behavior of those commands cannot change at all. Otherwise we could
never add new behavior with new defaults.

#39Peter Eisentraut
Peter Eisentraut
peter.eisentraut@enterprisedb.com
In reply to: Peter Eisentraut (#33)
Re: unlogged sequences

On 03.04.22 19:19, Peter Eisentraut wrote:

On 01.04.22 18:31, Peter Eisentraut wrote:

Consider that an identity sequence creates an "internal" dependency
and a serial sequence creates an "auto" dependency.

An "internal" dependency means that the internal object shouldn't
really be operated on directly.  (In some cases it's allowed for
convenience.) So I think in that case the sequence must follow the
table's persistence in all cases.  This is accomplished by setting
the initial persistence to the table's, making ALTER TABLE propagate
persistence changes, and prohibiting direct ALTER SEQUENCE SET.

But to make pg_upgrade work for identity sequences of unlogged tables,
we need to allow ALTER SEQUENCE ... SET LOGGED on such sequences.
Which I guess is not a real problem in the end.

Here is an updated patch that fixes this pg_dump/pg_upgrade issue and
also adds a few more comments and documentation sentences about what
happens and what is allowed.  I didn't change any behaviors; it seems we
didn't have consensus to do that.

These details about how tables and sequences are linked or not are
pretty easy to adjust, if people still have some qualms.

This patch is now in limbo because it appears that the logical
replication of sequences feature might end up being reverted for PG15.
This unlogged sequences feature is really a component of that overall
feature.

If we think that logical replication of sequences might stay in, then I
would like to commit this patch as well.

If we think that it will be reverted, then this patch is probably just
going to be in the way of that.

We could also move forward with this patch independently of the other
one. If we end up reverting the other one, then this one won't be very
useful but it won't really hurt anything and it would presumably become
useful eventually. What we presumably don't want is that the sequence
replication patch gets repaired for PG15 and we didn't end up committing
this patch because of uncertainty.

Thoughts?

#40Peter Eisentraut
Peter Eisentraut
peter.eisentraut@enterprisedb.com
In reply to: Peter Eisentraut (#39)
Re: unlogged sequences

On 06.04.22 11:12, Peter Eisentraut wrote:

We could also move forward with this patch independently of the other
one.  If we end up reverting the other one, then this one won't be very
useful but it won't really hurt anything and it would presumably become
useful eventually.  What we presumably don't want is that the sequence
replication patch gets repaired for PG15 and we didn't end up committing
this patch because of uncertainty.

I have received some encouragement off-list to go ahead with this, so
it's been committed.