pgsql: Keep track of transaction commit timestamps

Started by Alvaro Herreraabout 11 years ago28 messages
#1Alvaro Herrera
alvherre@alvh.no-ip.org

Keep track of transaction commit timestamps

Transactions can now set their commit timestamp directly as they commit,
or an external transaction commit timestamp can be fed from an outside
system using the new function TransactionTreeSetCommitTsData(). This
data is crash-safe, and truncated at Xid freeze point, same as pg_clog.

This module is disabled by default because it causes a performance hit,
but can be enabled in postgresql.conf requiring only a server restart.

A new test in src/test/modules is included.

Catalog version bumped due to the new subdirectory within PGDATA and a
couple of new SQL functions.

Authors: Álvaro Herrera and Petr Jelínek

Reviewed to varying degrees by Michael Paquier, Andres Freund, Robert
Haas, Amit Kapila, Fujii Masao, Jaime Casanova, Simon Riggs, Steven
Singer, Peter Eisentraut

Branch
------
master

Details
-------
http://git.postgresql.org/pg/commitdiff/73c986adde5d73a5e2555da9b5c8facedb146dcd

Modified Files
--------------
contrib/pg_upgrade/pg_upgrade.c | 7 +
contrib/pg_xlogdump/rmgrdesc.c | 1 +
doc/src/sgml/config.sgml | 14 +
doc/src/sgml/func.sgml | 39 +
doc/src/sgml/ref/pg_resetxlog.sgml | 19 +-
doc/src/sgml/storage.sgml | 5 +
src/backend/access/rmgrdesc/Makefile | 2 +-
src/backend/access/rmgrdesc/committsdesc.c | 82 ++
src/backend/access/rmgrdesc/xlogdesc.c | 3 +
src/backend/access/transam/Makefile | 5 +-
src/backend/access/transam/README | 2 +-
src/backend/access/transam/commit_ts.c | 902 ++++++++++++++++++++
src/backend/access/transam/rmgr.c | 1 +
src/backend/access/transam/slru.c | 2 +-
src/backend/access/transam/varsup.c | 4 +-
src/backend/access/transam/xact.c | 27 +-
src/backend/access/transam/xlog.c | 43 +-
src/backend/commands/vacuum.c | 8 +-
src/backend/libpq/hba.c | 2 +-
src/backend/replication/logical/decode.c | 1 +
src/backend/storage/ipc/ipci.c | 3 +
src/backend/storage/lmgr/lwlock.c | 4 +
src/backend/utils/misc/guc.c | 10 +
src/backend/utils/misc/postgresql.conf.sample | 2 +
src/bin/initdb/initdb.c | 1 +
src/bin/pg_controldata/pg_controldata.c | 4 +
src/bin/pg_resetxlog/pg_resetxlog.c | 75 +-
src/include/access/commit_ts.h | 72 ++
src/include/access/rmgrlist.h | 1 +
src/include/access/transam.h | 6 +
src/include/access/xlog_internal.h | 1 +
src/include/catalog/catversion.h | 2 +-
src/include/catalog/pg_control.h | 3 +
src/include/catalog/pg_proc.h | 6 +
src/include/storage/lwlock.h | 5 +-
src/include/utils/builtins.h | 4 +
src/test/modules/Makefile | 1 +
src/test/modules/commit_ts/.gitignore | 4 +
src/test/modules/commit_ts/Makefile | 15 +
src/test/modules/commit_ts/commit_ts.conf | 1 +
.../commit_ts/expected/commit_timestamp.out | 39 +
.../commit_ts/expected/commit_timestamp_1.out | 34 +
.../modules/commit_ts/sql/commit_timestamp.sql | 24 +
43 files changed, 1458 insertions(+), 28 deletions(-)

--
Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers

#2Stefan Kaltenbrunner
stefan@kaltenbrunner.cc
In reply to: Alvaro Herrera (#1)
Re: pgsql: Keep track of transaction commit timestamps

On 12/03/2014 03:54 PM, Alvaro Herrera wrote:

Keep track of transaction commit timestamps

Transactions can now set their commit timestamp directly as they commit,
or an external transaction commit timestamp can be fed from an outside
system using the new function TransactionTreeSetCommitTsData(). This
data is crash-safe, and truncated at Xid freeze point, same as pg_clog.

This module is disabled by default because it causes a performance hit,
but can be enabled in postgresql.conf requiring only a server restart.

A new test in src/test/modules is included.

Catalog version bumped due to the new subdirectory within PGDATA and a
couple of new SQL functions.

Authors: Álvaro Herrera and Petr Jelínek

Reviewed to varying degrees by Michael Paquier, Andres Freund, Robert
Haas, Amit Kapila, Fujii Masao, Jaime Casanova, Simon Riggs, Steven
Singer, Peter Eisentraut

this broke the docs build:

http://www.pgbuildfarm.org/cgi-bin/show_log.pl?nm=guaibasaurus&dt=2014-12-03%2016%3A17%3A01

Stefan

--
Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers

#3Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Stefan Kaltenbrunner (#2)
Re: pgsql: Keep track of transaction commit timestamps

Stefan Kaltenbrunner wrote:

On 12/03/2014 03:54 PM, Alvaro Herrera wrote:

Keep track of transaction commit timestamps

this broke the docs build:

http://www.pgbuildfarm.org/cgi-bin/show_log.pl?nm=guaibasaurus&dt=2014-12-03%2016%3A17%3A01

Woops, thanks. Fixed.

--
�lvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers

#4Heikki Linnakangas
hlinnakangas@vmware.com
In reply to: Alvaro Herrera (#1)
Re: [COMMITTERS] pgsql: Keep track of transaction commit timestamps

On 12/03/2014 04:54 PM, Alvaro Herrera wrote:

ir commit timestamp directly as they commit,
or an external transaction c

Sorry, I'm late to the party, but here's some random comments on this
after a quick review:

* The whole concept of a node ID seems to be undocumented, and unused.
No-one calls CommitTsSetDefaultNodeId(). The whole SET_TS record type is
dead code too, when a non-default node_id is not set. Please remove, or
explain how all that's supposed to work.

* COMMIT_TS_SETTS. "SETTS" sounds like a typo (like Robert complained
about "committs" earlier). Rename to "COMMIT_TS_SET_TIMESTAMP", or just
"COMMIT_TS_SET".

* committsdesc.c -> commit_ts_desc.c, to be consistent with "commit_ts.c"

* In commit_ts_desc:

+       nsubxids = ((XLogRecGetDataLen(record) - SizeOfCommitTsSet) /
+                   sizeof(TransactionId));
+       if (nsubxids > 0)
+       {
+           int     i;
+           TransactionId *subxids;
+
+           subxids = palloc(sizeof(TransactionId) * nsubxids);
+           memcpy(subxids,
+                  XLogRecGetData(record) + SizeOfCommitTsSet,
+                  sizeof(TransactionId) * nsubxids);
+           for (i = 0; i < nsubxids; i++)
+               appendStringInfo(buf, ", %u", subxids[i]);
+           pfree(subxids);
+       }

There's no need to memcpy() here. The subxid array in the WAL record is
aligned.

* This seems to be a completely unrelated change.

--- a/src/backend/libpq/hba.c
+++ b/src/backend/libpq/hba.c
@@ -1438,7 +1438,7 @@ parse_hba_auth_opt(char *name, char *val, HbaLine *hbaline, int line_num)
ereport(LOG,
(errcode(ERRCODE_CONFIG_FILE_ERROR),
errmsg("client certificates can only be checked if a root certificate store is available"),
-                        errhint("Make sure the configuration parameter \"ssl_ca_file\" is set."),
+                        errhint("Make sure the configuration parameter \"%s\" is set.", "ssl_ca_file"),
errcontext("line %d of configuration file \"%s\"",
line_num, HbaFileName)));
return false;

* What is the definition of "latest commit", in
pg_last_committed_xact()? It doesn't necessarily line up with the order
of commit WAL records, nor with the order the proc array is updated
(i.e. the order that the effects become visible to other backends). It
seems confusing to add yet another notion of commit order. Perhaps
that's the best we can do, but it needs to be documented.

- Heikki

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#5Petr Jelinek
petr@2ndquadrant.com
In reply to: Heikki Linnakangas (#4)
Re: [COMMITTERS] pgsql: Keep track of transaction commit timestamps

On 04/12/14 10:42, Heikki Linnakangas wrote:

On 12/03/2014 04:54 PM, Alvaro Herrera wrote:

ir commit timestamp directly as they commit,
or an external transaction c

Sorry, I'm late to the party, but here's some random comments on this
after a quick review:

* The whole concept of a node ID seems to be undocumented, and unused.
No-one calls CommitTsSetDefaultNodeId(). The whole SET_TS record type is
dead code too, when a non-default node_id is not set. Please remove, or
explain how all that's supposed to work.

That's API meant to be used by extensions, maybe it would be preferable
if there was SQL API exposed also?

(It might also make more sense once replication identifiers are submitted.)

* What is the definition of "latest commit", in
pg_last_committed_xact()? It doesn't necessarily line up with the order
of commit WAL records, nor with the order the proc array is updated
(i.e. the order that the effects become visible to other backends). It
seems confusing to add yet another notion of commit order. Perhaps
that's the best we can do, but it needs to be documented.

It's updated on CommitTransaction (well RecordTransactionCommit but
that's only called from CommitTransaction) time so the definition of
latest commit is last CommitTransaction call.

--
Petr Jelinek http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#6Heikki Linnakangas
hlinnakangas@vmware.com
In reply to: Petr Jelinek (#5)
Re: [COMMITTERS] pgsql: Keep track of transaction commit timestamps

On 12/04/2014 01:16 PM, Petr Jelinek wrote:

On 04/12/14 10:42, Heikki Linnakangas wrote:

On 12/03/2014 04:54 PM, Alvaro Herrera wrote:

ir commit timestamp directly as they commit,
or an external transaction c

Sorry, I'm late to the party, but here's some random comments on this
after a quick review:

* The whole concept of a node ID seems to be undocumented, and unused.
No-one calls CommitTsSetDefaultNodeId(). The whole SET_TS record type is
dead code too, when a non-default node_id is not set. Please remove, or
explain how all that's supposed to work.

That's API meant to be used by extensions, maybe it would be preferable
if there was SQL API exposed also?

(It might also make more sense once replication identifiers are submitted.)

Maybe. Hard to tell without any documentation or comments on how it's
supposed to work. In unacceptable in its current state.

I would prefer to remove it now, and add it back later together with the
code that actually uses it. I don't like having unused internal
functions and APIs sitting the source tree in the hope that someone will
find them useful in the future. It's more likely that it will bitrot, or
not actually work as intended, when someone later tries to use it.

What would the SQL API look like, and what would it be used for?

* What is the definition of "latest commit", in
pg_last_committed_xact()? It doesn't necessarily line up with the order
of commit WAL records, nor with the order the proc array is updated
(i.e. the order that the effects become visible to other backends). It
seems confusing to add yet another notion of commit order. Perhaps
that's the best we can do, but it needs to be documented.

It's updated on CommitTransaction (well RecordTransactionCommit but
that's only called from CommitTransaction) time so the definition of
latest commit is last CommitTransaction call.

Right. It was a rhetorical question, actually. What I meant is that that
needs to be documented, at least. Or changed so that it matches one of
the other definitions of commit order, and then documented.

- Heikki

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#7Petr Jelinek
petr@2ndquadrant.com
In reply to: Heikki Linnakangas (#6)
Re: [COMMITTERS] pgsql: Keep track of transaction commit timestamps

On 04/12/14 12:26, Heikki Linnakangas wrote:

On 12/04/2014 01:16 PM, Petr Jelinek wrote:

On 04/12/14 10:42, Heikki Linnakangas wrote:

On 12/03/2014 04:54 PM, Alvaro Herrera wrote:

ir commit timestamp directly as they commit,
or an external transaction c

Sorry, I'm late to the party, but here's some random comments on this
after a quick review:

* The whole concept of a node ID seems to be undocumented, and unused.
No-one calls CommitTsSetDefaultNodeId(). The whole SET_TS record type is
dead code too, when a non-default node_id is not set. Please remove, or
explain how all that's supposed to work.

That's API meant to be used by extensions, maybe it would be preferable
if there was SQL API exposed also?

(It might also make more sense once replication identifiers are
submitted.)

Maybe. Hard to tell without any documentation or comments on how it's
supposed to work. In unacceptable in its current state.

I would prefer to remove it now, and add it back later together with the
code that actually uses it. I don't like having unused internal
functions and APIs sitting the source tree in the hope that someone will
find them useful in the future. It's more likely that it will bitrot, or
not actually work as intended, when someone later tries to use it.

The thing is I already have extension for 9.4 where I would use this API
for conflict detection if it was available so it's not about hoping for
somebody to find it useful. There was discussion about this during the
review process because the objection was raised already then.

What would the SQL API look like, and what would it be used for?

It would probably mirror the C one, from my POV it's not needed but
maybe some replication solution would prefer to use this without having
to write C components...

--
Petr Jelinek http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#8Heikki Linnakangas
hlinnakangas@vmware.com
In reply to: Petr Jelinek (#7)
Re: [COMMITTERS] pgsql: Keep track of transaction commit timestamps

On 12/04/2014 01:47 PM, Petr Jelinek wrote:

On 04/12/14 12:26, Heikki Linnakangas wrote:

On 12/04/2014 01:16 PM, Petr Jelinek wrote:

On 04/12/14 10:42, Heikki Linnakangas wrote:

On 12/03/2014 04:54 PM, Alvaro Herrera wrote:

ir commit timestamp directly as they commit,
or an external transaction c

Sorry, I'm late to the party, but here's some random comments on this
after a quick review:

* The whole concept of a node ID seems to be undocumented, and unused.
No-one calls CommitTsSetDefaultNodeId(). The whole SET_TS record type is
dead code too, when a non-default node_id is not set. Please remove, or
explain how all that's supposed to work.

That's API meant to be used by extensions, maybe it would be preferable
if there was SQL API exposed also?

(It might also make more sense once replication identifiers are
submitted.)

Maybe. Hard to tell without any documentation or comments on how it's
supposed to work. In unacceptable in its current state.

I would prefer to remove it now, and add it back later together with the
code that actually uses it. I don't like having unused internal
functions and APIs sitting the source tree in the hope that someone will
find them useful in the future. It's more likely that it will bitrot, or
not actually work as intended, when someone later tries to use it.

The thing is I already have extension for 9.4 where I would use this API
for conflict detection if it was available so it's not about hoping for
somebody to find it useful. There was discussion about this during the
review process because the objection was raised already then.

Yeah, it was raised. I don't think it was really addressed. There was
lengthy discussion on whether to include LSN, node id, and/or some other
information, but there was no discussion on:

* What is a node ID?
* How is it used?
* Who assigns it?

etc.

None of those things are explained in the documentation nor code comments.

The node ID sounds suspiciously like the replication identifiers that
have been proposed a couple of times. I don't remember if I like
replication identifiers or not, but I'd rather discuss that issue
explicitly and separately. I don't want the concept of a
replication/node ID to fly under the radar like this.

What would the SQL API look like, and what would it be used for?

It would probably mirror the C one, from my POV it's not needed but
maybe some replication solution would prefer to use this without having
to write C components...

I can't comment on that, because without any documentation, I don't even
know what the C API is.

BTW, why is it OK that the node-ID of a commit is logged as a separate
WAL record, after the commit record? That means that it's possible that
a transaction commits, but you crash just before writing the SETTS
record, so that after replay the transaction appears committed but with
the default node ID.

(Maybe that's OK given the intended use case, but it's hard to tell
without any documentation. See a pattern here? ;-) )

- Heikki

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#9Alex Shulgin
ash@commandprompt.com
In reply to: Heikki Linnakangas (#4)
Re: [COMMITTERS] pgsql: Keep track of transaction commit timestamps

Heikki Linnakangas <hlinnakangas@vmware.com> writes:

On 12/03/2014 04:54 PM, Alvaro Herrera wrote:

ir commit timestamp directly as they commit,
or an external transaction c

Sorry, I'm late to the party, but here's some random comments on this
after a quick review:

Also this commit breaks initdb of `make check' for me:

creating template1 database in /home/ash/build/postgresql/HEAD/src/test/regress/./tmp_check/data/base/1 ... TRAP: FailedAssertion("!(((xmax) >= ((TransactionId) 3)))", File: "/home/ash/src/postgresql/src/backend/storage/ipc/procarray.c", Line: 1414)
Aborted (core dumped)
child process exited with exit code 134

--
Alex

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#10Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Alex Shulgin (#9)
Re: [COMMITTERS] pgsql: Keep track of transaction commit timestamps

Alex Shulgin wrote:

Also this commit breaks initdb of `make check' for me:

creating template1 database in /home/ash/build/postgresql/HEAD/src/test/regress/./tmp_check/data/base/1 ... TRAP: FailedAssertion("!(((xmax) >= ((TransactionId) 3)))", File: "/home/ash/src/postgresql/src/backend/storage/ipc/procarray.c", Line: 1414)
Aborted (core dumped)
child process exited with exit code 134

Uh, that's odd. Can you please get a stack trace? Do you have unusual
settings or a patched build?

--
�lvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#11Alex Shulgin
ash@commandprompt.com
In reply to: Alvaro Herrera (#10)
Re: [COMMITTERS] pgsql: Keep track of transaction commit timestamps

Alvaro Herrera <alvherre@2ndquadrant.com> writes:

Alex Shulgin wrote:

Also this commit breaks initdb of `make check' for me:

creating template1 database in
/home/ash/build/postgresql/HEAD/src/test/regress/./tmp_check/data/base/1
... TRAP: FailedAssertion("!(((xmax) >= ((TransactionId) 3)))",
File:
"/home/ash/src/postgresql/src/backend/storage/ipc/procarray.c",
Line: 1414)
Aborted (core dumped)
child process exited with exit code 134

Uh, that's odd. Can you please get a stack trace? Do you have unusual
settings or a patched build?

Not really, and I would mention that. Will get a trace.

--
Alex

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#12Alex Shulgin
ash@commandprompt.com
In reply to: Alvaro Herrera (#10)
Re: [COMMITTERS] pgsql: Keep track of transaction commit timestamps

Alvaro Herrera <alvherre@2ndquadrant.com> writes:

Uh, that's odd. Can you please get a stack trace? Do you have unusual
settings or a patched build?

Is there a way to pause the bootstrap process so I can attach gdb to it?

--
Alex

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#13Craig Ringer
craig@2ndquadrant.com
In reply to: Alex Shulgin (#12)
Re: [COMMITTERS] pgsql: Keep track of transaction commit timestamps

On 12/04/2014 10:50 PM, Alex Shulgin wrote:

Is there a way to pause the bootstrap process so I can attach gdb to it?

With a newer gdb, you can instead tell gdb to follow all forks. I wrote
some notes on it recently.

http://blog.2ndquadrant.com/processes-breakpoints-watchpoints-postgresql/

I've found it really handy when debugging crashes in regression tests.

However, it's often simpler to just:

ulimit -c unlimited
make check

(or whatever your crashing test is) then look at the core files that are
produced.

--
Craig Ringer http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#14Alex Shulgin
ash@commandprompt.com
In reply to: Craig Ringer (#13)
Re: [COMMITTERS] pgsql: Keep track of transaction commit timestamps

Craig Ringer <craig@2ndquadrant.com> writes:

On 12/04/2014 10:50 PM, Alex Shulgin wrote:

Is there a way to pause the bootstrap process so I can attach gdb to it?

With a newer gdb, you can instead tell gdb to follow all forks. I wrote
some notes on it recently.

http://blog.2ndquadrant.com/processes-breakpoints-watchpoints-postgresql/

I've found it really handy when debugging crashes in regression tests.

However, it's often simpler to just:

ulimit -c unlimited
make check

(or whatever your crashing test is) then look at the core files that are
produced.

Good one, it didn't occur to me that assert makes core files.

Figured it out with a pg_usleep in bootstrap.c anyway. Does this look sane?

DEBUG: inserting column 6 value "0"
DEBUG: inserted -> 0
DEBUG: inserting column 7 value "varchar_transform"
TRAP: FailedAssertion("!(((xmax) >= ((TransactionId) 3)))", File: "/home/ash/src/postgresql/src/backend/storage/ipc/procarray.c", Line: 1414)

Program received signal SIGABRT, Aborted.
0x00007f2757128d27 in __GI_raise (sig=sig@entry=6)
at ../nptl/sysdeps/unix/sysv/linux/raise.c:56
56 ../nptl/sysdeps/unix/sysv/linux/raise.c: No such file or directory.
(gdb) bt
#0 0x00007f2757128d27 in __GI_raise (sig=sig@entry=6)
at ../nptl/sysdeps/unix/sysv/linux/raise.c:56
#1 0x00007f275712a418 in __GI_abort () at abort.c:89
#2 0x00000000009088b2 in ExceptionalCondition (
conditionName=0xac0710 "!(((xmax) >= ((TransactionId) 3)))",
errorType=0xac01d8 "FailedAssertion",
fileName=0xac0178 "/home/ash/src/postgresql/src/backend/storage/ipc/procarray.c", lineNumber=1414)
at /home/ash/src/postgresql/src/backend/utils/error/assert.c:54
#3 0x000000000079e125 in GetSnapshotData (
snapshot=0xdb2d60 <CatalogSnapshotData>)
at /home/ash/src/postgresql/src/backend/storage/ipc/procarray.c:1414
#4 0x000000000094e02d in GetNonHistoricCatalogSnapshot (relid=1255)
at /home/ash/src/postgresql/src/backend/utils/time/snapmgr.c:298
#5 0x000000000094dfdd in GetCatalogSnapshot (relid=1255)
at /home/ash/src/postgresql/src/backend/utils/time/snapmgr.c:272
#6 0x00000000004c8f0d in systable_beginscan (heapRelation=0x1d0e5c0,
indexId=2691, indexOK=1 '\001', snapshot=0x0, nkeys=1, key=0x7fff201bbc40)
at /home/ash/src/postgresql/src/backend/access/index/genam.c:275
#7 0x0000000000885070 in regprocin (fcinfo=0x7fff201bbce0)
at /home/ash/src/postgresql/src/backend/utils/adt/regproc.c:106
#8 0x0000000000914fe7 in InputFunctionCall (flinfo=0x7fff201bc0c0,
str=0x1d349b8 "varchar_transform", typioparam=24, typmod=-1)
---Type <return> to continue, or q <return> to quit---
at /home/ash/src/postgresql/src/backend/utils/fmgr/fmgr.c:1914
#9 0x000000000091533e in OidInputFunctionCall (functionId=44,
str=0x1d349b8 "varchar_transform", typioparam=24, typmod=-1)
at /home/ash/src/postgresql/src/backend/utils/fmgr/fmgr.c:2045
#10 0x000000000052af91 in InsertOneValue (value=0x1d349b8 "varchar_transform",
i=7) at /home/ash/src/postgresql/src/backend/bootstrap/bootstrap.c:840
#11 0x0000000000527409 in boot_yyparse ()
at /home/ash/src/postgresql/src/backend/bootstrap/bootparse.y:455
#12 0x000000000052a26b in BootstrapModeMain ()
at /home/ash/src/postgresql/src/backend/bootstrap/bootstrap.c:494
#13 0x000000000052a177 in AuxiliaryProcessMain (argc=6, argv=0x1cc8378)
at /home/ash/src/postgresql/src/backend/bootstrap/bootstrap.c:414
#14 0x00000000006a327c in main (argc=7, argv=0x1cc8370)
at /home/ash/src/postgresql/src/backend/main/main.c:212
(gdb)

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#15Alex Shulgin
ash@commandprompt.com
In reply to: Alex Shulgin (#14)
Re: [COMMITTERS] pgsql: Keep track of transaction commit timestamps

Alex Shulgin <ash@commandprompt.com> writes:

Figured it out with a pg_usleep in bootstrap.c anyway. Does this look sane?

DEBUG: inserting column 6 value "0"
DEBUG: inserted -> 0
DEBUG: inserting column 7 value "varchar_transform"
TRAP: FailedAssertion("!(((xmax) >= ((TransactionId) 3)))", File: "/home/ash/src/postgresql/src/backend/storage/ipc/procarray.c", Line: 1414)

I've tried to debug this and I feel really dumbfound...

DEBUG: inserting column 7 value "varchar_transform"

Breakpoint 1, GetSnapshotData (snapshot=0xdb2d60 <CatalogSnapshotData>)
at /home/ash/src/postgresql/src/backend/storage/ipc/procarray.c:1413
1413 xmax = ShmemVariableCache->latestCompletedXid;

(gdb) p ShmemVariableCache->latestCompletedXid
$1 = 4294967295

(gdb) p *ShmemVariableCache
$2 = {nextOid = 10000, oidCount = 0, nextXid = 3, oldestXid = 3,
xidVacLimit = 200000003, xidWarnLimit = 2136483650,
xidStopLimit = 2146483650, xidWrapLimit = 2147483650, oldestXidDB = 1,
oldestCommitTs = 1, newestCommitTs = 0, latestCompletedXid = 4294967295}

(gdb) p xmax
$3 = 0

(gdb) n
1414 Assert(TransactionIdIsNormal(xmax));

(gdb) p xmax
$4 = 1

(gdb) p *ShmemVariableCache
$5 = {nextOid = 10000, oidCount = 0, nextXid = 3, oldestXid = 3,
xidVacLimit = 200000003, xidWarnLimit = 2136483650,
xidStopLimit = 2146483650, xidWrapLimit = 2147483650, oldestXidDB = 1,
oldestCommitTs = 1, newestCommitTs = 0, latestCompletedXid = 4294967295}

(gdb) p ShmemVariableCache->latestCompletedXid
$6 = 4294967295

(gdb)

How? Is there an another concurrent process with the old view of
VariableCacheData struct where latestCompletedXid still points to
oldestCommitTs?

This only happens with the CommitTs commit in effect.

--
Alex

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#16Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Alex Shulgin (#15)
Re: [COMMITTERS] pgsql: Keep track of transaction commit timestamps

Alex Shulgin wrote:

DEBUG: inserting column 7 value "varchar_transform"

Breakpoint 1, GetSnapshotData (snapshot=0xdb2d60 <CatalogSnapshotData>)
at /home/ash/src/postgresql/src/backend/storage/ipc/procarray.c:1413
1413 xmax = ShmemVariableCache->latestCompletedXid;

(gdb) p ShmemVariableCache->latestCompletedXid
$1 = 4294967295

I think you might need to "make distclean", then recompile. If you're
not passing --enable-depend to configure, I suggest you do so; that
greatly reduces such problems.

--
�lvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#17Alex Shulgin
ash@commandprompt.com
In reply to: Alvaro Herrera (#16)
Re: [COMMITTERS] pgsql: Keep track of transaction commit timestamps

Alvaro Herrera <alvherre@2ndquadrant.com> writes:

Alex Shulgin wrote:

DEBUG: inserting column 7 value "varchar_transform"

Breakpoint 1, GetSnapshotData (snapshot=0xdb2d60 <CatalogSnapshotData>)
at /home/ash/src/postgresql/src/backend/storage/ipc/procarray.c:1413
1413 xmax = ShmemVariableCache->latestCompletedXid;

(gdb) p ShmemVariableCache->latestCompletedXid
$1 = 4294967295

I think you might need to "make distclean", then recompile. If you're
not passing --enable-depend to configure, I suggest you do so; that
greatly reduces such problems.

Well I tried that before posting the complaint, let me try again though.

--
Alex

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#18Alex Shulgin
ash@commandprompt.com
In reply to: Alvaro Herrera (#16)
Re: [COMMITTERS] pgsql: Keep track of transaction commit timestamps

Alvaro Herrera <alvherre@2ndquadrant.com> writes:

Alex Shulgin wrote:

DEBUG: inserting column 7 value "varchar_transform"

Breakpoint 1, GetSnapshotData (snapshot=0xdb2d60 <CatalogSnapshotData>)
at /home/ash/src/postgresql/src/backend/storage/ipc/procarray.c:1413
1413 xmax = ShmemVariableCache->latestCompletedXid;

(gdb) p ShmemVariableCache->latestCompletedXid
$1 = 4294967295

I think you might need to "make distclean", then recompile. If you're
not passing --enable-depend to configure, I suggest you do so; that
greatly reduces such problems.

Yeah, that did it. Sorry for the noise, I must have messed it up with
the recompilations (note to self to always use --enable-depend).

--
Alex

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#19Michael Paquier
michael.paquier@gmail.com
In reply to: Heikki Linnakangas (#8)
Re: [COMMITTERS] pgsql: Keep track of transaction commit timestamps

On Thu, Dec 4, 2014 at 9:26 PM, Heikki Linnakangas
<hlinnakangas@vmware.com> wrote:

On 12/04/2014 01:47 PM, Petr Jelinek wrote:

On 04/12/14 12:26, Heikki Linnakangas wrote:

On 12/04/2014 01:16 PM, Petr Jelinek wrote:

On 04/12/14 10:42, Heikki Linnakangas wrote:

On 12/03/2014 04:54 PM, Alvaro Herrera wrote:

ir commit timestamp directly as they commit,
or an external transaction c

Sorry, I'm late to the party, but here's some random comments on this
after a quick review:

* The whole concept of a node ID seems to be undocumented, and unused.
No-one calls CommitTsSetDefaultNodeId(). The whole SET_TS record type
is
dead code too, when a non-default node_id is not set. Please remove, or
explain how all that's supposed to work.

That's API meant to be used by extensions, maybe it would be preferable
if there was SQL API exposed also?

(It might also make more sense once replication identifiers are
submitted.)

Maybe. Hard to tell without any documentation or comments on how it's
supposed to work. In unacceptable in its current state.

I would prefer to remove it now, and add it back later together with the
code that actually uses it. I don't like having unused internal
functions and APIs sitting the source tree in the hope that someone will
find them useful in the future. It's more likely that it will bitrot, or
not actually work as intended, when someone later tries to use it.

The thing is I already have extension for 9.4 where I would use this API
for conflict detection if it was available so it's not about hoping for
somebody to find it useful. There was discussion about this during the
review process because the objection was raised already then.

Yeah, it was raised. I don't think it was really addressed. There was
lengthy discussion on whether to include LSN, node id, and/or some other
information, but there was no discussion on:

* What is a node ID?
* How is it used?
* Who assigns it?

etc.

None of those things are explained in the documentation nor code comments.

The node ID sounds suspiciously like the replication identifiers that have
been proposed a couple of times. I don't remember if I like replication
identifiers or not, but I'd rather discuss that issue explicitly and
separately. I don't want the concept of a replication/node ID to fly under
the radar like this.

What would the SQL API look like, and what would it be used for?

It would probably mirror the C one, from my POV it's not needed but
maybe some replication solution would prefer to use this without having
to write C components...

I can't comment on that, because without any documentation, I don't even
know what the C API is.

BTW, why is it OK that the node-ID of a commit is logged as a separate WAL
record, after the commit record? That means that it's possible that a
transaction commits, but you crash just before writing the SETTS record, so
that after replay the transaction appears committed but with the default
node ID.

(Maybe that's OK given the intended use case, but it's hard to tell without
any documentation. See a pattern here? ;-) )

Could it be possible to get a refresh on that before it bitrots too
much or we'll simply forget about it? In the current state, there is
no documentation able to explain what is the point of the nodeid
stuff, how to use it and what use cases it is aimed at. The API in
committs.c should as well be part of it. If that's not done, I think
that it would be fair to remove this portion and simply WAL-log the
commit timestamp its GUC is activated.
Thanks,
--
Michael

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#20Petr Jelinek
petr@2ndquadrant.com
In reply to: Michael Paquier (#19)
Re: [COMMITTERS] pgsql: Keep track of transaction commit timestamps

On 10/12/14 04:26, Michael Paquier wrote:

On Thu, Dec 4, 2014 at 9:26 PM, Heikki Linnakangas
<hlinnakangas@vmware.com> wrote:

Yeah, it was raised. I don't think it was really addressed. There was
lengthy discussion on whether to include LSN, node id, and/or some other
information, but there was no discussion on:

* What is a node ID?
* How is it used?
* Who assigns it?

etc.

None of those things are explained in the documentation nor code comments.

The node ID sounds suspiciously like the replication identifiers that have
been proposed a couple of times. I don't remember if I like replication
identifiers or not, but I'd rather discuss that issue explicitly and
separately. I don't want the concept of a replication/node ID to fly under
the radar like this.

Replication identifiers are bigger thing but yes there is overlap as the
nodeid itself makes it possible to implement replication identifiers
outside of core if needed.

What would the SQL API look like, and what would it be used for?

It would probably mirror the C one, from my POV it's not needed but
maybe some replication solution would prefer to use this without having
to write C components...

I can't comment on that, because without any documentation, I don't even
know what the C API is.

BTW, why is it OK that the node-ID of a commit is logged as a separate WAL
record, after the commit record? That means that it's possible that a
transaction commits, but you crash just before writing the SETTS record, so
that after replay the transaction appears committed but with the default
node ID.

(Maybe that's OK given the intended use case, but it's hard to tell without
any documentation. See a pattern here? ;-) )

This is actually good point, it's not OK to have just commit record but
no nodeid record if nodeid was set.

Could it be possible to get a refresh on that before it bitrots too
much or we'll simply forget about it? In the current state, there is
no documentation able to explain what is the point of the nodeid
stuff, how to use it and what use cases it is aimed at. The API in
committs.c should as well be part of it. If that's not done, I think
that it would be fair to remove this portion and simply WAL-log the
commit timestamp its GUC is activated.

I have this on my list so it's not being forgotten and I don't see it
bitrotting unless we are changing XLog api again. I have bit busy
schedule right now though, can it wait till next week? - I will post
some code documentation patches then.

--
Petr Jelinek http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#21Petr Jelinek
petr@2ndquadrant.com
In reply to: Petr Jelinek (#20)
1 attachment(s)
Re: [COMMITTERS] pgsql: Keep track of transaction commit timestamps

On 10/12/14 16:03, Petr Jelinek wrote:

On 10/12/14 04:26, Michael Paquier wrote:

On Thu, Dec 4, 2014 at 9:26 PM, Heikki Linnakangas
<hlinnakangas@vmware.com> wrote:

Yeah, it was raised. I don't think it was really addressed. There was
lengthy discussion on whether to include LSN, node id, and/or some other
information, but there was no discussion on:

* What is a node ID?
* How is it used?
* Who assigns it?

etc.

None of those things are explained in the documentation nor code
comments.

Could it be possible to get a refresh on that before it bitrots too
much or we'll simply forget about it? In the current state, there is
no documentation able to explain what is the point of the nodeid
stuff, how to use it and what use cases it is aimed at. The API in
committs.c should as well be part of it. If that's not done, I think
that it would be fair to remove this portion and simply WAL-log the
commit timestamp its GUC is activated.

I have this on my list so it's not being forgotten and I don't see it
bitrotting unless we are changing XLog api again. I have bit busy
schedule right now though, can it wait till next week? - I will post
some code documentation patches then.

Hi,

as promised I am sending code-comment patch that explains the use of
commit timestamps + nodeid C api for the conflict resolution, comments
welcome.

--
Petr Jelinek http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

Attachments:

commit_ts_code_doc.patchtext/x-diff; name=commit_ts_code_doc.patchDownload
diff --git a/src/backend/access/transam/commit_ts.c b/src/backend/access/transam/commit_ts.c
index ca074da..fff1fdd 100644
--- a/src/backend/access/transam/commit_ts.c
+++ b/src/backend/access/transam/commit_ts.c
@@ -6,6 +6,62 @@
  * This module is a pg_clog-like system that stores the commit timestamp
  * for each transaction.
  *
+ * When track_commit_timestamp is enabled, this module will keep track of
+ * commit timestamp for each transaction. It also provides API to for
+ * optionally storing nodeid (origin) of each transaction. The main purpose of
+ * this functionality is to help with conflict detection and resolution for
+ * replication systems.
+ *
+ * The following example shows how to use the API provided by this module, to
+ * handle UPDATE conflicts coming from replication stream:
+ * void
+ * update_tuple(Relation relation, HeapTuple remote_tuple,
+ *              TimestampTz remote_commit_ts, CommitTsNodeId remote_node_id)
+ * {
+ *     bool             exists;
+ *     HeapTupleData    local_tuple;
+ *
+ *     // Find existing tuple with same PK/unique index combination.
+ *     exists = find_local_tuple(relation, remote_tuple, &local_tuple);
+ *
+ *     // The tuple was found.
+ *     if (exists)
+ *     {
+ *         TransactionId    xmin;
+ *         TimestampTz      local_commit_ts;
+ *         CommitTsNodeId   local_node_id;
+ *
+ *         xmin = HeapTupleHeaderGetXmin(local_tuple.t_data);
+ *         TransactionIdGetCommitTsData(xmin, &local_commit_ts, &nodeid);
+ *
+ *         // New tuple is coming from different node than the locally saved
+ *         // tuple and the remote commit timestamp is older than local commit
+ *         // timestamp, this is UPDATE/UPDATE conflict (node being UPDATEd on
+ *         // different nodes at the same time.
+ *         if (remote_id != local_node_id && remote_commit_ts <= local_commit_ts)
+ *         {
+ *             if (remote_commit_ts < local_commit_ts)
+ *                 return; // Keep the local tuple.
+ *
+ *             // Handle the conflict in a consistent manner.
+ *         }
+ *         else
+ *         {
+ *             // The new tuple either comes from same node as old tuple and/or
+ *             // is has newer commit timestamp than the local tuple, apply the
+ *             // UPDATE.
+ *         }
+ *      }
+ *      else
+ *      {
+ *          // Tuple not found (possibly UPDATE/DELETE conflict), handle it
+ *          // in a consistent manner.
+ *      }
+ * }
+ *
+ * See default_node_id and CommitTsSetDefaultNodeId for explanation of how to
+ * set nodeid when applying transactions.
+ *
  * XLOG interactions: this module generates an XLOG record whenever a new
  * CommitTs page is initialized to zeroes.  Also, one XLOG record is
  * generated for setting of values when the caller requests it; this allows
@@ -49,6 +105,15 @@
  */
 
 /*
+ * CommitTimestampEntry
+ *
+ * Record containing information about the transaction commit timestamp and
+ * the nodeid.
+ *
+ * The nodeid provides IO efficient way for replication systems to store
+ * information about origin of the transaction. Currently the nodeid is opaque
+ * value meaning of which is defined by the replication system itself.
+ *
  * We need 8+4 bytes per xact.  Note that enlarging this struct might mean
  * the largest possible file name is more than 5 chars long; see
  * SlruScanDirectory.
@@ -93,6 +158,21 @@ CommitTimestampShared	*commitTsShared;
 /* GUC variable */
 bool	track_commit_timestamp;
 
+/*
+ * The default_node_id will be written to commit timestamp record for
+ * every transaction committed by the current backend.
+ *
+ * The idea behind having default_node_id is that replication system
+ * will have connection (backend running) for specific source of replicated
+ * data to which it assigned the nodeid for identification purposes.
+ *
+ * The default_node_id setting stays in effect for duration of a session
+ * (unless changed again), so the replication system does not have to set
+ * the nodeid individually for each transaction.
+ *
+ * The public access to the default_node_id is provided by
+ * CommitTsSetDefaultNodeId and CommitTsGetDefaultNodeId interfaces.
+ */
 static CommitTsNodeId default_node_id = InvalidCommitTsNodeId;
 
 static void SetXidCommitTsInPage(TransactionId xid, int nsubxids,
@@ -112,7 +192,7 @@ static void WriteSetTimestampXlogRec(TransactionId mainxid, int nsubxids,
 /*
  * CommitTsSetDefaultNodeId
  *
- * Set default nodeid for current backend.
+ * Set default nodeid for current session.
  */
 void
 CommitTsSetDefaultNodeId(CommitTsNodeId nodeid)
@@ -123,7 +203,7 @@ CommitTsSetDefaultNodeId(CommitTsNodeId nodeid)
 /*
  * CommitTsGetDefaultNodeId
  *
- * Set default nodeid for current backend.
+ * Get current default nodeid.
  */
 CommitTsNodeId
 CommitTsGetDefaultNodeId(void)
#22Michael Paquier
michael.paquier@gmail.com
In reply to: Petr Jelinek (#21)
Re: [COMMITTERS] pgsql: Keep track of transaction commit timestamps

On Fri, Dec 19, 2014 at 6:30 PM, Petr Jelinek <petr@2ndquadrant.com> wrote:

On 10/12/14 16:03, Petr Jelinek wrote:

On 10/12/14 04:26, Michael Paquier wrote:

On Thu, Dec 4, 2014 at 9:26 PM, Heikki Linnakangas
<hlinnakangas@vmware.com> wrote:

Yeah, it was raised. I don't think it was really addressed. There was
lengthy discussion on whether to include LSN, node id, and/or some other
information, but there was no discussion on:

* What is a node ID?
* How is it used?
* Who assigns it?

etc.

None of those things are explained in the documentation nor code
comments.

Could it be possible to get a refresh on that before it bitrots too
much or we'll simply forget about it? In the current state, there is
no documentation able to explain what is the point of the nodeid
stuff, how to use it and what use cases it is aimed at. The API in
committs.c should as well be part of it. If that's not done, I think
that it would be fair to remove this portion and simply WAL-log the
commit timestamp its GUC is activated.

I have this on my list so it's not being forgotten and I don't see it
bitrotting unless we are changing XLog api again. I have bit busy
schedule right now though, can it wait till next week? - I will post
some code documentation patches then.

as promised I am sending code-comment patch that explains the use of commit
timestamps + nodeid C api for the conflict resolution, comments welcome.

Having some documentation with this example in doc/ would be more fruitful IMO.
--
Michael

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#23Petr Jelinek
petr@2ndquadrant.com
In reply to: Michael Paquier (#22)
Re: [COMMITTERS] pgsql: Keep track of transaction commit timestamps

On 19/12/14 13:17, Michael Paquier wrote:

On Fri, Dec 19, 2014 at 6:30 PM, Petr Jelinek <petr@2ndquadrant.com> wrote:

On 10/12/14 16:03, Petr Jelinek wrote:

On 10/12/14 04:26, Michael Paquier wrote:

On Thu, Dec 4, 2014 at 9:26 PM, Heikki Linnakangas
<hlinnakangas@vmware.com> wrote:

Yeah, it was raised. I don't think it was really addressed. There was
lengthy discussion on whether to include LSN, node id, and/or some other
information, but there was no discussion on:

* What is a node ID?
* How is it used?
* Who assigns it?

etc.

None of those things are explained in the documentation nor code
comments.

Could it be possible to get a refresh on that before it bitrots too
much or we'll simply forget about it? In the current state, there is
no documentation able to explain what is the point of the nodeid
stuff, how to use it and what use cases it is aimed at. The API in
committs.c should as well be part of it. If that's not done, I think
that it would be fair to remove this portion and simply WAL-log the
commit timestamp its GUC is activated.

I have this on my list so it's not being forgotten and I don't see it
bitrotting unless we are changing XLog api again. I have bit busy
schedule right now though, can it wait till next week? - I will post
some code documentation patches then.

as promised I am sending code-comment patch that explains the use of commit
timestamps + nodeid C api for the conflict resolution, comments welcome.

Having some documentation with this example in doc/ would be more fruitful IMO.

I am not sure I see point in that, the GUC and SQL interfaces are
documented in doc/ and we usually don't put documentation for C
interfaces there.

--
Petr Jelinek http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#24Heikki Linnakangas
hlinnakangas@vmware.com
In reply to: Petr Jelinek (#21)
Re: [COMMITTERS] pgsql: Keep track of transaction commit timestamps

On 12/19/2014 11:30 AM, Petr Jelinek wrote:

as promised I am sending code-comment patch that explains the use of
commit timestamps + nodeid C api for the conflict resolution, comments
welcome.

That's a little bit better, but I have to say I'm still not impressed.
There are so many implicit assumptions in the system. The first
assumption is that a 32-bit node id is sufficient. I'm sure it is for
many replication systems, but might not be for all. Then there's the
assumption that the node id should be "sticky", i.e. it's set for the
whole session. Again, I'm sure that's useful for many systems, but I
could just as easily imagine that you'd want it to reset after every commit.

To be honest, I think this patch should be reverted. Instead, we should
design a system where extensions can define their own SLRUs to store
additional per-transaction information. That way, each extension can
have as much space per transaction as needed, and support functions that
make most sense with the information. Commit timestamp tracking would be
one such extension, and for this node ID stuff, you could have another
one (or include it in the replication extension).

- Heikki

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#25Andres Freund
andres@2ndquadrant.com
In reply to: Heikki Linnakangas (#24)
Re: [COMMITTERS] pgsql: Keep track of transaction commit timestamps

On 2014-12-29 12:06:07 +0200, Heikki Linnakangas wrote:

That's a little bit better, but I have to say I'm still not impressed. There
are so many implicit assumptions in the system. The first assumption is that
a 32-bit node id is sufficient.

Seriously? Are we going to build facilities for replication systems with
that many nodes? It seems absolutely unrealistic that a) somebody does
this b) that we'll blindly meet the demands of such a super hypothetical
scenario.

Then there's the assumption that the node id should be "sticky",
i.e. it's set for the whole session. Again, I'm sure that's useful for
many systems, but I could just as easily imagine that you'd want it to
reset after every commit.

It's trivial to add that/reset it manually. So what?

To be honest, I think this patch should be reverted. Instead, we should
design a system where extensions can define their own SLRUs to store
additional per-transaction information. That way, each extension can have as
much space per transaction as needed, and support functions that make most
sense with the information. Commit timestamp tracking would be one such
extension, and for this node ID stuff, you could have another one (or
include it in the replication extension).

If somebody wants that they should develop it. But given that we, based
on previous discussions, don't want to run user defined code in the
relevant phase during transaction commit *and* replay I don't think it'd
be all that easy to do it fast and flexible.

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#26Petr Jelinek
petr@2ndquadrant.com
In reply to: Andres Freund (#25)
Re: [COMMITTERS] pgsql: Keep track of transaction commit timestamps

On 29/12/14 11:16, Andres Freund wrote:

On 2014-12-29 12:06:07 +0200, Heikki Linnakangas wrote:

That's a little bit better, but I have to say I'm still not impressed. There
are so many implicit assumptions in the system. The first assumption is that
a 32-bit node id is sufficient.

Seriously? Are we going to build facilities for replication systems with
that many nodes? It seems absolutely unrealistic that a) somebody does
this b) that we'll blindly meet the demands of such a super hypothetical
scenario.

+1, Honestly, if somebody will ever have setup with more nodes than what
fits into 32bits they will run into bigger problems than nodeid being
too small.

Then there's the assumption that the node id should be "sticky",
i.e. it's set for the whole session. Again, I'm sure that's useful for
many systems, but I could just as easily imagine that you'd want it to
reset after every commit.

It's trivial to add that/reset it manually. So what?

Yes you can reset in the extension after commit, or you can actually
override both commit timestamp and nodeid after commit if you so wish.

To be honest, I think this patch should be reverted. Instead, we should
design a system where extensions can define their own SLRUs to store
additional per-transaction information. That way, each extension can have as
much space per transaction as needed, and support functions that make most
sense with the information. Commit timestamp tracking would be one such
extension, and for this node ID stuff, you could have another one (or
include it in the replication extension).

If somebody wants that they should develop it. But given that we, based
on previous discussions, don't want to run user defined code in the
relevant phase during transaction commit *and* replay I don't think it'd
be all that easy to do it fast and flexible.

Right, I would love to have custom SLRUs but I don't see it happening
given those two restrictions, otherwise I would write the CommitTs patch
that way in the first place...

--
Petr Jelinek http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#27Heikki Linnakangas
hlinnakangas@vmware.com
In reply to: Petr Jelinek (#26)
Re: [COMMITTERS] pgsql: Keep track of transaction commit timestamps

On 12/29/2014 12:39 PM, Petr Jelinek wrote:

On 29/12/14 11:16, Andres Freund wrote:

On 2014-12-29 12:06:07 +0200, Heikki Linnakangas wrote:

To be honest, I think this patch should be reverted. Instead, we should
design a system where extensions can define their own SLRUs to store
additional per-transaction information. That way, each extension can have as
much space per transaction as needed, and support functions that make most
sense with the information. Commit timestamp tracking would be one such
extension, and for this node ID stuff, you could have another one (or
include it in the replication extension).

If somebody wants that they should develop it. But given that we, based
on previous discussions, don't want to run user defined code in the
relevant phase during transaction commit *and* replay I don't think it'd
be all that easy to do it fast and flexible.

Right, I would love to have custom SLRUs but I don't see it happening
given those two restrictions, otherwise I would write the CommitTs patch
that way in the first place...

Transaction commit and replay can treat the per-transaction information
as an opaque blob. It just needs to be included in the commit record,
and replay needs to write it to the SLRU. That way you don't need to run
any user-defined code in those phases.

- Heikki

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#28Andres Freund
andres@2ndquadrant.com
In reply to: Heikki Linnakangas (#27)
Re: [COMMITTERS] pgsql: Keep track of transaction commit timestamps

On 2014-12-29 12:50:23 +0200, Heikki Linnakangas wrote:

On 12/29/2014 12:39 PM, Petr Jelinek wrote:

On 29/12/14 11:16, Andres Freund wrote:

On 2014-12-29 12:06:07 +0200, Heikki Linnakangas wrote:

To be honest, I think this patch should be reverted. Instead, we should
design a system where extensions can define their own SLRUs to store
additional per-transaction information. That way, each extension can have as
much space per transaction as needed, and support functions that make most
sense with the information. Commit timestamp tracking would be one such
extension, and for this node ID stuff, you could have another one (or
include it in the replication extension).

If somebody wants that they should develop it. But given that we, based
on previous discussions, don't want to run user defined code in the
relevant phase during transaction commit *and* replay I don't think it'd
be all that easy to do it fast and flexible.

Right, I would love to have custom SLRUs but I don't see it happening
given those two restrictions, otherwise I would write the CommitTs patch
that way in the first place...

Transaction commit and replay can treat the per-transaction information as
an opaque blob. It just needs to be included in the commit record, and
replay needs to write it to the SLRU. That way you don't need to run any
user-defined code in those phases.

Meh. Only if you want to duplicate the timestamps from the commits.

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers