A patch for get origin from commit_ts.
Hello hackers,
I am researching about 'origin' in PostgreSQL, mainly it used in logical
decoding to filter transaction from non-local source. I notice that the
'origin' is stored in commit_ts so that I think we are possible to get 'origin'
of a transaction from commit_ts.
But I can not fond any code to get 'origin' from commit_ts, just like it is
producing data which nobody cares about. Can I know what's the purpose
of the 'origin' in commit_ts? Do you think we should add some support
to the careless data?
For example, I add a function to get 'origin' from commit_ts:
=======================================
postgres=# select pg_xact_commit_origin('490');
pg_xact_commit_origin
-----------------------
test_origin
(1 row)
postgres=# select pg_xact_commit_origin('491');
pg_xact_commit_origin
-----------------------
test_origin1
(1 row)
postgres=#
=======================================
Regards,
Highgo Software (Canada/China/Pakistan)
URL : www.highgo.ca
EMAIL: mailto:movead(dot)li(at)highgo(dot)ca
Attachments:
get_origin_from_commit_ts.patchapplication/octet-stream; name=get_origin_from_commit_ts.patchDownload+29-0
On Mon, May 11, 2020 at 04:43:11PM +0800, movead.li@highgo.ca wrote:
But I can not fond any code to get 'origin' from commit_ts, just like it is
producing data which nobody cares about. Can I know what's the purpose
of the 'origin' in commit_ts? Do you think we should add some support
to the careless data?
I have not thought about this matter, but it seems to me that you
should add this patch to the upcoming commit fest for evaluation:
https://commitfest.postgresql.org/28/
This is going to take a couple of months though as the main focus
lately is the stability of 13.
--
Michael
I have not thought about this matter, but it seems to me that you
should add this patch to the upcoming commit fest for evaluation:
https://commitfest.postgresql.org/28/
Thanks.
I think about it more detailed, and find it's better to show the 'roident'
other than 'roname'. Because an old 'roident' value will be used
immediately after dropped, and a new patch attached with test case
and documentation.
============================================
SELECT pg_xact_commit_origin('490');
pg_xact_commit_origin
-----------------------
1
(1 row)
SELECT pg_xact_commit_origin('491');
pg_xact_commit_origin
-----------------------
2
(1 row)
============================================
Regards,
Highgo Software (Canada/China/Pakistan)
URL : www.highgo.ca
EMAIL: mailto:movead(dot)li(at)highgo(dot)ca
Attachments:
get_origin_from_commit_ts_v2.patchapplication/octet-stream; name=get_origin_from_commit_ts_v2.patchDownload+240-0
Hello hackers,
We already have pg_xact_commit_timestamp() that returns the timestamp of
the commit. It may be better to have one single function returning both
timestamp and origin for a given transaction ID.
A second thing is that TransactionIdGetCommitTsData() was introdued in
core(73c986add). It has only one caller pg_xact_commit_timestamp() which
passes RepOriginId as NULL, making last argument to the
TransactionIdGetCommitTsData() a dead code in core.
Quick code search shows that it is getting used by pglogical (caller:
https://sources.debian.org/src/pglogical/2.3.2-1/pglogical_conflict.c/?hl=509#L509).
CCing Craig Ringer and Petr Jelinek for the inputs.
Warm Regards,
Madan Kumar K
"There is no Elevator to Success. You have to take the Stairs"
On Mon, Jun 29, 2020 at 06:17:27PM -0700, Madan Kumar wrote:
We already have pg_xact_commit_timestamp() that returns the timestamp of
the commit. It may be better to have one single function returning both
timestamp and origin for a given transaction ID.A second thing is that TransactionIdGetCommitTsData() was introdued in
core(73c986add). It has only one caller pg_xact_commit_timestamp() which
passes RepOriginId as NULL, making last argument to the
TransactionIdGetCommitTsData() a dead code in core.Quick code search shows that it is getting used by pglogical (caller:
https://sources.debian.org/src/pglogical/2.3.2-1/pglogical_conflict.c/?hl=509#L509).
CCing Craig Ringer and Petr Jelinek for the inputs.
Another question that has popped up when doing this review is what
would be the use-case of adding this information at SQL level knowing
that logical replication exists since 10? Having dead code in the
backend tree is not a good idea of course, so we can also have as
argument to simplify TransactionIdGetCommitTsData(). Now, pglogical
has pglogical_xact_commit_timestamp_origin() to get the replication
origin with its own function so providing an extending equivalent
returning one row with two fields would be nice for pglogical so as
this function is not necessary. As mentioned by Madan, the portion of
the code using TransactionIdGetCommitTsData() relies on it for
conflicts of updates (the first win, last win logic at quick glance).
I am adding Peter E in CC for an opinion, the last commits of
pglogical are from him.
--
Michael
A second thing is that TransactionIdGetCommitTsData() was introdued in
core(73c986add). It has only one caller pg_xact_commit_timestamp() which
passes RepOriginId as NULL, making last argument to the
TransactionIdGetCommitTsData() a dead code in core.Quick code search shows that it is getting used by pglogical (caller:
https://sources.debian.org/src/pglogical/2.3.2-1/pglogical_conflict.c/?hl=509#L509).
CCing Craig Ringer and Petr Jelinek for the inputs.
Another question that has popped up when doing this review is what
would be the use-case of adding this information at SQL level knowing
that logical replication exists since 10? Having dead code in the
backend tree is not a good idea of course, so we can also have as
argument to simplify TransactionIdGetCommitTsData(). Now, pglogical
has pglogical_xact_commit_timestamp_origin() to get the replication
origin with its own function so providing an extending equivalent
returning one row with two fields would be nice for pglogical so as
this function is not necessary. As mentioned by Madan, the portion of
the code using TransactionIdGetCommitTsData() relies on it for
conflicts of updates (the first win, last win logic at quick glance).
Thanks for the explanation, the origin in commit_ts seems useless, I am just
want to know why it appears there. It's ok to close this issue if we do not
want to touch it now.
And I am more interest in origin in wal, if data from a logical replicate or a
manual origin then many wal records will get a 'RepOriginId', 'RepOriginId'
in 'xact' wal record may help to do some filter, the other same dead code
too. So can you help me to understand why or the historical reason for that?
Regards,
Highgo Software (Canada/China/Pakistan)
URL : www.highgo.ca
EMAIL: mailto:movead(dot)li(at)highgo(dot)ca
On Tue, 30 Jun 2020 at 02:17, Madan Kumar <madankumar1993@gmail.com> wrote:
We already have pg_xact_commit_timestamp() that returns the timestamp of
the commit.
Yes, pg_xact_commit_origin() is a good name for an additional function. +1
for this.
It may be better to have one single function returning both
timestamp and origin for a given transaction ID.
No need to change existing APIs.
--
Simon Riggs http://www.2ndQuadrant.com/
<http://www.2ndquadrant.com/>
Mission Critical Databases
On 2020-Jun-30, Michael Paquier wrote:
Another question that has popped up when doing this review is what
would be the use-case of adding this information at SQL level knowing
that logical replication exists since 10?
Logical replication in core is a far cry from a fully featured
replication solution. Kindly do not claim that we can now remove
features just because in-core logical replication does not use them;
this argument is ignoring the fact that we're still a long way from
developing actually powerful logical replication capabilities.
--
�lvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Tue, Jun 30, 2020 at 02:32:47PM -0400, Alvaro Herrera wrote:
On 2020-Jun-30, Michael Paquier wrote:
Another question that has popped up when doing this review is what
would be the use-case of adding this information at SQL level knowing
that logical replication exists since 10?Logical replication in core is a far cry from a fully featured
replication solution. Kindly do not claim that we can now remove
features just because in-core logical replication does not use them;
this argument is ignoring the fact that we're still a long way from
developing actually powerful logical replication capabilities.
Thanks for the feedback. If that sounded aggressive in some way, this
was not my intention, so my apologies for that. Now, I have to admit
that I am worried to see in core code that stands as dead without any
actual way to test it directly. Somebody hacking this code cannot be
sure if they are breaking it or not, except if they test it with
pglogical. So it is good to close the gap here. It also brings a
second point IMO, could the documentation be improved to describe more
use-cases where these functions would be useful? The documentation
gap is not a problem this patch has to deal with, though.
--
Michael
On Tue, Jun 30, 2020 at 01:58:17PM +0100, Simon Riggs wrote:
On Tue, 30 Jun 2020 at 02:17, Madan Kumar <madankumar1993@gmail.com> wrote:
It may be better to have one single function returning both
timestamp and origin for a given transaction ID.No need to change existing APIs.
Adding a new function able to return both fields at the same time does
not imply that we'd remove the original one, it just implies that we
would be able to retrieve both fields with a single call of
TransactionIdGetCommitTsData(), saving from an extra CommitTsSLRULock
taken, etc. That's actually what pglogical does with
its pglogical_xact_commit_timestamp_origin() in
pglogical_functions.c. So adding one function able to return one
tuple with the two fields, without removing the existing
pg_xact_commit_timestamp() makes the most sense, no?
--
Michael
On Thu, 2 Jul 2020 at 02:58, <michael@paquier.xyz> wrote:
On Tue, Jun 30, 2020 at 01:58:17PM +0100, Simon Riggs wrote:
On Tue, 30 Jun 2020 at 02:17, Madan Kumar <madankumar1993@gmail.com>
wrote:
It may be better to have one single function returning both
timestamp and origin for a given transaction ID.No need to change existing APIs.
Adding a new function able to return both fields at the same time does
not imply that we'd remove the original one, it just implies that we
would be able to retrieve both fields with a single call of
TransactionIdGetCommitTsData(), saving from an extra CommitTsSLRULock
taken, etc. That's actually what pglogical does with
its pglogical_xact_commit_timestamp_origin() in
pglogical_functions.c. So adding one function able to return one
tuple with the two fields, without removing the existing
pg_xact_commit_timestamp() makes the most sense, no?
OK
--
Simon Riggs http://www.2ndQuadrant.com/
<http://www.2ndquadrant.com/>
Mission Critical Databases
On 02/07/2020 03:58, michael@paquier.xyz wrote:
On Tue, Jun 30, 2020 at 01:58:17PM +0100, Simon Riggs wrote:
On Tue, 30 Jun 2020 at 02:17, Madan Kumar <madankumar1993@gmail.com> wrote:
It may be better to have one single function returning both
timestamp and origin for a given transaction ID.No need to change existing APIs.
Adding a new function able to return both fields at the same time does
not imply that we'd remove the original one, it just implies that we
would be able to retrieve both fields with a single call of
TransactionIdGetCommitTsData(), saving from an extra CommitTsSLRULock
taken, etc. That's actually what pglogical does with
its pglogical_xact_commit_timestamp_origin() in
pglogical_functions.c. So adding one function able to return one
tuple with the two fields, without removing the existing
pg_xact_commit_timestamp() makes the most sense, no?
Agreed, sounds reasonable.
I also (I suspect like �lvaro) parsed your original message as wanting
to remove origin from the record completely.
--
Petr Jelinek
2ndQuadrant - PostgreSQL Solutions for the Enterprise
https://www.2ndQuadrant.com/
On Thu, Jul 02, 2020 at 10:12:02AM +0200, Petr Jelinek wrote:
On 02/07/2020 03:58, michael@paquier.xyz wrote:
Adding a new function able to return both fields at the same time does
not imply that we'd remove the original one, it just implies that we
would be able to retrieve both fields with a single call of
TransactionIdGetCommitTsData(), saving from an extra CommitTsSLRULock
taken, etc. That's actually what pglogical does with
its pglogical_xact_commit_timestamp_origin() in
pglogical_functions.c. So adding one function able to return one
tuple with the two fields, without removing the existing
pg_xact_commit_timestamp() makes the most sense, no?Agreed, sounds reasonable.
Thanks. Movead, please note that the patch is waiting on author?
Could you send an update if you think that those changes make sense?
--
Michael
Thanks. Movead, please note that the patch is waiting on author?
Could you send an update if you think that those changes make sense?
Thanks for approval the issue, I will send a patch at Monday.
Regards,
Highgo Software (Canada/China/Pakistan)
URL : http://www.highgo.ca/
EMAIL: mailto:movead(dot)li(at)highgo(dot)ca
Thanks. Movead, please note that the patch is waiting on author?
Could you send an update if you think that those changes make sense?
I make a patch as Michael Paquier described that use a new function to
return transactionid and origin, and I add a origin version to
pg_last_committed_xact() too, now it looks like below:
============================================
postgres=# SELECT txid_current() as txid \gset
postgres=# SELECT * FROM pg_xact_commit_timestamp_origin(:'txid');
timestamp | origin
-------------------------------------+--------
2020-07-04 17:52:10.199623+08 | 1
(1 row)
postgres=# SELECT * FROM pg_last_committed_xact_with_origin();
xid | timestamp | origin
-----+------------------------------------+--------
506 | 2020-07-04 17:52:10.199623+08 | 1
(1 row)
postgres=#
============================================
---
Regards,
Highgo Software (Canada/China/Pakistan)
URL : www.highgo.ca
EMAIL: mailto:movead(dot)li(at)highgo(dot)ca
Attachments:
get_origin_from_commit_ts_v3.patchapplication/octet-stream; name=get_origin_from_commit_ts_v3.patchDownload+365-0
On Sat, Jul 04, 2020 at 06:01:28PM +0800, movead.li@highgo.ca wrote:
I make a patch as Michael Paquier described that use a new function to
return transactionid and origin, and I add a origin version to
pg_last_committed_xact() too, now it looks like below:
+SELECT pg_replication_origin_create('test_commit_ts: get_origin_1');
+SELECT pg_replication_origin_create('test_commit_ts: get_origin_2');
+SELECT pg_replication_origin_create('test_commit_ts: get_origin_3');
Why do you need three replication origins to test three times the same
pattern? Wouldn't one be enough and why don't you check after the
timestamp? I would also two extra tests: one with a NULL input and an
extra one where the data could not be found.
+ found = TransactionIdGetCommitTsData(xid, &ts, &nodeid);
+
+ if (!found)
+ PG_RETURN_NULL();
This part also looks incorrect to me, I think that you should still
return two tuples, both marked as NULL. You can do that just by
switching the nulls flags to true for the two values if nothing can be
found.
--
Michael
+SELECT pg_replication_origin_create('test_commit_ts: get_origin_1'); +SELECT pg_replication_origin_create('test_commit_ts: get_origin_2'); +SELECT pg_replication_origin_create('test_commit_ts: get_origin_3');Why do you need three replication origins to test three times the same
pattern? Wouldn't one be enough and why don't you check after the
timestamp? I would also two extra tests: one with a NULL input and an
extra one where the data could not be found.+ found = TransactionIdGetCommitTsData(xid, &ts, &nodeid); + + if (!found) + PG_RETURN_NULL();This part also looks incorrect to me, I think that you should still
return two tuples, both marked as NULL. You can do that just by
switching the nulls flags to true for the two values if nothing can be
found.
Thanks for the points and follow them, new patch attached.
Regards,
Highgo Software (Canada/China/Pakistan)
URL : www.highgo.ca
EMAIL: mailto:movead(dot)li(at)highgo(dot)ca
Attachments:
get_origin_from_commit_ts_v4.patchapplication/octet-stream; name=get_origin_from_commit_ts_v4.patchDownload+310-0
On Mon, Jul 06, 2020 at 11:12:30AM +0800, movead.li@highgo.ca wrote:
Thanks for the points and follow them, new patch attached.
That was fast, thanks. I have not tested the patch, but there are
two things I missed a couple of hours back. Why do you need
pg_last_committed_xact_with_origin() to begin with? Wouldn't it be
more simple to just add a new column to pg_last_committed_xact() for
the replication origin? Contrary to pg_xact_commit_timestamp() that
should not be broken for compatibility reasons because it returns only
one value, we don't have this problem with pg_last_committed_xact() as
it already returns one tuple with two values.
+{ oid => '4179', descr => 'get commit origin of a transaction',
A second thing is that the OID of the new function should be in the
range 8000..9999, as per the policy introduced in commit a6417078.
src/include/catalog/unused_oids can be used to pick up a value.
--
Michael
That was fast, thanks. I have not tested the patch, but there are
two things I missed a couple of hours back. Why do you need
pg_last_committed_xact_with_origin() to begin with? Wouldn't it be
more simple to just add a new column to pg_last_committed_xact() for
the replication origin? Contrary to pg_xact_commit_timestamp() that
should not be broken for compatibility reasons because it returns only
one value, we don't have this problem with pg_last_committed_xact() as
it already returns one tuple with two values.
Yes make sense, changed in new patch.
+{ oid => '4179', descr => 'get commit origin of a transaction',
A second thing is that the OID of the new function should be in the
range 8000..9999, as per the policy introduced in commit a6417078.
src/include/catalog/unused_oids can be used to pick up a value.
Thanks, very helpful information and I have follow that.
Regards,
Highgo Software (Canada/China/Pakistan)
URL : www.highgo.ca
EMAIL: mailto:movead(dot)li(at)highgo(dot)ca
Attachments:
get_origin_from_commit_ts_v5.patchapplication/octet-stream; name=get_origin_from_commit_ts_v5.patchDownload+254-9
On Tue, Jul 07, 2020 at 10:02:29AM +0800, movead.li@highgo.ca wrote:
Thanks, very helpful information and I have followed that.
Cool, thanks. I have gone through your patch in details, and updated
it as the attached. Here are some comments.
'8000' as OID for the new function was not really random, so to be
fair with the other patches, I picked up the first random value
unused_oids has given me instead.
There were some indentation issues, and pgindent got that fixed.
I think that it is better to use "replication origin" in the docs
instead of just origin. I have kept "origin" in the functions for
now as that sounded cleaner to me, but we may consider using something
like "reporigin" as well as attribute name.
The tests could just use tstzrange() to make sure that the timestamps
have valid values, so I have switched to that, and did not resist to
do the same in the existing tests.
+-- Test when it can not find the transaction
+SELECT * FROM pg_xact_commit_timestamp_origin((:'txid_set_origin'::text::int +
10)::text::xid) x;
This test could become unstable, particularly if it gets used in a
parallel environment, so I have removed it. Perhaps I am just
over-pessimistic here though..
As a side note, I think that we could just remove the alternate output
of commit_ts/, as it does not really get used because of the
NO_INSTALLCHECK present in the module's Makefile. That would be the
job of a different patch, so I have updated it accordingly. Glad to
see that you did not forget to adapt it in your own patch.
(The change in catversion.h is a self-reminder...)
--
Michael