Add common function ReplicationOriginName.

Started by Peter Smithover 3 years ago17 messageshackers
Jump to latest
#1Peter Smith
smithpb2250@gmail.com

Hi.

There are already multiple places that are building the subscription
origin name, and there are more coming with some new patches [1]/messages/by-id/CAA4eK1+wyN6zpaHUkCLorEWNx75MG0xhMwcFhvjqm2KURZEAGw@mail.gmail.com doing
the same:

e.g.
snprintf(originname, sizeof(originname), "pg_%u", subid);

~~

IMO it is better to encapsulate this name formatting in common code
instead of the format string being scattered around the place.

PSA a patch to add a common function ReplicationOriginName. This is
the equivalent of a similar function for tablesync
(ReplicationOriginNameForTablesync) which already existed.

------
[1]: /messages/by-id/CAA4eK1+wyN6zpaHUkCLorEWNx75MG0xhMwcFhvjqm2KURZEAGw@mail.gmail.com

Kind Regards,
Peter Smith.
Fujitsu Australia

Attachments:

v1-0001-Add-common-function-ReplicationOriginName.patchapplication/octet-stream; name=v1-0001-Add-common-function-ReplicationOriginName.patchDownload+16-5
#2Aleksander Alekseev
aleksander@timescale.com
In reply to: Peter Smith (#1)
Re: Add common function ReplicationOriginName.

Hi Peter,

PSA a patch to add a common function ReplicationOriginName

The patch looks good to me.

One nitpick I have is that the 2nd argument of snprintf is size_t
while we are passing int's. Your patch is consistent with the current
implementation of ReplicationOriginNameForTablesync() and similar
functions in tablesync.c. However I would like to mention this in case
the committer will be interested in replacing ints with Size's while
on it.

--
Best regards,
Aleksander Alekseev

#3Amit Kapila
amit.kapila16@gmail.com
In reply to: Aleksander Alekseev (#2)
Re: Add common function ReplicationOriginName.

On Mon, Sep 19, 2022 at 2:27 PM Aleksander Alekseev
<aleksander@timescale.com> wrote:

Hi Peter,

PSA a patch to add a common function ReplicationOriginName

The patch looks good to me.

One nitpick I have is that the 2nd argument of snprintf is size_t
while we are passing int's. Your patch is consistent with the current
implementation of ReplicationOriginNameForTablesync() and similar
functions in tablesync.c.

I think it is better to use Size. Even though, it may not fail now as
the size of names for origin will always be much lesser but it is
better if we are consistent. If we agree with this, then as a first
patch, we can make it to use Size in existing places and then
introduce this new function.

--
With Regards,
Amit Kapila.

#4Aleksander Alekseev
aleksander@timescale.com
In reply to: Amit Kapila (#3)
Re: Add common function ReplicationOriginName.

Hi Amit,

I think it is better to use Size. Even though, it may not fail now as
the size of names for origin will always be much lesser but it is
better if we are consistent. If we agree with this, then as a first
patch, we can make it to use Size in existing places and then
introduce this new function.

OK, here is the updated patchset.

* 0001 replaces int's with Size's in the existing code
* 0002 applies Peter's patch on top of 0001

--
Best regards,
Aleksander Alekseev

Attachments:

v2-0001-Pass-Size-as-a-2nd-argument-for-snprintf-in-table.patchapplication/octet-stream; name=v2-0001-Pass-Size-as-a-2nd-argument-for-snprintf-in-table.patchDownload+4-5
v2-0002-Add-common-function-ReplicationOriginName.patchapplication/octet-stream; name=v2-0002-Add-common-function-ReplicationOriginName.patchDownload+16-5
#5Peter Smith
smithpb2250@gmail.com
In reply to: Aleksander Alekseev (#4)
Re: Add common function ReplicationOriginName.

On Tue, Sep 20, 2022 at 6:36 PM Aleksander Alekseev
<aleksander@timescale.com> wrote:

Hi Amit,

I think it is better to use Size. Even though, it may not fail now as
the size of names for origin will always be much lesser but it is
better if we are consistent. If we agree with this, then as a first
patch, we can make it to use Size in existing places and then
introduce this new function.

OK, here is the updated patchset.

* 0001 replaces int's with Size's in the existing code
* 0002 applies Peter's patch on top of 0001

LGTM. Thanks!

-----
Kind Regards,
Peter Smith.
Fujitsu Australia

#6Amit Kapila
amit.kapila16@gmail.com
In reply to: Aleksander Alekseev (#4)
Re: Add common function ReplicationOriginName.

On Tue, Sep 20, 2022 at 2:06 PM Aleksander Alekseev
<aleksander@timescale.com> wrote:

Hi Amit,

I think it is better to use Size. Even though, it may not fail now as
the size of names for origin will always be much lesser but it is
better if we are consistent. If we agree with this, then as a first
patch, we can make it to use Size in existing places and then
introduce this new function.

OK, here is the updated patchset.

* 0001 replaces int's with Size's in the existing code

Pushed this one.

* 0002 applies Peter's patch on top of 0001

Can't we use the existing function ReplicationOriginNameForTablesync()
by passing relid as InvalidOid for this purpose? We need a check
inside to decide which name to construct, otherwise, it should be
fine. If we agree with this, then we can change the name of the
function to something like ReplicationOriginNameForLogicalRep or
ReplicationOriginNameForLogicalRepWorkers.

--
With Regards,
Amit Kapila.

#7Peter Smith
smithpb2250@gmail.com
In reply to: Amit Kapila (#6)
Re: Add common function ReplicationOriginName.

On Wed, Sep 21, 2022 at 3:23 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

...

Can't we use the existing function ReplicationOriginNameForTablesync()
by passing relid as InvalidOid for this purpose? We need a check
inside to decide which name to construct, otherwise, it should be
fine. If we agree with this, then we can change the name of the
function to something like ReplicationOriginNameForLogicalRep or
ReplicationOriginNameForLogicalRepWorkers.

This suggestion attaches special meaning to the reild param.

Won't it seem a bit strange for the non-tablesync callers (who
probably have a perfectly valid 'relid') to have to pass an InvalidOid
relid just so they can format the correct origin name?

------
Kind Regards,
Peter Smith.
Fujitsu Australia

#8Amit Kapila
amit.kapila16@gmail.com
In reply to: Peter Smith (#7)
Re: Add common function ReplicationOriginName.

On Wed, Sep 21, 2022 at 3:09 PM Peter Smith <smithpb2250@gmail.com> wrote:

On Wed, Sep 21, 2022 at 3:23 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

...

Can't we use the existing function ReplicationOriginNameForTablesync()
by passing relid as InvalidOid for this purpose? We need a check
inside to decide which name to construct, otherwise, it should be
fine. If we agree with this, then we can change the name of the
function to something like ReplicationOriginNameForLogicalRep or
ReplicationOriginNameForLogicalRepWorkers.

This suggestion attaches special meaning to the reild param.

Won't it seem a bit strange for the non-tablesync callers (who
probably have a perfectly valid 'relid') to have to pass an InvalidOid
relid just so they can format the correct origin name?

For non-tablesync workers, relid should always be InvalidOid. See, how
we launch apply workers in ApplyLauncherMain(). Do you see any case
for non-tablesync workers where relid is not InvalidOid?

--
With Regards,
Amit Kapila.

#9Peter Smith
smithpb2250@gmail.com
In reply to: Amit Kapila (#8)
Re: Add common function ReplicationOriginName.

On Wed, Sep 21, 2022 at 8:08 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

On Wed, Sep 21, 2022 at 3:09 PM Peter Smith <smithpb2250@gmail.com> wrote:

On Wed, Sep 21, 2022 at 3:23 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

...

Can't we use the existing function ReplicationOriginNameForTablesync()
by passing relid as InvalidOid for this purpose? We need a check
inside to decide which name to construct, otherwise, it should be
fine. If we agree with this, then we can change the name of the
function to something like ReplicationOriginNameForLogicalRep or
ReplicationOriginNameForLogicalRepWorkers.

This suggestion attaches special meaning to the reild param.

Won't it seem a bit strange for the non-tablesync callers (who
probably have a perfectly valid 'relid') to have to pass an InvalidOid
relid just so they can format the correct origin name?

For non-tablesync workers, relid should always be InvalidOid. See, how
we launch apply workers in ApplyLauncherMain(). Do you see any case
for non-tablesync workers where relid is not InvalidOid?

Hmm, my mistake. I was thinking more of all the calls coming from the
subscriptioncmds.c, but now that I look at it maybe none of those has
any relid either.

OK, I can unify the 2 functions as you suggested. I will post another
patch in a few days.

------
Kind Regards,
Peter Smith,
Fujitsu Australia.

#10Peter Smith
smithpb2250@gmail.com
In reply to: Peter Smith (#9)
Re: Add common function ReplicationOriginName.

On Wed, Sep 21, 2022 at 8:22 PM Peter Smith <smithpb2250@gmail.com> wrote:

...

On Wed, Sep 21, 2022 at 3:23 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
...

Can't we use the existing function ReplicationOriginNameForTablesync()
by passing relid as InvalidOid for this purpose? We need a check
inside to decide which name to construct, otherwise, it should be
fine. If we agree with this, then we can change the name of the
function to something like ReplicationOriginNameForLogicalRep or
ReplicationOriginNameForLogicalRepWorkers.

...

OK, I can unify the 2 functions as you suggested. I will post another
patch in a few days.

PSA patch v3 to combine the different replication origin name
formatting in a single function ReplicationOriginNameForLogicalRep as
suggested.

------
Kind Regards,
Peter Smith.
Fujitsu Australia

Attachments:

v3-0001-Add-common-function-ReplicationOriginNameForLogic.patchapplication/octet-stream; name=v3-0001-Add-common-function-ReplicationOriginNameForLogic.patchDownload+38-25
#11Peter Smith
smithpb2250@gmail.com
In reply to: Aleksander Alekseev (#4)
Re: Add common function ReplicationOriginName.

On Tue, Sep 20, 2022 at 6:36 PM Aleksander Alekseev
<aleksander@timescale.com> wrote:

Hi Amit,

I think it is better to use Size. Even though, it may not fail now as
the size of names for origin will always be much lesser but it is
better if we are consistent. If we agree with this, then as a first
patch, we can make it to use Size in existing places and then
introduce this new function.

OK, here is the updated patchset.

* 0001 replaces int's with Size's in the existing code
* 0002 applies Peter's patch on top of 0001

Hi Aleksander.

FYI - although it is outside the scope of this thread, I did notice at
least one other example where you might want to substitute Size for
int in the same way as your v2-0001 patch did.

e.g. Just searching code for 'snprintf' where there is some parameter
for the size I quickly found:

File: src/bin/pg_dump/pg_dump_sort.c:

static void
describeDumpableObject(DumpableObject *obj, char *buf, int bufsize)

caller:
describeDumpableObject(loop[i], buf, sizeof(buf));

~~

I expect you can find more like just this if you look harder than I did.

------
Kind Regards,
Peter Smith.
Fujitsu Australia.

#12Aleksander Alekseev
aleksander@timescale.com
In reply to: Peter Smith (#11)
Re: Add common function ReplicationOriginName.

Hi Peter,

PSA patch v3 to combine the different replication origin name
formatting in a single function ReplicationOriginNameForLogicalRep as
suggested.

LGTM except for minor issues with the formatting; fixed.

I expect you can find more like just this if you look harder than I did.

Thanks. You are right, there are more places that pass int as the
second argument of *nprintf(). I used a command:

$ grep -r nprintf ./ | perl -lne 'print if($_ !~
/nprintf\([^\,]+,\s*(sizeof|([0-9A-Z_ \-]+\,))/ )' > found.txt

... and then re-checked the results manually. This excludes patterns
like *nprintf(..., sizeof(...)) and *nprintf(..., MACRO) and leaves
only something like *nprintf(..., variable). The cases where we
subtract an integer from a Size, etc were ignored.

I don't have a strong opinion on whether we should be really worried
by this. But in case we do, here is the patch. The order of 0001 and
0002 doesn't matter.

As I understand, ecpg uses size_t rather than Size, so for this
library I used size_t. Not 100% sure if the changes I made to
src/backend/utils/adt/jsonb.c add much value. I leave this to the
committer to decide.

--
Best regards,
Aleksander Alekseev

Attachments:

v4-0002-Add-common-function-ReplicationOriginNameForLogic.patchapplication/octet-stream; name=v4-0002-Add-common-function-ReplicationOriginNameForLogic.patchDownload+52-39
v4-0001-Pass-Size-size_t-as-a-2nd-argument-of-snprintf.patchapplication/octet-stream; name=v4-0001-Pass-Size-size_t-as-a-2nd-argument-of-snprintf.patchDownload+19-20
#13Amit Kapila
amit.kapila16@gmail.com
In reply to: Aleksander Alekseev (#12)
Re: Add common function ReplicationOriginName.

On Tue, Sep 27, 2022 at 5:04 PM Aleksander Alekseev
<aleksander@timescale.com> wrote:

Hi Peter,

PSA patch v3 to combine the different replication origin name
formatting in a single function ReplicationOriginNameForLogicalRep as
suggested.

LGTM except for minor issues with the formatting; fixed.

LGTM as well. I'll push this tomorrow unless there are any more comments.

--
With Regards,
Amit Kapila.

#14Amit Kapila
amit.kapila16@gmail.com
In reply to: Amit Kapila (#13)
Re: Add common function ReplicationOriginName.

On Mon, Oct 10, 2022 at 7:09 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

On Tue, Sep 27, 2022 at 5:04 PM Aleksander Alekseev
<aleksander@timescale.com> wrote:

Hi Peter,

PSA patch v3 to combine the different replication origin name
formatting in a single function ReplicationOriginNameForLogicalRep as
suggested.

LGTM except for minor issues with the formatting; fixed.

LGTM as well. I'll push this tomorrow unless there are any more comments.

Pushed.

--
With Regards,
Amit Kapila.

#15Aleksander Alekseev
aleksander@timescale.com
In reply to: Amit Kapila (#14)
Re: Add common function ReplicationOriginName.

Hi Amit,

Pushed.

Thanks!

I don't have a strong opinion on whether we should be really worried
by this. But in case we do, here is the patch.

This leaves us one patch to deal with.

--
Best regards,
Aleksander Alekseev

Attachments:

v4-0001-Pass-Size-size_t-as-a-2nd-argument-of-snprintf.patchapplication/octet-stream; name=v4-0001-Pass-Size-size_t-as-a-2nd-argument-of-snprintf.patchDownload+19-20
#16Tom Lane
tgl@sss.pgh.pa.us
In reply to: Aleksander Alekseev (#15)
Re: Add common function ReplicationOriginName.

Aleksander Alekseev <aleksander@timescale.com> writes:

This leaves us one patch to deal with.
[ v4-0001-Pass-Size-size_t-as-a-2nd-argument-of-snprintf.patch ]

I looked at this and am inclined to reject it. None of these
places realistically need to deal with strings longer than
MAXPATHLEN or so, let alone multiple gigabytes. So it's just
code churn, creating backpatch hazards (admittedly not big ones)
for no real gain.

regards, tom lane

#17Aleksander Alekseev
aleksander@timescale.com
In reply to: Tom Lane (#16)
Re: Add common function ReplicationOriginName.

Hi Tom,

I looked at this and am inclined to reject it. [...]

OK, thanks. Then we are done with this thread. I closed the
corresponding CF entry.

--
Best regards,
Aleksander Alekseev