ArchiveEntry optional arguments refactoring
Hi,
During the discussion in [1]/messages/by-id/20180703070645.wchpu5muyto5n647@alap3.anarazel.de an idea about refactoring ArchiveEntry was
suggested. The reason is that currently this function has significant number of
arguments that are "optional", and every change that has to deal with it
introduces a lot of useless diffs. In the thread, mentioned above, such an
example is tracking current table access method, and I guess "Remove WITH OIDS"
commit 578b229718e is also similar.
Proposed idea is to refactor out all/optional arguments into a separate data
structure, so that adding/removing a new argument wouldn't change that much of
unrelated code. Then for every invocation of ArchiveEntry this structure needs
to be prepared before the call, or as Andres suggested:
ArchiveEntry((ArchiveArgs){.tablespace = 3,
.dumpFn = somefunc,
...});
Another suggestion from Amit is to have an ArchiveEntry() function with limited
number of parameters, and an ArchiveEntryEx() with those extra parameters which
are not needed in usual cases.
I want to prepare a patch for that, and I'm inclined to go with the first
option, but since there are two solutions to choose from, I would love to hear
more opinion about this topic. Any pros/cons we don't see yet?
[1]: /messages/by-id/20180703070645.wchpu5muyto5n647@alap3.anarazel.de
On Wed, Jan 16, 2019 at 1:16 PM Dmitry Dolgov <9erthalion6@gmail.com> wrote:
Hi,
During the discussion in [1] an idea about refactoring ArchiveEntry was
suggested. The reason is that currently this function has significant number of
arguments that are "optional", and every change that has to deal with it
introduces a lot of useless diffs. In the thread, mentioned above, such an
example is tracking current table access method, and I guess "Remove WITH OIDS"
commit 578b229718e is also similar.Proposed idea is to refactor out all/optional arguments into a separate data
structure, so that adding/removing a new argument wouldn't change that much of
unrelated code. Then for every invocation of ArchiveEntry this structure needs
to be prepared before the call, or as Andres suggested:ArchiveEntry((ArchiveArgs){.tablespace = 3,
.dumpFn = somefunc,
...});Another suggestion from Amit is to have an ArchiveEntry() function with limited
number of parameters, and an ArchiveEntryEx() with those extra parameters which
are not needed in usual cases.I want to prepare a patch for that, and I'm inclined to go with the first
option, but since there are two solutions to choose from, I would love to hear
more opinion about this topic. Any pros/cons we don't see yet?[1]: /messages/by-id/20180703070645.wchpu5muyto5n647@alap3.anarazel.de
[CC Andres and Amit]
On 2019-Jan-16, Dmitry Dolgov wrote:
Proposed idea is to refactor out all/optional arguments into a separate data
structure, so that adding/removing a new argument wouldn't change that much of
unrelated code. Then for every invocation of ArchiveEntry this structure needs
to be prepared before the call, or as Andres suggested:ArchiveEntry((ArchiveArgs){.tablespace = 3,
.dumpFn = somefunc,
...});
Prepping the struct before the call would be our natural style, I think.
This one where the struct is embedded in the function call does not look
*too* horrible, but I'm curious as to what does pgindent do with it.
Another suggestion from Amit is to have an ArchiveEntry() function with limited
number of parameters, and an ArchiveEntryEx() with those extra parameters which
are not needed in usual cases.
Is there real savings to be had by doing this? What would be the
arguments to each function? Off-hand, I'm not liking this idea too
much. But maybe we can combine both ideas and have one "normal"
function with only the most common args, and create ArchiveEntryExtended
to use the struct as proposed by Andres.
--
�lvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Alvaro Herrera <alvherre@2ndquadrant.com> writes:
On 2019-Jan-16, Dmitry Dolgov wrote:
ArchiveEntry((ArchiveArgs){.tablespace = 3,
.dumpFn = somefunc,
...});
Is there real savings to be had by doing this? What would be the
arguments to each function? Off-hand, I'm not liking this idea too
much.
I'm not either. What this looks like it will mainly do is create
a back-patching barrier, with little if any readability improvement.
I don't buy the argument that this would move the goalposts in terms
of how much work it is to add a new argument. You'd still end up
touching every call site.
regards, tom lane
On 2019-01-17 10:23:39 -0500, Tom Lane wrote:
Alvaro Herrera <alvherre@2ndquadrant.com> writes:
On 2019-Jan-16, Dmitry Dolgov wrote:
ArchiveEntry((ArchiveArgs){.tablespace = 3,
.dumpFn = somefunc,
...});Is there real savings to be had by doing this? What would be the
arguments to each function? Off-hand, I'm not liking this idea too
much.I'm not either. What this looks like it will mainly do is create
a back-patching barrier, with little if any readability improvement.
I don't really buy this. We've whacked around the arguments to
ArchiveEntry() repeatedly over the last few releases, so there's already
a hindrance to backpatching. And given the current setup we've to whack
around all 70+ callsites whenever a single argument is added. With the
setup I'd suggested you don't, because the designated initializer syntax
allows you to omit items that ought to be zero-initialized.
And given the number of arguments to ArchiveEntry() having a name for
each argument would help for readability too. It's currently not exactly
obvious what is an argument for what:
ArchiveEntry(AH, nilCatalogId, createDumpId(),
"ENCODING", NULL, NULL, "",
"ENCODING", SECTION_PRE_DATA,
qry->data, "", NULL,
NULL, 0,
NULL, NULL);
If you compare that with
ArchiveEntry(AH,
(ArchiveEntry){.catalogId = nilCatalogId,
.dumpId = createDumpId(),
.tag = "ENCODING",
.desc = "ENCODING",
.section = SECTION_PRE_DATA,
.defn = qry->data});
it's definitely easier to see what argument is what.
I don't buy the argument that this would move the goalposts in terms
of how much work it is to add a new argument. You'd still end up
touching every call site.
Why? A lot of arguments that'd be potentially added or removed would not
be set by each callsites.
If you e.g. look at
you can see that a lot of changes where like
ArchiveEntry(fout, nilCatalogId, createDumpId(),
"pg_largeobject", NULL, NULL, "",
- false, "pg_largeobject", SECTION_PRE_DATA,
+ "pg_largeobject", SECTION_PRE_DATA,
loOutQry->data, "", NULL,
NULL, 0,
NULL, NULL);
i.e. just removing a 'false' argument. In like 70+ callsites. With the
above scheme, we'd instead just have removed a single .withoids = true,
from dumpTableSchema()'s ArchiveEntry() call.
Greetings,
Andres Freund
Greetings,
* Andres Freund (andres@anarazel.de) wrote:
On 2019-01-17 10:23:39 -0500, Tom Lane wrote:
Alvaro Herrera <alvherre@2ndquadrant.com> writes:
On 2019-Jan-16, Dmitry Dolgov wrote:
ArchiveEntry((ArchiveArgs){.tablespace = 3,
.dumpFn = somefunc,
...});Is there real savings to be had by doing this? What would be the
arguments to each function? Off-hand, I'm not liking this idea too
much.I'm not either. What this looks like it will mainly do is create
a back-patching barrier, with little if any readability improvement.I don't really buy this. We've whacked around the arguments to
ArchiveEntry() repeatedly over the last few releases, so there's already
a hindrance to backpatching. And given the current setup we've to whack
around all 70+ callsites whenever a single argument is added. With the
setup I'd suggested you don't, because the designated initializer syntax
allows you to omit items that ought to be zero-initialized.And given the number of arguments to ArchiveEntry() having a name for
each argument would help for readability too. It's currently not exactly
obvious what is an argument for what:
ArchiveEntry(AH, nilCatalogId, createDumpId(),
"ENCODING", NULL, NULL, "",
"ENCODING", SECTION_PRE_DATA,
qry->data, "", NULL,
NULL, 0,
NULL, NULL);If you compare that with
ArchiveEntry(AH,
(ArchiveEntry){.catalogId = nilCatalogId,
.dumpId = createDumpId(),
.tag = "ENCODING",
.desc = "ENCODING",
.section = SECTION_PRE_DATA,
.defn = qry->data});it's definitely easier to see what argument is what.
+1. I was on the fence about this approach when David started using it
in pgBackRest but I've come to find that it's actually pretty nice and
being able to omit things which should be zero/default is very nice. I
feel like it's quite similar to what we do in other places too- just
look for things like:
utils/adt/jsonfuncs.c:600
sem = palloc0(sizeof(JsonSemAction));
...
sem->semstate = (void *) state;
sem->array_start = okeys_array_start;
sem->scalar = okeys_scalar;
sem->object_field_start = okeys_object_field_start;
/* remainder are all NULL, courtesy of palloc0 above */
pg_parse_json(lex, sem);
...
pfree(sem);
I don't buy the argument that this would move the goalposts in terms
of how much work it is to add a new argument. You'd still end up
touching every call site.Why? A lot of arguments that'd be potentially added or removed would not
be set by each callsites.If you e.g. look at
you can see that a lot of changes where like ArchiveEntry(fout, nilCatalogId, createDumpId(), "pg_largeobject", NULL, NULL, "", - false, "pg_largeobject", SECTION_PRE_DATA, + "pg_largeobject", SECTION_PRE_DATA, loOutQry->data, "", NULL, NULL, 0, NULL, NULL);i.e. just removing a 'false' argument. In like 70+ callsites. With the
above scheme, we'd instead just have removed a single .withoids = true,
from dumpTableSchema()'s ArchiveEntry() call.
Agreed. Using this approach in more places, when appropriate and
sensible, seems like a good direction to go in. To be clear, I don't
think we should go rewrite pieces of code just for the sake of it as
that would make back-patching more difficult, but when we're making
changes anyway, or where it wouldn't really change the landscape for
back-patching, then it seems like a good change.
Thanks!
Stephen
Hi,
On 2019-01-17 09:29:04 -0800, Andres Freund wrote:
On 2019-01-17 10:23:39 -0500, Tom Lane wrote:
I don't buy the argument that this would move the goalposts in terms
of how much work it is to add a new argument. You'd still end up
touching every call site.Why? A lot of arguments that'd be potentially added or removed would not
be set by each callsites.If you e.g. look at
you can see that a lot of changes where like ArchiveEntry(fout, nilCatalogId, createDumpId(), "pg_largeobject", NULL, NULL, "", - false, "pg_largeobject", SECTION_PRE_DATA, + "pg_largeobject", SECTION_PRE_DATA, loOutQry->data, "", NULL, NULL, 0, NULL, NULL);i.e. just removing a 'false' argument. In like 70+ callsites. With the
above scheme, we'd instead just have removed a single .withoids = true,
from dumpTableSchema()'s ArchiveEntry() call.
the "at" I was trying to reference above is
578b229718e8f15fa779e20f086c4b6bb3776106 / the WITH OID removal, and
therein specifically the pg_dump changes.
Greetings,
Andres Freund
On Wed, 16 Jan 2019 at 17:45, Dmitry Dolgov <9erthalion6@gmail.com> wrote:
Hi,
During the discussion in [1] an idea about refactoring ArchiveEntry was
suggested. The reason is that currently this function has significant number of
arguments that are "optional", and every change that has to deal with it
introduces a lot of useless diffs. In the thread, mentioned above, such an
example is tracking current table access method, and I guess "Remove WITH OIDS"
commit 578b229718e is also similar.Proposed idea is to refactor out all/optional arguments into a separate data
structure, so that adding/removing a new argument wouldn't change that much of
unrelated code. Then for every invocation of ArchiveEntry this structure needs
to be prepared before the call, or as Andres suggested:ArchiveEntry((ArchiveArgs){.tablespace = 3,
.dumpFn = somefunc,
...});
I didn't know we could do it this way. I thought we would have to
declare a variable and have to initialize fields with non-const values
separately. This looks nice. We could even initialize fields with
non-const values. +1 from me.
I think, we could use the same TocEntry structure as parameter, rather
than a new structure. Most of the arguments already resemble fields of
this structure. Also, we could pass pointer to that structure :
ArchiveEntry( &(TocEntry){.tablespace = 3,
.dumpFn = somefunc,
...});
--
Thanks,
-Amit Khandekar
EnterpriseDB Corporation
The Postgres Database Company
On 2019-01-17 09:29:04 -0800, Andres Freund wrote:
On 2019-01-17 10:23:39 -0500, Tom Lane wrote:I don't buy the argument that this would move the goalposts in terms
of how much work it is to add a new argument. You'd still end up
touching every call site.Why? A lot of arguments that'd be potentially added or removed would not
be set by each callsites.If you e.g. look at
you can see that a lot of changes where like ArchiveEntry(fout, nilCatalogId, createDumpId(), "pg_largeobject", NULL, NULL, "", - false, "pg_largeobject", SECTION_PRE_DATA, + "pg_largeobject", SECTION_PRE_DATA, loOutQry->data, "", NULL, NULL, 0, NULL, NULL);i.e. just removing a 'false' argument. In like 70+ callsites. With the
above scheme, we'd instead just have removed a single .withoids = true,
from dumpTableSchema()'s ArchiveEntry() call.
To make this discussion a bit more specific, I've created a patch of how it can
look like. All the arguments, except Archive, CatalogId and DumpId I've moved
into the ArchiveOpts structure. Not all of them could be empty before, but
anyway it seems better for consistency and readability. Some of the arguments
had empty string as a default value, I haven't changed anything here yet
(although this mixture of NULL and "" in ArchiveEntry looks a bit confusing).
As Andres mentioned above, for 578b229718e / the WITH OID removal and pg_dump
modification from pluggable storage thread, this patch reduces number of
changes, related to ArchiveEntry, from 70+ to just one.
Attachments:
0001-ArchiveOpts-structure.patchapplication/octet-stream; name=0001-ArchiveOpts-structure.patchDownload+624-540
Hi,
On 2019-01-23 16:12:15 +0100, Dmitry Dolgov wrote:
To make this discussion a bit more specific, I've created a patch of how it can
look like.
Thanks.
All the arguments, except Archive, CatalogId and DumpId I've moved
into the ArchiveOpts structure. Not all of them could be empty before, but
anyway it seems better for consistency and readability. Some of the arguments
had empty string as a default value, I haven't changed anything here yet
(although this mixture of NULL and "" in ArchiveEntry looks a bit confusing).
Probably worth changing at the same time, if we decide to go for it.
To me this does look like it'd be more maintainable going forward.
Greetings,
Andres Freund
Hello
On 2019-Jan-23, Andres Freund wrote:
All the arguments, except Archive, CatalogId and DumpId I've moved
into the ArchiveOpts structure. Not all of them could be empty before, but
anyway it seems better for consistency and readability. Some of the arguments
had empty string as a default value, I haven't changed anything here yet
(although this mixture of NULL and "" in ArchiveEntry looks a bit confusing).Probably worth changing at the same time, if we decide to go for it.
To me this does look like it'd be more maintainable going forward.
It does. How does pgindent behave with it?
I'd use ArchiveEntryOpts as struct name; ArchiveOpts sounds wrong. Also,
the struct members could use better names -- "defn" for example could
perhaps be "createStmt" (to match dropStmt/copyStmt), and expand "desc"
to "description".
--
�lvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On 1/23/19 10:12 AM, Dmitry Dolgov wrote:
To make this discussion a bit more specific, I've created a patch of how
it can look like.
A little bit of vararg-macro action can make such a design look
even tidier, cf. [1]https://github.com/NetBSD/src/blob/trunk/sys/sys/midiio.h#L709.
Or are compilers without vararg macros still in the supported mix?
-Chap
[1]: https://github.com/NetBSD/src/blob/trunk/sys/sys/midiio.h#L709
The macros in [1]https://github.com/NetBSD/src/blob/trunk/sys/sys/midiio.h#L709 are not defined to create a function call, but only
the argument structure because there might be several functions to pass
it to, so a call would be written like func(&SEQ_MK_CHN(NOTEON, ...)).
In ArchiveEntry's case, if there's only one function involved, there'd
be no reason not to have a macro produce the whole call.
Chapman Flack <chap@anastigmatix.net> writes:
Or are compilers without vararg macros still in the supported mix?
No.
regards, tom lane
Hi,
On 2019-01-23 13:58:07 -0300, Alvaro Herrera wrote:
I'd use ArchiveEntryOpts as struct name; ArchiveOpts sounds wrong.
Brevity would be of some advantage IMO, because it'll probably determine
how pgindent indents the arguments, because the struct name will be in
the arguments.
Also, the struct members could use better names -- "defn" for example
could perhaps be "createStmt" (to match dropStmt/copyStmt), and expand
"desc" to "description".
True.
Greetings,
Andres Freund
Hi,
On 2019-01-23 12:05:10 -0500, Chapman Flack wrote:
On 1/23/19 10:12 AM, Dmitry Dolgov wrote:
To make this discussion a bit more specific, I've created a patch of how
it can look like.
A little bit of vararg-macro action can make such a design look
even tidier, cf. [1].
[1] https://github.com/NetBSD/src/blob/trunk/sys/sys/midiio.h#L709The macros in [1] are not defined to create a function call, but only
the argument structure because there might be several functions to pass
it to, so a call would be written like func(&SEQ_MK_CHN(NOTEON, ...)).In ArchiveEntry's case, if there's only one function involved, there'd
be no reason not to have a macro produce the whole call.
I'm not really seeing this being more than obfuscation in this case. The
only point of the macro is to set the .tag and .op elements to something
without adding redundancies due to the struct name. Which we'd not have.
Greetings,
Andres Freund
On 1/23/19 12:10 PM, Andres Freund wrote:
On 2019-01-23 12:05:10 -0500, Chapman Flack wrote:
[1] https://github.com/NetBSD/src/blob/trunk/sys/sys/midiio.h#L709
I'm not really seeing this being more than obfuscation in this case. The
only point of the macro is to set the .tag and .op elements to something
without adding redundancies due to the struct name. Which we'd not have.
Granted, that example is more elaborate than this case, but writing
ArchiveEntry(fout, dbCatId, dbDumpId, .tag = datname, .owner = dba,
.desc = "DATABASE", .section = SECTION_PRE_DATA,
.defn = creaQry->data, .dropStmt = delQry->data);
instead of
ArchiveEntry(fout, dbCatId, dbDumpId, &(ArchiveOpts){.tag = datname,
.owner = dba, .desc = "DATABASE",
.section = SECTION_PRE_DATA, .defn = creaQry->data,
.dropStmt = delQry->data});
would be easy, and still save a bit of visual noise.
Regards,
-Chap
On 2019-01-23 13:58:07 -0300, Alvaro Herrera wrote:
Hello
On 2019-Jan-23, Andres Freund wrote:
All the arguments, except Archive, CatalogId and DumpId I've moved
into the ArchiveOpts structure. Not all of them could be empty before, but
anyway it seems better for consistency and readability. Some of the arguments
had empty string as a default value, I haven't changed anything here yet
(although this mixture of NULL and "" in ArchiveEntry looks a bit confusing).Probably worth changing at the same time, if we decide to go for it.
To me this does look like it'd be more maintainable going forward.
It does. How does pgindent behave with it?
It craps out:
Error@3649: Unbalanced parens
Warning@3657: Extra )
But that can be worked around with something like
te = ArchiveEntry(fout, tdinfo->dobj.catId, tdinfo->dobj.dumpId,
ARCHIVE_ARGS(.tag = tbinfo->dobj.name,
.namespace = tbinfo->dobj.namespace->dobj.name,
.owner = tbinfo->rolname,
.desc = "TABLE DATA",
.section = SECTION_DATA,
.copyStmt = copyStmt,
.deps = &(tbinfo->dobj.dumpId),
.nDeps = 1,
.dumpFn = dumpFn,
.dumpArg = tdinfo,
));
which looks mildly simpler too.
Greetings,
Andres Freund
On 2019-01-23 12:22:23 -0500, Chapman Flack wrote:
On 1/23/19 12:10 PM, Andres Freund wrote:
On 2019-01-23 12:05:10 -0500, Chapman Flack wrote:
[1] https://github.com/NetBSD/src/blob/trunk/sys/sys/midiio.h#L709
I'm not really seeing this being more than obfuscation in this case. The
only point of the macro is to set the .tag and .op elements to something
without adding redundancies due to the struct name. Which we'd not have.Granted, that example is more elaborate than this case, but writing
ArchiveEntry(fout, dbCatId, dbDumpId, .tag = datname, .owner = dba,
.desc = "DATABASE", .section = SECTION_PRE_DATA,
.defn = creaQry->data, .dropStmt = delQry->data);instead of
ArchiveEntry(fout, dbCatId, dbDumpId, &(ArchiveOpts){.tag = datname,
.owner = dba, .desc = "DATABASE",
.section = SECTION_PRE_DATA, .defn = creaQry->data,
.dropStmt = delQry->data});would be easy, and still save a bit of visual noise.
IDK, it'd be harder to parse correctly as a C programmer though. I'm up
with a wrapper macro like
#define ARCHIVE_ARGS(...) &(ArchiveOpts){__VA_ARGS__}
but weirdly mixing struct arguments and normal function arguments seems
quite confusing.
Greetings,
Andres Freund
Andres Freund <andres@anarazel.de> writes:
On 2019-01-23 13:58:07 -0300, Alvaro Herrera wrote:
It does. How does pgindent behave with it?
It craps out:
Error@3649: Unbalanced parens
Warning@3657: Extra )
But that can be worked around with something like
te = ArchiveEntry(fout, tdinfo->dobj.catId, tdinfo->dobj.dumpId,
ARCHIVE_ARGS(.tag = tbinfo->dobj.name,
.namespace = tbinfo->dobj.namespace->dobj.name,
.owner = tbinfo->rolname,
.desc = "TABLE DATA",
.section = SECTION_DATA,
.copyStmt = copyStmt,
.deps = &(tbinfo->dobj.dumpId),
.nDeps = 1,
.dumpFn = dumpFn,
.dumpArg = tdinfo,
));
which looks mildly simpler too.
That looks fairly reasonable from here, but I'd suggest
ARCHIVE_OPTS rather than ARCHIVE_ARGS.
Can we omit the initial dots if we use a wrapper macro? Would it be
a good idea to do so (I'm not really sure)?
regards, tom lane
On 2019-01-23 12:32:06 -0500, Tom Lane wrote:
Andres Freund <andres@anarazel.de> writes:
On 2019-01-23 13:58:07 -0300, Alvaro Herrera wrote:
It does. How does pgindent behave with it?
It craps out:
Error@3649: Unbalanced parens
Warning@3657: Extra )But that can be worked around with something like
te = ArchiveEntry(fout, tdinfo->dobj.catId, tdinfo->dobj.dumpId,
ARCHIVE_ARGS(.tag = tbinfo->dobj.name,
.namespace = tbinfo->dobj.namespace->dobj.name,
.owner = tbinfo->rolname,
.desc = "TABLE DATA",
.section = SECTION_DATA,
.copyStmt = copyStmt,
.deps = &(tbinfo->dobj.dumpId),
.nDeps = 1,
.dumpFn = dumpFn,
.dumpArg = tdinfo,
));
which looks mildly simpler too.That looks fairly reasonable from here, but I'd suggest
ARCHIVE_OPTS rather than ARCHIVE_ARGS.
WFM. Seems quite possible that we'd grow a few more of these over time,
so establishing some common naming seems good.
Btw, do you have an opionion on keeping catId / dumpId outside/inside
the argument struct?
Can we omit the initial dots if we use a wrapper macro? Would it be
a good idea to do so (I'm not really sure)?
Not easily, if at all, I think. We'd have to do a fair bit of weird
macro magic (and then still end up with limitations) to "process" each
argument individually. And even if it were easy, I don't think it's
particularly advantageous.
Greetings,
Andres Freund