initdb's -c option behaves wrong way?

Started by Kyotaro Horiguchiover 2 years ago17 messageshackers
Jump to latest
#1Kyotaro Horiguchi
horikyota.ntt@gmail.com

Hello.

I noticed that -c option of initdb behaves in an unexpected
manner. Identical variable names with variations in letter casing are
treated as distinct variables.

$ initdb -cwork_mem=100 -cWORK_MEM=1000 -cWork_mem=2000
...
$ grep -i 'work_mem ' $PGDATA/postgresql.conf
work_mem = 100 # min 64kB
#maintenance_work_mem = 64MB # min 1MB
#autovacuum_work_mem = -1 # min 1MB, or -1 to use maintenance_work_mem
#logical_decoding_work_mem = 64MB # min 64kB
WORK_MEM = 1000
Work_mem = 2000

The original intention was apparently to overwrite the existing
line. Furthermore, I surmise that preserving the original letter
casing is preferable.

Attached is a patch to address this issue. To retrieve the variable
name from the existing line, the code is slightly restructured.
Alternatively, should we just down-case the provided variable names?

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center

Attachments:

initdb-c_ignore_letter_case.patchtext/x-patch; charset=us-asciiDownload+30-16
#2Daniel Gustafsson
daniel@yesql.se
In reply to: Kyotaro Horiguchi (#1)
Re: initdb's -c option behaves wrong way?

On 28 Sep 2023, at 09:49, Kyotaro Horiguchi <horikyota.ntt@gmail.com> wrote:

I noticed that -c option of initdb behaves in an unexpected
manner. Identical variable names with variations in letter casing are
treated as distinct variables.

$ initdb -cwork_mem=100 -cWORK_MEM=1000 -cWork_mem=2000

The original intention was apparently to overwrite the existing
line. Furthermore, I surmise that preserving the original letter
casing is preferable.

Circling back to an old thread, I agree that this seems odd and the original
thread [0]/messages/by-id/2844176.1674681919@sss.pgh.pa.us makes no mention of it being intentional.

The patch seems fine to me, the attached version is rebased, pgindented and has
a test case added.

--
Daniel Gustafsson

[0]: /messages/by-id/2844176.1674681919@sss.pgh.pa.us

Attachments:

v2-0001-Make-initdb-c-option-case-insensitive.patchapplication/octet-stream; name=v2-0001-Make-initdb-c-option-case-insensitive.patch; x-unix-mode=0644Download+42-17
#3Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Daniel Gustafsson (#2)
Re: initdb's -c option behaves wrong way?

On 2024-Jan-16, Daniel Gustafsson wrote:

On 28 Sep 2023, at 09:49, Kyotaro Horiguchi <horikyota.ntt@gmail.com> wrote:

I noticed that -c option of initdb behaves in an unexpected
manner. Identical variable names with variations in letter casing are
treated as distinct variables.

$ initdb -cwork_mem=100 -cWORK_MEM=1000 -cWork_mem=2000

The original intention was apparently to overwrite the existing
line. Furthermore, I surmise that preserving the original letter
casing is preferable.

Circling back to an old thread, I agree that this seems odd and the original
thread [0] makes no mention of it being intentional.

Hmm, how about raising an error if multiple options are given targetting
the same GUC?

--
Álvaro Herrera Breisgau, Deutschland — https://www.EnterpriseDB.com/

#4Tom Lane
tgl@sss.pgh.pa.us
In reply to: Alvaro Herrera (#3)
Re: initdb's -c option behaves wrong way?

Alvaro Herrera <alvherre@alvh.no-ip.org> writes:

Hmm, how about raising an error if multiple options are given targetting
the same GUC?

I don't see any reason to do that. The underlying configuration
files don't complain about duplicate entries, they just take the
last setting.

regards, tom lane

#5Daniel Gustafsson
daniel@yesql.se
In reply to: Tom Lane (#4)
Re: initdb's -c option behaves wrong way?

On 17 Jan 2024, at 18:05, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Alvaro Herrera <alvherre@alvh.no-ip.org> writes:

Hmm, how about raising an error if multiple options are given targetting
the same GUC?

I don't see any reason to do that. The underlying configuration
files don't complain about duplicate entries, they just take the
last setting.

Agreed, I think the patch as it stands now where it replaces case insensitive,
while keeping the original casing, is the best path forward. The issue exist
in 16 as well so I propose a backpatch to there.

--
Daniel Gustafsson

#6Robert Haas
robertmhaas@gmail.com
In reply to: Daniel Gustafsson (#5)
Re: initdb's -c option behaves wrong way?

On Wed, Jan 17, 2024 at 2:31 PM Daniel Gustafsson <daniel@yesql.se> wrote:

Agreed, I think the patch as it stands now where it replaces case insensitive,
while keeping the original casing, is the best path forward. The issue exist
in 16 as well so I propose a backpatch to there.

I like that approach, too. I could go either way on back-patching. It
doesn't seem important, but it probably won't break anything, either.

--
Robert Haas
EDB: http://www.enterprisedb.com

#7Tom Lane
tgl@sss.pgh.pa.us
In reply to: Robert Haas (#6)
Re: initdb's -c option behaves wrong way?

Robert Haas <robertmhaas@gmail.com> writes:

On Wed, Jan 17, 2024 at 2:31 PM Daniel Gustafsson <daniel@yesql.se> wrote:

Agreed, I think the patch as it stands now where it replaces case insensitive,
while keeping the original casing, is the best path forward. The issue exist
in 16 as well so I propose a backpatch to there.

I like that approach, too. I could go either way on back-patching. It
doesn't seem important, but it probably won't break anything, either.

We just added this switch in 16, so I think backpatching to keep all
the branches consistent is a good idea.

However ... I don't like the patch much. It seems to have left
the code in a rather random state. Why, for example, didn't you
keep all the code that constructs the "newline" value together?
I'm also unconvinced by the removal of the "assume there's only
one match" break --- if we need to support multiple matches
I doubt that's sufficient.

regards, tom lane

#8Tom Lane
tgl@sss.pgh.pa.us
In reply to: Tom Lane (#7)
Re: initdb's -c option behaves wrong way?

I wrote:

However ... I don't like the patch much. It seems to have left
the code in a rather random state. Why, for example, didn't you
keep all the code that constructs the "newline" value together?

After thinking about it a bit more, I don't see why you didn't just
s/strncmp/strncasecmp/ and call it good. The messiness seems to be
a result of your choice to replace the GUC's case as shown in the
file with the case used on the command line, which is not better
IMO. We don't change our mind about the canonical spelling of a
GUC because somebody varied the case in a SET command.

regards, tom lane

#9Daniel Gustafsson
daniel@yesql.se
In reply to: Tom Lane (#8)
Re: initdb's -c option behaves wrong way?

On 17 Jan 2024, at 21:33, Tom Lane <tgl@sss.pgh.pa.us> wrote:

I wrote:

However ... I don't like the patch much. It seems to have left
the code in a rather random state. Why, for example, didn't you
keep all the code that constructs the "newline" value together?

After thinking about it a bit more, I don't see why you didn't just
s/strncmp/strncasecmp/ and call it good. The messiness seems to be
a result of your choice to replace the GUC's case as shown in the
file with the case used on the command line, which is not better
IMO. We don't change our mind about the canonical spelling of a
GUC because somebody varied the case in a SET command.

The original patch was preserving the case of the file throwing away the case
from the commandline. The attached is a slimmed down version which only moves
the assembly of newline to the different cases (replace vs. new) keeping the
rest of the code intact. Keeping it in one place gets sort of messy too since
it needs to use different values for a replacement and a new variable. Not
sure if there is a cleaner approach?

--
Daniel Gustafsson

Attachments:

v3-0001-Make-initdb-c-option-case-insensitive.patchapplication/octet-stream; name=v3-0001-Make-initdb-c-option-case-insensitive.patch; x-unix-mode=0644Download+25-8
#10Kyotaro Horiguchi
horikyota.ntt@gmail.com
In reply to: Daniel Gustafsson (#9)
Re: initdb's -c option behaves wrong way?

Thank you for upicking this up.

At Wed, 17 Jan 2024 23:47:41 +0100, Daniel Gustafsson <daniel@yesql.se> wrote in

On 17 Jan 2024, at 21:33, Tom Lane <tgl@sss.pgh.pa.us> wrote:

I wrote:

However ... I don't like the patch much. It seems to have left
the code in a rather random state. Why, for example, didn't you
keep all the code that constructs the "newline" value together?

After thinking about it a bit more, I don't see why you didn't just
s/strncmp/strncasecmp/ and call it good. The messiness seems to be
a result of your choice to replace the GUC's case as shown in the
file with the case used on the command line, which is not better
IMO. We don't change our mind about the canonical spelling of a
GUC because somebody varied the case in a SET command.

The original patch was preserving the case of the file throwing away the case
from the commandline. The attached is a slimmed down version which only moves
the assembly of newline to the different cases (replace vs. new) keeping the
rest of the code intact. Keeping it in one place gets sort of messy too since
it needs to use different values for a replacement and a new variable. Not
sure if there is a cleaner approach?

Just to clarify, the current code breaks out after processing the
first matching line. I haven't changed that behavior. The reason I
moved the rewrite processing code out of the loop was I wanted to
avoid adding more lines that are executed only once into the
loop. However, it is in exchange of additional complexity to remember
the original spelling of the parameter name. Personally, I believe
separating the search and rewrite processing is better, but I'm not
particularly attached to the approach, so either way is fine with me.

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center

#11Daniel Gustafsson
daniel@yesql.se
In reply to: Kyotaro Horiguchi (#10)
Re: initdb's -c option behaves wrong way?

On 18 Jan 2024, at 05:49, Kyotaro Horiguchi <horikyota.ntt@gmail.com> wrote:

Thank you for upicking this up.

At Wed, 17 Jan 2024 23:47:41 +0100, Daniel Gustafsson <daniel@yesql.se> wrote in

On 17 Jan 2024, at 21:33, Tom Lane <tgl@sss.pgh.pa.us> wrote:

I wrote:

However ... I don't like the patch much. It seems to have left
the code in a rather random state. Why, for example, didn't you
keep all the code that constructs the "newline" value together?

After thinking about it a bit more, I don't see why you didn't just
s/strncmp/strncasecmp/ and call it good. The messiness seems to be
a result of your choice to replace the GUC's case as shown in the
file with the case used on the command line, which is not better
IMO. We don't change our mind about the canonical spelling of a
GUC because somebody varied the case in a SET command.

The original patch was preserving the case of the file throwing away the case
from the commandline. The attached is a slimmed down version which only moves
the assembly of newline to the different cases (replace vs. new) keeping the
rest of the code intact. Keeping it in one place gets sort of messy too since
it needs to use different values for a replacement and a new variable. Not
sure if there is a cleaner approach?

Just to clarify, the current code breaks out after processing the
first matching line. I haven't changed that behavior.

Yup.

The reason I
moved the rewrite processing code out of the loop was I wanted to
avoid adding more lines that are executed only once into the
loop. However, it is in exchange of additional complexity to remember
the original spelling of the parameter name. Personally, I believe
separating the search and rewrite processing is better, but I'm not
particularly attached to the approach, so either way is fine with me.

I'll give some more time for opinions, then I'll go ahead with one of the
patches with a backpatch to v16.

--
Daniel Gustafsson

#12Tom Lane
tgl@sss.pgh.pa.us
In reply to: Daniel Gustafsson (#11)
Re: initdb's -c option behaves wrong way?

Daniel Gustafsson <daniel@yesql.se> writes:

I'll give some more time for opinions, then I'll go ahead with one of the
patches with a backpatch to v16.

OK, I take back my previous complaint that just using strncasecmp
would be enough --- I was misremembering how the code worked, and
you're right that it would use the spelling from the command line
rather than that from the file.

However, the v3 patch is flat broken. You can't assume you have
found a match until you've verified that whitespace and '='
appear next --- otherwise, you'll be fooled by a guc_name that
is a prefix of one that appears in the file. I think the simplest
change that does it correctly is as attached.

Now, since I was the one who wrote the existing code, I freely
concede that I may have too high an opinion of its readability.
Maybe some larger refactoring is appropriate. But I didn't
find that what you'd done improved the readability. I'd still
rather keep the newline-assembly code together as much as possible.
Maybe we should do the search part before we build any of newline?

regards, tom lane

Attachments:

v4-0001-Make-initdb-c-option-case-insensitive.patchtext/x-diff; charset=us-ascii; name=v4-0001-Make-initdb-c-option-case-insensitive.patchDownload+20-2
#13Daniel Gustafsson
daniel@yesql.se
In reply to: Tom Lane (#12)
Re: initdb's -c option behaves wrong way?

On 19 Jan 2024, at 17:33, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Daniel Gustafsson <daniel@yesql.se> writes:

I'll give some more time for opinions, then I'll go ahead with one of the
patches with a backpatch to v16.

OK, I take back my previous complaint that just using strncasecmp
would be enough --- I was misremembering how the code worked, and
you're right that it would use the spelling from the command line
rather than that from the file.

However, the v3 patch is flat broken. You can't assume you have
found a match until you've verified that whitespace and '='
appear next --- otherwise, you'll be fooled by a guc_name that
is a prefix of one that appears in the file. I think the simplest
change that does it correctly is as attached.

The attached v4 looks good to me, I don't think it moves the needle wrt
readability (ie, no need to move the search). Feel free to go ahead with that
version, or let me know if you want me to deal with it.

--
Daniel Gustafsson

#14Daniel Gustafsson
daniel@yesql.se
In reply to: Daniel Gustafsson (#13)
Re: initdb's -c option behaves wrong way?

On 22 Jan 2024, at 11:09, Daniel Gustafsson <daniel@yesql.se> wrote:

Feel free to go ahead with that
version, or let me know if you want me to deal with it.

I took the liberty to add this to the upcoming CF to make sure we don't lose
track of it.

--
Daniel Gustafsson

#15Tom Lane
tgl@sss.pgh.pa.us
In reply to: Daniel Gustafsson (#14)
Re: initdb's -c option behaves wrong way?

Daniel Gustafsson <daniel@yesql.se> writes:

I took the liberty to add this to the upcoming CF to make sure we don't lose
track of it.

Thanks for doing that, because the cfbot pointed out a problem:
I should have written pg_strncasecmp not strncasecmp. If this
version tests cleanly, I'll push it.

regards, tom lane

Attachments:

v5-0001-Make-initdb-c-option-case-insensitive.patchtext/x-diff; charset=us-ascii; name=v5-0001-Make-initdb-c-option-case-insensitive.patchDownload+20-2
#16Daniel Gustafsson
daniel@yesql.se
In reply to: Tom Lane (#15)
Re: initdb's -c option behaves wrong way?

On 4 Mar 2024, at 02:01, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Daniel Gustafsson <daniel@yesql.se> writes:

I took the liberty to add this to the upcoming CF to make sure we don't lose
track of it.

Thanks for doing that, because the cfbot pointed out a problem:
I should have written pg_strncasecmp not strncasecmp. If this
version tests cleanly, I'll push it.

+1, LGTM.

--
Daniel Gustafsson

#17Kyotaro Horiguchi
horikyota.ntt@gmail.com
In reply to: Daniel Gustafsson (#16)
Re: initdb's -c option behaves wrong way?

At Mon, 4 Mar 2024 09:39:39 +0100, Daniel Gustafsson <daniel@yesql.se> wrote in

On 4 Mar 2024, at 02:01, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Daniel Gustafsson <daniel@yesql.se> writes:

I took the liberty to add this to the upcoming CF to make sure we don't lose
track of it.

Thanks for doing that, because the cfbot pointed out a problem:
I should have written pg_strncasecmp not strncasecmp. If this
version tests cleanly, I'll push it.

+1, LGTM.

Thank you for fixing this, Tom!

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center