Proposal: Make use of C99 designated initialisers for nulls/values arrays
Dear Hackers,
I have identified some OSS code which maybe can make use of C99 designated initialisers for nulls/values arrays.
~
Background:
There are lots of tuple operations where arrays of values and flags are being passed.
Typically these arrays are being previously initialised 0/false by memset.
By modifying code to use C99 designated initialiser syntax [1]REF C99 [$6.7.8/21] If there are fewer initializers in a brace-enclosed list than there are elements or members of an aggregate, or fewer characters in a string literal used to initialize an array of known size than there are elements in the array, the remainder of the aggregate shall be initialized implicitly the same as objects that have static storage duration, most of these memsets can become redundant.
Actually, this mechanism is already being used in some of the existing OSS code. This patch/proposal just propagates the same idea to all other similar places I could find.
~
Result:
Less code. Removes ~200 unnecessary memsets.
More consistent initialisation.
~
Typical Example:
Before:
Datum values[Natts_pg_attribute];
bool nulls[Natts_pg_attribute];
...
memset(values, 0, sizeof(values));
memset(nulls, false, sizeof(nulls));
After:
Datum values[Natts_pg_attribute] = {0};
bool nulls[Natts_pg_attribute] = {0};
---
[1]: REF C99 [$6.7.8/21] If there are fewer initializers in a brace-enclosed list than there are elements or members of an aggregate, or fewer characters in a string literal used to initialize an array of known size than there are elements in the array, the remainder of the aggregate shall be initialized implicitly the same as objects that have static storage duration
or fewer characters in a string literal used to initialize an array of known size than there are elements in the array,
the remainder of the aggregate shall be initialized implicitly the same as objects that have static storage duration
~
Please refer to the attached patch.
Kind Regards,
---
Peter Smith
Fujitsu Australia
Attachments:
init_nulls.patchapplication/octet-stream; name=init_nulls.patchDownload+204-475
On Tue, Oct 1, 2019 at 1:25 PM Smith, Peter <peters@fast.au.fujitsu.com> wrote:
Dear Hackers,
I have identified some OSS code which maybe can make use of C99 designated initialisers for nulls/values arrays.
~
Background:
There are lots of tuple operations where arrays of values and flags are being passed.
Typically these arrays are being previously initialised 0/false by memset.
By modifying code to use C99 designated initialiser syntax [1], most of these memsets can become redundant.
Actually, this mechanism is already being used in some of the existing OSS code. This patch/proposal just propagates the same idea to all other similar places I could find.~
Result:
Less code. Removes ~200 unnecessary memsets.
More consistent initialisation.
+1. This seems like an improvement. I can review and take this
forward unless there are objections from others.
--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com
On 10/1/19 6:12 AM, Amit Kapila wrote:
On Tue, Oct 1, 2019 at 1:25 PM Smith, Peter <peters@fast.au.fujitsu.com> wrote:
Dear Hackers,
I have identified some OSS code which maybe can make use of C99 designated initialisers for nulls/values arrays.
~
Background:
There are lots of tuple operations where arrays of values and flags are being passed.
Typically these arrays are being previously initialised 0/false by memset.
By modifying code to use C99 designated initialiser syntax [1], most of these memsets can become redundant.
Actually, this mechanism is already being used in some of the existing OSS code. This patch/proposal just propagates the same idea to all other similar places I could find.~
Result:
Less code. Removes ~200 unnecessary memsets.
More consistent initialisation.+1. This seems like an improvement. I can review and take this
forward unless there are objections from others.
+1.
cheers
andrew
--
Andrew Dunstan https://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Tue, 1 Oct 2019 at 03:55, Smith, Peter <peters@fast.au.fujitsu.com>
wrote:
Typical Example:
Before:
Datum values[Natts_pg_attribute];
bool nulls[Natts_pg_attribute];
...
memset(values, 0, sizeof(values));
memset(nulls, false, sizeof(nulls));
After:
Datum values[Natts_pg_attribute] = {0};
bool nulls[Natts_pg_attribute] = {0};
I hope you'll forgive a noob question. Why does the "After" initialization
for the boolean array have {0} rather than {false}?
On Tue, Oct 1, 2019 at 08:40:26AM -0400, Andrew Dunstan wrote:
On 10/1/19 6:12 AM, Amit Kapila wrote:
On Tue, Oct 1, 2019 at 1:25 PM Smith, Peter <peters@fast.au.fujitsu.com> wrote:
Dear Hackers,
I have identified some OSS code which maybe can make use of C99 designated initialisers for nulls/values arrays.
~
Background:
There are lots of tuple operations where arrays of values and flags are being passed.
Typically these arrays are being previously initialised 0/false by memset.
By modifying code to use C99 designated initialiser syntax [1], most of these memsets can become redundant.
Actually, this mechanism is already being used in some of the existing OSS code. This patch/proposal just propagates the same idea to all other similar places I could find.~
Result:
Less code. Removes ~200 unnecessary memsets.
More consistent initialisation.+1. This seems like an improvement. I can review and take this
forward unless there are objections from others.+1.
I like it!
--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com
+ As you are, so once was I. As I am, so you will be. +
+ Ancient Roman grave inscription +
Bruce Momjian <bruce@momjian.us> writes:
On Tue, Oct 1, 2019 at 1:25 PM Smith, Peter <peters@fast.au.fujitsu.com> wrote:
There are lots of tuple operations where arrays of values and flags are being passed.
Typically these arrays are being previously initialised 0/false by memset.
By modifying code to use C99 designated initialiser syntax [1], most of these memsets can become redundant.
I like it!
FYI, I checked into whether this would result in worse generated code.
In the one place I checked (InsertPgAttributeTuple, which hopefully
is representative), I got *exactly the same* assembly code before
and after, on both a somewhat-aging gcc and fairly modern clang.
Hadn't quite expected that, but it removes any worries about whether
we might be losing anything.
Note though that InsertPgAttributeTuple uses memset(), while some of
these other places use MemSet(). The code I see being generated for
MemSet() is also the same(!) on clang, but it is different and
probably worse on gcc. I wonder if it isn't time to kick MemSet to
the curb. We have not re-evaluated that macro in more than a dozen
years, and compilers have surely changed.
regards, tom lane
Hi,
On 2019-10-01 12:17:08 -0400, Tom Lane wrote:
FYI, I checked into whether this would result in worse generated code.
In the one place I checked (InsertPgAttributeTuple, which hopefully
is representative), I got *exactly the same* assembly code before
and after, on both a somewhat-aging gcc and fairly modern clang.
Hadn't quite expected that, but it removes any worries about whether
we might be losing anything.
I think the only case where it's plausible to be really worse is where
we intentionally leave part of such allocations uninitialized - which we
can't easily do in these cases because the rest of the struct will also
get zeroed out. The compiler will probably figure it out in some cases,
but there's plenty where it can't. But I don't think there's many
places like that in our code though.
Note though that InsertPgAttributeTuple uses memset(), while some of
these other places use MemSet(). The code I see being generated for
MemSet() is also the same(!) on clang, but it is different and
probably worse on gcc. I wonder if it isn't time to kick MemSet to
the curb. We have not re-evaluated that macro in more than a dozen
years, and compilers have surely changed.
Yes, we really should!
- Andres
On Wed, Oct 2, 2019 at 5:49 AM Andres Freund <andres@anarazel.de> wrote:
On 2019-10-01 12:17:08 -0400, Tom Lane wrote:
Note though that InsertPgAttributeTuple uses memset(), while some of
these other places use MemSet(). The code I see being generated for
MemSet() is also the same(!) on clang, but it is different and
probably worse on gcc. I wonder if it isn't time to kick MemSet to
the curb. We have not re-evaluated that macro in more than a dozen
years, and compilers have surely changed.Yes, we really should!
+1
FWIW I experimented with that over here:
/messages/by-id/CA+hUKGLfa6ANa0vs7Lf0op0XBH05HE8SyX8NFhDyT7k2CHYLXw@mail.gmail.com
From: Isaac Morland <isaac.morland@gmail.com> Sent: Tuesday, 1 October 2019 11:32 PM
Typical Example:
Before:
Datum values[Natts_pg_attribute];
bool nulls[Natts_pg_attribute];
...
memset(values, 0, sizeof(values));
memset(nulls, false, sizeof(nulls));
After:
Datum values[Natts_pg_attribute] = {0};
bool nulls[Natts_pg_attribute] = {0};I hope you'll forgive a noob question. Why does the "After" initialization for the boolean array have {0} rather than {false}?
It is a valid question.
I found that the original memsets that this patch replaces were already using 0 and false interchangeably. So I just picked one.
Reasons I chose {0} over {false} are: (a) laziness, and (b) consistency with the values[] initialiser.
But it is no problem to change the bool initialisers to {false} if that becomes a committer review issue.
Kind Regards
--
Peter Smith
Fujitsu Australia
On Wed, Oct 2, 2019 at 4:53 AM Smith, Peter <peters@fast.au.fujitsu.com>
wrote:
From: Isaac Morland <isaac.morland@gmail.com> Sent: Tuesday, 1 October
2019 11:32 PMTypical Example:
Before:
Datum values[Natts_pg_attribute];
bool nulls[Natts_pg_attribute];
...
memset(values, 0, sizeof(values));
memset(nulls, false, sizeof(nulls));
After:
Datum values[Natts_pg_attribute] = {0};
bool nulls[Natts_pg_attribute] = {0};I hope you'll forgive a noob question. Why does the "After"
initialization for the boolean array have {0} rather than {false}?
It is a valid question.
I found that the original memsets that this patch replaces were already
using 0 and false interchangeably. So I just picked one.
Reasons I chose {0} over {false} are: (a) laziness, and (b) consistency
with the values[] initialiser.
In this case, I think it is better to be consistent in all the places. As
of now (without patch), we are using 'false' or '0' to initialize the
boolean array. See below two instances from the patch:
1.
@@ -607,9 +601,9 @@ UpdateStatisticsForTypeChange(Oid statsOid, Oid
relationOid, int attnum,
Relation rel;
- Datum values[Natts_pg_statistic_ext_data];
- bool nulls[Natts_pg_statistic_ext_data];
- bool replaces[Natts_pg_statistic_ext_data];
+ Datum values[Natts_pg_statistic_ext_data] = {0};
+ bool nulls[Natts_pg_statistic_ext_data] = {0};
+ bool replaces[Natts_pg_statistic_ext_data] = {0};
oldtup = SearchSysCache1(STATEXTDATASTXOID, ObjectIdGetDatum(statsOid));
if (!HeapTupleIsValid(oldtup))
@@ -630,10 +624,6 @@ UpdateStatisticsForTypeChange(Oid statsOid, Oid
relationOid, int attnum,
* OK, we need to reset some statistics. So let's build the new tuple,
* replacing the affected statistics types with NULL.
*/
- memset(nulls, 0, Natts_pg_statistic_ext_data * sizeof(bool));
- memset(replaces, 0, Natts_pg_statistic_ext_data * sizeof(bool));
- memset(values, 0, Natts_pg_statistic_ext_data * sizeof(Datum));
2.
@@ -69,10 +69,10 @@ CreateStatistics(CreateStatsStmt *stmt)
Oid namespaceId;
Oid stxowner = GetUserId();
HeapTuple htup;
- Datum values[Natts_pg_statistic_ext];
- bool nulls[Natts_pg_statistic_ext];
- Datum datavalues[Natts_pg_statistic_ext_data];
- bool datanulls[Natts_pg_statistic_ext_data];
+ Datum values[Natts_pg_statistic_ext] = {0};
+ bool nulls[Natts_pg_statistic_ext] = {0};
+ Datum datavalues[Natts_pg_statistic_ext_data] = {0};
+ bool datanulls[Natts_pg_statistic_ext_data] = {0};
int2vector *stxkeys;
Relation statrel;
Relation datarel;
@@ -330,9 +330,6 @@ CreateStatistics(CreateStatsStmt *stmt)
/*
* Everything seems fine, so let's build the pg_statistic_ext tuple.
*/
- memset(values, 0, sizeof(values));
- memset(nulls, false, sizeof(nulls));
-
statoid = GetNewOidWithIndex(statrel, StatisticExtOidIndexId,
Anum_pg_statistic_ext_oid);
values[Anum_pg_statistic_ext_oid - 1] = ObjectIdGetDatum(statoid);
@@ -357,9 +354,6 @@ CreateStatistics(CreateStatsStmt *stmt)
*/
datarel = table_open(StatisticExtDataRelationId, RowExclusiveLock);
- memset(datavalues, 0, sizeof(datavalues));
- memset(datanulls, false, sizeof(datanulls));
In the first usage, we are initializing the boolean array with 0 and in the
second case, we are using false. The patch changes it to use 0 at all the
places which I think is better.
I don't have any strong opinion on this, but I would mildly prefer to
initialize boolean array with false just for the sake of readability (we
generally initializing booleans with false).
--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com
From: Amit Kapila <amit.kapila16@gmail.com> Sent: Tuesday, 1 October 2019 8:12 PM
+1. This seems like an improvement. I can review and take this forward unless there are objections from others.
FYI - I created a Commitfest entry for this here: https://commitfest.postgresql.org/25/2290/
Kind Regards
--
Peter Smith
Fujitsu Australia
From: Amit Kapila <amit.kapila16@gmail.com> Sent: Wednesday, 2 October 2019 9:42 AM
I don't have any strong opinion on this, but I would mildly prefer to initialize boolean array with false just for the sake of readability (we generally initializing booleans with false).
Done. Please see attached updated patch.
Kind Regards
--
Peter Smith
Fujitsu Australia
Attachments:
c99_init_nulls_2.patchapplication/octet-stream; name=c99_init_nulls_2.patchDownload+203-474
Isaac Morland wrote:
I hope you'll forgive a noob question. Why does the "After"
initialization for the boolean array have {0} rather than {false}?
I think using a value other than {0} potentially gives the incorrect
impression that the value is used for *all* elements of the
array/structure, whereas it is only used for the first element. "The
remainder of the aggregate shall be initialized implicitly the same as
objects that have static storage duration."
The rest of the elements are being initialized to zero as interpreted by
their types (so NULL for pointers, 0.0 for floats, even though neither
of them need be bitwise zero). Setting the first item to 0 matches that
exactly.
Using {false} may encourage the unwary to try
bool foo[2] = {true};
which will not set all elements to true.
Joe Nelson <joe@begriffs.com> writes:
Isaac Morland wrote:
I hope you'll forgive a noob question. Why does the "After"
initialization for the boolean array have {0} rather than {false}?
I think using a value other than {0} potentially gives the incorrect
impression that the value is used for *all* elements of the
array/structure, whereas it is only used for the first element.
There's been something vaguely bothering me about this proposal,
and I think you just crystallized it.
Using {false} may encourage the unwary to try
bool foo[2] = {true};
which will not set all elements to true.
Right. I think that in general it's bad practice for an initializer
to not specify all fields/elements of the target. It is okay in the
specific case that we're substituting for a memset(..., 0, ...).
Perhaps we could make this explicit by using a coding style like
/* in c.h or some such place: */
#define INIT_ALL_ZEROES {0}
/* in code: */
Datum values[N] = INIT_ALL_ZEROES;
and then decreeing that it's not project style to use a partial
initializer other than in this way.
regards, tom lane
On Wed, 2 Oct 2019 at 11:34, Joe Nelson <joe@begriffs.com> wrote:
Isaac Morland wrote:
I hope you'll forgive a noob question. Why does the "After"
initialization for the boolean array have {0} rather than {false}?I think using a value other than {0} potentially gives the incorrect
impression that the value is used for *all* elements of the
array/structure, whereas it is only used for the first element. "The
remainder of the aggregate shall be initialized implicitly the same as
objects that have static storage duration."The rest of the elements are being initialized to zero as interpreted by
their types (so NULL for pointers, 0.0 for floats, even though neither
of them need be bitwise zero). Setting the first item to 0 matches that
exactly.Using {false} may encourage the unwary to try
bool foo[2] = {true};
which will not set all elements to true.
Thanks for the explanation. So the first however many elements are in curly
braces get initialized to those values, then the rest get initialized to
blank/0/0.0/false/...?
If so, I don't suppose it's possible to give empty braces:
bool nulls[Natts_pg_attribute] = {};
If so, I don't suppose it's possible to give empty braces:
bool nulls[Natts_pg_attribute] = {};
GNU does add this capability as a nonstandard language extension, but
according to the C99 standard, no.
On 10/2/19 8:46 AM, Tom Lane wrote:
Joe Nelson <joe@begriffs.com> writes:
Isaac Morland wrote:
I hope you'll forgive a noob question. Why does the "After"
initialization for the boolean array have {0} rather than {false}?I think using a value other than {0} potentially gives the incorrect
impression that the value is used for *all* elements of the
array/structure, whereas it is only used for the first element.There's been something vaguely bothering me about this proposal,
and I think you just crystallized it.Using {false} may encourage the unwary to try
bool foo[2] = {true};
which will not set all elements to true.Right. I think that in general it's bad practice for an initializer
to not specify all fields/elements of the target. It is okay in the
specific case that we're substituting for a memset(..., 0, ...).
Perhaps we could make this explicit by using a coding style like/* in c.h or some such place: */
#define INIT_ALL_ZEROES {0}/* in code: */
Datum values[N] = INIT_ALL_ZEROES;and then decreeing that it's not project style to use a partial
initializer other than in this way.
There are numerous locations in the code that raise warnings when
-Wmissing-field-initializers is handed to gcc. See, for example,
src/backend/utils/adt/formatting.c where
static const KeyWord NUM_keywords[]
is initialized, and the code comment above that disclaims the need to
initialize is_digit and date_mode. Are you proposing cleaning up all
such incomplete initializations within the project?
I understand that your INIT_ALL_ZEROS macro does nothing to change
whether -Wmissing-field-initializers would raise a warning. I'm
just asking about the decree you propose, and I used that warning flag
to get the compiler to spit out relevant examples.
mark
Mark Dilger <hornschnorter@gmail.com> writes:
On 10/2/19 8:46 AM, Tom Lane wrote:
Right. I think that in general it's bad practice for an initializer
to not specify all fields/elements of the target.
There are numerous locations in the code that raise warnings when
-Wmissing-field-initializers is handed to gcc. See, for example,
src/backend/utils/adt/formatting.c where
static const KeyWord NUM_keywords[]
is initialized, and the code comment above that disclaims the need to
initialize is_digit and date_mode. Are you proposing cleaning up all
such incomplete initializations within the project?
Hmm. Maybe it's worth doing as a code beautification effort, but
I'm not volunteering. At the same time, I wouldn't like to make a
change like this, if it introduces dozens/hundreds of new cases.
I understand that your INIT_ALL_ZEROS macro does nothing to change
whether -Wmissing-field-initializers would raise a warning.
Not sure --- the name of that option suggests that maybe it only
complains about omitted *struct fields* not omitted *array elements*.
If it does complain, is there any way that we could extend the macro
to annotate usages of it to suppress the warning?
regards, tom lane
On 10/2/19 11:02 AM, Tom Lane wrote:
Mark Dilger <hornschnorter@gmail.com> writes:
On 10/2/19 8:46 AM, Tom Lane wrote:
Right. I think that in general it's bad practice for an initializer
to not specify all fields/elements of the target.There are numerous locations in the code that raise warnings when
-Wmissing-field-initializers is handed to gcc. See, for example,
src/backend/utils/adt/formatting.c where
static const KeyWord NUM_keywords[]
is initialized, and the code comment above that disclaims the need to
initialize is_digit and date_mode. Are you proposing cleaning up all
such incomplete initializations within the project?Hmm. Maybe it's worth doing as a code beautification effort, but
I'm not volunteering. At the same time, I wouldn't like to make a
change like this, if it introduces dozens/hundreds of new cases.I understand that your INIT_ALL_ZEROS macro does nothing to change
whether -Wmissing-field-initializers would raise a warning.Not sure --- the name of that option suggests that maybe it only
complains about omitted *struct fields* not omitted *array elements*.
With gcc (Debian 8.3.0-6) 8.3.0
int foo[6] = {0, 1, 2};
does not draw a warning when compiled with this flag.
If it does complain, is there any way that we could extend the macro
to annotate usages of it to suppress the warning?
Neither initializing a struct with {0} nor with INIT_ALL_ZEROS draws a
warning either, with my gcc. There are reports online that older
versions of the compiler did, see
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=36750
but I don't have an older version to test with just now.
Note that initializing a multi-element struct with {1} does still draw a
warning, and reading the thread above suggests that gcc made a specific
effort to allow initialization to {0} to work without warning as a
special case.
So your proposal for using INIT_ALL_ZEROS is probably good with
sufficiently new compilers, and I'm generally in favor of the proposal,
but I don't think the decree you propose can work unless somebody cleans
up all these other cases that I indicated in my prior email.
(I'm sitting on a few patches until v12 goes out the door from some
conversations with you several months ago, and perhaps I'll include a
patch for this cleanup, too, when time comes for v13 patch sets to be
submitted. My past experience submitting patches shortly before a
release was that they get ignored.)
mark
Mark Dilger <hornschnorter@gmail.com> writes:
(I'm sitting on a few patches until v12 goes out the door from some
conversations with you several months ago, and perhaps I'll include a
patch for this cleanup, too, when time comes for v13 patch sets to be
submitted.
That would be now. We already ran one CF for v13.
My past experience submitting patches shortly before a
release was that they get ignored.)
What you need to do is add 'em to the commitfest app. They might
still get ignored for awhile, but we won't forget about them.
regards, tom lane