Expanding the use of FLEXIBLE_ARRAY_MEMBER for declarations like foo[1]
Hi,
In a lot of places in the code we have many structures doing
declarations of the type foo[1] to emulate variable length arrays.
Attached are a set of patches aimed at replacing that with
FLEXIBLE_ARRAY_MEMBER to prevent potential failures that could be
caused by compiler optimizations and improve report of static code
analyzers of the type Coverity (idea by Tom, patches from me):
- 0001 uses FLEXIBLE_ARRAY_MEMBER in a bunch of trivial places (at
least check-world does not complain)
- 0002 does the same for catalog tables
- 0003 changes varlena in c.h. This is done as a separate patch
because this needs some other modifications as variable-length things
need to be placed at the end of structures, because of:
-- rolpassword that should be placed as the last field of pg_authid
(and shouldn't there be CATALOG_VARLEN here??)
-- inv_api.c uses bytea in an internal structure without putting it at
the end of the structure. For clarity I think that we should just use
a bytea pointer and do a sufficient allocation for the duration of the
lobj manipulation.
-- Similarly, tuptoaster.c needed to be patched for toast_save_datum
There are as well couple of things that are not changed on purpose:
- in namespace.h for FuncCandidateList. I tried manipulating it but it
resulted in allocation overflow in PortalHeapMemory
- I don't think that the t_bits fields in htup_details.h should be
updated either.
Regards,
--
Michael
Attachments:
0001-First-cut-with-FLEXIBLE_ARRAY_MEMBER.patchtext/x-diff; charset=US-ASCII; name=0001-First-cut-with-FLEXIBLE_ARRAY_MEMBER.patchDownload+69-65
0002-Use-FLEXIBLE_ARRAY_MEMBER-in-catalog-tables.patchtext/x-diff; charset=US-ASCII; name=0002-Use-FLEXIBLE_ARRAY_MEMBER-in-catalog-tables.patchDownload+53-42
0003-Update-varlena-in-c.h-to-use-FLEXIBLE_ARRAY_MEMBER.patchtext/x-diff; charset=US-ASCII; name=0003-Update-varlena-in-c.h-to-use-FLEXIBLE_ARRAY_MEMBER.patchDownload+31-36
On 2015-02-17 05:51:22 +0900, Michael Paquier wrote:
- 0002 does the same for catalog tables
- 0003 changes varlena in c.h. This is done as a separate patch
because this needs some other modifications as variable-length things
need to be placed at the end of structures, because of:
-- rolpassword that should be placed as the last field of pg_authid
(and shouldn't there be CATALOG_VARLEN here??)
Yes, there should.
-- inv_api.c uses bytea in an internal structure without putting it at
the end of the structure. For clarity I think that we should just use
a bytea pointer and do a sufficient allocation for the duration of the
lobj manipulation.
Hm. I don't really see the problem tbh.
There's actually is a reason the bytea is at the beginning - the other
fields are *are* the data part of the bytea (which is just the varlena
header).
-- Similarly, tuptoaster.c needed to be patched for toast_save_datum
I'm not a big fan of these two changes. This adds some not that small
memory allocations to a somewhat hot path. Without a big win in
clarity. And it doesn't seem to have anything to do with with
FLEXIBLE_ARRAY_MEMBER.
There are as well couple of things that are not changed on purpose:
- in namespace.h for FuncCandidateList. I tried manipulating it but it
resulted in allocation overflow in PortalHeapMemory
Did you change the allocation in FuncnameGetCandidates()? Note the -
sizeof(Oid) there.
- I don't think that the t_bits fields in htup_details.h should be
updated either.
Why not? Any not broken code should already use MinHeapTupleSize and
similar macros.
Generally it's worthwhile to check the code you changed for
sizeof(changed_struct) and similar things. Because this very well could
add bugs. And not all of them will result in a outright visibile error
like the FuncnameGetCandidates() one.
--- a/src/include/access/htup_details.h +++ b/src/include/access/htup_details.h @@ -150,7 +150,7 @@ struct HeapTupleHeaderData/* ^ - 23 bytes - ^ */
- bits8 t_bits[1]; /* bitmap of NULLs -- VARIABLE LENGTH */ + bits8 t_bits[1]; /* bitmap of NULLs *//* MORE DATA FOLLOWS AT END OF STRUCT */
};
@@ -579,7 +579,7 @@ struct MinimalTupleData/* ^ - 23 bytes - ^ */
- bits8 t_bits[1]; /* bitmap of NULLs -- VARIABLE LENGTH */ + bits8 t_bits[1]; /* bitmap of NULLs *//* MORE DATA FOLLOWS AT END OF STRUCT */
};
This sees like overeager search/replace.
diff --git a/src/include/nodes/params.h b/src/include/nodes/params.h index 5b096c5..0ad7ef0 100644 --- a/src/include/nodes/params.h +++ b/src/include/nodes/params.h @@ -71,7 +71,7 @@ typedef struct ParamListInfoData ParserSetupHook parserSetup; /* parser setup hook */ void *parserSetupArg; int numParams; /* number of ParamExternDatas following */ - ParamExternData params[1]; /* VARIABLE LENGTH ARRAY */ + ParamExternData params[1]; } ParamListInfoData;
Eh?
diff --git a/src/include/catalog/pg_attribute.h b/src/include/catalog/pg_attribute.h index 87a3462..73bcefe 100644 --- a/src/include/catalog/pg_attribute.h +++ b/src/include/catalog/pg_attribute.h @@ -157,13 +157,13 @@ CATALOG(pg_attribute,1249) BKI_BOOTSTRAP BKI_WITHOUT_OIDS BKI_ROWTYPE_OID(75) BK /* NOTE: The following fields are not present in tuple descriptors. *//* Column-level access permissions */ - aclitem attacl[1]; + aclitem attacl[FLEXIBLE_ARRAY_MEMBER];/* Column-level options */ - text attoptions[1]; + text attoptions[FLEXIBLE_ARRAY_MEMBER];/* Column-level FDW options */ - text attfdwoptions[1]; + text attfdwoptions[FLEXIBLE_ARRAY_MEMBER]; #endif } FormData_pg_attribute;
Ugh. This really isn't C at all anymore - these structs wouldn't compile
if you removed the #ifdef. Maybe we should instead a 'textarray' type
for genbki's benefit?
Generally the catalog changes are much less clearly a benefit imo.
From ad8f54cdb5776146f17d1038bb295b5f13b549f1 Mon Sep 17 00:00:00 2001
From: Michael Paquier <michael@otacoo.com>
Date: Mon, 16 Feb 2015 03:53:38 +0900
Subject: [PATCH 3/3] Update varlena in c.h to use FLEXIBLE_ARRAY_MEMBERPlaces using a variable-length variable not at the end of a structure
are changed with workaround, without impacting what those features do.
I vote for rejecting most of this, except a (corrected version) of the
pg_authid change. Which doesn't really belong to the rest of the patch
anyway ;)x
#define VARHDRSZ ((int32) sizeof(int32)) diff --git a/src/include/catalog/pg_authid.h b/src/include/catalog/pg_authid.h index e01e6aa..d8789a5 100644 --- a/src/include/catalog/pg_authid.h +++ b/src/include/catalog/pg_authid.h @@ -56,8 +56,10 @@ CATALOG(pg_authid,1260) BKI_SHARED_RELATION BKI_ROWTYPE_OID(2842) BKI_SCHEMA_MAC int32 rolconnlimit; /* max connections allowed (-1=no limit) *//* remaining fields may be null; use heap_getattr to read them! */ - text rolpassword; /* password, if any */ timestamptz rolvaliduntil; /* password expiration time, if any */ +#ifdef CATALOG_VARLEN + text rolpassword; /* password, if any */ +#endif } FormData_pg_authid;
That change IIRC is wrong, because it'll make rolvaliduntil until NOT
NULL (any column that's fixed width and has only fixed with columns
before it is marked as such).
Greetings,
Andres Freund
--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Andres Freund <andres@2ndquadrant.com> writes:
On 2015-02-17 05:51:22 +0900, Michael Paquier wrote:
diff --git a/src/include/catalog/pg_authid.h b/src/include/catalog/pg_authid.h index e01e6aa..d8789a5 100644 --- a/src/include/catalog/pg_authid.h +++ b/src/include/catalog/pg_authid.h @@ -56,8 +56,10 @@ CATALOG(pg_authid,1260) BKI_SHARED_RELATION BKI_ROWTYPE_OID(2842) BKI_SCHEMA_MAC int32 rolconnlimit; /* max connections allowed (-1=no limit) *//* remaining fields may be null; use heap_getattr to read them! */ - text rolpassword; /* password, if any */ timestamptz rolvaliduntil; /* password expiration time, if any */ +#ifdef CATALOG_VARLEN + text rolpassword; /* password, if any */ +#endif } FormData_pg_authid;
That change IIRC is wrong, because it'll make rolvaliduntil until NOT
NULL (any column that's fixed width and has only fixed with columns
before it is marked as such).
You were muttering about a BKI_FORCE_NOT_NULL option ... for symmetry,
maybe we could add BKI_FORCE_NULL as well, and then use that for cases
like this? Also, if we want to insist that these fields be accessed
through heap_getattr, I'd be inclined to put them inside the "#ifdef
CATALOG_VARLEN" to enforce that.
I'm generally -1 on reordering any catalog columns as part of this patch.
There should be zero user-visible change from it IMO. However, if we
stick both those columns inside the ifdef, we don't need to reorder.
regards, tom lane
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Tue, Feb 17, 2015 at 9:02 AM, Andres Freund wrote:
On 2015-02-17 05:51:22 +0900, Michael Paquier wrote:
-- inv_api.c uses bytea in an internal structure without putting it at
the end of the structure. For clarity I think that we should just use
a bytea pointer and do a sufficient allocation for the duration of the
lobj manipulation.Hm. I don't really see the problem tbh.
There's actually is a reason the bytea is at the beginning - the other
fields are *are* the data part of the bytea (which is just the varlena
header).-- Similarly, tuptoaster.c needed to be patched for toast_save_datum
I'm not a big fan of these two changes. This adds some not that small
memory allocations to a somewhat hot path. Without a big win in
clarity. And it doesn't seem to have anything to do with with
FLEXIBLE_ARRAY_MEMBER.
We could replace those palloc calls with a simple buffer that has a
predefined size, but this somewhat reduces the readability of the
code.
There are as well couple of things that are not changed on purpose:
- in namespace.h for FuncCandidateList. I tried manipulating it but it
resulted in allocation overflow in PortalHeapMemoryDid you change the allocation in FuncnameGetCandidates()? Note the -
sizeof(Oid) there.
Yeah. Missed it.
- I don't think that the t_bits fields in htup_details.h should be
updated either.Why not? Any not broken code should already use MinHeapTupleSize and
similar macros.
Changing t_bits impacts HeapTupleHeaderData, ReorderBufferTupleBuf and
similarly a couple of redo routines in heapam.c using
HeapTupleHeaderData in a couple of structures not placing it at the
end (compiler complains). We could use for each of them a buffer that
has enough room with sizeof(HeapTupleHeaderData) + MaxHeapTupleSize,
but wouldn't it reduce the readability of the current code? Opinions
welcome.
diff --git a/src/include/catalog/pg_attribute.h b/src/include/catalog/pg_attribute.h[...]
Generally the catalog changes are much less clearly a benefit imo.
OK, let's drop them then.
From ad8f54cdb5776146f17d1038bb295b5f13b549f1 Mon Sep 17 00:00:00 2001
From: Michael Paquier <michael@otacoo.com>
Date: Mon, 16 Feb 2015 03:53:38 +0900
Subject: [PATCH 3/3] Update varlena in c.h to use FLEXIBLE_ARRAY_MEMBERPlaces using a variable-length variable not at the end of a structure
are changed with workaround, without impacting what those features do.I vote for rejecting most of this, except a (corrected version) of the
pg_authid change. Which doesn't really belong to the rest of the patch
anyway ;)x
Changing bytea to use FLEXIBLE_ARRAY_MEMBER requires those changes, or
at least some changes in this area as something with
FLEXIBLE_ARRAY_MEMBER can only be placed at the end of a structure.
But I guess that we can do fine as well by replacing those structures
with some buffers with a pre-defined size. I'll draft an additional
patch on top of 0001 with all those less-trivial changes implemented.
#define VARHDRSZ ((int32) sizeof(int32)) diff --git a/src/include/catalog/pg_authid.h b/src/include/catalog/pg_authid.h index e01e6aa..d8789a5 100644 --- a/src/include/catalog/pg_authid.h +++ b/src/include/catalog/pg_authid.h @@ -56,8 +56,10 @@ CATALOG(pg_authid,1260) BKI_SHARED_RELATION BKI_ROWTYPE_OID(2842) BKI_SCHEMA_MAC int32 rolconnlimit; /* max connections allowed (-1=no limit) *//* remaining fields may be null; use heap_getattr to read them! */ - text rolpassword; /* password, if any */ timestamptz rolvaliduntil; /* password expiration time, if any */ +#ifdef CATALOG_VARLEN + text rolpassword; /* password, if any */ +#endif } FormData_pg_authid;That change IIRC is wrong, because it'll make rolvaliduntil until NOT
NULL (any column that's fixed width and has only fixed with columns
before it is marked as such).
This sounds better as a separate patch...
--
Michael
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 2015-02-18 17:15:18 +0900, Michael Paquier wrote:
- I don't think that the t_bits fields in htup_details.h should be
updated either.Why not? Any not broken code should already use MinHeapTupleSize and
similar macros.Changing t_bits impacts HeapTupleHeaderData, ReorderBufferTupleBuf and
similarly a couple of redo routines in heapam.c using
HeapTupleHeaderData in a couple of structures not placing it at the
end (compiler complains).
The compiler will complain if you use a FLEXIBLE_ARRAY_MEMBER in the
middle of a struct but not when when you embed a struct that uses it
into the middle another struct. At least gcc doesn't and I think it'd be
utterly broken if another compiler did that. If there's a compiler that
does so, we need to make it define FLEXIBLE_ARRAY_MEMBER to 1.
Code embedding structs using *either* [FLEXIBLE_ARRAY_MEMBER] or [1] for
variable length obviously has to take care where the variable length
part goes. And that what the structs you removed where doing - that's
where the t_bits et al go:
{
...
HeapTupleHeaderData header;
char data[MaxHeapTupleSize];
...
}
the 'data' bit is just the t_bits + data together. And similar in
- struct
- {
- struct varlena hdr;
- char data[TOAST_MAX_CHUNK_SIZE]; /* make struct big enough */
- int32 align_it; /* ensure struct is aligned well enough */
- } chunk_data;
The 'data' is where the varlena's vl_dat stuff is stored.
Places using a variable-length variable not at the end of a structure
are changed with workaround, without impacting what those features do.I vote for rejecting most of this, except a (corrected version) of the
pg_authid change. Which doesn't really belong to the rest of the patch
anyway ;)xChanging bytea to use FLEXIBLE_ARRAY_MEMBER requires those changes, or
at least some changes in this area as something with
FLEXIBLE_ARRAY_MEMBER can only be placed at the end of a structure.
Again, I think you're confusing things that may not be be done in a
single struct, and structs that are embedded in other places.
Greetings,
Andres Freund
--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 2015-02-16 21:34:57 -0500, Tom Lane wrote:
Andres Freund <andres@2ndquadrant.com> writes:
On 2015-02-17 05:51:22 +0900, Michael Paquier wrote:
diff --git a/src/include/catalog/pg_authid.h b/src/include/catalog/pg_authid.h index e01e6aa..d8789a5 100644 --- a/src/include/catalog/pg_authid.h +++ b/src/include/catalog/pg_authid.h @@ -56,8 +56,10 @@ CATALOG(pg_authid,1260) BKI_SHARED_RELATION BKI_ROWTYPE_OID(2842) BKI_SCHEMA_MAC int32 rolconnlimit; /* max connections allowed (-1=no limit) *//* remaining fields may be null; use heap_getattr to read them! */ - text rolpassword; /* password, if any */ timestamptz rolvaliduntil; /* password expiration time, if any */ +#ifdef CATALOG_VARLEN + text rolpassword; /* password, if any */ +#endif } FormData_pg_authid;That change IIRC is wrong, because it'll make rolvaliduntil until NOT
NULL (any column that's fixed width and has only fixed with columns
before it is marked as such).You were muttering about a BKI_FORCE_NOT_NULL option ... for symmetry,
maybe we could add BKI_FORCE_NULL as well, and then use that for cases
like this?
Yea, I guess it'd not be too hard.
Also, if we want to insist that these fields be accessed
through heap_getattr, I'd be inclined to put them inside the "#ifdef
CATALOG_VARLEN" to enforce that.
That we definitely should do. It's imo just a small bug that it was
omitted here. I'll fix it, but not backpatch unless you prefer?
I'm generally -1 on reordering any catalog columns as part of this patch.
There should be zero user-visible change from it IMO. However, if we
stick both those columns inside the ifdef, we don't need to reorder.
Agreed.
Greetings,
Andres Freund
--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Andres Freund <andres@2ndquadrant.com> writes:
On 2015-02-16 21:34:57 -0500, Tom Lane wrote:
Also, if we want to insist that these fields be accessed
through heap_getattr, I'd be inclined to put them inside the "#ifdef
CATALOG_VARLEN" to enforce that.
That we definitely should do. It's imo just a small bug that it was
omitted here. I'll fix it, but not backpatch unless you prefer?
Agreed, that's really independent of FLEXIBLE_ARRAY_MEMBER, so fix
it as its own patch. I also agree that back-patching is probably
not appropriate.
regards, tom lane
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Wed, Feb 18, 2015 at 10:09 PM, Andres Freund <andres@2ndquadrant.com> wrote:
On 2015-02-18 17:15:18 +0900, Michael Paquier wrote:
- I don't think that the t_bits fields in htup_details.h should be
updated either.Why not? Any not broken code should already use MinHeapTupleSize and
similar macros.Changing t_bits impacts HeapTupleHeaderData, ReorderBufferTupleBuf and
similarly a couple of redo routines in heapam.c using
HeapTupleHeaderData in a couple of structures not placing it at the
end (compiler complains).The compiler will complain if you use a FLEXIBLE_ARRAY_MEMBER in the
middle of a struct but not when when you embed a struct that uses it
into the middle another struct. At least gcc doesn't and I think it'd be
utterly broken if another compiler did that. If there's a compiler that
does so, we need to make it define FLEXIBLE_ARRAY_MEMBER to 1.
clang does complain on my OSX laptop regarding that ;)
--
Michael
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 2015-02-19 07:10:19 +0900, Michael Paquier wrote:
On Wed, Feb 18, 2015 at 10:09 PM, Andres Freund <andres@2ndquadrant.com> wrote:
On 2015-02-18 17:15:18 +0900, Michael Paquier wrote:
- I don't think that the t_bits fields in htup_details.h should be
updated either.Why not? Any not broken code should already use MinHeapTupleSize and
similar macros.Changing t_bits impacts HeapTupleHeaderData, ReorderBufferTupleBuf and
similarly a couple of redo routines in heapam.c using
HeapTupleHeaderData in a couple of structures not placing it at the
end (compiler complains).The compiler will complain if you use a FLEXIBLE_ARRAY_MEMBER in the
middle of a struct but not when when you embed a struct that uses it
into the middle another struct. At least gcc doesn't and I think it'd be
utterly broken if another compiler did that. If there's a compiler that
does so, we need to make it define FLEXIBLE_ARRAY_MEMBER to 1.clang does complain on my OSX laptop regarding that ;)
I think that then puts the idea of doing it for those structs pretty
much to bed.
Greetings,
Andres Freund
--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Michael Paquier <michael.paquier@gmail.com> writes:
On Wed, Feb 18, 2015 at 10:09 PM, Andres Freund <andres@2ndquadrant.com> wrote:
The compiler will complain if you use a FLEXIBLE_ARRAY_MEMBER in the
middle of a struct but not when when you embed a struct that uses it
into the middle another struct. At least gcc doesn't and I think it'd be
utterly broken if another compiler did that. If there's a compiler that
does so, we need to make it define FLEXIBLE_ARRAY_MEMBER to 1.
clang does complain on my OSX laptop regarding that ;)
I'm a bit astonished that gcc doesn't consider this an error. Sure seems
like it should. (Has anyone tried it on recent gcc?) I am entirely
opposed to Andreas' claim that we ought to consider compilers that do warn
to be broken; if anything it's the other way around.
Moreover, if we have any code that is assuming such cases are okay, it
probably needs a second look. Isn't this situation effectively assuming
that a variable-length array is fixed-length?
regards, tom lane
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Thu, Feb 19, 2015 at 7:29 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Michael Paquier <michael.paquier@gmail.com> writes:
On Wed, Feb 18, 2015 at 10:09 PM, Andres Freund <andres@2ndquadrant.com> wrote:
The compiler will complain if you use a FLEXIBLE_ARRAY_MEMBER in the
middle of a struct but not when when you embed a struct that uses it
into the middle another struct. At least gcc doesn't and I think it'd be
utterly broken if another compiler did that. If there's a compiler that
does so, we need to make it define FLEXIBLE_ARRAY_MEMBER to 1.clang does complain on my OSX laptop regarding that ;)
I'm a bit astonished that gcc doesn't consider this an error. Sure seems
like it should. (Has anyone tried it on recent gcc?)
Just tried with gcc 4.9.2 on an ArchLinux bix and it does not complain
after switching the declaration of varlena declaration from [1] to
FLEXIBLE_ARRAY_MEMBER in c.h on HEAD. But it does with clang 3.5.1 on
the same box.
I am entirely
opposed to Andreas' claim that we ought to consider compilers that do warn
to be broken; if anything it's the other way around.
I'm on board with that.
Moreover, if we have any code that is assuming such cases are okay, it
probably needs a second look. Isn't this situation effectively assuming
that a variable-length array is fixed-length?
AFAIK, switching a bunch of things to use FLEXIBLE_ARRAY_MEMBER has
put a couple of things in light that could be revisited:
1) tuptoaster.c, with this declaration of varlena:
struct
{
struct varlena hdr;
char data[TOAST_MAX_CHUNK_SIZE]; /* make
struct big enough */
int32 align_it; /* ensure struct is
aligned well enough */
} chunk_data;
2) inv_api.c with this thing:
struct
{
bytea hdr;
char data[LOBLKSIZE]; /* make struct
big enough */
int32 align_it; /* ensure struct is
aligned well enough */
} workbuf;
3) heapam.c in three places with HeapTupleHeaderData:
struct
{
HeapTupleHeaderData hdr;
char data[MaxHeapTupleSize];
} tbuf;
4) pg_authid.h with its use of relpasswd.
5) reorderbuffer.h with its use of HeapTupleHeaderData:
typedef struct ReorderBufferTupleBuf
{
/* position in preallocated list */
slist_node node;
/* tuple, stored sequentially */
HeapTupleData tuple;
HeapTupleHeaderData header;
char data[MaxHeapTupleSize];
} ReorderBufferTupleBuf;
Those issues can be grouped depending on where foo[1] is switched to
FLEXIBLE_ARRAY_MEMBER, so I will try to get a set of patches depending
on that.
Regards,
--
Michael
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Michael Paquier <michael.paquier@gmail.com> writes:
On Thu, Feb 19, 2015 at 7:29 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Moreover, if we have any code that is assuming such cases are okay, it
probably needs a second look. Isn't this situation effectively assuming
that a variable-length array is fixed-length?
AFAIK, switching a bunch of things to use FLEXIBLE_ARRAY_MEMBER has
put a couple of things in light that could be revisited:
1) tuptoaster.c, with this declaration of varlena:
struct
{
struct varlena hdr;
char data[TOAST_MAX_CHUNK_SIZE]; /* make
struct big enough */
int32 align_it; /* ensure struct is
aligned well enough */
} chunk_data;
I'm pretty sure that thing ought to be a union, not a struct.
2) inv_api.c with this thing:
struct
{
bytea hdr;
char data[LOBLKSIZE]; /* make struct
big enough */
int32 align_it; /* ensure struct is
aligned well enough */
} workbuf;
And probably this too.
3) heapam.c in three places with HeapTupleHeaderData:
struct
{
HeapTupleHeaderData hdr;
char data[MaxHeapTupleSize];
} tbuf;
And this, though I'm not sure if we'd have to change the size of the
padding data[] member.
5) reorderbuffer.h with its use of HeapTupleHeaderData:
Hmm. Andres will have to answer for that one ;-)
regards, tom lane
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 19/02/15 15:00, Tom Lane wrote:
Michael Paquier <michael.paquier@gmail.com> writes:
On Thu, Feb 19, 2015 at 7:29 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Moreover, if we have any code that is assuming such cases are okay, it
probably needs a second look. Isn't this situation effectively assuming
that a variable-length array is fixed-length?AFAIK, switching a bunch of things to use FLEXIBLE_ARRAY_MEMBER has
put a couple of things in light that could be revisited:
1) tuptoaster.c, with this declaration of varlena:
struct
{
struct varlena hdr;
char data[TOAST_MAX_CHUNK_SIZE]; /* make
struct big enough */
int32 align_it; /* ensure struct is
aligned well enough */
} chunk_data;I'm pretty sure that thing ought to be a union, not a struct.
2) inv_api.c with this thing:
struct
{
bytea hdr;
char data[LOBLKSIZE]; /* make struct
big enough */
int32 align_it; /* ensure struct is
aligned well enough */
} workbuf;And probably this too.
3) heapam.c in three places with HeapTupleHeaderData:
struct
{
HeapTupleHeaderData hdr;
char data[MaxHeapTupleSize];
} tbuf;And this, though I'm not sure if we'd have to change the size of the
padding data[] member.5) reorderbuffer.h with its use of HeapTupleHeaderData:
Hmm. Andres will have to answer for that one ;-)
regards, tom lane
Curious, has this problem been raised with the gcc maintainers?
Is this still a problem with gcc 5.0 (which is due to be released soon)?
Cheers,
Gavin
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Thu, Feb 19, 2015 at 11:00 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Michael Paquier <michael.paquier@gmail.com> writes:
On Thu, Feb 19, 2015 at 7:29 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Moreover, if we have any code that is assuming such cases are okay, it
probably needs a second look. Isn't this situation effectively assuming
that a variable-length array is fixed-length?AFAIK, switching a bunch of things to use FLEXIBLE_ARRAY_MEMBER has
put a couple of things in light that could be revisited:
1) tuptoaster.c, with this declaration of varlena:
struct
...
} chunk_data;I'm pretty sure that thing ought to be a union, not a struct.
2) inv_api.c with this thing:
...And probably this too.
Sounds good to me, but with an additional VARHDRSZ to give enough room
IMO (or toast_save_datum explodes).
3) heapam.c in three places with HeapTupleHeaderData:
struct
{
HeapTupleHeaderData hdr;
char data[MaxHeapTupleSize];
} tbuf;And this, though I'm not sure if we'd have to change the size of the
padding data[] member.
Here I think that we should add sizeof(HeapTupleHeaderData) to ensure
that there is enough room
5) reorderbuffer.h with its use of HeapTupleHeaderData:
Hmm. Andres will have to answer for that one ;-)
Surely. This impacts decode.c and reorder.c at quick glance.
So, attached are a new set of patches:
1) 0001 is more or less the same as upthread, changing trivial places
with foo[1]. I have checked as well calls to sizeof for the structures
impacted:
1-1) In dumputils.c, I guess that the call of sizeof with
SimpleStringListCell should be changed as follows:
cell = (SimpleStringListCell *)
- pg_malloc(sizeof(SimpleStringListCell) + strlen(val));
+ pg_malloc(sizeof(SimpleStringListCell));
1-2) sizeof(ParamListInfoData) is present in a couple of places,
assuming that sizeof(ParamListInfoData) has the equivalent of 1
parameter, like prepare.c, functions.c, spi.c and postgres.c:
- /* sizeof(ParamListInfoData) includes the first array element */
paramLI = (ParamListInfo)
palloc(sizeof(ParamListInfoData) +
- (num_params - 1) * sizeof(ParamExternData));
+ num_params * sizeof(ParamExternData));
1-3) FuncCandidateList in namespace.c (thanks Andres!):
newResult = (FuncCandidateList)
- palloc(sizeof(struct _FuncCandidateList) - sizeof(Oid)
- + effective_nargs * sizeof(Oid));
+ palloc(sizeof(struct _FuncCandidateList) +
+ effective_nargs * sizeof(Oid));
I imagine that we do not want for those palloc calls to use ifdef
FLEXIBLE_ARRAY_MEMBER to save some memory for code readability even if
compiler does not support flexible-array length, right?
2) 0002 fixes pg_authid to use CATALOG_VARLEN, this is needed for 0003.
3) 0003 switches varlena in c.h to use FLEXIBLE_ARRAY_MEMBER, with
necessary tweaks added for tuptoaster.c and inv_api.c
4) 0004 is some preparatory work before switching HeapTupleHeaderData
and MinimalTupleData, changing the struct declarations to union in
heapam.c with enough room ensured for processing.
Regards,
--
Michael
Attachments:
0001-First-cut-with-FLEXIBLE_ARRAY_MEMBER.patchapplication/x-patch; name=0001-First-cut-with-FLEXIBLE_ARRAY_MEMBER.patchDownload+80-81
0002-Add-forgotten-CATALOG_VARLEN-pg_authid.patchapplication/x-patch; name=0002-Add-forgotten-CATALOG_VARLEN-pg_authid.patchDownload+2-2
0003-Switch-varlena-to-use-FLEXIBLE_ARRAY_MEMBER.patchapplication/x-patch; name=0003-Switch-varlena-to-use-FLEXIBLE_ARRAY_MEMBER.patchDownload+7-8
0004-Replace-some-struct-declarations-with-union-in-heapa.patchapplication/x-patch; name=0004-Replace-some-struct-declarations-with-union-in-heapa.patchDownload+8-7
Michael Paquier <michael.paquier@gmail.com> writes:
1-2) sizeof(ParamListInfoData) is present in a couple of places, assuming that sizeof(ParamListInfoData) has the equivalent of 1 parameter, like prepare.c, functions.c, spi.c and postgres.c: - /* sizeof(ParamListInfoData) includes the first array element */ paramLI = (ParamListInfo) palloc(sizeof(ParamListInfoData) + - (num_params - 1) * sizeof(ParamExternData)); + num_params * sizeof(ParamExternData)); 1-3) FuncCandidateList in namespace.c (thanks Andres!): newResult = (FuncCandidateList) - palloc(sizeof(struct _FuncCandidateList) - sizeof(Oid) - + effective_nargs * sizeof(Oid)); + palloc(sizeof(struct _FuncCandidateList) + + effective_nargs * sizeof(Oid)); I imagine that we do not want for those palloc calls to use ifdef FLEXIBLE_ARRAY_MEMBER to save some memory for code readability even if compiler does not support flexible-array length, right?
These are just wrong. As a general rule, we do not want to *ever* take
sizeof() a struct that contains a flexible array: the results will not
be consistent across platforms. The right thing is to use offsetof()
instead. See the helpful comment autoconf provides:
/* Define to nothing if C supports flexible array members, and to 1 if it does
not. That way, with a declaration like `struct s { int n; double
d[FLEXIBLE_ARRAY_MEMBER]; };', the struct hack can be used with pre-C99
compilers. When computing the size of such an object, don't use 'sizeof
(struct s)' as it overestimates the size. Use 'offsetof (struct s, d)'
instead. Don't use 'offsetof (struct s, d[0])', as this doesn't work with
MSVC and with C++ compilers. */
#define FLEXIBLE_ARRAY_MEMBER /**/
This point is actually the main reason we've not done this change long
since. People did not feel like running around to make sure there were
no overlooked uses of sizeof().
regards, tom lane
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Thu, Feb 19, 2015 at 2:58 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Michael Paquier <michael.paquier@gmail.com> writes:
1-2) sizeof(ParamListInfoData) is present in a couple of places, assuming that sizeof(ParamListInfoData) has the equivalent of 1 parameter, like prepare.c, functions.c, spi.c and postgres.c: - /* sizeof(ParamListInfoData) includes the first array element */ paramLI = (ParamListInfo) palloc(sizeof(ParamListInfoData) + - (num_params - 1) * sizeof(ParamExternData)); + num_params * sizeof(ParamExternData)); 1-3) FuncCandidateList in namespace.c (thanks Andres!): newResult = (FuncCandidateList) - palloc(sizeof(struct _FuncCandidateList) - sizeof(Oid) - + effective_nargs * sizeof(Oid)); + palloc(sizeof(struct _FuncCandidateList) + + effective_nargs * sizeof(Oid)); I imagine that we do not want for those palloc calls to use ifdef FLEXIBLE_ARRAY_MEMBER to save some memory for code readability even if compiler does not support flexible-array length, right?These are just wrong. As a general rule, we do not want to *ever* take
sizeof() a struct that contains a flexible array: the results will not
be consistent across platforms. The right thing is to use offsetof()
instead. See the helpful comment autoconf provides:[...]
And I had this one in front of my eyes a couple of hours ago... Thanks.
This point is actually the main reason we've not done this change long
since. People did not feel like running around to make sure there were
no overlooked uses of sizeof().
Thanks for the clarifications and the review. Attached is a new set.
--
Michael
Attachments:
0001-First-cut-with-FLEXIBLE_ARRAY_MEMBER.patchapplication/x-patch; name=0001-First-cut-with-FLEXIBLE_ARRAY_MEMBER.patchDownload+86-86
0002-Add-forgotten-CATALOG_VARLEN-pg_authid.patchapplication/x-patch; name=0002-Add-forgotten-CATALOG_VARLEN-pg_authid.patchDownload+2-2
0003-Switch-varlena-to-use-FLEXIBLE_ARRAY_MEMBER.patchapplication/x-patch; name=0003-Switch-varlena-to-use-FLEXIBLE_ARRAY_MEMBER.patchDownload+7-8
0004-Replace-some-struct-declarations-with-union-in-heapa.patchapplication/x-patch; name=0004-Replace-some-struct-declarations-with-union-in-heapa.patchDownload+8-7
On Thu, Feb 19, 2015 at 3:57 PM, Michael Paquier
<michael.paquier@gmail.com> wrote:
On Thu, Feb 19, 2015 at 2:58 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Michael Paquier <michael.paquier@gmail.com> writes:
1-2) sizeof(ParamListInfoData) is present in a couple of places, assuming that sizeof(ParamListInfoData) has the equivalent of 1 parameter, like prepare.c, functions.c, spi.c and postgres.c: - /* sizeof(ParamListInfoData) includes the first array element */ paramLI = (ParamListInfo) palloc(sizeof(ParamListInfoData) + - (num_params - 1) * sizeof(ParamExternData)); + num_params * sizeof(ParamExternData)); 1-3) FuncCandidateList in namespace.c (thanks Andres!): newResult = (FuncCandidateList) - palloc(sizeof(struct _FuncCandidateList) - sizeof(Oid) - + effective_nargs * sizeof(Oid)); + palloc(sizeof(struct _FuncCandidateList) + + effective_nargs * sizeof(Oid)); I imagine that we do not want for those palloc calls to use ifdef FLEXIBLE_ARRAY_MEMBER to save some memory for code readability even if compiler does not support flexible-array length, right?These are just wrong. As a general rule, we do not want to *ever* take
sizeof() a struct that contains a flexible array: the results will not
be consistent across platforms. The right thing is to use offsetof()
instead. See the helpful comment autoconf provides:[...]
And I had this one in front of my eyes a couple of hours ago... Thanks.
This point is actually the main reason we've not done this change long
since. People did not feel like running around to make sure there were
no overlooked uses of sizeof().Thanks for the clarifications and the review. Attached is a new set.
Grr. Completely forgot to use offsetof in dumputils.c as well. Patch
that can be applied on top of 0001 is attached.
--
Michael
Attachments:
dumputils-fix.patchapplication/x-patch; name=dumputils-fix.patchDownload+1-1
On 2015-02-18 17:29:27 -0500, Tom Lane wrote:
Michael Paquier <michael.paquier@gmail.com> writes:
On Wed, Feb 18, 2015 at 10:09 PM, Andres Freund <andres@2ndquadrant.com> wrote:
The compiler will complain if you use a FLEXIBLE_ARRAY_MEMBER in the
middle of a struct but not when when you embed a struct that uses it
into the middle another struct. At least gcc doesn't and I think it'd be
utterly broken if another compiler did that. If there's a compiler that
does so, we need to make it define FLEXIBLE_ARRAY_MEMBER to 1.clang does complain on my OSX laptop regarding that ;)
I'm a bit astonished that gcc doesn't consider this an error. Sure seems
like it should.
Why? The flexible arrary stuff tells the compiler that it doesn't have
to worry about space for the array - it seems alright that it actually
doesn't. There's pretty much no way you can do that sensibly if the
variable length array itself is somewhere in the middle of a struct -
but if you embed the whole struct somewhere you have to take care
yourself. And e.g. the varlena cases Michael has shown do just that?
(Has anyone tried it on recent gcc?)
Yes.
Moreover, if we have any code that is assuming such cases are okay, it
probably needs a second look. Isn't this situation effectively assuming
that a variable-length array is fixed-length?
Not really. If you have
struct varlena hdr;
char data[TOAST_MAX_CHUNK_SIZE]; /* make struct big enough */
the variable length part is preallocated in the data?
You're right that many of these structs could just be replaced with a
union though.
Greetings,
Andres Freund
--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 2015-02-18 21:00:43 -0500, Tom Lane wrote:
Michael Paquier <michael.paquier@gmail.com> writes:
3) heapam.c in three places with HeapTupleHeaderData:
struct
{
HeapTupleHeaderData hdr;
char data[MaxHeapTupleSize];
} tbuf;And this, though I'm not sure if we'd have to change the size of the
padding data[] member.
I don't think so.
/*
* MaxHeapTupleSize is the maximum allowed size of a heap tuple, including
* header and MAXALIGN alignment padding. Basically it's BLCKSZ minus the
* other stuff that has to be on a disk page. Since heap pages use no
* "special space", there's no deduction for that.
...
#define MaxHeapTupleSize (BLCKSZ - MAXALIGN(SizeOfPageHeaderData + sizeof(ItemIdData)))
5) reorderbuffer.h with its use of HeapTupleHeaderData:
Hmm. Andres will have to answer for that one ;-)
That should be fairly uncomplicated to replace.
...
/* tuple, stored sequentially */
HeapTupleData tuple;
HeapTupleHeaderData header;
char data[MaxHeapTupleSize];
probably can just be replaced by a union of data and header part - as
quoted above MaxHeapTupleSize actually contains space for the
header. It's a bit annoying because potentially some output plugin might
reference .header - but they can just be changed to reference
tuple.t_data instead.
Greetings,
Andres Freund
--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Michael Paquier <michael.paquier@gmail.com> writes:
Thanks for the clarifications and the review. Attached is a new set.
I've reviewed and pushed the 0001 patch (you missed a few things :-().
Let's see how unhappy the buildfarm is with this before we start on
the rest of them.
regards, tom lane
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers