Remove configure --disable-float4-byval and --disable-float8-byval
AFAICT, these build options were only useful to maintain compatibility
for version-0 functions, but those are no longer supported, so these
options can be removed. There is a fair amount of code all over the
place to support these options, so the cleanup is quite significant.
The current behavior became the default in PG9.3. Note that this does
not affect on-disk storage. The only upgrade issue that I can see is
that pg_upgrade refuses to upgrade incompatible clusters if you have
contrib/isn installed. But hopefully everyone who is affected by that
will have upgraded at least once since PG9.2 already.
float4 is now always pass-by-value; the pass-by-reference code path is
completely removed.
float8 and related types are now hardcoded to pass-by-value or
pass-by-reference depending on whether the build is 64- or 32-bit, as
was previously also the default.
--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Attachments:
0001-Remove-configure-disable-float4-byval-and-disable-fl.patchtext/plain; charset=UTF-8; name=0001-Remove-configure-disable-float4-byval-and-disable-fl.patch; x-mac-creator=0; x-mac-type=0Download+40-383
0002-Replace-USE_FLOAT8_BYVAL-by-FLOAT8PASSBYVAL.patchtext/plain; charset=UTF-8; name=0002-Replace-USE_FLOAT8_BYVAL-by-FLOAT8PASSBYVAL.patch; x-mac-creator=0; x-mac-type=0Download+18-25
0003-Convert-some-if-calls-to-compiler-if-s.patchtext/plain; charset=UTF-8; name=0003-Convert-some-if-calls-to-compiler-if-s.patch; x-mac-creator=0; x-mac-type=0Download+16-27
Peter Eisentraut <peter.eisentraut@2ndquadrant.com> writes:
float4 is now always pass-by-value; the pass-by-reference code path is
completely removed.
I think this is OK.
float8 and related types are now hardcoded to pass-by-value or
pass-by-reference depending on whether the build is 64- or 32-bit, as
was previously also the default.
I'm less happy with doing this. It makes it impossible to test the
pass-by-reference code paths without actually firing up a 32-bit
environment. It'd be fine to document --disable-float8-byval as
a developer-only option (it might be so already), but I don't want
to lose it completely. I fail to see any advantage in getting rid
of it, anyway, since we do still have to maintain both code paths.
regards, tom lane
On Thu, Oct 31, 2019 at 9:36 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
float8 and related types are now hardcoded to pass-by-value or
pass-by-reference depending on whether the build is 64- or 32-bit, as
was previously also the default.I'm less happy with doing this. It makes it impossible to test the
pass-by-reference code paths without actually firing up a 32-bit
environment. It'd be fine to document --disable-float8-byval as
a developer-only option (it might be so already), but I don't want
to lose it completely. I fail to see any advantage in getting rid
of it, anyway, since we do still have to maintain both code paths.
Could we get around this by making Datum 8 bytes everywhere?
Years ago, we got rid of INT64_IS_BUSTED, so we're relying on 64-bit
types working on all platforms. However, Datum on a system where
pointers are only 32 bits wide is also only 32 bits wide, so we can't
pass 64-bit quantities the same way we do elsewhere. But, why is the
width of a Datum equal to the width of a pointer, anyway? It would
surely cost something to widen 1, 2, and 4 byte quantities to 8 bytes
when packing them into datums on 32-bit platforms, but (1) we've long
since accepted that cost on 64-bit platforms, (2) it would save
palloc/pfree cycles when packing 8 byte quantities into 4-byte values,
(3) 32-bit platforms are now marginal and performance on those
platforms is not critical, and (4) it would simplify a lot of code and
reduce future bugs.
On a related note, why do we store typbyval in the catalog anyway
instead of inferring it from typlen and maybe typalign? It seems like
a bad idea to record on disk the way we pass around values in memory,
because it means that a change to how values are passed around in
memory has ramifications for on-disk compatibility.
rhaas=# select typname, typlen, typbyval, typalign from pg_type where
typlen in (1,2,4,8) != typbyval;
typname | typlen | typbyval | typalign
----------+--------+----------+----------
macaddr8 | 8 | f | i
(1 row)
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
On Fri, Nov 1, 2019 at 7:41 AM Robert Haas <robertmhaas@gmail.com> wrote:
Could we get around this by making Datum 8 bytes everywhere?
I really like that idea.
Even Raspberry Pi devices (which can cost as little as $35) use 64-bit
ARM processors. It's abundantly clear that 32-bit platforms do not
matter enough to justify keeping all the SIZEOF_DATUM crud around.
--
Peter Geoghegan
Peter Geoghegan <pg@bowt.ie> writes:
On Fri, Nov 1, 2019 at 7:41 AM Robert Haas <robertmhaas@gmail.com> wrote:
Could we get around this by making Datum 8 bytes everywhere?
I really like that idea.
Even Raspberry Pi devices (which can cost as little as $35) use 64-bit
ARM processors. It's abundantly clear that 32-bit platforms do not
matter enough to justify keeping all the SIZEOF_DATUM crud around.
This line of argument seems to me to be the moral equivalent of
"let's drop 32-bit support altogether". I'm not entirely on board
with that. Certainly, a lot of the world is 64-bit these days,
but people are still building small systems and they might want
a database; preferably one that hasn't been detuned to the extent
that it barely manages to run at all on such a platform. Making
a whole lot of internal APIs 64-bit would be a pretty big hit for
a 32-bit platform --- more instructions, more memory consumed for
things like Datum arrays, all in a memory space that's not that big.
It seems especially insane to conclude that we should pull the plug
on such use-cases just to get rid of one obscure configure option.
If we were expending any significant devel effort on supporting
32-bit platforms, I might be ready to drop support, but we're not.
(Robert's proposal looks to me like it's actually creating new work
to do, not saving work.)
regards, tom lane
On Fri, Nov 1, 2019 at 11:00 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
This line of argument seems to me to be the moral equivalent of
"let's drop 32-bit support altogether". I'm not entirely on board
with that.
I don't think that those two things are equivalent at all. There may
even be workloads that will benefit when run on 32-bit hardware.
Having to palloc() and pfree() with 8 byte integers is probably very
slow.
Certainly, a lot of the world is 64-bit these days,
but people are still building small systems and they might want
a database; preferably one that hasn't been detuned to the extent
that it barely manages to run at all on such a platform.
Even the very low end is 64-bit these days. $100 smartphones have
64-bit CPUs and 4GB of ram. AFAICT, anything still being produced that
is recognizable as a general purpose CPU (e.g. by having virtual
memory) is 64-bit. We're talking about a change that can't affect
users until late 2020 at the earliest.
I accept that there are some number of users that still have 32-bit
Postgres installations, probably because they happened to have a
32-bit machine close at hand. The economics of running a database
server on a 32-bit machine are already awful, though.
It seems especially insane to conclude that we should pull the plug
on such use-cases just to get rid of one obscure configure option.
If we were expending any significant devel effort on supporting
32-bit platforms, I might be ready to drop support, but we're not.
(Robert's proposal looks to me like it's actually creating new work
to do, not saving work.)
The mental burden of considering "SIZEOF_DATUM == 4" and
"USE_FLOAT8_BYVAL" is a real cost for us. We maintain non-trivial
32-bit only code for numeric abbreviated keys. We also have to worry
about pfree()'ing memory when USE_FLOAT8_BYVAL within
heapam_index_validate_scan(). How confident are we that there isn't
some place that leaks memory on !USE_FLOAT8_BYVAL builds because
somebody forgot to add a pfree() in an #ifdef block?
--
Peter Geoghegan
On Fri, Nov 1, 2019 at 3:15 PM Peter Geoghegan <pg@bowt.ie> wrote:
I don't think that those two things are equivalent at all. There may
even be workloads that will benefit when run on 32-bit hardware.
Having to palloc() and pfree() with 8 byte integers is probably very
slow.
Yeah! I mean, users who are using only 4-byte or smaller pass-by-value
quantities will be harmed, especially in cases where they are storing
a lot of them at the same time (e.g. sorting) and especially if they
double their space consumption and run out of their very limited
supply of memory. Those are all worthwhile considerations and perhaps
strong arguments against my proposal. But, people using 8-byte
oughta-be-pass-by-value quantities are certainly being harmed by the
present system. It's not a black-and-white thing, like, oh, 8-byte
datums on 32-bit system is awful all the time. Or at least, I don't
think it is.
The mental burden of considering "SIZEOF_DATUM == 4" and
"USE_FLOAT8_BYVAL" is a real cost for us. We maintain non-trivial
32-bit only code for numeric abbreviated keys. We also have to worry
about pfree()'ing memory when USE_FLOAT8_BYVAL within
heapam_index_validate_scan(). How confident are we that there isn't
some place that leaks memory on !USE_FLOAT8_BYVAL builds because
somebody forgot to add a pfree() in an #ifdef block?
Yeah, I've had to fight with this multiple times, and so have other
people. It's a nuisance, and it causes bugs, certainly in draft
patches, sometimes in committed ones. It's not "free." If it's a cost
worth paying, ok, but is it?
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
On Fri, Nov 1, 2019 at 1:19 PM Robert Haas <robertmhaas@gmail.com> wrote:
Yeah! I mean, users who are using only 4-byte or smaller pass-by-value
quantities will be harmed, especially in cases where they are storing
a lot of them at the same time (e.g. sorting) and especially if they
double their space consumption and run out of their very limited
supply of memory.
I think that you meant treble, not double. You're forgetting about the
palloc() header overhead. ;-)
Doing even slightly serious work on a 32-bit machine is penny wise and
pound foolish. I also believe that we should support minority
platforms reasonably well, including 32-bit platforms, because it's
always a good idea to try to meet people where they are. Your proposal
doesn't seem like it really gives up on that goal.
Those are all worthwhile considerations and perhaps
strong arguments against my proposal. But, people using 8-byte
oughta-be-pass-by-value quantities are certainly being harmed by the
present system. It's not a black-and-white thing, like, oh, 8-byte
datums on 32-bit system is awful all the time. Or at least, I don't
think it is.
Right.
Yeah, I've had to fight with this multiple times, and so have other
people. It's a nuisance, and it causes bugs, certainly in draft
patches, sometimes in committed ones. It's not "free." If it's a cost
worth paying, ok, but is it?
#ifdef crud is something that we should go out of our way to eliminate
on general principle. All good portable C codebases go to great
lengths to encapsulate platform differences, if necessary by adding a
compatibility layer. One of the worst things about the OpenSSL
codebase is that it makes writing portable code everybody's problem.
--
Peter Geoghegan
On Fri, Nov 01, 2019 at 02:00:10PM -0400, Tom Lane wrote:
Peter Geoghegan <pg@bowt.ie> writes:
Even Raspberry Pi devices (which can cost as little as $35) use 64-bit
ARM processors. It's abundantly clear that 32-bit platforms do not
matter enough to justify keeping all the SIZEOF_DATUM crud around.This line of argument seems to me to be the moral equivalent of
"let's drop 32-bit support altogether". I'm not entirely on board
with that. Certainly, a lot of the world is 64-bit these days,
but people are still building small systems and they might want
a database; preferably one that hasn't been detuned to the extent
that it barely manages to run at all on such a platform. Making
a whole lot of internal APIs 64-bit would be a pretty big hit for
a 32-bit platform --- more instructions, more memory consumed for
things like Datum arrays, all in a memory space that's not that big.
I don't agree as well with the line of arguments to just remove 32b
support. The newest models of PI indeed use 64b ARM processors, but
the first model, as well as the PI zero are on 32b if I recall
correctly, and I would like to believe that these are still widely
used.
--
Michael
On 2019-11-02 11:47:26 +0900, Michael Paquier wrote:
On Fri, Nov 01, 2019 at 02:00:10PM -0400, Tom Lane wrote:
Peter Geoghegan <pg@bowt.ie> writes:
Even Raspberry Pi devices (which can cost as little as $35) use 64-bit
ARM processors. It's abundantly clear that 32-bit platforms do not
matter enough to justify keeping all the SIZEOF_DATUM crud around.This line of argument seems to me to be the moral equivalent of
"let's drop 32-bit support altogether". I'm not entirely on board
with that. Certainly, a lot of the world is 64-bit these days,
but people are still building small systems and they might want
a database; preferably one that hasn't been detuned to the extent
that it barely manages to run at all on such a platform. Making
a whole lot of internal APIs 64-bit would be a pretty big hit for
a 32-bit platform --- more instructions, more memory consumed for
things like Datum arrays, all in a memory space that's not that big.I don't agree as well with the line of arguments to just remove 32b
support.
Nobody is actually proposing that, as far as I can tell? I mean, by all
means argue that the overhead is too high, but just claiming that
slowing down 32bit systems by a measurable fraction is morally
equivalent to removing 32bit support seems pointlessly facetious.
Greetings,
Andres Freund
On Fri, Nov 1, 2019 at 7:47 PM Michael Paquier <michael@paquier.xyz> wrote:
This line of argument seems to me to be the moral equivalent of
"let's drop 32-bit support altogether". I'm not entirely on board
with that. Certainly, a lot of the world is 64-bit these days,
but people are still building small systems and they might want
a database; preferably one that hasn't been detuned to the extent
that it barely manages to run at all on such a platform. Making
a whole lot of internal APIs 64-bit would be a pretty big hit for
a 32-bit platform --- more instructions, more memory consumed for
things like Datum arrays, all in a memory space that's not that big.I don't agree as well with the line of arguments to just remove 32b
support.
Clearly you didn't read what I actually wrote, Michael.
--
Peter Geoghegan
On 2019-10-31 14:36, Tom Lane wrote:
Peter Eisentraut <peter.eisentraut@2ndquadrant.com> writes:
float4 is now always pass-by-value; the pass-by-reference code path is
completely removed.I think this is OK.
OK, here is a patch for just this part, and we can continue the
discussion on the rest in the meantime.
--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Attachments:
v2-0001-Remove-configure-disable-float4-byval.patchtext/plain; charset=UTF-8; name=v2-0001-Remove-configure-disable-float4-byval.patch; x-mac-creator=0; x-mac-type=0Download+18-191
On 2019-11-01 15:41, Robert Haas wrote:
On a related note, why do we store typbyval in the catalog anyway
instead of inferring it from typlen and maybe typalign? It seems like
a bad idea to record on disk the way we pass around values in memory,
because it means that a change to how values are passed around in
memory has ramifications for on-disk compatibility.
This sounds interesting. It would remove a pg_upgrade hazard (in the
long run).
There is some backward compatibility to be concerned about. This change
would require extension authors to change their code to insert #ifdef
USE_FLOAT8_BYVAL or similar, where currently their code might only
support one method or the other.
rhaas=# select typname, typlen, typbyval, typalign from pg_type where
typlen in (1,2,4,8) != typbyval;
There are also typlen=6 types. Who knew. ;-)
typname | typlen | typbyval | typalign
----------+--------+----------+----------
macaddr8 | 8 | f | i
(1 row)
This might be a case of the above issue: It's easier to just make it
pass by reference always than deal with a bunch of #ifdefs.
--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Peter Eisentraut <peter.eisentraut@2ndquadrant.com> writes:
On 2019-11-01 15:41, Robert Haas wrote:
On a related note, why do we store typbyval in the catalog anyway
instead of inferring it from typlen and maybe typalign? It seems like
a bad idea to record on disk the way we pass around values in memory,
because it means that a change to how values are passed around in
memory has ramifications for on-disk compatibility.
This sounds interesting. It would remove a pg_upgrade hazard (in the
long run).
There is some backward compatibility to be concerned about.
Yeah. The point here is that typbyval specifies what the C functions
concerned with the datatype are expecting. We can't just up and say
"we're going to decide that for you".
I do get the point that supporting two different typbyval options
for float8 and related types is a nontrivial cost.
regards, tom lane
Hi,
On 2019-11-02 08:46:12 +0100, Peter Eisentraut wrote:
On 2019-11-01 15:41, Robert Haas wrote:
On a related note, why do we store typbyval in the catalog anyway
instead of inferring it from typlen and maybe typalign? It seems like
a bad idea to record on disk the way we pass around values in memory,
because it means that a change to how values are passed around in
memory has ramifications for on-disk compatibility.This sounds interesting. It would remove a pg_upgrade hazard (in the long
run).There is some backward compatibility to be concerned about. This change
would require extension authors to change their code to insert #ifdef
USE_FLOAT8_BYVAL or similar, where currently their code might only support
one method or the other.
I think we really ought to remove the difference behind macros. That is,
for types that could be either, provide macros that fetch function
arguments into local memory, independent of whether the argument is a
byval or byref type. I.e. instead of having separate #ifdef
USE_FLOAT8_BYVALs for DatumGetFloat8(), DatumGetInt64(), ... we should
provide that logic in one centralized set of macros.
The fact that USE_FLOAT8_BYVAL has to creep into various functions imo
is the reasons why people are unhappy about it.
One way to do this would be something roughly like this sketch:
/* allow to force types to be byref, for testing purposes */
#if USE_FLOAT8_BYVAL
#define DatumForTypeIsByval(type) (sizeof(type) <= SIZEOF_DATUM)
#else
#define DatumForTypeIsByval(type) (sizeof(type) <= sizeof(int))
#endif
/* yes, I dream of being C++ once grown up */
#define DefineSmallFixedWidthDatumTypeConversions(type, TypeToDatumName, DatumToTypeName) \
static inline type \
TypeToDatumName (type value) \
{ \
if (DatumForTypeIsByval(type)) \
{ \
Datum tmp; \
\
/* ensure correct conversion, consider e.g. float */ \
memcpy(&tmp, &value, sizeof(type)); \
\
return tmp; \
} \
else \
{ \
type *tmp = (type *) palloc(sizeof(datum)); \
*tmp = value;
return PointerGetDatum(tmp); \
} \
} \
\
static inline type \
DatumToTypeName (Datum datum) \
{ \
if (DatumForTypeIsByval(type)) \
{ \
type tmp; \
\
/* ensure correct conversion */ \
memcpy(&tmp, &datum, sizeof(type)); \
\
return tmp; \
} \
else \
return *(type *) DatumGetPointer(type); \
} \
And then have
DefineSmallFixedWidthDatumTypeConversions(
float8,
Float8GetDatum,
DatumGetFloat8);
DefineSmallFixedWidthDatumTypeConversions(
int64,
Int64GetDatum,
DatumGetInt64);
And now also
DefineSmallFixedWidthDatumTypeConversions(
macaddr,
Macaddr8GetDatum,
DatumGetMacaddr8);
(there's probably plenty of bugs in the above, it's just a sketch)
We don't have to break types / extensions with inferring byval for fixed
width types. Instead we can change CREATE TYPE's PASSEDBYVALUE to accept
an optional parameter 'auto', allowing to opt in.
rhaas=# select typname, typlen, typbyval, typalign from pg_type where
typlen in (1,2,4,8) != typbyval;There are also typlen=6 types. Who knew. ;-)
There's a recent thread about them :)
typname | typlen | typbyval | typalign
----------+--------+----------+----------
macaddr8 | 8 | f | i
(1 row)This might be a case of the above issue: It's easier to just make it pass by
reference always than deal with a bunch of #ifdefs.
Indeed. And that's a bad sign imo.
Greetings,
Andres Freund
On Fri, Nov 01, 2019 at 09:41:58PM -0700, Peter Geoghegan wrote:
On Fri, Nov 1, 2019 at 7:47 PM Michael Paquier <michael@paquier.xyz> wrote:
I don't agree as well with the line of arguments to just remove 32b
support.Clearly you didn't read what I actually wrote, Michael.
Sorry, that was an misunderstanding from my side.
--
Michael
On Sat, Nov 2, 2019 at 8:00 PM Andres Freund <andres@anarazel.de> wrote:
I think we really ought to remove the difference behind macros. That is,
for types that could be either, provide macros that fetch function
arguments into local memory, independent of whether the argument is a
byval or byref type. I.e. instead of having separate #ifdef
USE_FLOAT8_BYVALs for DatumGetFloat8(), DatumGetInt64(), ... we should
provide that logic in one centralized set of macros.The fact that USE_FLOAT8_BYVAL has to creep into various functions imo
is the reasons why people are unhappy about it.
I think I'm *more* unhappy about the fact that it affects a bunch of
things that are not about whether float8 is passed byval. I mean, you
mention DatumGetInt64() above, but why in the world does a setting
with "float8" in the name affect how we pass int64? The thing is like
kudzu, getting into all sorts of things that it has no business
affecting - at least if you judge by the name - and for no really
clear reason.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
On 2019-11-02 08:39, Peter Eisentraut wrote:
On 2019-10-31 14:36, Tom Lane wrote:
Peter Eisentraut <peter.eisentraut@2ndquadrant.com> writes:
float4 is now always pass-by-value; the pass-by-reference code path is
completely removed.I think this is OK.
OK, here is a patch for just this part, and we can continue the
discussion on the rest in the meantime.
I have committed this part.
I will rebase and continue developing the rest of the patches based on
the discussion so far.
--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Thu, Nov 21, 2019 at 07:20:28PM +0100, Peter Eisentraut wrote:
I have committed this part.
I will rebase and continue developing the rest of the patches based on the
discussion so far.
Based on that I am marking the patch as committed in the CF. The rest
of the patch set could have a new entry.
--
Michael
My revised proposal is to remove --disable-float8-byval as a configure
option but keep it as an option in pg_config_manual.h. It is no longer
useful as a user-facing option, but as was pointed out, it is somewhat
useful for developers, so pg_config_manual.h seems like the right place.
--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services