WIP: Relaxing the constraints on numeric scale
When specifying NUMERIC(precision, scale) the scale is constrained to
the range [0, precision], which is per SQL spec. However, at least one
other major database vendor intentionally does not impose this
restriction, since allowing scales outside this range can be useful.
A negative scale implies rounding before the decimal point. For
example, a column declared as NUMERIC(3,-3) rounds values to the
nearest thousand, and can hold values up to 999000.
(Note that the display scale remains non-negative, so all digits
before the decimal point are displayed, and none of the internals of
numeric.c need to worry about negative dscale values. Only the scale
in the typemod is negative.)
A scale greater than the precision constrains the value to be less
than 0.1. For example, a column declared as NUMERIC(3,6) can hold
"micro" quantities up to 0.000999.
Attached is a WIP patch supporting this.
Regards,
Dean
Attachments:
numeric-scale.patchtext/x-patch; charset=US-ASCII; name=numeric-scale.patchDownload+76-32
On Tue, Jun 29, 2021 at 3:58 PM Dean Rasheed <dean.a.rasheed@gmail.com> wrote:
When specifying NUMERIC(precision, scale) the scale is constrained to
the range [0, precision], which is per SQL spec. However, at least one
other major database vendor intentionally does not impose this
restriction, since allowing scales outside this range can be useful.
I thought about this too, but
/messages/by-id/774767.1591985683@sss.pgh.pa.us made me think that
it would be an on-disk format break. Maybe it's not, though?
--
Robert Haas
EDB: http://www.enterprisedb.com
On Tue, 29 Jun 2021 at 21:34, Robert Haas <robertmhaas@gmail.com> wrote:
I thought about this too, but
/messages/by-id/774767.1591985683@sss.pgh.pa.us made me think that
it would be an on-disk format break. Maybe it's not, though?
No, because the numeric dscale remains non-negative, so there's no
change to the way numeric values are stored. The only change is to
extend the allowed scale in the numeric typemod.
Regards,
Dean
Robert Haas <robertmhaas@gmail.com> writes:
On Tue, Jun 29, 2021 at 3:58 PM Dean Rasheed <dean.a.rasheed@gmail.com> wrote:
When specifying NUMERIC(precision, scale) the scale is constrained to
the range [0, precision], which is per SQL spec. However, at least one
other major database vendor intentionally does not impose this
restriction, since allowing scales outside this range can be useful.
I thought about this too, but
/messages/by-id/774767.1591985683@sss.pgh.pa.us made me think that
it would be an on-disk format break. Maybe it's not, though?
See further down in that thread --- I don't think there's actually
a need for negative dscale on-disk. However, there remains the question
of whether any external code knows enough about numeric typmods to become
confused by a negative scale field within those.
After reflecting for a bit, I suspect the answer is "probably", but
it seems like it wouldn't be much worse of an update than any number
of other catalog changes we make every release.
regards, tom lane
On Tue, Jun 29, 2021 at 4:46 PM Dean Rasheed <dean.a.rasheed@gmail.com> wrote:
On Tue, 29 Jun 2021 at 21:34, Robert Haas <robertmhaas@gmail.com> wrote:
I thought about this too, but
/messages/by-id/774767.1591985683@sss.pgh.pa.us made me think that
it would be an on-disk format break. Maybe it's not, though?No, because the numeric dscale remains non-negative, so there's no
change to the way numeric values are stored. The only change is to
extend the allowed scale in the numeric typemod.
Ah! Well, in that case, this sounds great.
(I haven't looked at the patch, so this is just an endorsement of the concept.)
--
Robert Haas
EDB: http://www.enterprisedb.com
Attached is a more complete patch, with updated docs and tests.
I chose to allow the scale to be in the range -1000 to 1000, which, to
some extent, is quite arbitrary. The upper limit of 1000 makes sense,
because nearly all numeric computations (other than multiply, add and
subtract) have that as their upper scale limit (that's the maximum
display scale). It also has to be at least 1000 for SQL compliance,
since the precision can be up to 1000.
The lower limit, on the other hand, really is quite arbitrary. -1000
is a nice round number, giving it a certain symmetry, and is almost
certainly sufficient for any realistic use case (-1000 means numbers
are rounded to the nearest multiple of 10^1000).
Also, keeping some spare bits in the typemod might come in handy one
day for something else (e.g., rounding mode choice).
Regards,
Dean
Attachments:
numeric-scale-v2.patchtext/x-patch; charset=US-ASCII; name=numeric-scale-v2.patchDownload+188-33
Dean Rasheed <dean.a.rasheed@gmail.com> writes:
Attached is a more complete patch, with updated docs and tests.
I took a brief look at this and have a couple of quick suggestions:
* As you mention, keeping some spare bits in the typmod might come
in handy some day, but as given this patch isn't really doing so.
I think it might be advisable to mask the scale off at 11 bits,
preserving the high 5 bits of the low-order half of the word for future
use. The main objection to that I guess is that it would complicate
doing sign extension in TYPMOD_SCALE(). But it doesn't seem like we
use that logic in any really hot code paths, so another instruction
or three probably is not much of a cost.
* I agree with wrapping the typmod construction/extraction into macros
(or maybe they should be inline functions?) but the names you chose
seem generic enough to possibly confuse onlookers. I'd suggest
changing TYPMOD to NUMERIC_TYPMOD or NUM_TYPMOD. The comment for them
should probably also explicitly explain "For purely historical reasons,
VARHDRSZ is added to the typmod value after these fields are combined",
or words to that effect.
* It might be advisable to write NUMERIC_MIN_SCALE with parens:
#define NUMERIC_MIN_SCALE (-1000)
to avoid any precedence gotchas.
* I'd be inclined to leave the num_typemod_test table in place,
rather than dropping it, so that it serves to exercise pg_dump
for these cases during the pg_upgrade test.
Haven't read the code in detail yet.
regards, tom lane
On Wed, 21 Jul 2021 at 22:33, Tom Lane <tgl@sss.pgh.pa.us> wrote:
I took a brief look at this and have a couple of quick suggestions:
Thanks for looking at this!
* As you mention, keeping some spare bits in the typmod might come
in handy some day, but as given this patch isn't really doing so.
I think it might be advisable to mask the scale off at 11 bits,
preserving the high 5 bits of the low-order half of the word for future
use. The main objection to that I guess is that it would complicate
doing sign extension in TYPMOD_SCALE(). But it doesn't seem like we
use that logic in any really hot code paths, so another instruction
or three probably is not much of a cost.
Yeah, that makes sense, and it's worth documenting where the spare bits are.
Interestingly, gcc recognised the bit hack I used for sign extension
and turned it into (x << 21) >> 21 using x86 shl and sar instructions,
though I didn't write it that way because apparently that's not
portable.
* I agree with wrapping the typmod construction/extraction into macros
(or maybe they should be inline functions?) but the names you chose
seem generic enough to possibly confuse onlookers. I'd suggest
changing TYPMOD to NUMERIC_TYPMOD or NUM_TYPMOD. The comment for them
should probably also explicitly explain "For purely historical reasons,
VARHDRSZ is added to the typmod value after these fields are combined",
or words to that effect.
I've turned them into inline functions, since that makes them easier
to read, and debug if necessary.
All your other suggestions make sense too. Attached is a new version.
Regards,
Dean
Attachments:
numeric-scale-v3.patchtext/x-patch; charset=US-ASCII; name=numeric-scale-v3.patchDownload+221-33
Dean Rasheed <dean.a.rasheed@gmail.com> writes:
All your other suggestions make sense too. Attached is a new version.
OK, I've now studied this more closely, and have some additional
nitpicks:
* I felt the way you did the documentation was confusing. It seems
better to explain the normal case first, and then describe the two
extended cases.
* As long as we're encapsulating typmod construction/extraction, let's
also encapsulate the checks for valid typmods.
* Other places are fairly careful to declare typmod values as "int32",
so I think this code should too.
Attached is a proposed delta patch making those changes.
(I made the docs mention that the extension cases are allowed as of v15.
While useful in the short run, that will look like noise in ten years;
so I could go either way on whether to do that.)
If you're good with these, then I think it's ready to go.
I'll mark it RfC in the commitfest.
regards, tom lane
Attachments:
numeric-scale-v3-delta.patchtext/x-diff; charset=us-ascii; name=numeric-scale-v3-delta.patchDownload+55-35
On Fri, 23 Jul 2021 at 16:50, Tom Lane <tgl@sss.pgh.pa.us> wrote:
OK, I've now studied this more closely, and have some additional
nitpicks:* I felt the way you did the documentation was confusing. It seems
better to explain the normal case first, and then describe the two
extended cases.
OK, that looks much better. Re-reading the entire section, I think
it's much clearer now.
* As long as we're encapsulating typmod construction/extraction, let's
also encapsulate the checks for valid typmods.
Good idea.
* Other places are fairly careful to declare typmod values as "int32",
so I think this code should too.
OK, that seems sensible.
Attached is a proposed delta patch making those changes.
(I made the docs mention that the extension cases are allowed as of v15.
While useful in the short run, that will look like noise in ten years;
so I could go either way on whether to do that.)
Hmm, yeah. In general,I find such things in the documentation useful
for quite a few years. I'm regularly looking to see when a particular
feature was added, to see if I can use it in a particular situation.
But eventually, it'll become irrelevant, and I don't know if anyone
will go around tidying these things up. I have left it in, but perhaps
there is a wider discussion to be had about whether we should be doing
that more (or less) often. FWIW, I like the way some docs include an
"available since" tag (e.g,, Java's @since tag).
If you're good with these, then I think it's ready to go.
I'll mark it RfC in the commitfest.
Thanks. That all looked good, so I have pushed it.
Regards,
Dean