Best data type to use for sales tax percent

Started by Mike Christensenover 16 years ago23 messagesgeneral
Jump to latest
#1Mike Christensen
mike@kitchenpc.com

(Sorry for the super-easy question)

I want to store sales tax (as a percent) in the DB, such as 9.5%.
What's the best data type for this? I'm guessing numeric(2,3) should
be fine, yes? I'm not too familiar with the numeric type (I was using
"real" before), but as I understand the data will be stored using the
necessary number of bits on the disk? Thanks!

Mike

#2Steve Crawford
scrawford@pinpointresearch.com
In reply to: Mike Christensen (#1)
Re: Best data type to use for sales tax percent

Mike Christensen wrote:

(Sorry for the super-easy question)

I want to store sales tax (as a percent) in the DB, such as 9.5%.
What's the best data type for this? I'm guessing numeric(2,3) should
be fine, yes? I'm not too familiar with the numeric type (I was using
"real" before), but as I understand the data will be stored using the
necessary number of bits on the disk? Thanks!

Mike

Peruse: http://www.postgresql.org/docs/8.4/interactive/datatype-numeric.html

Pay special attention to nuggets like:

The data types real and double precision are inexact...

and

If you require exact storage and calculations (such as for monetary
amounts), use the numeric type instead....

Cheers,
Steve

#3Christophe Pettus
xof@thebuild.com
In reply to: Mike Christensen (#1)
Re: Best data type to use for sales tax percent

On Oct 8, 2009, at 3:04 PM, Mike Christensen wrote:

I'm guessing numeric(2,3) should
be fine, yes?

Numeric is definitely what you want. You probably want at least four
fractional digits, since sales tax (at least in California) is
routinely set to a quarter of a point. For example, the basic sales
tax rate in Los Angeles is 9.75%, so 0.0975. (There are other
subtleties in sales tax calculation in California; feel free to ask
off-list if you want more utterly non-PostgreSQL-related detail. :) )

--
-- Christophe Pettus
xof@thebuild.com

#4Merlin Moncure
mmoncure@gmail.com
In reply to: Mike Christensen (#1)
Re: Best data type to use for sales tax percent

On Thu, Oct 8, 2009 at 6:04 PM, Mike Christensen <mike@kitchenpc.com> wrote:

(Sorry for the super-easy question)

I want to store sales tax (as a percent) in the DB, such as 9.5%.
What's the best data type for this?  I'm guessing numeric(2,3) should
be fine, yes?  I'm not too familiar with the numeric type (I was using
"real" before), but as I understand the data will be stored using the
necessary number of bits on the disk?  Thanks!

numeric(2,3) is not possible :-). IMO, a percentage should be stored
numeric(6,3) or so. That gives up to 100% down to thousandth of a
percent.

merlin

#5Mike Christensen
mike@kitchenpc.com
In reply to: Merlin Moncure (#4)
Re: Best data type to use for sales tax percent

Oops sorry I was thinking 2,3 meant 2 significant digits to the left
of the decimal point and 3 to the right. I just re-read the docs and
now see what you mean. 6,3 would work fine though is maybe a bit
overkill since a 100% sales tax rate would cause a violent revolution
and lead to beheadings, at which point Postgres data integrity would
be the least of our worries..

I'll probably just use 3,3 and store this value between 0 and 1, since
all I'll be doing with this number is using it to multiply against a
subtotal. 3,3 gives me 0.000 through 0.999, correct?

Mike

Show quoted text

On Thu, Oct 8, 2009 at 4:38 PM, Merlin Moncure <mmoncure@gmail.com> wrote:

On Thu, Oct 8, 2009 at 6:04 PM, Mike Christensen <mike@kitchenpc.com> wrote:

(Sorry for the super-easy question)

I want to store sales tax (as a percent) in the DB, such as 9.5%.
What's the best data type for this?  I'm guessing numeric(2,3) should
be fine, yes?  I'm not too familiar with the numeric type (I was using
"real" before), but as I understand the data will be stored using the
necessary number of bits on the disk?  Thanks!

numeric(2,3) is not possible :-). IMO, a percentage should be stored
numeric(6,3) or so.  That gives up to 100% down to thousandth of a
percent.

merlin

#6Rich Shepard
rshepard@appl-ecosys.com
In reply to: Mike Christensen (#5)
Re: Best data type to use for sales tax percent

On Thu, 8 Oct 2009, Mike Christensen wrote:

I'll probably just use 3,3 and store this value between 0 and 1, since all
I'll be doing with this number is using it to multiply against a subtotal.
3,3 gives me 0.000 through 0.999, correct?

Mike,

No. The two digits represent the width of the column and the number of
significant digits. Try (4,3).

Rich

#7Lew
noone@lwsc.ehost-services.com
In reply to: Rich Shepard (#6)
Re: Best data type to use for sales tax percent

Rich Shepard wrote:

On Thu, 8 Oct 2009, Mike Christensen wrote:

I'll probably just use 3,3 and store this value between 0 and 1, since
all
I'll be doing with this number is using it to multiply against a
subtotal. 3,3 gives me 0.000 through 0.999, correct?

Mike,

No. The two digits represent the width of the column and the number of
significant digits. Try (4,3).

That will fail for the scenario that Christope Pettus pointed out. California
is not the only state with sales tax specified to a quarter point., or even a
tenth of a point as in Missouri, where a local sales tax can be 9.241%, and
equal or exceed 10%, as in Alabama, Arizona, California and Illinois.
<http://en.wikipedia.org/wiki/Sales_taxes_in_the_United_States&gt;

It's a good idea to research the domain before deciding on the representation.

--
Lew

#8Mike Christensen
mike@kitchenpc.com
In reply to: Rich Shepard (#6)
Re: Best data type to use for sales tax percent

Wouldn't (4,3) let me store 0.000 through 9.999? Maybe I'm still not
following what both numbers mean.

I understand the point about states/counties with 3 decimal digits of
sales tax, so I'd probably want to do (5,5) which should give me
0.00000 - 0.99999, and store 9.825% sales tax as .09825. I'm
suggesting storing sales tax as a number between 0 and 1 so I can
easily multiply it against a subtotal to get the tax amount, storing
anything over 1.0 is unnecessary.

Also, if you just say "numeric" (without any numbers) then Postgres
lets you store any number you wish and will never do any rounding of
any sort, correct? If there a price you pay for this in terms of
perf, bytes on disk, etc?

Another idea is if I'm tying myself down to a certain level of decimal
accuracy in the first place, why not just store everything as an Int2?
9.825% would be stored as 9825 and I'll divide everything by 100000
when I calc sales tax. If I'm not mistaken, integral data types are
faster for Postgres and less bytes on disk, right? BTW, I will never
be doing any math using Postgres, it's just for pure storage..

Mike

Show quoted text

On Thu, Oct 8, 2009 at 5:23 PM, Rich Shepard <rshepard@appl-ecosys.com> wrote:

On Thu, 8 Oct 2009, Mike Christensen wrote:

I'll probably just use 3,3 and store this value between 0 and 1, since all
I'll be doing with this number is using it to multiply against a subtotal.
3,3 gives me 0.000 through 0.999, correct?

Mike,

 No. The two digits represent the width of the column and the number of
significant digits. Try (4,3).

Rich

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#9Peter Eisentraut
peter_e@gmx.net
In reply to: Mike Christensen (#8)
Re: Best data type to use for sales tax percent

On Fri, 2009-10-09 at 00:10 -0700, Mike Christensen wrote:

Wouldn't (4,3) let me store 0.000 through 9.999? Maybe I'm still not
following what both numbers mean.

Yes. If you want 0.000 through 0.999, use numeric(3,3). Adding a check
constraint might increase clarity. And put it in a domain for extra
style points. :-)

#10Sam Mason
sam@samason.me.uk
In reply to: Mike Christensen (#8)
Re: Best data type to use for sales tax percent

On Fri, Oct 09, 2009 at 12:10:41AM -0700, Mike Christensen wrote:

Wouldn't (4,3) let me store 0.000 through 9.999? Maybe I'm still not
following what both numbers mean.

I think Rich was getting confused about how you wanted to represent your
percentages.

I understand the point about states/counties with 3 decimal digits of
sales tax, so I'd probably want to do (5,5) which should give me
0.00000 - 0.99999, and store 9.825% sales tax as .09825. I'm
suggesting storing sales tax as a number between 0 and 1 so I can
easily multiply it against a subtotal to get the tax amount, storing
anything over 1.0 is unnecessary.

This is how I'd normally do it. Ratios for inside the code, just
"format" them as percentages when you want the user to see them.

Also, if you just say "numeric" (without any numbers) then Postgres
lets you store any number you wish and will never do any rounding of
any sort, correct? If there a price you pay for this in terms of
perf, bytes on disk, etc?

It's not possible to do division accurately (not sure about the caveats
in other operators). For example, 1/3 is represented as "0.33333" and
multiplying this by three again will give "0.99999". When people say
that numeric types are "exact" they're not giving you whole truth.

Another idea is if I'm tying myself down to a certain level of decimal
accuracy in the first place, why not just store everything as an Int2?
9.825% would be stored as 9825 and I'll divide everything by 100000
when I calc sales tax. If I'm not mistaken, integral data types are
faster for Postgres and less bytes on disk, right? BTW, I will never
be doing any math using Postgres, it's just for pure storage..

Not sure what range of values you have to cover; you wouldn't be able to
do this with fixed width integer types:

select numeric '100' ^ 300;

Numeric types allow you to do the above, the flexibility of allowing the
representation of a number to get this wide that causes things to be
slower. It's not much slower though, I'd benchmark a test case that's
meaningful to you and then can you make a sensible decision.

--
Sam http://samason.me.uk/

#11Rich Shepard
rshepard@appl-ecosys.com
In reply to: Lew (#7)
Re: Best data type to use for sales tax percent

On Thu, 8 Oct 2009, Lew wrote:

That will fail for the scenario that Christope Pettus pointed out.
California is not the only state with sales tax specified to a quarter
point., or even a tenth of a point as in Missouri, where a local sales tax
can be 9.241%, and equal or exceed 10%, as in Alabama, Arizona, California
and Illinois.
<http://en.wikipedia.org/wiki/Sales_taxes_in_the_United_States&gt;

Good points, Lew. My response was specific to making the field
specification the same as the significant digits specification, not the data
domain, but I accept your correction as completely valid.

Here in Oregon there is no sales tax. Each time it's proposed it's
rejected by voters as heretical. Of course we have some of the highest
income and property taxes around and other "fees" that serve in lieu of
sales tax revenues. But, the majority of voting folks think that having
out-of-state visitor also pay for government services by paying a sales tax
is a Bad Thing. We also are not allowed to fuel our own vehicles (New Jersey
is the only other state with that prohabition), and that's another religious
issue with most voters. Sigh. Can I generalze by writing that every state
government is dysfunctional?

Rich

#12Mike Christensen
mike@kitchenpc.com
In reply to: Sam Mason (#10)
Re: Best data type to use for sales tax percent

So back to my question about representing sales tax, it looks like I
have two choices:

1) Use a numeric(5,5) column. This has the advantage of storing the
sales tax in the exact representation of a percent (I can directly
multiply it against any subtotal to get the sales tax). It also
"looks" nicer in the DB and probably has some style points. It
appears to me there is a massive amount of overhead with these columns
though, something like 8bytes plus whatever storage is required for
the precision. However, even if I had a million rows that's only like
8 megs on the disk. I'm more worried about perf than disk size. I
would assume there's only a perf hit doing math with these types, not
so much just loading them into a dataset.

2) Use an Int2. I'd have to use a multiplier in my source code after
loading the data. In the DB, the data would look kinda funky since
9.825% would be represented as 9825. I think this is a more
efficient number for storage since it only requires 2 bytes. I'd be
able to store up to 65% or so which is fine for any sales tax I've
ever heard of.

I'm kinda leaning towards using the numeric column type, simply
because it seems "cleaner" to me. At work, we use multipliers all
over the place in our DB and it has turned into a complete nightmare.
I'm somewhat of a believer in just storing data exactly how you need
to use it.

Thanks!

Mike

Show quoted text

On Fri, Oct 9, 2009 at 4:13 AM, Sam Mason <sam@samason.me.uk> wrote:

On Fri, Oct 09, 2009 at 12:10:41AM -0700, Mike Christensen wrote:

Wouldn't (4,3) let me store 0.000 through 9.999?  Maybe I'm still not
following what both numbers mean.

I think Rich was getting confused about how you wanted to represent your
percentages.

I understand the point about states/counties with 3 decimal digits of
sales tax, so I'd probably want to do (5,5) which should give me
0.00000 - 0.99999, and store 9.825% sales tax as .09825.  I'm
suggesting storing sales tax as a number between 0 and 1 so I can
easily multiply it against a subtotal to get the tax amount, storing
anything over 1.0 is unnecessary.

This is how I'd normally do it.  Ratios for inside the code, just
"format" them as percentages when you want the user to see them.

Also, if you just say "numeric" (without any numbers) then Postgres
lets you store any number you wish and will never do any rounding of
any sort, correct?  If there a price you pay for this in terms of
perf, bytes on disk, etc?

It's not possible to do division accurately (not sure about the caveats
in other operators).  For example, 1/3 is represented as "0.33333" and
multiplying this by three again will give "0.99999".  When people say
that numeric types are "exact" they're not giving you whole truth.

Another idea is if I'm tying myself down to a certain level of decimal
accuracy in the first place, why not just store everything as an Int2?
 9.825% would be stored as 9825 and I'll divide everything by 100000
when I calc sales tax.  If I'm not mistaken, integral data types are
faster for Postgres and less bytes on disk, right?  BTW, I will never
be doing any math using Postgres, it's just for pure storage..

Not sure what range of values you have to cover; you wouldn't be able to
do this with fixed width integer types:

 select numeric '100' ^ 300;

Numeric types allow you to do the above, the flexibility of allowing the
representation of a number to get this wide that causes things to be
slower.  It's not much slower though, I'd benchmark a test case that's
meaningful to you and then can you make a sensible decision.

--
 Sam  http://samason.me.uk/

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#13Mike Christensen
mike@kitchenpc.com
In reply to: Peter Eisentraut (#9)
Re: Best data type to use for sales tax percent

Can you explain what you mean by "put it in a domain" - I'd love extra
style points, but this sounds like a feature I haven't learned about
yet.

Show quoted text

On Fri, Oct 9, 2009 at 3:38 AM, Peter Eisentraut <peter_e@gmx.net> wrote:

On Fri, 2009-10-09 at 00:10 -0700, Mike Christensen wrote:

Wouldn't (4,3) let me store 0.000 through 9.999?  Maybe I'm still not
following what both numbers mean.

Yes.  If you want 0.000 through 0.999, use numeric(3,3).  Adding a check
constraint might increase clarity.  And put it in a domain for extra
style points. :-)

#14Christophe Pettus
xof@thebuild.com
In reply to: Mike Christensen (#13)
Re: Best data type to use for sales tax percent

On Oct 9, 2009, at 11:36 AM, Mike Christensen wrote:

Can you explain what you mean by "put it in a domain" - I'd love extra
style points, but this sounds like a feature I haven't learned about
yet.

http://www.postgresql.org/docs/8.4/interactive/sql-createdomain.html

Domains are basically type aliases with an optional CHECK clause, so
you could do something like:

CREATE DOMAN sales_tax_rate AS DECIMAL(5,5) CHECK (VALUE >= 0);

Then, you can use the type "sales_tax_rate" in your tables, etc. just
as a normal first-class type. (The only limitation, right now, is
that you can't create an array of them.)

In response to the other email, DECIMAL is definitely the better
solution for what you are looking for.

--
-- Christophe Pettus
xof@thebuild.com

#15Mike Christensen
mike@kitchenpc.com
In reply to: Christophe Pettus (#14)
Re: Best data type to use for sales tax percent

Thanks for the info! I was thinking this would be a short thread but
I definitely appreciate all the information.

I will definitely create a domain for this (and probably for some
other types in my app since I now know about this). However, is the
CHECK really necessary? A numeric(5,5) already has a maximum value of
10^0, so it would already create an overflow error if you set it
higher. Is there an advantage of using the CHECK constraint? Perhaps
this is faster or doesn't lock the row on update or something?

Just to point out, NUMERIC and DECIMAL are one and the same yes?

Mike

Show quoted text

On Fri, Oct 9, 2009 at 11:46 AM, Christophe Pettus <xof@thebuild.com> wrote:

On Oct 9, 2009, at 11:36 AM, Mike Christensen wrote:

Can you explain what you mean by "put it in a domain" - I'd love extra
style points, but this sounds like a feature I haven't learned about
yet.

       http://www.postgresql.org/docs/8.4/interactive/sql-createdomain.html

Domains are basically type aliases with an optional CHECK clause, so you
could do something like:

       CREATE DOMAN sales_tax_rate AS DECIMAL(5,5) CHECK (VALUE >= 0);

Then, you can use the type "sales_tax_rate" in your tables, etc. just as a
normal first-class type.  (The only limitation, right now, is that you can't
create an array of them.)

In response to the other email, DECIMAL is definitely the better solution
for what you are looking for.

--
-- Christophe Pettus
  xof@thebuild.com

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#16Peter Eisentraut
peter_e@gmx.net
In reply to: Christophe Pettus (#14)
Re: Best data type to use for sales tax percent

On Fri, 2009-10-09 at 11:46 -0700, Christophe Pettus wrote:

Domains are basically type aliases with an optional CHECK clause, so
you could do something like:

CREATE DOMAN sales_tax_rate AS DECIMAL(5,5) CHECK (VALUE >= 0);

Then, you can use the type "sales_tax_rate" in your tables, etc. just
as a normal first-class type. (The only limitation, right now, is
that you can't create an array of them.)

Actually I wouldn't bother with the precision and scale at all. I'd go
with something like

CREATE DOMAN sales_tax_rate AS DECIMAL CHECK (VALUE >= 0 AND VALUE <=1);

#17Christophe Pettus
xof@thebuild.com
In reply to: Mike Christensen (#15)
Re: Best data type to use for sales tax percent

On Oct 9, 2009, at 12:14 PM, Mike Christensen wrote:

I will definitely create a domain for this (and probably for some
other types in my app since I now know about this). However, is the
CHECK really necessary? A numeric(5,5) already has a maximum value of
10^0, so it would already create an overflow error if you set it
higher. Is there an advantage of using the CHECK constraint? Perhaps
this is faster or doesn't lock the row on update or something?

Just to point out, NUMERIC and DECIMAL are one and the same yes?

Sorry, NUMERIC is correct; DECIMAL was my brain misfiring.

Since NUMERICs can go negative regardless of scale, the CHECK is handy
to make sure that you do not insert a negative number. (This is
assuming, of course, that a sales tax rate can't actually be negative
for some reason.) It's not required, of course, but it's always
advisable to put as much reasonable data checking into the database as
you can.

Peter Eisentraut's suggestion of just not putting a scale or precision
on the type at all and using CHECK to validate the values is also a
fine way of handling it.
--
-- Christophe Pettus
xof@thebuild.com

#18Jasen Betts
jasen@xnet.co.nz
In reply to: Mike Christensen (#1)
Re: Best data type to use for sales tax percent

On 2009-10-08, Mike Christensen <mike@kitchenpc.com> wrote:

(Sorry for the super-easy question)

I want to store sales tax (as a percent) in the DB, such as 9.5%.
What's the best data type for this?

real, or numeric, probably numeric.

I'm guessing numeric(2,3) should be fine, yes?

depends on the range of values you want to store.

generally percentages are most usefully represented as decimal fractions
especially if you intend to do arithmetic with them.

numeric reccomended useful for financial stuff.

#19Jasen Betts
jasen@xnet.co.nz
In reply to: Mike Christensen (#1)
Re: Best data type to use for sales tax percent

On 2009-10-09, Peter Eisentraut <peter_e@gmx.net> wrote:

On Fri, 2009-10-09 at 11:46 -0700, Christophe Pettus wrote:

Domains are basically type aliases with an optional CHECK clause, so
you could do something like:

CREATE DOMAN sales_tax_rate AS DECIMAL(5,5) CHECK (VALUE >= 0);

Then, you can use the type "sales_tax_rate" in your tables, etc. just
as a normal first-class type. (The only limitation, right now, is
that you can't create an array of them.)

Actually I wouldn't bother with the precision and scale at all. I'd go
with something like

CREATE DOMAN sales_tax_rate AS DECIMAL CHECK (VALUE >= 0 AND VALUE <=1);

why the latter check ( VALUE <=1 )?

#20Roderick A. Anderson
raanders@cyber-office.net
In reply to: Jasen Betts (#19)
Re: Best data type to use for sales tax percent

Jasen Betts wrote:

On 2009-10-09, Peter Eisentraut <peter_e@gmx.net> wrote:

On Fri, 2009-10-09 at 11:46 -0700, Christophe Pettus wrote:

Domains are basically type aliases with an optional CHECK clause, so
you could do something like:

CREATE DOMAN sales_tax_rate AS DECIMAL(5,5) CHECK (VALUE >= 0);

Then, you can use the type "sales_tax_rate" in your tables, etc. just
as a normal first-class type. (The only limitation, right now, is
that you can't create an array of them.)

Actually I wouldn't bother with the precision and scale at all. I'd go
with something like

CREATE DOMAN sales_tax_rate AS DECIMAL CHECK (VALUE >= 0 AND VALUE <=1);

why the latter check ( VALUE <=1 )?

I think the initial post implied it was to be used directly in the
calculations, no "sales_tax_rate/100", so a fraction is needed.
Hopefully no one is experiencing 99%+ tax rates.

\\||/
Rod
--

#21Christophe Pettus
xof@thebuild.com
In reply to: Jasen Betts (#19)
#22Steve Crawford
scrawford@pinpointresearch.com
In reply to: Christophe Pettus (#21)
In reply to: Steve Crawford (#22)