Thread: Best data type to use for sales tax percent

Best data type to use for sales tax percent

From
Mike Christensen
Date:
(Sorry for the super-easy question)

I want to store sales tax (as a percent) in the DB, such as 9.5%.
What's the best data type for this?  I'm guessing numeric(2,3) should
be fine, yes?  I'm not too familiar with the numeric type (I was using
"real" before), but as I understand the data will be stored using the
necessary number of bits on the disk?  Thanks!

Mike

Re: Best data type to use for sales tax percent

From
Steve Crawford
Date:
Mike Christensen wrote:
> (Sorry for the super-easy question)
>
> I want to store sales tax (as a percent) in the DB, such as 9.5%.
> What's the best data type for this?  I'm guessing numeric(2,3) should
> be fine, yes?  I'm not too familiar with the numeric type (I was using
> "real" before), but as I understand the data will be stored using the
> necessary number of bits on the disk?  Thanks!
>
> Mike
>
>
Peruse: http://www.postgresql.org/docs/8.4/interactive/datatype-numeric.html

Pay special attention to nuggets like:

The data types real and double precision are inexact...

and

If you require exact storage and calculations (such as for monetary
amounts), use the numeric type instead....

Cheers,
Steve


Re: Best data type to use for sales tax percent

From
Christophe Pettus
Date:
On Oct 8, 2009, at 3:04 PM, Mike Christensen wrote:
> I'm guessing numeric(2,3) should
> be fine, yes?

Numeric is definitely what you want.  You probably want at least four
fractional digits, since sales tax (at least in California) is
routinely set to a quarter of a point.  For example, the basic sales
tax rate in Los Angeles is 9.75%, so 0.0975.  (There are other
subtleties in sales tax calculation in California; feel free to ask
off-list if you want more utterly non-PostgreSQL-related detail. :) )

--
-- Christophe Pettus
    xof@thebuild.com


Re: Best data type to use for sales tax percent

From
Merlin Moncure
Date:
On Thu, Oct 8, 2009 at 6:04 PM, Mike Christensen <mike@kitchenpc.com> wrote:
> (Sorry for the super-easy question)
>
> I want to store sales tax (as a percent) in the DB, such as 9.5%.
> What's the best data type for this?  I'm guessing numeric(2,3) should
> be fine, yes?  I'm not too familiar with the numeric type (I was using
> "real" before), but as I understand the data will be stored using the
> necessary number of bits on the disk?  Thanks!

numeric(2,3) is not possible :-). IMO, a percentage should be stored
numeric(6,3) or so.  That gives up to 100% down to thousandth of a
percent.

merlin

Re: Best data type to use for sales tax percent

From
Mike Christensen
Date:
Oops sorry I was thinking 2,3 meant 2 significant digits to the left
of the decimal point and 3 to the right.  I just re-read the docs and
now see what you mean.  6,3 would work fine though is maybe a bit
overkill since a 100% sales tax rate would cause a violent revolution
and lead to beheadings, at which point Postgres data integrity would
be the least of our worries..

I'll probably just use 3,3 and store this value between 0 and 1, since
all I'll be doing with this number is using it to multiply against a
subtotal.  3,3 gives me 0.000 through 0.999, correct?

Mike

On Thu, Oct 8, 2009 at 4:38 PM, Merlin Moncure <mmoncure@gmail.com> wrote:
> On Thu, Oct 8, 2009 at 6:04 PM, Mike Christensen <mike@kitchenpc.com> wrote:
>> (Sorry for the super-easy question)
>>
>> I want to store sales tax (as a percent) in the DB, such as 9.5%.
>> What's the best data type for this?  I'm guessing numeric(2,3) should
>> be fine, yes?  I'm not too familiar with the numeric type (I was using
>> "real" before), but as I understand the data will be stored using the
>> necessary number of bits on the disk?  Thanks!
>
> numeric(2,3) is not possible :-). IMO, a percentage should be stored
> numeric(6,3) or so.  That gives up to 100% down to thousandth of a
> percent.
>
> merlin
>

Re: Best data type to use for sales tax percent

From
Rich Shepard
Date:
On Thu, 8 Oct 2009, Mike Christensen wrote:

> I'll probably just use 3,3 and store this value between 0 and 1, since all
> I'll be doing with this number is using it to multiply against a subtotal.
> 3,3 gives me 0.000 through 0.999, correct?

Mike,

   No. The two digits represent the width of the column and the number of
significant digits. Try (4,3).

Rich

Re: Best data type to use for sales tax percent

From
Lew
Date:
Rich Shepard wrote:
> On Thu, 8 Oct 2009, Mike Christensen wrote:
>
>> I'll probably just use 3,3 and store this value between 0 and 1, since
>> all
>> I'll be doing with this number is using it to multiply against a
>> subtotal. 3,3 gives me 0.000 through 0.999, correct?
>
> Mike,
>
>   No. The two digits represent the width of the column and the number of
> significant digits. Try (4,3).

That will fail for the scenario that Christope Pettus pointed out.  California
is not the only state with sales tax specified to a quarter point., or even a
tenth of a point as in Missouri, where a local sales tax can be 9.241%, and
equal or exceed 10%, as in Alabama, Arizona, California and Illinois.
<http://en.wikipedia.org/wiki/Sales_taxes_in_the_United_States>

It's a good idea to research the domain before deciding on the representation.

--
Lew

Re: Best data type to use for sales tax percent

From
Mike Christensen
Date:
Wouldn't (4,3) let me store 0.000 through 9.999?  Maybe I'm still not
following what both numbers mean.

I understand the point about states/counties with 3 decimal digits of
sales tax, so I'd probably want to do (5,5) which should give me
0.00000 - 0.99999, and store 9.825% sales tax as .09825.  I'm
suggesting storing sales tax as a number between 0 and 1 so I can
easily multiply it against a subtotal to get the tax amount, storing
anything over 1.0 is unnecessary.

Also, if you just say "numeric" (without any numbers) then Postgres
lets you store any number you wish and will never do any rounding of
any sort, correct?  If there a price you pay for this in terms of
perf, bytes on disk, etc?

Another idea is if I'm tying myself down to a certain level of decimal
accuracy in the first place, why not just store everything as an Int2?
 9.825% would be stored as 9825 and I'll divide everything by 100000
when I calc sales tax.  If I'm not mistaken, integral data types are
faster for Postgres and less bytes on disk, right?  BTW, I will never
be doing any math using Postgres, it's just for pure storage..

Mike

On Thu, Oct 8, 2009 at 5:23 PM, Rich Shepard <rshepard@appl-ecosys.com> wrote:
> On Thu, 8 Oct 2009, Mike Christensen wrote:
>
>> I'll probably just use 3,3 and store this value between 0 and 1, since all
>> I'll be doing with this number is using it to multiply against a subtotal.
>> 3,3 gives me 0.000 through 0.999, correct?
>
> Mike,
>
>  No. The two digits represent the width of the column and the number of
> significant digits. Try (4,3).
>
> Rich
>
> --
> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general
>

Re: Best data type to use for sales tax percent

From
Peter Eisentraut
Date:
On Fri, 2009-10-09 at 00:10 -0700, Mike Christensen wrote:
> Wouldn't (4,3) let me store 0.000 through 9.999?  Maybe I'm still not
> following what both numbers mean.

Yes.  If you want 0.000 through 0.999, use numeric(3,3).  Adding a check
constraint might increase clarity.  And put it in a domain for extra
style points. :-)



Re: Best data type to use for sales tax percent

From
Sam Mason
Date:
On Fri, Oct 09, 2009 at 12:10:41AM -0700, Mike Christensen wrote:
> Wouldn't (4,3) let me store 0.000 through 9.999?  Maybe I'm still not
> following what both numbers mean.

I think Rich was getting confused about how you wanted to represent your
percentages.

> I understand the point about states/counties with 3 decimal digits of
> sales tax, so I'd probably want to do (5,5) which should give me
> 0.00000 - 0.99999, and store 9.825% sales tax as .09825.  I'm
> suggesting storing sales tax as a number between 0 and 1 so I can
> easily multiply it against a subtotal to get the tax amount, storing
> anything over 1.0 is unnecessary.

This is how I'd normally do it.  Ratios for inside the code, just
"format" them as percentages when you want the user to see them.

> Also, if you just say "numeric" (without any numbers) then Postgres
> lets you store any number you wish and will never do any rounding of
> any sort, correct?  If there a price you pay for this in terms of
> perf, bytes on disk, etc?

It's not possible to do division accurately (not sure about the caveats
in other operators).  For example, 1/3 is represented as "0.33333" and
multiplying this by three again will give "0.99999".  When people say
that numeric types are "exact" they're not giving you whole truth.

> Another idea is if I'm tying myself down to a certain level of decimal
> accuracy in the first place, why not just store everything as an Int2?
>  9.825% would be stored as 9825 and I'll divide everything by 100000
> when I calc sales tax.  If I'm not mistaken, integral data types are
> faster for Postgres and less bytes on disk, right?  BTW, I will never
> be doing any math using Postgres, it's just for pure storage..

Not sure what range of values you have to cover; you wouldn't be able to
do this with fixed width integer types:

  select numeric '100' ^ 300;

Numeric types allow you to do the above, the flexibility of allowing the
representation of a number to get this wide that causes things to be
slower.  It's not much slower though, I'd benchmark a test case that's
meaningful to you and then can you make a sensible decision.

--
  Sam  http://samason.me.uk/

Re: Best data type to use for sales tax percent

From
Rich Shepard
Date:
On Thu, 8 Oct 2009, Lew wrote:

> That will fail for the scenario that Christope Pettus pointed out.
> California is not the only state with sales tax specified to a quarter
> point., or even a tenth of a point as in Missouri, where a local sales tax
> can be 9.241%, and equal or exceed 10%, as in Alabama, Arizona, California
> and Illinois.
> <http://en.wikipedia.org/wiki/Sales_taxes_in_the_United_States>

   Good points, Lew. My response was specific to making the field
specification the same as the significant digits specification, not the data
domain, but I accept your correction as completely valid.

   Here in Oregon there is no sales tax. Each time it's proposed it's
rejected by voters as heretical. Of course we have some of the highest
income and property taxes around and other "fees" that serve in lieu of
sales tax revenues. But, the majority of voting folks think that having
out-of-state visitor also pay for government services by paying a sales tax
is a Bad Thing. We also are not allowed to fuel our own vehicles (New Jersey
is the only other state with that prohabition), and that's another religious
issue with most voters. Sigh. Can I generalze by writing that every state
government is dysfunctional?

Rich

Re: Best data type to use for sales tax percent

From
Mike Christensen
Date:
So back to my question about representing sales tax, it looks like I
have two choices:

1) Use a numeric(5,5) column.  This has the advantage of storing the
sales tax in the exact representation of a percent (I can directly
multiply it against any subtotal to get the sales tax).  It also
"looks" nicer in the DB and probably has some style points.  It
appears to me there is a massive amount of overhead with these columns
though, something like 8bytes plus whatever storage is required for
the precision.  However, even if I had a million rows that's only like
8 megs on the disk.  I'm more worried about perf than disk size.  I
would assume there's only a perf hit doing math with these types, not
so much just loading them into a dataset.

2) Use an Int2.  I'd have to use a multiplier in my source code after
loading the data.  In the DB, the data would look kinda funky since
9.825% would be represented as 9825.   I think this is a more
efficient number for storage since it only requires 2 bytes.  I'd be
able to store up to 65% or so which is fine for any sales tax I've
ever heard of.

I'm kinda leaning towards using the numeric column type, simply
because it seems "cleaner" to me.  At work, we use multipliers all
over the place in our DB and it has turned into a complete nightmare.
I'm somewhat of a believer in just storing data exactly how you need
to use it.

Thanks!

Mike

On Fri, Oct 9, 2009 at 4:13 AM, Sam Mason <sam@samason.me.uk> wrote:
> On Fri, Oct 09, 2009 at 12:10:41AM -0700, Mike Christensen wrote:
>> Wouldn't (4,3) let me store 0.000 through 9.999?  Maybe I'm still not
>> following what both numbers mean.
>
> I think Rich was getting confused about how you wanted to represent your
> percentages.
>
>> I understand the point about states/counties with 3 decimal digits of
>> sales tax, so I'd probably want to do (5,5) which should give me
>> 0.00000 - 0.99999, and store 9.825% sales tax as .09825.  I'm
>> suggesting storing sales tax as a number between 0 and 1 so I can
>> easily multiply it against a subtotal to get the tax amount, storing
>> anything over 1.0 is unnecessary.
>
> This is how I'd normally do it.  Ratios for inside the code, just
> "format" them as percentages when you want the user to see them.
>
>> Also, if you just say "numeric" (without any numbers) then Postgres
>> lets you store any number you wish and will never do any rounding of
>> any sort, correct?  If there a price you pay for this in terms of
>> perf, bytes on disk, etc?
>
> It's not possible to do division accurately (not sure about the caveats
> in other operators).  For example, 1/3 is represented as "0.33333" and
> multiplying this by three again will give "0.99999".  When people say
> that numeric types are "exact" they're not giving you whole truth.
>
>> Another idea is if I'm tying myself down to a certain level of decimal
>> accuracy in the first place, why not just store everything as an Int2?
>>  9.825% would be stored as 9825 and I'll divide everything by 100000
>> when I calc sales tax.  If I'm not mistaken, integral data types are
>> faster for Postgres and less bytes on disk, right?  BTW, I will never
>> be doing any math using Postgres, it's just for pure storage..
>
> Not sure what range of values you have to cover; you wouldn't be able to
> do this with fixed width integer types:
>
>  select numeric '100' ^ 300;
>
> Numeric types allow you to do the above, the flexibility of allowing the
> representation of a number to get this wide that causes things to be
> slower.  It's not much slower though, I'd benchmark a test case that's
> meaningful to you and then can you make a sensible decision.
>
> --
>  Sam  http://samason.me.uk/
>
> --
> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general
>

Re: Best data type to use for sales tax percent

From
Mike Christensen
Date:
Can you explain what you mean by "put it in a domain" - I'd love extra
style points, but this sounds like a feature I haven't learned about
yet.

On Fri, Oct 9, 2009 at 3:38 AM, Peter Eisentraut <peter_e@gmx.net> wrote:
> On Fri, 2009-10-09 at 00:10 -0700, Mike Christensen wrote:
>> Wouldn't (4,3) let me store 0.000 through 9.999?  Maybe I'm still not
>> following what both numbers mean.
>
> Yes.  If you want 0.000 through 0.999, use numeric(3,3).  Adding a check
> constraint might increase clarity.  And put it in a domain for extra
> style points. :-)
>
>
>

Re: Best data type to use for sales tax percent

From
Christophe Pettus
Date:
On Oct 9, 2009, at 11:36 AM, Mike Christensen wrote:

> Can you explain what you mean by "put it in a domain" - I'd love extra
> style points, but this sounds like a feature I haven't learned about
> yet.


    http://www.postgresql.org/docs/8.4/interactive/sql-createdomain.html

Domains are basically type aliases with an optional CHECK clause, so
you could do something like:

    CREATE DOMAN sales_tax_rate AS DECIMAL(5,5) CHECK (VALUE >= 0);

Then, you can use the type "sales_tax_rate" in your tables, etc. just
as a normal first-class type.  (The only limitation, right now, is
that you can't create an array of them.)

In response to the other email, DECIMAL is definitely the better
solution for what you are looking for.

--
-- Christophe Pettus
    xof@thebuild.com


Re: Best data type to use for sales tax percent

From
Mike Christensen
Date:
Thanks for the info!  I was thinking this would be a short thread but
I definitely appreciate all the information.

I will definitely create a domain for this (and probably for some
other types in my app since I now know about this).  However, is the
CHECK really necessary?  A numeric(5,5) already has a maximum value of
10^0, so it would already create an overflow error if you set it
higher.  Is there an advantage of using the CHECK constraint?  Perhaps
this is faster or doesn't lock the row on update or something?

Just to point out, NUMERIC and DECIMAL are one and the same yes?

Mike

On Fri, Oct 9, 2009 at 11:46 AM, Christophe Pettus <xof@thebuild.com> wrote:
>
> On Oct 9, 2009, at 11:36 AM, Mike Christensen wrote:
>
>> Can you explain what you mean by "put it in a domain" - I'd love extra
>> style points, but this sounds like a feature I haven't learned about
>> yet.
>
>
>        http://www.postgresql.org/docs/8.4/interactive/sql-createdomain.html
>
> Domains are basically type aliases with an optional CHECK clause, so you
> could do something like:
>
>        CREATE DOMAN sales_tax_rate AS DECIMAL(5,5) CHECK (VALUE >= 0);
>
> Then, you can use the type "sales_tax_rate" in your tables, etc. just as a
> normal first-class type.  (The only limitation, right now, is that you can't
> create an array of them.)
>
> In response to the other email, DECIMAL is definitely the better solution
> for what you are looking for.
>
> --
> -- Christophe Pettus
>   xof@thebuild.com
>
>
> --
> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general
>

Re: Best data type to use for sales tax percent

From
Peter Eisentraut
Date:
On Fri, 2009-10-09 at 11:46 -0700, Christophe Pettus wrote:
> Domains are basically type aliases with an optional CHECK clause, so
> you could do something like:
>
>     CREATE DOMAN sales_tax_rate AS DECIMAL(5,5) CHECK (VALUE >= 0);
>
> Then, you can use the type "sales_tax_rate" in your tables, etc. just
> as a normal first-class type.  (The only limitation, right now, is
> that you can't create an array of them.)

Actually I wouldn't bother with the precision and scale at all.  I'd go
with something like

CREATE DOMAN sales_tax_rate AS DECIMAL CHECK (VALUE >= 0 AND VALUE <=1);




Re: Best data type to use for sales tax percent

From
Christophe Pettus
Date:
On Oct 9, 2009, at 12:14 PM, Mike Christensen wrote:
> I will definitely create a domain for this (and probably for some
> other types in my app since I now know about this).  However, is the
> CHECK really necessary?  A numeric(5,5) already has a maximum value of
> 10^0, so it would already create an overflow error if you set it
> higher.  Is there an advantage of using the CHECK constraint?  Perhaps
> this is faster or doesn't lock the row on update or something?
>
> Just to point out, NUMERIC and DECIMAL are one and the same yes?

Sorry, NUMERIC is correct; DECIMAL was my brain misfiring.

Since NUMERICs can go negative regardless of scale, the CHECK is handy
to make sure that you do not insert a negative number.  (This is
assuming, of course, that a sales tax rate can't actually be negative
for some reason.)  It's not required, of course, but it's always
advisable to put as much reasonable data checking into the database as
you can.

Peter Eisentraut's suggestion of just not putting a scale or precision
on the type at all and using CHECK to validate the values is also a
fine way of handling it.
--
-- Christophe Pettus
    xof@thebuild.com


Re: Best data type to use for sales tax percent

From
Jasen Betts
Date:
On 2009-10-08, Mike Christensen <mike@kitchenpc.com> wrote:
> (Sorry for the super-easy question)
>
> I want to store sales tax (as a percent) in the DB, such as 9.5%.
> What's the best data type for this?

real, or numeric, probably numeric.

> I'm guessing numeric(2,3) should be fine, yes?

depends on the range of values you want to store.

generally percentages are most usefully represented as decimal fractions
especially if you intend to do arithmetic with them.

numeric reccomended useful for financial stuff.

Re: Best data type to use for sales tax percent

From
Jasen Betts
Date:
On 2009-10-09, Peter Eisentraut <peter_e@gmx.net> wrote:
> On Fri, 2009-10-09 at 11:46 -0700, Christophe Pettus wrote:
>> Domains are basically type aliases with an optional CHECK clause, so
>> you could do something like:
>>
>>     CREATE DOMAN sales_tax_rate AS DECIMAL(5,5) CHECK (VALUE >= 0);
>>
>> Then, you can use the type "sales_tax_rate" in your tables, etc. just
>> as a normal first-class type.  (The only limitation, right now, is
>> that you can't create an array of them.)
>
> Actually I wouldn't bother with the precision and scale at all.  I'd go
> with something like
>
> CREATE DOMAN sales_tax_rate AS DECIMAL CHECK (VALUE >= 0 AND VALUE <=1);

why the latter check ( VALUE <=1 )?

Re: Best data type to use for sales tax percent

From
"Roderick A. Anderson"
Date:
Jasen Betts wrote:
> On 2009-10-09, Peter Eisentraut <peter_e@gmx.net> wrote:
>> On Fri, 2009-10-09 at 11:46 -0700, Christophe Pettus wrote:
>>> Domains are basically type aliases with an optional CHECK clause, so
>>> you could do something like:
>>>
>>>     CREATE DOMAN sales_tax_rate AS DECIMAL(5,5) CHECK (VALUE >= 0);
>>>
>>> Then, you can use the type "sales_tax_rate" in your tables, etc. just
>>> as a normal first-class type.  (The only limitation, right now, is
>>> that you can't create an array of them.)
>> Actually I wouldn't bother with the precision and scale at all.  I'd go
>> with something like
>>
>> CREATE DOMAN sales_tax_rate AS DECIMAL CHECK (VALUE >= 0 AND VALUE <=1);
>
> why the latter check ( VALUE <=1 )?

I think the initial post implied it was to be used directly in the
calculations, no "sales_tax_rate/100", so a fraction is needed.
Hopefully no one is experiencing 99%+ tax rates.


\\||/
Rod
--


Re: Best data type to use for sales tax percent

From
Christophe Pettus
Date:
On Oct 10, 2009, at 3:33 AM, Jasen Betts wrote:
>> CREATE DOMAN sales_tax_rate AS DECIMAL CHECK (VALUE >= 0 AND VALUE
>> <=1);
>
> why the latter check ( VALUE <=1 )?

Since this version has no scale on the DECIMAL, the second check keeps
it from being larger than 1.0, since it's presumably a percentage from
0% to 99%.

--
-- Christophe Pettus
    xof@thebuild.com


Re: Best data type to use for sales tax percent

From
Steve Crawford
Date:
Christophe Pettus wrote:
>
> On Oct 10, 2009, at 3:33 AM, Jasen Betts wrote:
>>> CREATE DOMAN sales_tax_rate AS DECIMAL CHECK (VALUE >= 0 AND VALUE
>>> <=1);
>>
>> why the latter check ( VALUE <=1 )?
>
> Since this version has no scale on the DECIMAL, the second check keeps
> it from being larger than 1.0, since it's presumably a percentage from
> 0% to 99%.
For general-purpose sales-tax...in the United States...for the time being.

I believe the current tax on new vehicles in Israel is 70+% and seem to
recall that it was well over 100% at one time. Sales taxes already vary
by product (in California, food is 0% for example) as well as state,
county, city. I can certainly imagine some locale pushing for taxes well
over 100% on the sin-du-jour (alcohol, tobacco, sugar, fat..)

Cheers,
Steve


Re: Best data type to use for sales tax percent

From
Peter Geoghegan
Date:
> For general-purpose sales-tax...in the United States...for the time being.
>
> I believe the current tax on new vehicles in Israel is 70+% and seem to
> recall that it was well over 100% at one time. Sales taxes already vary by
> product (in California, food is 0% for example) as well as state, county,
> city. I can certainly imagine some locale pushing for taxes well over 100%
> on the sin-du-jour (alcohol, tobacco, sugar, fat..)


In the E.U., sin taxes are charged as excise duty. A manufacturer pays
duty (a fixed amount per unit of ethanol or whatever), and sells it on
to a retailer at a price that (presumably) factors in the cost to him
of paying that duty, in addition to VAT  - they charge tax on the tax.
The retailer doesn't have to give sinful goods any special treatment.
When selling on those goods they just charge VAT, in turn, at the
standard rate. I imagine it's a similar situation in the U.S.

This minimises tax evasion, carousel fraud, etc, and takes the
burden/responsibility of collecting such massive taxes (effectively
80%+ of the cost of a pack of cigarettes) higher up the supply chain.
Taxes on cars are covered by "vehicle registration tax" and road tax
here in Ireland, in addition to charging VAT at the standard rate on
top of the total, inclusive cost. Charging massive rates like 70% is
avoided. Consumption taxes are supposed to be easy to collect.

If you wonder why a manufacturer charges VAT to a retailer or non-end
user, well, that's one of the distinctions between VAT and sales tax
(they pay, but they claim it back later, whereas an individual cannot
claim it back).

Regards,
Peter Geoghegan