Thread: Best data type to use for sales tax percent
(Sorry for the super-easy question) I want to store sales tax (as a percent) in the DB, such as 9.5%. What's the best data type for this? I'm guessing numeric(2,3) should be fine, yes? I'm not too familiar with the numeric type (I was using "real" before), but as I understand the data will be stored using the necessary number of bits on the disk? Thanks! Mike
Mike Christensen wrote: > (Sorry for the super-easy question) > > I want to store sales tax (as a percent) in the DB, such as 9.5%. > What's the best data type for this? I'm guessing numeric(2,3) should > be fine, yes? I'm not too familiar with the numeric type (I was using > "real" before), but as I understand the data will be stored using the > necessary number of bits on the disk? Thanks! > > Mike > > Peruse: http://www.postgresql.org/docs/8.4/interactive/datatype-numeric.html Pay special attention to nuggets like: The data types real and double precision are inexact... and If you require exact storage and calculations (such as for monetary amounts), use the numeric type instead.... Cheers, Steve
On Oct 8, 2009, at 3:04 PM, Mike Christensen wrote: > I'm guessing numeric(2,3) should > be fine, yes? Numeric is definitely what you want. You probably want at least four fractional digits, since sales tax (at least in California) is routinely set to a quarter of a point. For example, the basic sales tax rate in Los Angeles is 9.75%, so 0.0975. (There are other subtleties in sales tax calculation in California; feel free to ask off-list if you want more utterly non-PostgreSQL-related detail. :) ) -- -- Christophe Pettus xof@thebuild.com
On Thu, Oct 8, 2009 at 6:04 PM, Mike Christensen <mike@kitchenpc.com> wrote: > (Sorry for the super-easy question) > > I want to store sales tax (as a percent) in the DB, such as 9.5%. > What's the best data type for this? I'm guessing numeric(2,3) should > be fine, yes? I'm not too familiar with the numeric type (I was using > "real" before), but as I understand the data will be stored using the > necessary number of bits on the disk? Thanks! numeric(2,3) is not possible :-). IMO, a percentage should be stored numeric(6,3) or so. That gives up to 100% down to thousandth of a percent. merlin
Oops sorry I was thinking 2,3 meant 2 significant digits to the left of the decimal point and 3 to the right. I just re-read the docs and now see what you mean. 6,3 would work fine though is maybe a bit overkill since a 100% sales tax rate would cause a violent revolution and lead to beheadings, at which point Postgres data integrity would be the least of our worries.. I'll probably just use 3,3 and store this value between 0 and 1, since all I'll be doing with this number is using it to multiply against a subtotal. 3,3 gives me 0.000 through 0.999, correct? Mike On Thu, Oct 8, 2009 at 4:38 PM, Merlin Moncure <mmoncure@gmail.com> wrote: > On Thu, Oct 8, 2009 at 6:04 PM, Mike Christensen <mike@kitchenpc.com> wrote: >> (Sorry for the super-easy question) >> >> I want to store sales tax (as a percent) in the DB, such as 9.5%. >> What's the best data type for this? I'm guessing numeric(2,3) should >> be fine, yes? I'm not too familiar with the numeric type (I was using >> "real" before), but as I understand the data will be stored using the >> necessary number of bits on the disk? Thanks! > > numeric(2,3) is not possible :-). IMO, a percentage should be stored > numeric(6,3) or so. That gives up to 100% down to thousandth of a > percent. > > merlin >
On Thu, 8 Oct 2009, Mike Christensen wrote: > I'll probably just use 3,3 and store this value between 0 and 1, since all > I'll be doing with this number is using it to multiply against a subtotal. > 3,3 gives me 0.000 through 0.999, correct? Mike, No. The two digits represent the width of the column and the number of significant digits. Try (4,3). Rich
Rich Shepard wrote: > On Thu, 8 Oct 2009, Mike Christensen wrote: > >> I'll probably just use 3,3 and store this value between 0 and 1, since >> all >> I'll be doing with this number is using it to multiply against a >> subtotal. 3,3 gives me 0.000 through 0.999, correct? > > Mike, > > No. The two digits represent the width of the column and the number of > significant digits. Try (4,3). That will fail for the scenario that Christope Pettus pointed out. California is not the only state with sales tax specified to a quarter point., or even a tenth of a point as in Missouri, where a local sales tax can be 9.241%, and equal or exceed 10%, as in Alabama, Arizona, California and Illinois. <http://en.wikipedia.org/wiki/Sales_taxes_in_the_United_States> It's a good idea to research the domain before deciding on the representation. -- Lew
Wouldn't (4,3) let me store 0.000 through 9.999? Maybe I'm still not following what both numbers mean. I understand the point about states/counties with 3 decimal digits of sales tax, so I'd probably want to do (5,5) which should give me 0.00000 - 0.99999, and store 9.825% sales tax as .09825. I'm suggesting storing sales tax as a number between 0 and 1 so I can easily multiply it against a subtotal to get the tax amount, storing anything over 1.0 is unnecessary. Also, if you just say "numeric" (without any numbers) then Postgres lets you store any number you wish and will never do any rounding of any sort, correct? If there a price you pay for this in terms of perf, bytes on disk, etc? Another idea is if I'm tying myself down to a certain level of decimal accuracy in the first place, why not just store everything as an Int2? 9.825% would be stored as 9825 and I'll divide everything by 100000 when I calc sales tax. If I'm not mistaken, integral data types are faster for Postgres and less bytes on disk, right? BTW, I will never be doing any math using Postgres, it's just for pure storage.. Mike On Thu, Oct 8, 2009 at 5:23 PM, Rich Shepard <rshepard@appl-ecosys.com> wrote: > On Thu, 8 Oct 2009, Mike Christensen wrote: > >> I'll probably just use 3,3 and store this value between 0 and 1, since all >> I'll be doing with this number is using it to multiply against a subtotal. >> 3,3 gives me 0.000 through 0.999, correct? > > Mike, > > No. The two digits represent the width of the column and the number of > significant digits. Try (4,3). > > Rich > > -- > Sent via pgsql-general mailing list (pgsql-general@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-general >
On Fri, 2009-10-09 at 00:10 -0700, Mike Christensen wrote: > Wouldn't (4,3) let me store 0.000 through 9.999? Maybe I'm still not > following what both numbers mean. Yes. If you want 0.000 through 0.999, use numeric(3,3). Adding a check constraint might increase clarity. And put it in a domain for extra style points. :-)
On Fri, Oct 09, 2009 at 12:10:41AM -0700, Mike Christensen wrote: > Wouldn't (4,3) let me store 0.000 through 9.999? Maybe I'm still not > following what both numbers mean. I think Rich was getting confused about how you wanted to represent your percentages. > I understand the point about states/counties with 3 decimal digits of > sales tax, so I'd probably want to do (5,5) which should give me > 0.00000 - 0.99999, and store 9.825% sales tax as .09825. I'm > suggesting storing sales tax as a number between 0 and 1 so I can > easily multiply it against a subtotal to get the tax amount, storing > anything over 1.0 is unnecessary. This is how I'd normally do it. Ratios for inside the code, just "format" them as percentages when you want the user to see them. > Also, if you just say "numeric" (without any numbers) then Postgres > lets you store any number you wish and will never do any rounding of > any sort, correct? If there a price you pay for this in terms of > perf, bytes on disk, etc? It's not possible to do division accurately (not sure about the caveats in other operators). For example, 1/3 is represented as "0.33333" and multiplying this by three again will give "0.99999". When people say that numeric types are "exact" they're not giving you whole truth. > Another idea is if I'm tying myself down to a certain level of decimal > accuracy in the first place, why not just store everything as an Int2? > 9.825% would be stored as 9825 and I'll divide everything by 100000 > when I calc sales tax. If I'm not mistaken, integral data types are > faster for Postgres and less bytes on disk, right? BTW, I will never > be doing any math using Postgres, it's just for pure storage.. Not sure what range of values you have to cover; you wouldn't be able to do this with fixed width integer types: select numeric '100' ^ 300; Numeric types allow you to do the above, the flexibility of allowing the representation of a number to get this wide that causes things to be slower. It's not much slower though, I'd benchmark a test case that's meaningful to you and then can you make a sensible decision. -- Sam http://samason.me.uk/
On Thu, 8 Oct 2009, Lew wrote: > That will fail for the scenario that Christope Pettus pointed out. > California is not the only state with sales tax specified to a quarter > point., or even a tenth of a point as in Missouri, where a local sales tax > can be 9.241%, and equal or exceed 10%, as in Alabama, Arizona, California > and Illinois. > <http://en.wikipedia.org/wiki/Sales_taxes_in_the_United_States> Good points, Lew. My response was specific to making the field specification the same as the significant digits specification, not the data domain, but I accept your correction as completely valid. Here in Oregon there is no sales tax. Each time it's proposed it's rejected by voters as heretical. Of course we have some of the highest income and property taxes around and other "fees" that serve in lieu of sales tax revenues. But, the majority of voting folks think that having out-of-state visitor also pay for government services by paying a sales tax is a Bad Thing. We also are not allowed to fuel our own vehicles (New Jersey is the only other state with that prohabition), and that's another religious issue with most voters. Sigh. Can I generalze by writing that every state government is dysfunctional? Rich
So back to my question about representing sales tax, it looks like I have two choices: 1) Use a numeric(5,5) column. This has the advantage of storing the sales tax in the exact representation of a percent (I can directly multiply it against any subtotal to get the sales tax). It also "looks" nicer in the DB and probably has some style points. It appears to me there is a massive amount of overhead with these columns though, something like 8bytes plus whatever storage is required for the precision. However, even if I had a million rows that's only like 8 megs on the disk. I'm more worried about perf than disk size. I would assume there's only a perf hit doing math with these types, not so much just loading them into a dataset. 2) Use an Int2. I'd have to use a multiplier in my source code after loading the data. In the DB, the data would look kinda funky since 9.825% would be represented as 9825. I think this is a more efficient number for storage since it only requires 2 bytes. I'd be able to store up to 65% or so which is fine for any sales tax I've ever heard of. I'm kinda leaning towards using the numeric column type, simply because it seems "cleaner" to me. At work, we use multipliers all over the place in our DB and it has turned into a complete nightmare. I'm somewhat of a believer in just storing data exactly how you need to use it. Thanks! Mike On Fri, Oct 9, 2009 at 4:13 AM, Sam Mason <sam@samason.me.uk> wrote: > On Fri, Oct 09, 2009 at 12:10:41AM -0700, Mike Christensen wrote: >> Wouldn't (4,3) let me store 0.000 through 9.999? Maybe I'm still not >> following what both numbers mean. > > I think Rich was getting confused about how you wanted to represent your > percentages. > >> I understand the point about states/counties with 3 decimal digits of >> sales tax, so I'd probably want to do (5,5) which should give me >> 0.00000 - 0.99999, and store 9.825% sales tax as .09825. I'm >> suggesting storing sales tax as a number between 0 and 1 so I can >> easily multiply it against a subtotal to get the tax amount, storing >> anything over 1.0 is unnecessary. > > This is how I'd normally do it. Ratios for inside the code, just > "format" them as percentages when you want the user to see them. > >> Also, if you just say "numeric" (without any numbers) then Postgres >> lets you store any number you wish and will never do any rounding of >> any sort, correct? If there a price you pay for this in terms of >> perf, bytes on disk, etc? > > It's not possible to do division accurately (not sure about the caveats > in other operators). For example, 1/3 is represented as "0.33333" and > multiplying this by three again will give "0.99999". When people say > that numeric types are "exact" they're not giving you whole truth. > >> Another idea is if I'm tying myself down to a certain level of decimal >> accuracy in the first place, why not just store everything as an Int2? >> 9.825% would be stored as 9825 and I'll divide everything by 100000 >> when I calc sales tax. If I'm not mistaken, integral data types are >> faster for Postgres and less bytes on disk, right? BTW, I will never >> be doing any math using Postgres, it's just for pure storage.. > > Not sure what range of values you have to cover; you wouldn't be able to > do this with fixed width integer types: > > select numeric '100' ^ 300; > > Numeric types allow you to do the above, the flexibility of allowing the > representation of a number to get this wide that causes things to be > slower. It's not much slower though, I'd benchmark a test case that's > meaningful to you and then can you make a sensible decision. > > -- > Sam http://samason.me.uk/ > > -- > Sent via pgsql-general mailing list (pgsql-general@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-general >
Can you explain what you mean by "put it in a domain" - I'd love extra style points, but this sounds like a feature I haven't learned about yet. On Fri, Oct 9, 2009 at 3:38 AM, Peter Eisentraut <peter_e@gmx.net> wrote: > On Fri, 2009-10-09 at 00:10 -0700, Mike Christensen wrote: >> Wouldn't (4,3) let me store 0.000 through 9.999? Maybe I'm still not >> following what both numbers mean. > > Yes. If you want 0.000 through 0.999, use numeric(3,3). Adding a check > constraint might increase clarity. And put it in a domain for extra > style points. :-) > > >
On Oct 9, 2009, at 11:36 AM, Mike Christensen wrote: > Can you explain what you mean by "put it in a domain" - I'd love extra > style points, but this sounds like a feature I haven't learned about > yet. http://www.postgresql.org/docs/8.4/interactive/sql-createdomain.html Domains are basically type aliases with an optional CHECK clause, so you could do something like: CREATE DOMAN sales_tax_rate AS DECIMAL(5,5) CHECK (VALUE >= 0); Then, you can use the type "sales_tax_rate" in your tables, etc. just as a normal first-class type. (The only limitation, right now, is that you can't create an array of them.) In response to the other email, DECIMAL is definitely the better solution for what you are looking for. -- -- Christophe Pettus xof@thebuild.com
Thanks for the info! I was thinking this would be a short thread but I definitely appreciate all the information. I will definitely create a domain for this (and probably for some other types in my app since I now know about this). However, is the CHECK really necessary? A numeric(5,5) already has a maximum value of 10^0, so it would already create an overflow error if you set it higher. Is there an advantage of using the CHECK constraint? Perhaps this is faster or doesn't lock the row on update or something? Just to point out, NUMERIC and DECIMAL are one and the same yes? Mike On Fri, Oct 9, 2009 at 11:46 AM, Christophe Pettus <xof@thebuild.com> wrote: > > On Oct 9, 2009, at 11:36 AM, Mike Christensen wrote: > >> Can you explain what you mean by "put it in a domain" - I'd love extra >> style points, but this sounds like a feature I haven't learned about >> yet. > > > http://www.postgresql.org/docs/8.4/interactive/sql-createdomain.html > > Domains are basically type aliases with an optional CHECK clause, so you > could do something like: > > CREATE DOMAN sales_tax_rate AS DECIMAL(5,5) CHECK (VALUE >= 0); > > Then, you can use the type "sales_tax_rate" in your tables, etc. just as a > normal first-class type. (The only limitation, right now, is that you can't > create an array of them.) > > In response to the other email, DECIMAL is definitely the better solution > for what you are looking for. > > -- > -- Christophe Pettus > xof@thebuild.com > > > -- > Sent via pgsql-general mailing list (pgsql-general@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-general >
On Fri, 2009-10-09 at 11:46 -0700, Christophe Pettus wrote: > Domains are basically type aliases with an optional CHECK clause, so > you could do something like: > > CREATE DOMAN sales_tax_rate AS DECIMAL(5,5) CHECK (VALUE >= 0); > > Then, you can use the type "sales_tax_rate" in your tables, etc. just > as a normal first-class type. (The only limitation, right now, is > that you can't create an array of them.) Actually I wouldn't bother with the precision and scale at all. I'd go with something like CREATE DOMAN sales_tax_rate AS DECIMAL CHECK (VALUE >= 0 AND VALUE <=1);
On Oct 9, 2009, at 12:14 PM, Mike Christensen wrote: > I will definitely create a domain for this (and probably for some > other types in my app since I now know about this). However, is the > CHECK really necessary? A numeric(5,5) already has a maximum value of > 10^0, so it would already create an overflow error if you set it > higher. Is there an advantage of using the CHECK constraint? Perhaps > this is faster or doesn't lock the row on update or something? > > Just to point out, NUMERIC and DECIMAL are one and the same yes? Sorry, NUMERIC is correct; DECIMAL was my brain misfiring. Since NUMERICs can go negative regardless of scale, the CHECK is handy to make sure that you do not insert a negative number. (This is assuming, of course, that a sales tax rate can't actually be negative for some reason.) It's not required, of course, but it's always advisable to put as much reasonable data checking into the database as you can. Peter Eisentraut's suggestion of just not putting a scale or precision on the type at all and using CHECK to validate the values is also a fine way of handling it. -- -- Christophe Pettus xof@thebuild.com
On 2009-10-08, Mike Christensen <mike@kitchenpc.com> wrote: > (Sorry for the super-easy question) > > I want to store sales tax (as a percent) in the DB, such as 9.5%. > What's the best data type for this? real, or numeric, probably numeric. > I'm guessing numeric(2,3) should be fine, yes? depends on the range of values you want to store. generally percentages are most usefully represented as decimal fractions especially if you intend to do arithmetic with them. numeric reccomended useful for financial stuff.
On 2009-10-09, Peter Eisentraut <peter_e@gmx.net> wrote: > On Fri, 2009-10-09 at 11:46 -0700, Christophe Pettus wrote: >> Domains are basically type aliases with an optional CHECK clause, so >> you could do something like: >> >> CREATE DOMAN sales_tax_rate AS DECIMAL(5,5) CHECK (VALUE >= 0); >> >> Then, you can use the type "sales_tax_rate" in your tables, etc. just >> as a normal first-class type. (The only limitation, right now, is >> that you can't create an array of them.) > > Actually I wouldn't bother with the precision and scale at all. I'd go > with something like > > CREATE DOMAN sales_tax_rate AS DECIMAL CHECK (VALUE >= 0 AND VALUE <=1); why the latter check ( VALUE <=1 )?
Jasen Betts wrote: > On 2009-10-09, Peter Eisentraut <peter_e@gmx.net> wrote: >> On Fri, 2009-10-09 at 11:46 -0700, Christophe Pettus wrote: >>> Domains are basically type aliases with an optional CHECK clause, so >>> you could do something like: >>> >>> CREATE DOMAN sales_tax_rate AS DECIMAL(5,5) CHECK (VALUE >= 0); >>> >>> Then, you can use the type "sales_tax_rate" in your tables, etc. just >>> as a normal first-class type. (The only limitation, right now, is >>> that you can't create an array of them.) >> Actually I wouldn't bother with the precision and scale at all. I'd go >> with something like >> >> CREATE DOMAN sales_tax_rate AS DECIMAL CHECK (VALUE >= 0 AND VALUE <=1); > > why the latter check ( VALUE <=1 )? I think the initial post implied it was to be used directly in the calculations, no "sales_tax_rate/100", so a fraction is needed. Hopefully no one is experiencing 99%+ tax rates. \\||/ Rod --
On Oct 10, 2009, at 3:33 AM, Jasen Betts wrote: >> CREATE DOMAN sales_tax_rate AS DECIMAL CHECK (VALUE >= 0 AND VALUE >> <=1); > > why the latter check ( VALUE <=1 )? Since this version has no scale on the DECIMAL, the second check keeps it from being larger than 1.0, since it's presumably a percentage from 0% to 99%. -- -- Christophe Pettus xof@thebuild.com
Christophe Pettus wrote: > > On Oct 10, 2009, at 3:33 AM, Jasen Betts wrote: >>> CREATE DOMAN sales_tax_rate AS DECIMAL CHECK (VALUE >= 0 AND VALUE >>> <=1); >> >> why the latter check ( VALUE <=1 )? > > Since this version has no scale on the DECIMAL, the second check keeps > it from being larger than 1.0, since it's presumably a percentage from > 0% to 99%. For general-purpose sales-tax...in the United States...for the time being. I believe the current tax on new vehicles in Israel is 70+% and seem to recall that it was well over 100% at one time. Sales taxes already vary by product (in California, food is 0% for example) as well as state, county, city. I can certainly imagine some locale pushing for taxes well over 100% on the sin-du-jour (alcohol, tobacco, sugar, fat..) Cheers, Steve
> For general-purpose sales-tax...in the United States...for the time being. > > I believe the current tax on new vehicles in Israel is 70+% and seem to > recall that it was well over 100% at one time. Sales taxes already vary by > product (in California, food is 0% for example) as well as state, county, > city. I can certainly imagine some locale pushing for taxes well over 100% > on the sin-du-jour (alcohol, tobacco, sugar, fat..) In the E.U., sin taxes are charged as excise duty. A manufacturer pays duty (a fixed amount per unit of ethanol or whatever), and sells it on to a retailer at a price that (presumably) factors in the cost to him of paying that duty, in addition to VAT - they charge tax on the tax. The retailer doesn't have to give sinful goods any special treatment. When selling on those goods they just charge VAT, in turn, at the standard rate. I imagine it's a similar situation in the U.S. This minimises tax evasion, carousel fraud, etc, and takes the burden/responsibility of collecting such massive taxes (effectively 80%+ of the cost of a pack of cigarettes) higher up the supply chain. Taxes on cars are covered by "vehicle registration tax" and road tax here in Ireland, in addition to charging VAT at the standard rate on top of the total, inclusive cost. Charging massive rates like 70% is avoided. Consumption taxes are supposed to be easy to collect. If you wonder why a manufacturer charges VAT to a retailer or non-end user, well, that's one of the distinctions between VAT and sales tax (they pay, but they claim it back later, whereas an individual cannot claim it back). Regards, Peter Geoghegan