Re: reducing NUMERIC size for 9.1 - Mailing list pgsql-hackers
From | Robert Haas |
---|---|
Subject | Re: reducing NUMERIC size for 9.1 |
Date | |
Msg-id | AANLkTinHRCH7h+rY1nba3yV2Fz-cjiJQ77xVzzZRC+sh@mail.gmail.com Whole thread Raw |
In response to | Re: reducing NUMERIC size for 9.1 (Tom Lane <tgl@sss.pgh.pa.us>) |
Responses |
Re: reducing NUMERIC size for 9.1
Re: reducing NUMERIC size for 9.1 |
List | pgsql-hackers |
On Fri, Jul 30, 2010 at 9:16 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Robert Haas <robertmhaas@gmail.com> writes: >> On Fri, Jul 30, 2010 at 1:13 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >>> Perhaps, but I think you're confused on at least one point. >>> numeric(2,1) has to be able to hold 2 decimal digits, not 2 >>> NumericDigits (which'd actually be 8 decimal digits given >>> the current code). > >> I get that. The point is: if one of those 2 decimal digits is before >> the decimal point and the other is after it, then two NumericDigits >> will be used. > > Ah, I see. Maybe we should allow for one more NumericDigit in the > calculation for such cases. I guess you could look at the scale too > to detect if the case is possible, but not sure it's worth the trouble. Since this just needs to be a reasonable upper bound, probably not. Another small but related issue is this comment is out-of-date: /* Numeric stores 2 decimal digits/byte, plus header */ return (precision + 1) / 2 + NUMERIC_HDRSZ; Currently, numeric stores 4 decimal digits/2 bytes, which is not quite the same thing. I think that for clarity it would make sense to break this calculation in two parts. First, estimate the number of NumericDigits that could be needed; then, estimate the amount of space that could be needed to store them. Maybe something like this, obviously with a suitable comment which I haven't written yet: numeric_digits = (precision + 6) / 4; return (numeric_digits * sizeof(int16)) + NUMERIC_HDRSZ; The first line would be incorrect for precision 0 (it would produce 1, not 0) but precision 0 isn't allowed anyway, and overestimating just that one case would be OK even if it did. As far as I can see, it's correct for all higher values. If you have 1 decimal digit, you can't need more than one NumericDigit, but if you have 2 decimal digits, you can need one for each side of the decimal point. But you can't manage to get a third NumericDigit until you get up to 6 decimal digits, because with only 5 you can split them 1-4 or 3-2 across the decimal point, but getting to a second NumericDigit on either side of the decimal point uses up all 5 digits, leaving nothing for the other side. Similarly, to get a fourth NumericDigit, you have to have enough decimal digits to split them 1-4-4-1 (with the decimal point somewhere in the middle), so 10 is the minimum. Another question here is whether we should just fix this in CVS HEAD, or whether there's any reason to back-patch it. I can't see much reason to back-patch, unless there's third-party code depending on this for something more than what we use it for. The only callers of type_maximum_size are get_typavgwidth(), which doesn't need to be exact, and needs_toast_table(). So it seems that the worst thing that could happen as a result of this problem is that we might fail to create a toast table when one is needed, and even that seems extremely unlikely. First, you'd need to have a table containing no text or unlimited-length varchar columns, because those will force toast table creation anyway. Second, numerics are typically going to be stored with a short varlena header on disk, so a 2-byte underestimate of the max size is going to be swamped by a 3-byte savings on the varlena header. Third, even if you can find a case where the above doesn't matter, it's not that hard to force a toast table to get created. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company
pgsql-hackers by date: