Re: reducing NUMERIC size for 9.1 - Mailing list pgsql-hackers
From | Robert Haas |
---|---|
Subject | Re: reducing NUMERIC size for 9.1 |
Date | |
Msg-id | AANLkTindFE6p5owv7ZARW222xgistX-QqVLEWk6UjY28@mail.gmail.com Whole thread Raw |
In response to | Re: reducing NUMERIC size for 9.1 (Tom Lane <tgl@sss.pgh.pa.us>) |
Responses |
Re: reducing NUMERIC size for 9.1
|
List | pgsql-hackers |
On Wed, Jul 28, 2010 at 3:00 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Robert Haas <robertmhaas@gmail.com> writes: >> On Fri, Jul 16, 2010 at 2:39 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >>> I don't like the way you did that either (specifically, not the kluge >>> in NUMERIC_DIGITS()). It would probably work better if you declared >>> two different structs, or a union of same, to represent the two layout >>> cases. >>> >>> n_sign_dscale is now pretty inappropriately named, probably better to >>> change the field name. This will also help to catch anything that's >>> not using the macros. (Renaming the n_weight field, or at least burying >>> it in an extra level of struct, would be helpful for the same reason.) > >> I'm not sure what you have in mind here. If we create a union of two >> structs, we'll still have to pick one of them to use to check the high >> bits of the first word, so I'm not sure we'll be adding all that much >> in terms of clarity. > > No, you can do something like this: > > typedef struct numeric_short > { > uint16 word1; > NumericDigit digits[1]; > } numeric_short; > > typedef struct numeric_long > { > uint16 word1; > int16 weight; > NumericDigit digits[1]; > } numeric_long; > > typedef union numeric > { > uint16 word1; > numeric_short short; > numeric_long long; > } numeric; That doesn't quite work because there's also a varlena header that has to be accounted for, so the third member of the union can't be a simple uint16. I'm wondering if it makes sense to do something along these lines: typedef struct NumericData { int32 varlen; int16 n_header; union { struct { char n_data[1]; } short; struct { uint16 n_weight; char n_data[1]; } long; }; } NumericData; Why n_data as char[1] instead of NumericDigit, you ask? It's that way now, mostly I think so that the rest of the system isn't allowed to know what underlying type is being used for NumericDigit; it looks like previously it was signed char, but now it's int16. >>> It seems like you've handled the NAN case a bit awkwardly. Since the >>> weight is uninteresting for a NAN, it's okay to not store the weight >>> field, so I think what you should do is consider that the dscale field >>> is still full-width, ie the format of the first word remains old-style >>> not new-style. I don't remember whether dscale is meaningful for a NAN, >>> but if it is, your approach is constraining what is possible to store, >>> and is also breaking compatibility with old databases. > >> There is only one NaN value. Neither weight or dscale is meaningful. >> I think if the high two bits of the first word are 11 we never examine >> anything else - do you see somewhere that we're doing otherwise? > > I hadn't actually looked. I think though that it's a mistake to break > compatibility on both dscale and weight when you only need to break one. > Also, weight is *certainly* uninteresting for NaNs since it's not even > meaningful unless there are digits. dscale could conceivably be worth > something. I don't think I'm breaking compatibility on anything. Can you clarify what part of the code you're referring to here? I'm sort of lost. >>> The sign extension code in the NUMERIC_WEIGHT() macro seems a bit >>> awkward; I wonder if there's a better way. One solution might be to >>> offset the value (ie, add or subtract NUMERIC_SHORT_WEIGHT_MIN) rather >>> than try to sign-extend per se. > >> Hmm... so, if the weight is X we store the value >> X-NUMERIC_SHORT_WEIGHT_MIN as an unsigned integer? That's kind of a >> funny representation - I *think* it works out to sign extension with >> the high bit flipped. I guess we could do it that way, but it might >> make it harder/more confusing to do bit arithmetic with the weight >> sign bit later on. > > Yeah, it was just an idea. It seems like there should be an easier way > to extract the sign-extended value, though. It seemed a bit awkward to me, too, but I'm not sure there's a better one. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company
pgsql-hackers by date: