Re: A varint implementation for PG? - Mailing list pgsql-hackers
From | Robert Haas |
---|---|
Subject | Re: A varint implementation for PG? |
Date | |
Msg-id | CA+TgmoYyz3MXtNuxrOfx=HFOAaE=+WxbQPuxtkTPdLb=oeBTkA@mail.gmail.com Whole thread Raw |
In response to | Re: A varint implementation for PG? (Andres Freund <andres@anarazel.de>) |
Responses |
Re: A varint implementation for PG?
Re: A varint implementation for PG? Re: A varint implementation for PG? |
List | pgsql-hackers |
[ resurrecting this 2-year-old thread ] On Fri, Dec 13, 2019 at 12:45 AM Andres Freund <andres@anarazel.de> wrote: > > If baking a new variant integer format now, I think limiting it to 64 bits > > is probably a mistake given how long-lived PostgreSQL is, and how hard it > > can be to change things in the protocol, on disk, etc. > > I don't think it's ever going to be sensible to transport 64bit quanta > of data. Also, uh, it'd be larger than the data a postgres instance > could really contain, given LSNs are 64 bit. We already use 128-bit integers in some places in the source code, and it seems more likely than not that use of 128-bit integer data types will continue to expand over time. If we want to represent 64 bit integers in a variable number of bits today, it does seem pretty likely that some day we will want to do the same thing with 128 bit integers, and maybe eventually larger. And I think that all of that is true even though exhausting a 64-bit LSN space looks difficult today and will likely remain difficult for the foreseeable future. So to me, the right thing to do here probably depends a bit on the use case. If we are talking about using this for strictly internal purposes - e.g. cramming stuff into an internal data structure into which users have no direct visibility - then I think it's probably best to pick a representation that corresponds exactly to the C data type that we're trying to store. If we're starting out with a int64 and we want to byte-squeeze it, the representation you've chosen here seems like just the thing. And it's easy to imagine why we might also want to have similar transformations for uint64, int32, and uint32, and maybe eventually int128 and uint128. And if the details of the format are a little different in each case that's fine; for these sorts of use cases we're always going to unpack into the same data type that we packed originally, so it's not important to have exact compatibility across differences in signedness or bit count. However, I suspect that the whole approach should be completely revised for a user-visible data type. On the one hand, there's no telling how large a value some user will want to represent, so limiting ourselves to 64 bits does seem shortsighted. And on the othe hand, if we've got a varlena, we already know the length, so it seems like we shouldn't also encode the length in the value. Maybe there's a more efficient way, but the first thing that occurs to me is to just discard high order bytes that are all zeroes or all ones until the high order bit of the next byte doesn't match and plonk the remaining bytes into the varlena. To decompress, just sign-extend out to the target length. Really, this kind of representation can be extended to represent arbitrarily large integers, even bigger than what we can currently do with numeric, which is already crazy huge, and it seems to have some advantage in that every payload byte contains exactly 8 data bits, so we don't need to shift or mask while encoding and decoding. Now, we could think of introducing a new format for variable-length datums, essentially making this a new typlen rather than a new kind of varlena. That might be worth it, because if you are storing a lot of values that are small enough that this format could represent them in 3 bytes or less, which I think would be everything up to +/- 2^20, you could save a pretty significant amount of space even if your table was laid out to avoid alignment padding. However, there would be some distributed overhead to this, because all the places that have special handling for typlen = -1 and typlen = -2 would need to grow new cases for this. I'm not sure how much of a problem that is, really, but I don't think we can go nuts with adding new typlen values. > It'll be a bit problmeatic to deal with all the casting necessary, and > with the likely resulting overload resolution issues. I'm wondering > whether it'd be worthwhile to have a ALTER TABLE ... STORAGE ... option > that encodes int2/4/8 as varints when inside a tuple, but otherwise just > let it be a normal integer. I don't see how a STORAGE option could work. We tend to treat those as hints, rather than critical data, which wouldn't work here. I think there are a number of problems that crop up, but one of them is the same thing we hit with the custom TOAST compression stuff. If you need to make a value of some row type out of a HeapTuple, you're not going to know which settings were used to create that heap tuple, and you're certainly not going to know that when you go to deform that tuple. The only thing you're going to have is the tuple descriptor, which AFAICS means that the representation needs to be a property of the type, not where the value is stored. Maybe you have a clever idea I'm not seeing. As for the casting and overloading issues, that's tripped up quite a few people now when trying to add new numeric data types - unsigned integers being a case that has come up a few times now, I think. I don't think it has to be a blocker. I think the solution is probably to accept that using unsigned or variable-width data types will require inserting more casts than would be required for integer datatypes and numeric. That may not thrill everybody, but it may still be better than deciding we're never ever ever adding any more data types for storing numbers than we have today. -- Robert Haas EDB: http://www.enterprisedb.com
pgsql-hackers by date: