WIP: RangeTypes - Mailing list pgsql-hackers
From | Jeff Davis |
---|---|
Subject | WIP: RangeTypes |
Date | |
Msg-id | 1294737407.18031.3651.camel@jdavis Whole thread Raw |
Responses |
Re: WIP: RangeTypes
Re: WIP: RangeTypes Re: WIP: RangeTypes Re: WIP: RangeTypes Re: WIP: RangeTypes |
List | pgsql-hackers |
Ok, I have made some progress. This is still a proof-of-concept patch, but the important pieces are working together. Synopsis: CREATE TYPE numrange AS RANGE (SUBTYPE=numeric, SUBTYPE_CMP=numeric_cmp); SELECT range_eq('[1,2.2)'::numrange,'[1,2.2]'); SELECT range_lbound('(3.7,9]'::numrange); SELECT range(6.7); SELECT '-'::numrange; -- empty SELECT '[1, NULL]'::numrange; -- ] will become ) SELECT '(INF, 3)'::numrange; I haven't completed many of the other generic functions, because I'd like to make sure I'm on the right track first. The important thing about the functions above is that they show ANYRANGE working in conjunction with ANYELEMENT in various combinations, which was a significant part of this patch. Here are the open items: 1. Generic functions -- most of which are fairly obvious. However, I want to make sure I'm on the right track first. 2. GiST -- I'll need a mechanism to implement the "penalty" function, and perhaps I'll also need additional support for the picksplit function. For the "penalty" function, I think I'll need to require a function to convert the subtype into a float, and I can use that to find a distance (which can be the penalty). That should also satisfy anything that picksplit might need. 3. Typmod -- There is still one annoyance about typmod remaining. I need to treat it like an array in find_typmod_coercion_function(), and then create a coercion expression. Is it worth it? Would typmod on a range be confusing, or should I just finish this item up? 4. Docs 5. Tests 6. pg_dump -- should be pretty easy; I just want to settle some of the other stuff first. 7. Right now the parse function is quite dumb. Is there some example code I should follow to make sure I get this right? 8. In order to properly support the various combinations of ANYRANGE and ANYELEMENT in a function definition (which are all important), we need to be able to determine the range type given a subtype. That means that each subtype can only have one associated range, which sounds somewhat limiting, but it can be worked around by using domains. I don't think this is a major limitation. Comments? 9. Representation -- right now I store the OID of the range type in the range itself, much like arrays, in order to call the find the functions to operate on the subtype. Robert has some justifiable concerns about that 4-byte overhead. Possible ideas: * Forget about ANYRANGE altogether, and generate new catalog entries for the generic functions for each new range type defined. I don't particularly like this approach because it makes it very difficult to define new generic functions. * Somehow fix the type system so that we know the specific types of arguments in all situations. I don't know if this is feasible. * Store a 8- or 16-bit unique number in pg_range, and store that number in the representation. That would be pretty ugly, and limit the total possible range types defined at once, but it saves a couple bytes per value. * Try to somehow mimic what records do. Records use a global array and use the typmod as an index into that array. It looks like a hack to me, but might be worth borrowing anyway. Also related to representation: * Right now I always align the subtypes within the range according to typalign. I could avoid that by packing the bytes tightly, and then copying them around later. Suggestions? And what should the overall alignment of the range type be? * If it's a fixed-length type, we can save the varlena header byte on the overall range; but we lose the ability to save space when one of the boundaries of the range is missing (NULL or INF), and it would complicate the code a little. Thoughts? Regards, Jeff Davis
Attachment
pgsql-hackers by date: