Supporting NULL elements in arrays - Mailing list pgsql-hackers
From | Tom Lane |
---|---|
Subject | Supporting NULL elements in arrays |
Date | |
Msg-id | 29807.1131408544@sss.pgh.pa.us Whole thread Raw |
Responses |
Re: Supporting NULL elements in arrays
Re: Supporting NULL elements in arrays |
List | pgsql-hackers |
I'm starting to think about what it'll take to allow arrays to contain elements that are NULL. The behavioral semantics seem pretty straightforward, but there are a couple of areas that need discussion. One trouble spot is what the I/O representation should look like. Since 8.0, the array input parser has rejected empty elements: regression=# select '{a,,c}'::text[]; ERROR: malformed array literal: "{a,,c}" regression=# select '{a, ,c}'::text[]; ERROR: malformed array literal: "{a, ,c}" -- the right way to specify a zero-length string is: regression=# select '{a,"",c}'::text[]; text ----------{a,"",c} (1 row) and so the most straightforward thing to do is define an empty element as meaning a NULL. But this might be objected to on a couple of grounds: 1. "Since 8.0" isn't really old enough --- there may well be applications still out there that think '{a,,c}'::text[] should produce a zero-length string element and not a NULL element. (Note: this isn't a hazard for reloading old dump files, because the array output routine has dumped empty strings as "" since 7.0 if not before.) 2. Even today, the array documentation athttp://developer.postgresql.org/docs/postgres/arrays.html doesn't actually *say* that empty elements are disallowed. I don't see any alternatives that seem better, though, and the empty-element convention at least has the virtue of being reasonably compatible with what we did for I/O of composite data types. Anyone have a better idea? Another issue is what to do with the internal representation. I think that basically we want to insert a null bitmap just following the dimension info, with the option to omit the bitmap if there are no null values. Now, when the bitmap is present, it would be fairly tedious to calculate the length of the bitmap to determine the offset to the actual data; and that's an operation that we'd like to be cheap. What I'm thinking of doing is commandeering the whole "flags" field of ArrayType (which is currently unused and should be always zero), and redefining it thus:zero: bitmap not present (fully backwards-compatible case)not zero: bitmap is present; value is offsetto start of data I wouldn't propose doing this if I thought we had any pressing reason to save some array flag bits for some other purpose; but I can't think of anything else we might want 'em for. Did anyone have any pet ideas this would foreclose? Also, with respect to the binary I/O representation, I'd suggest replacing the "flag" word with "zero if no bitmap, 1 if bitmap present". This would require both sender and receiver to calculate the bitmap length from the given dimensions, but it seems best to require that for error-checking reasons. Thoughts anyone? regards, tom lane
pgsql-hackers by date: