VARIANT / ANYTYPE datatype - Mailing list pgsql-hackers
| From | Alvaro Herrera | 
|---|---|
| Subject | VARIANT / ANYTYPE datatype | 
| Date | |
| Msg-id | 1304527158-sup-5344@alvh.no-ip.org Whole thread Raw | 
| Responses | Re: VARIANT / ANYTYPE datatype Re: VARIANT / ANYTYPE datatype Re: VARIANT / ANYTYPE datatype Re: VARIANT / ANYTYPE datatype Re: VARIANT / ANYTYPE datatype | 
| List | pgsql-hackers | 
Hello, A customer came to us with this request: a way to store "any" data in a column. We've gone back and forth trying to determine reasonable implementation restrictions, safety and useful semantics for them. I note that this has been requested in the past: http://archives.postgresql.org/pgsql-general/2004-02/msg01266.php and both Oracle and MS-SQL have it and apparently people find them useful. I didn't find any indication that SQL contains anything resembling this. The main idea is to be able to store column values in an audit table like this: change_time timestamptz table_name name column_name name old_value variant new_value variant So per-column changes, which is much more convenient than the regular idea of storing the whole NEW and/or OLD record(s). Currently, they use text for old_value and new_value, but this is, of course, not very satisfactory. My thought was that a variant type would store the datum as <typid><data> so that it would be possible to identify the datatype stored in each column/row and interpret adequately, calling the appropriate output function etc. On input it would be limited to come only from inside the system, not from the outside world, as that would have obvious security implications; so it'd be similar to pg_node_tree in that regard. Now this has obvious limitations: first, any query that tries to extract data would need to include a cast of the variant value to the appropriate type, so that the type can be resolved early. Thus, trying to extract rows of different types would be forbidden. Also, there would be a security problem with a datum storing something whose datatype later changes (consider a user-defined record type or things like that). My first reaction was to do something like CREATE TYPE foo VARIANT OF (int, text, timestamptz); and then you could declare old_value with type foo, which would only allow values of the declared types. This makes it easy to catalogue used types in any variant, and thus easy to restrict modifying or dropping types that are used in some variant. However, this idea was rejected by the customer due to the unusability: you would have to remember to edit the variant to add the new type anytime you added a new column to a table, which would be cumbersome. What the customer suggested was to have a new fork, which stores type OIDs of datatypes used in the variant. Then when a type is to be altered, all forks would be scanned to determine if the type is used, and raise an error if so. I rejected that idea as unworkable. However, as a middle ground we agreed that we could allow a declared variant to store any pinned type without restrictions; those can't be changed or dropped so there's no safety concern. Storing other types (user-defined types, records and so on) would require some safety net. Before spending too much time exploring a detailed design, I'd like to hear some opinions on the basic idea. -- Álvaro Herrera <alvherre@alvh.no-ip.org>
pgsql-hackers by date: