type design guidance needed - Mailing list pgsql-hackers
From | Brook Milligan |
---|---|
Subject | type design guidance needed |
Date | |
Msg-id | 200009222305.RAA03411@biology.nmsu.edu Whole thread Raw |
Responses |
Re: type design guidance needed
|
List | pgsql-hackers |
I am working on designing some new datatypes and could use some guidance. Along with each data item, I must keep additional information about the scale of measurement. Further, the relevant scales of measurement fall into a few major families of related scales, so at least a different type will be required for each of these major families. Additionally, I wish to be able to convert data measured according to one scale into other scales (both within the same family and between different families), and these interconversions require relatively large sets of parameters. It seems that there are several alternative approaches, and I am seeking some guidance from the wizards here who have some understanding of the backend internals, performance tradeoffs, and such issues. Possible solutions: 1. Store the data and all the scale parameters within the type. Advantages: All information contained within each type. Can be implemented with no backend changes. No access toancillary tables required, so processing might be fast. Disadvantages: Duplicate information on the scales recorded in each field of the types; i.e., waste of space. I/O iseither cumbersome (if all parameters are required) or they type-handling code has built-in tables for supplying missingparameters, in which case the available types and families cannot be extended by users without recompiling thecode. 2. Store only the data and a reference to a compiled-in data table holding the scale parameters. Advantages: No duplicate information stored in the fields. Access to scale data compiled into backend, so processingmight be fast. Disadvantages: Tables of scale data fixed at compile time, so users cannot add additional scales or families of scales. Requires backend changes to implement, but these changes are relatively minor since all the scale parameters arecompiled into the code handling the type. 3. Store only the data and a reference to a new system table (or tables) holding the scale parameters. Advantages: No duplicate information stored in the fields. Access to scale data _not_ compiled into backend, so userscould add scales or families of scales by modifying the system tables. Disadvantages: Requires access to system tables to perform conversions, so processing might be slow. Requires morecomplex backend changes to implement, including the ability to retrieve information from system tables. Clearly, option 3 is optimal (more flexible, no data duplication) unless the access to system tables by the backend presents too much overhead. (Other suggestions are welcome, especially if I have misjudged the relative merits of these ideas or missed one altogether.) The advice I need is the following: - How much of an overhead is introduced by requiring the backend to query system tables during tuple processing? Is thisunacceptable from the outset or is it reasonable to consider this option further? Note that the size of these new tableswill not be large (probably less than 100 tuples) if that matters. - How does one access system tables from the backend code? I seem to recall that issuing straight queries via SPI is notnecessarily the right way to go about this, but I'm not sure where to look for alternatives. Thanks for your help. Cheers, Brook
pgsql-hackers by date: