type design guidance needed - Mailing list pgsql-hackers

From Brook Milligan
Subject type design guidance needed
Date
Msg-id 200009222305.RAA03411@biology.nmsu.edu
Whole thread Raw
Responses Re: type design guidance needed
List pgsql-hackers
I am working on designing some new datatypes and could use some
guidance.

Along with each data item, I must keep additional information about
the scale of measurement.  Further, the relevant scales of measurement
fall into a few major families of related scales, so at least a
different type will be required for each of these major families.
Additionally, I wish to be able to convert data measured according to
one scale into other scales (both within the same family and between
different families), and these interconversions require relatively
large sets of parameters.

It seems that there are several alternative approaches, and I am
seeking some guidance from the wizards here who have some
understanding of the backend internals, performance tradeoffs, and
such issues.

Possible solutions:

1.  Store the data and all the scale parameters within the type.
   Advantages:  All information contained within each type.  Can be   implemented with no backend changes.  No access
toancillary tables   required, so processing might be fast.
 
   Disadvantages: Duplicate information on the scales recorded in   each field of the types; i.e., waste of space.  I/O
iseither   cumbersome (if all parameters are required) or they type-handling   code has built-in tables for supplying
missingparameters, in   which case the available types and families cannot be extended by   users without recompiling
thecode.
 

2.  Store only the data and a reference to a compiled-in data table   holding the scale parameters.
   Advantages:  No duplicate information stored in the fields.   Access to scale data compiled into backend, so
processingmight be   fast.
 
   Disadvantages: Tables of scale data fixed at compile time, so   users cannot add additional scales or families of
scales.  Requires backend changes to implement, but these changes are   relatively minor since all the scale parameters
arecompiled into   the code handling the type.
 

3.  Store only the data and a reference to a new system table (or   tables) holding the scale parameters.
   Advantages:  No duplicate information stored in the fields.   Access to scale data _not_ compiled into backend, so
userscould   add scales or families of scales by modifying the system tables.
 
   Disadvantages: Requires access to system tables to perform   conversions, so processing might be slow.  Requires
morecomplex   backend changes to implement, including the ability to retrieve   information from system tables.
 

Clearly, option 3 is optimal (more flexible, no data duplication)
unless the access to system tables by the backend presents too much
overhead.  (Other suggestions are welcome, especially if I have
misjudged the relative merits of these ideas or missed one
altogether.)  The advice I need is the following:

- How much of an overhead is introduced by requiring the backend to query system tables during tuple processing?  Is
thisunacceptable from the outset or is it reasonable to consider this option further? Note that the size of these new
tableswill not be large (probably less than 100 tuples) if that matters.
 

- How does one access system tables from the backend code?  I seem to recall that issuing straight queries via SPI is
notnecessarily the right way to go about this, but I'm not sure where to look for alternatives.
 

Thanks for your help.

Cheers,
Brook



pgsql-hackers by date:

Previous
From: Stephan Szabo
Date:
Subject: Re: Bug in RI
Next
From: "Evgeni E. Selkov"
Date:
Subject: Re: type design guidance needed