Re: [HACKERS] Data types - Mailing list pgsql-hackers

From Thomas G. Lockhart
Subject Re: [HACKERS] Data types
Date
Msg-id 352C5660.BD2830C9@alumni.caltech.edu
Whole thread Raw
Responses Re: [HACKERS] Data types
List pgsql-hackers
(We were discussing support for complicated data types last October)

OK, I think I understand most of what the parser does to handle data
types and to handle type resolution. I've started rewriting the code
involved with functions, operators, and targets, and have completed a
trial implementation. I was wondering where some things which must
happen were actually done, and it turns out that they aren't! For
example:

postgres=> create table c1 (c char(4));
postgres=> create table c2 (c char(4));
postgres=> insert into c1 values ('abc', 'abc');
postgres=> insert into c1 values ('def', 'def');
postgres=> insert into c2(c) select c || 'ghi' from c1;
INSERT 0 2
postgres=> select * from c2;
c      |t
-------+-
abc ghi|
def ghi|
(2 rows)

Note that our char(4) column happily ends up with strings 7 characters
long! This is with pristine v6.3.1 code.

OK, so to do things right, for, say, char() and for numeric(), we need a
way to provide target properties to routines even though the
characteristics of a particular value might be stored with the value.
The current system usually stores string or array properties with the
value; but even so we also need to provide any target properties to
correctly handle function and operator output into target columns. In
the example above, each input string of char(4) is concatenated with a
string constant, and stored back into a char(4) column. The target
column info must appear somewhere so a conversion or storage routine can
fix up the values.

> > > What about passing to functions references to some structures ?
> > > struct ... {
> > >         TypeSpec        type_spec;
> > >         ...some other things maybe...
> > >         Datum           data;           /* this is real data */
> > > }
> > > type_spec could be precision/scale for NUMBERs and DECIMALs,
> > > max len for (VAR)CHARs or other type specification value (pointer > > > to) for (other) user-defined types, so
onecould define new type  
> > > with pointing of function(s) to handle type_spec for this type in
> > > CREATE TABLE:
> > >         A_Column                An_User_Type[(type-spec)]
> > > Just like
> > > A NUMBER(5,2)but more general (object-oriented).

OK, so I'm thinking that we need to do something like this, but there
are problems in doing it with explicit structures in the backend:
1) the backend might need to know too much about each type, damaging the
extensibility
2) somehow the backend would need to be able to support dump/reload
operations (so there needs to be a string representation of the type).
3) we need this part to be extensible to new types also.

I'm not happy with the structure above, but I'm not sure why. It just
seems pretty complicated, and doesn't really address issues 1-3 above.

Well, finally a light came on: the characteristics of these complicated
types should be a type also! For example, the numeric type would have a
support definition type which contains two fields, the precision and
scale, and which as part of the definition would know how to print
itself out, read itself in, etc. Then, the backend would just need to be
able to match up the type with the support type, and the adt code would
need to be able to access both.

I haven't gotten farther than this yet, but it seems like any solution
has _got_ to take advantage of some of the existing type mechanisms so
that we have access to input/output routines for the type support info.
SQL doesn't naturally lend itself to full OO extensibility, but perhaps
we can extend the backend to handle syntax like

  typename(characteristic, characteristic,...)

where the characteristics have properties stored in the type/attribute
system and the backend knows how to meld the type with the support type
info (e.g. typeOutput = print("%s(%s)", typename, printTypeSupport) ).

Does this ring a bell with anyone? Vadim?

                      - Tom

> > Yes . For a few data types (bpchar for varchar support) the system already passes
> > more than just a single field structure to the type-handler code. It passes the
> > pointer to the structure, and also a couple of other parameters including the
> > maximum length. We could/should generalize this so that a descriptive structure
> > can be passed in for every type-handler (at least for -in() and -out() functions,
> > perhaps for all calls?) which is specialized for each data type and which is
>               ^^^^^^^^^
> Imho, for all. number_pl() should know about precision/scale of
> both args...
>
> > defined _by_ the data type code itself. This basically means that data types can
> > provide more "methods" to help with the type handling.
> >
> > The default behavior could be that NULL is passed for these extra arguments and
> > type-handlers could choose to ignore it (so perhaps existing handlers could work
> > without change).
>
> My suggestion is passing data in descriptive structure in all cases.
> This will require user to re-write all user-defined functions...
>
> >
> > For example, to implement NUMERIC(5,2) we would need to pass the precision and
> > scale numbers to the type handlers (numeric_in(), etc) after saving them when
> > defining the table/column. We also need a way to either save the original text
> > definition or to reconstruct it so that dump/reload can work.
>                    ^^^^^^^^^^^
> Type_spec handling functions could accept two args: one is
> type_spec itself (in external/internal formats) and second - in/out
> flag: in - to convert (5,2) into internal form, out - from
> internal form to external representation.
> I prefer this way.
>
> We could store type_spec in pg_attribute. This makes
> pg_attribute tuple len variable (if we'll allow type_spec
> has > 4 bytes len) and so many parts of code must be changed.
> For the moment, we could allow only 4 bytes for type_spec -
> it's ok for NUMBERs, DECIMALs, (VAR/BP)CHARs...
>
> Vadim

pgsql-hackers by date:

Previous
From: The Hermit Hacker
Date:
Subject: Re: [HACKERS] Re: [QUESTIONS] warning: tcl support disabled
Next
From: Bruce Momjian
Date:
Subject: Re: [HACKERS] Data types