Thread: Variable length custom data types help

Variable length custom data types help

From
"Morgan Kita"
Date:
Ok I am creating some important custom data types for my DB. The data should be stored as an array of strucs, which
themselvescontain a few different data fields. Anyways I wrote the input function in C when I realized I had a problem.


The arrays will not always contain the same number of structures, and as such should be variable length. I read the
pagesin the manual on custom data type creation,  and I noticed that you can do this by setting the first word of the
storedbytes to be the total length in bytes of the stored object. So my question is, given that this is a typed array,
whatis the best way to represent this in the c code? In the end I am returning a pointer correct? What is the type of
thepointer I should be returning? I guess I really don't understand how the memory returned by the pointer is going to
beparsed by Postgres. I thought of making the first member of the array a dummy structure whose first data field would
containthe length. However, that doesn't sound implementation independant and I think it would be better if I actually
understoodhow the DB will actually parse the pointer that my function returns. 

Also I have a related question... from what I read in the manual; by creating the type as variable length that
automaticallymakes it toasted in the DB. Is that the case, and if so, do I have to do anything else other than untoast
itin my output function? 

Thanks,
Morgan

Re: Variable length custom data types help

From
Michael Fuhr
Date:
On Thu, Mar 31, 2005 at 07:45:12PM -0800, Morgan Kita wrote:
>
> Ok I am creating some important custom data types for my DB. The data
> should be stored as an array of strucs, which themselves contain a few
> different data fields. Anyways I wrote the input function in C when I
> realized I had a problem.
>
> The arrays will not always contain the same number of structures, and
> as such should be variable length. I read the pages in the manual on
> custom data type creation,  and I noticed that you can do this by
> setting the first word of the stored bytes to be the total length in
> bytes of the stored object. So my question is, given that this is a
> typed array, what is the best way to represent this in the c code?

The usual way is to create a structure:

  typedef struct foo {
      int32  total_length;
      /* rest of structure here */
  } foo;

What follows the length is irrelevant to PostgreSQL, so define it
however makes sense in your code.  Just make sure that all data is
in a contiguous chunk of memory that begins with the 4-byte length
(don't embed pointers in the data, because PostgreSQL won't know
anything about them).

> In the end I am returning a pointer correct?

Correct.  Use palloc() to allocate the memory and PG_RETURN_POINTER()
to return the pointer.

> What is the type of the pointer I should be returning?

It doesn't matter; it's just a memory address.  With the structure
above you could declare a foo *.

> I guess I really don't understand how the memory returned by the
> pointer is going to be parsed by Postgres.

PostgreSQL isn't going to parse the data beyond looking at the
4-byte length.  You're returning a pointer to "length + stuff" and
PostgreSQL treats "stuff" as an opaque bunch of bytes.

> I thought of making the first member of the array a dummy structure whose
> first data field would contain the length. However, that doesn't sound
> implementation independant and I think it would be better if I actually
> understood how the DB will actually parse the pointer that my function
> returns.

PostgreSQL only cares about the length in the first 4 bytes, so if
the data begins with an int32 then you should be okay.

> Also I have a related question... from what I read in the manual; by
> creating the type as variable length that automatically makes it toasted
> in the DB. Is that the case, and if so, do I have to do anything else
> other than untoast it in my output function?

CREATE TYPE's "storage" parameter determines the storage strategy,
but you should probably use PG_DETOAST_DATUM() in any case (I think
it's a no-op if the data isn't actually toasted).  I can't think
of any other special handling you'd need to do.

--
Michael Fuhr
http://www.fuhr.org/~mfuhr/

Re: Variable length custom data types help

From
"Morgan Kita"
Date:
Thank you for your response! That was quite enlightening.

Just to clarify in the example foo structure below where the first
member is the length, and then you say /*rest of structure here*/, by
that do you mean the individual data members of the structure that will
be in my array of strucs? I guess it can't be a pointer to the array as
you said postgres will not know that you are storing pointers. If so
that means you have a bit of wasted overhead right? I think I will just
try creating the pointer as void*, set the first 4 bytes to the length,
and then fill in the rest.

Seriously, that was an extremely helpful post!

Thanks,
Morgan


-----Original Message-----
From: Michael Fuhr [mailto:mike@fuhr.org]
Sent: Thursday, March 31, 2005 11:17 PM
To: Morgan Kita
Cc: pgsql-novice@postgresql.org
Subject: Re: [NOVICE] Variable length custom data types help

The usual way is to create a structure:

  typedef struct foo {
      int32  total_length;
      /* rest of structure here */
  } foo;

What follows the length is irrelevant to PostgreSQL, so define it
however makes sense in your code.  Just make sure that all data is
in a contiguous chunk of memory that begins with the 4-byte length
(don't embed pointers in the data, because PostgreSQL won't know
anything about them).

Re: Variable length custom data types help

From
Tom Lane
Date:
"Morgan Kita" <mkita@verseon.com> writes:
> Just to clarify in the example foo structure below where the first
> member is the length, and then you say /*rest of structure here*/, by
> that do you mean the individual data members of the structure that will
> be in my array of strucs? I guess it can't be a pointer to the array as
> you said postgres will not know that you are storing pointers. If so
> that means you have a bit of wasted overhead right? I think I will just
> try creating the pointer as void*, set the first 4 bytes to the length,
> and then fill in the rest.

BTW, if what you have is actually an *array*, namely N instances of the
same kind of animal, maybe you should define your custom type as just
one of that kind of animal, and let PG provide the array superstructure.
That way you could, for example, subscript the array at the SQL level.

            regards, tom lane

Re: Variable length custom data types help

From
"Morgan Kita"
Date:
I thought about that but I am really not that interested in the individual elements of the array. I am always going to
begrabbing and storing them as entire blocks. 

Thanks,
Morgan

"Morgan Kita" <mkita@verseon.com> writes:
> Just to clarify in the example foo structure below where the first
> member is the length, and then you say /*rest of structure here*/, by
> that do you mean the individual data members of the structure that will
> be in my array of strucs? I guess it can't be a pointer to the array as
> you said postgres will not know that you are storing pointers. If so
> that means you have a bit of wasted overhead right? I think I will just
> try creating the pointer as void*, set the first 4 bytes to the length,
> and then fill in the rest.

BTW, if what you have is actually an *array*, namely N instances of the
same kind of animal, maybe you should define your custom type as just
one of that kind of animal, and let PG provide the array superstructure.
That way you could, for example, subscript the array at the SQL level.

            regards, tom lane