Thread: pg_class catalog question...

pg_class catalog question...

From
"Jonah H. Harris"
Date:
I apologize for the lameness of this question upfront :)

Does anyone know how to add a fixed-length char field to pg_class?  I
need to avoid the cost of variable-length so I tried to add the
fixed-length char[64] right before relacl and updated CLASS_TUPLE_SIZE
to reflect the fixed-size of the struct, but it still dies on me.

I've never tried to do a fixed length char in the catalogs... any
suggestions on the field, datatype, or bootstrapping?  Of course, if
there's another way to avoid the cost of variable length, I'm all
ears.

--
Jonah H. Harris, Database Internals Architect
EnterpriseDB Corporation
732.331.1324


Re: pg_class catalog question...

From
Tom Lane
Date:
"Jonah H. Harris" <jonah.harris@gmail.com> writes:
> Does anyone know how to add a fixed-length char field to pg_class?

Changing any of the bootstrap catalogs is fairly tricky --- there are a
lot of places you have to update manually.  I'd suggest looking for a
previous commit that did something similar and studying the diff.
[ digs in CVS log... ]  Here are a couple of possibilities; the first
one is smaller but it's touching pg_proc not pg_class.

2005-03-29 14:44  tgl
* doc/src/sgml/bki.sgml, doc/src/sgml/catalogs.sgml,src/backend/bootstrap/bootstrap.c,
src/backend/catalog/pg_proc.c,src/include/catalog/catversion.h,src/include/catalog/pg_attribute.h,
src/include/catalog/pg_class.h,src/include/catalog/pg_proc.h:Add proallargtypes and proargmodescolumns to pg_proc, as
permy earlier proposal for OUT parametersupport.  The columns don't actually *do* anything yet, they arejust left
NULLs. But I thought I'd commit this part separately asa fairly pure example of the tasks needed when adding a column
topg_procor one of the other core system tables.
 

2002-03-26 14:15  tgl
* doc/src/sgml/catalogs.sgml,
src/backend/access/heap/heapam.c,src/backend/access/index/indexam.c,src/backend/bootstrap/bootparse.y,src/backend/bootstrap/bootstrap.c,
src/backend/catalog/Makefile,src/backend/catalog/aclchk.c,src/backend/catalog/genbki.sh,src/backend/catalog/heap.c,
src/backend/catalog/index.c,src/backend/catalog/indexing.c,
src/backend/catalog/namespace.c,src/backend/commands/cluster.c,
src/backend/commands/command.c,src/backend/commands/comment.c,
src/backend/commands/creatinh.c,src/backend/commands/indexcmds.c,
src/backend/commands/rename.c,src/backend/commands/trigger.c,
src/backend/commands/user.c,src/backend/executor/execMain.c,
src/backend/parser/analyze.c,src/backend/parser/parse_clause.c,src/backend/parser/parse_relation.c,src/backend/rewrite/rewriteDefine.c,
src/backend/tcop/utility.c,src/backend/utils/adt/acl.c,
src/backend/utils/cache/catcache.c,src/backend/utils/cache/lsyscache.c,src/backend/utils/cache/relcache.c,src/backend/utils/cache/syscache.c,
src/include/access/genam.h,src/include/access/heapam.h,
src/include/bootstrap/bootstrap.h,src/include/catalog/catversion.h,
src/include/catalog/heap.h,src/include/catalog/index.h,
src/include/catalog/indexing.h,src/include/catalog/namespace.h,src/include/catalog/pg_attribute.h,
src/include/catalog/pg_class.h,src/include/commands/cluster.h,
src/include/commands/command.h,src/include/commands/comment.h,
src/include/commands/defrem.h,src/include/commands/rename.h,
src/include/nodes/parsenodes.h,src/include/nodes/primnodes.h,
src/include/utils/catcache.h,src/include/utils/lsyscache.h,src/include/utils/rel.h,src/include/utils/relcache.h,
src/include/utils/syscache.h,src/pl/plpgsql/src/pl_comp.c:pg_class has a relnamespace column. You can create and access
tablesin schemas other than the systemnamespace; however, there's no search path yet, and not alloperations work yet on
tablesoutside the system namespace.
 
        regards, tom lane


Re: pg_class catalog question...

From
"Jonah H. Harris"
Date:
On 3/30/06, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Here are a couple of possibilities; the first
> one is smaller but it's touching pg_proc not pg_class.
>

Yeah, I noticed that one.  How would you suggest setting
CLASS_TUPLE_SIZE in that case?

--
Jonah H. Harris, Database Internals Architect
EnterpriseDB Corporation
732.331.1324


Re: pg_class catalog question...

From
"Qingqing Zhou"
Date:
""Jonah H. Harris"" <jonah.harris@gmail.com> wrote
>
> Yeah, I noticed that one.  How would you suggest setting
> CLASS_TUPLE_SIZE in that case?
>

What if you put your char[64] before relhassubclass, then you don't change
CLASS_TUPLE_SIZE.

Regards,
Qingqing




Re: pg_class catalog question...

From
"Jonah H. Harris"
Date:
On 3/31/06, Qingqing Zhou <zhouqq@cs.toronto.edu> wrote:
> What if you put your char[64] before relhassubclass, then you
> don't change CLASS_TUPLE_SIZE.

Thought about that... but it would be an ugly place for this column.
I know I could get around it by renumbering the attribute, but that's
just a kludge.

Now that I've had some sleep, I'm sure I'll get it working :)

--
Jonah H. Harris, Database Internals Architect
EnterpriseDB Corporation
732.331.1324


Re: pg_class catalog question...

From
Alvaro Herrera
Date:
Jonah H. Harris wrote:
> On 3/31/06, Qingqing Zhou <zhouqq@cs.toronto.edu> wrote:
> > What if you put your char[64] before relhassubclass, then you
> > don't change CLASS_TUPLE_SIZE.
> 
> Thought about that... but it would be an ugly place for this column. 
> I know I could get around it by renumbering the attribute, but that's
> just a kludge.

What are you using a char[64] for anyway?  You should probably consider
using NameData, if you want to store an identifier.

-- 
Alvaro Herrera                                http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support


Re: pg_class catalog question...

From
"Jonah H. Harris"
Date:
On 3/31/06, Alvaro Herrera <alvherre@commandprompt.com> wrote:
> What are you using a char[64] for anyway?  You should probably consider
> using NameData, if you want to store an identifier.

It's just a fixed length string that will never change in size and as
such, I'd like not to add the overhead of any variable-length
handling.

--
Jonah H. Harris, Database Internals Architect
EnterpriseDB Corporation
732.331.1324


Re: pg_class catalog question...

From
"Jim C. Nasby"
Date:
On Fri, Mar 31, 2006 at 10:45:15AM -0500, Jonah H. Harris wrote:
> On 3/31/06, Alvaro Herrera <alvherre@commandprompt.com> wrote:
> > What are you using a char[64] for anyway?  You should probably consider
> > using NameData, if you want to store an identifier.
> 
> It's just a fixed length string that will never change in size and as
> such, I'd like not to add the overhead of any variable-length
> handling.

What about creating a fixed-size general purpose type?

About the only reason I use CHAR in other databases systems is when I
know that the field will always contain the same amount of data, ie:
storing a SHA1. In these cases it's silly to have a 4 byte overhead to
store length. I really wish CHAR in PostgreSQL worked this way, so it
would be a welcome addition to have a type that did work this way. In
fact, I'd argue that CHAR should be made to work that way, and what's
currently called CHAR should be renamed for those who wish to use it.
I've yet to run across a use for CHAR where you might actually have a
variable amount of data stored and just want to enforce a certain number
of space padding.
-- 
Jim C. Nasby, Sr. Engineering Consultant      jnasby@pervasive.com
Pervasive Software      http://pervasive.com    work: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf       cell: 512-569-9461


Re: pg_class catalog question...

From
Tom Lane
Date:
"Jim C. Nasby" <jnasby@pervasive.com> writes:
> About the only reason I use CHAR in other databases systems is when I
> know that the field will always contain the same amount of data, ie:
> storing a SHA1. In these cases it's silly to have a 4 byte overhead to
> store length. I really wish CHAR in PostgreSQL worked this way, so it
> would be a welcome addition to have a type that did work this way. In
> fact, I'd argue that CHAR should be made to work that way, and what's
> currently called CHAR should be renamed for those who wish to use it.

This argument falls flat when you consider that the width of a CHAR
entry is measured in characters, not bytes, and therefore its physical
size is not fixed even if its logical width is.
        regards, tom lane


Re: pg_class catalog question...

From
"Jonah H. Harris"
Date:
On 3/31/06, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> This argument falls flat when you consider that the width of a CHAR
> entry is measured in characters, not bytes, and therefore its physical
> size is not fixed even if its logical width is.

Gotta love multibyte :)

--
Jonah H. Harris, Database Internals Architect
EnterpriseDB Corporation
732.331.1324


Re: pg_class catalog question...

From
"Jim C. Nasby"
Date:
On Fri, Mar 31, 2006 at 11:29:15AM -0500, Tom Lane wrote:
> "Jim C. Nasby" <jnasby@pervasive.com> writes:
> > About the only reason I use CHAR in other databases systems is when I
> > know that the field will always contain the same amount of data, ie:
> > storing a SHA1. In these cases it's silly to have a 4 byte overhead to
> > store length. I really wish CHAR in PostgreSQL worked this way, so it
> > would be a welcome addition to have a type that did work this way. In
> > fact, I'd argue that CHAR should be made to work that way, and what's
> > currently called CHAR should be renamed for those who wish to use it.
> 
> This argument falls flat when you consider that the width of a CHAR
> entry is measured in characters, not bytes, and therefore its physical
> size is not fixed even if its logical width is.

True, but in every case I've used char it was to store something that
would never be multi-byte, like a GUID, or a SHA1. Though I guess in
retrospect, what would really be handy is 'hex' datatype, that stores a
hex string (possibly with a custom format, such as a GUID) in it's
native binary format.
-- 
Jim C. Nasby, Sr. Engineering Consultant      jnasby@pervasive.com
Pervasive Software      http://pervasive.com    work: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf       cell: 512-569-9461


Re: pg_class catalog question...

From
Thomas Hallgren
Date:
Jim C. Nasby wrote:
> On Fri, Mar 31, 2006 at 11:29:15AM -0500, Tom Lane wrote:
>> This argument falls flat when you consider that the width of a CHAR
>> entry is measured in characters, not bytes, and therefore its physical
>> size is not fixed even if its logical width is.
> 
> True, but in every case I've used char it was to store something that
> would never be multi-byte, like a GUID, or a SHA1. Though I guess in
> retrospect, what would really be handy is 'hex' datatype, that stores a
> hex string (possibly with a custom format, such as a GUID) in it's
> native binary format.

Why not simply a fixed number of bytes, i.e. byte(16) or octet(16)? Hexadecimal is just a 
convenient human-readable representation.

Regards,
Thomas Hallgren


Re: pg_class catalog question...

From
"Jim C. Nasby"
Date:
On Sat, Apr 01, 2006 at 05:42:34PM +0200, Thomas Hallgren wrote:
> Jim C. Nasby wrote:
> >On Fri, Mar 31, 2006 at 11:29:15AM -0500, Tom Lane wrote:
> >>This argument falls flat when you consider that the width of a CHAR
> >>entry is measured in characters, not bytes, and therefore its physical
> >>size is not fixed even if its logical width is.
> >
> >True, but in every case I've used char it was to store something that
> >would never be multi-byte, like a GUID, or a SHA1. Though I guess in
> >retrospect, what would really be handy is 'hex' datatype, that stores a
> >hex string (possibly with a custom format, such as a GUID) in it's
> >native binary format.
> 
> Why not simply a fixed number of bytes, i.e. byte(16) or octet(16)? 
> Hexadecimal is just a convenient human-readable representation.

Well, hex is much easier to deal with in many regards than raw bytes,
though. But yes, the idea is that you'd just store raw bytes on disk.
byte or octet would work fine if they existed.
-- 
Jim C. Nasby, Sr. Engineering Consultant      jnasby@pervasive.com
Pervasive Software      http://pervasive.com    work: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf       cell: 512-569-9461


Re: pg_class catalog question...

From
Thomas Hallgren
Date:
Jim C. Nasby wrote:
> On Sat, Apr 01, 2006 at 05:42:34PM +0200, Thomas Hallgren wrote:
>   
>> Why not simply a fixed number of bytes, i.e. byte(16) or octet(16)? 
>> Hexadecimal is just a convenient human-readable representation.
>>     
>
> Well, hex is much easier to deal with in many regards than raw bytes,
> though. But yes, the idea is that you'd just store raw bytes on disk.
> byte or octet would work fine if they existed.
>   
IIRC, Oracle actually uses the term RAW. It makes sense I think. No 
conversion applied, no nothing. Just simple raw data.

- thomas



Re: pg_class catalog question...

From
Tom Lane
Date:
Thomas Hallgren <thomas@tada.se> writes:
> Jim C. Nasby wrote:
>> Well, hex is much easier to deal with in many regards than raw bytes,
>> though. But yes, the idea is that you'd just store raw bytes on disk.
>> byte or octet would work fine if they existed.
>> 
> IIRC, Oracle actually uses the term RAW. It makes sense I think. No 
> conversion applied, no nothing. Just simple raw data.

bytea does that.
        regards, tom lane


Re: pg_class catalog question...

From
"Jonah H. Harris"
Date:
On 4/2/06, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> bytea does that.

Yep.  However, I've wanted to add a constrained, fixed-length version
of bytea for some time now; it's just not high on my priority list.
At EnterpriseDB, we've actually had a lot of customers who would
prefer a constrained bytea (like Oracle's RAW), rather than
variable-length.  However, many people end up liking bytea better just
because they don't like the limitation's of Oracle's RAW data type.
Just depends on the application.

I'll probably get to this in the next couple weeks unless someone
wants to beat me to it :)

--
Jonah H. Harris, Database Internals Architect
EnterpriseDB Corporation
732.331.1324


Re: pg_class catalog question...

From
Tom Lane
Date:
"Jonah H. Harris" <jonah.harris@gmail.com> writes:
> Yep.  However, I've wanted to add a constrained, fixed-length version
> of bytea for some time now; it's just not high on my priority list. 

If you're expecting that you'll be able to write BYTEA(n) and avoid
storing a length word, you'll find that it's not a trivial matter.
        regards, tom lane


Re: pg_class catalog question...

From
"Jonah H. Harris"
Date:
On 4/2/06, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> If you're expecting that you'll be able to write BYTEA(n) and avoid
> storing a length word, you'll find that it's not a trivial matter.

It may not be trivial, but it's certainly not impossible.


--
Jonah H. Harris, Database Internals Architect
EnterpriseDB Corporation
732.331.1324


Re: pg_class catalog question...

From
Tom Lane
Date:
"Jonah H. Harris" <jonah.harris@gmail.com> writes:
> On 4/2/06, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> If you're expecting that you'll be able to write BYTEA(n) and avoid
>> storing a length word, you'll find that it's not a trivial matter.

> It may not be trivial, but it's certainly not impossible.

A word to the wise is sufficient: function result types don't have
known typmods, and for the most part expression results don't either.
Changing that is not "impossible", but the level of pain vastly exceeds
what this feature would be worth.  And that's not even the only
problem.

If you're desperate to have something like this, you could create one
or more fixed-size datatypes (ie, with various positive typlen values).
But I don't see a practical way to use a typmod in determining the
physical width.
        regards, tom lane


Re: pg_class catalog question...

From
"Jonah H. Harris"
Date:
On 4/2/06, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Changing that is not "impossible", but the level of pain vastly exceeds
> what this feature would be worth.

I really like the wording, "the level of pain"... so true :)

> you could create one or more fixed-size datatypes (ie, with various
> positive typlen values).  But I don't see a practical way to use a
> typmod in determining the physical width.

Thanks for the suggestion and caution.  I hope to look into this when
I get some time in the next month or so.  Likewise, I will inform
everyone of my planned implementation after some inspection.

--
Jonah H. Harris, Database Internals Architect
EnterpriseDB Corporation
732.331.1324


Re: pg_class catalog question...

From
Jim Nasby
Date:
On Apr 2, 2006, at 6:13 PM, Tom Lane wrote:

> "Jonah H. Harris" <jonah.harris@gmail.com> writes:
>> On 4/2/06, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>> If you're expecting that you'll be able to write BYTEA(n) and avoid
>>> storing a length word, you'll find that it's not a trivial matter.
>
>> It may not be trivial, but it's certainly not impossible.
>
> A word to the wise is sufficient: function result types don't have
> known typmods, and for the most part expression results don't either.
> Changing that is not "impossible", but the level of pain vastly  
> exceeds
> what this feature would be worth.  And that's not even the only
> problem.
>
> If you're desperate to have something like this, you could create one
> or more fixed-size datatypes (ie, with various positive typlen  
> values).
> But I don't see a practical way to use a typmod in determining the
> physical width.

I'm not sure how other databases handle this, but I suspect it would  
be OK performance-wise to tack on a length byte for these types when  
dealing with functions and anything else that isn't directly tied to  
a table where you can easily get length info from the catalog.

Actually, how is this handled with varchar(x)?
--
Jim C. Nasby, Sr. Engineering Consultant      jnasby@pervasive.com
Pervasive Software      http://pervasive.com    work: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf       cell: 512-569-9461




Re: pg_class catalog question...

From
Martijn van Oosterhout
Date:
On Mon, Apr 03, 2006 at 01:31:50PM -0400, Jim Nasby wrote:
> I'm not sure how other databases handle this, but I suspect it would
> be OK performance-wise to tack on a length byte for these types when
> dealing with functions and anything else that isn't directly tied to
> a table where you can easily get length info from the catalog.
>
> Actually, how is this handled with varchar(x)?

By storing the length in the Datum.

This discussion as about whether we could support something like HEX(n)
without storing the (n) in the data field but only in the catalog.
varchar(n) doesn't have this issue because we always store the length,
so everywhere that needs to know already does.

If your not worried about the length field you could code this up in an
afternoon. In fact, it's probably already been done...
--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
> tool for doing 5% of the work and then sitting around waiting for someone
> else to do the other 95% so you can sue them.