Re: [PATCH] by request: base64 for bytea - Mailing list pgsql-hackers

From Alex Pilosov
Subject Re: [PATCH] by request: base64 for bytea
Date
Msg-id Pine.BSO.4.10.10106241157250.9446-100000@spider.pilosoft.com
Whole thread Raw
In response to Re: [PATCH] by request: base64 for bytea  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: [PATCH] by request: base64 for bytea  (Marko Kreen <marko@l-t.ee>)
List pgsql-hackers
On Sun, 24 Jun 2001, Tom Lane wrote:

> Alex Pilosov <alex@pilosoft.com> writes:
> > Function to cast bytea as text, I think, should do proper checking that
> > input did not contain nulls, and return text data back.
> 
> That is most definitely not good enough.  In MULTIBYTE installations
> you'd have to also check that there were no illegal multibyte sequences.
True, but see below.

> The whole approach seems misguided to me anyway.  bytea isn't equivalent
> to text and conversion functions based on providing incomplete binary
> equivalence are fundamentally wrong.  hex or base64 encode/decode
> functions seem like reasonable conversion paths, or you could provide
> a function that mimics the existing I/O conversions for bytea, ugly as
> they are.
>
> In the case that Marko is describing, it seems to me he is providing
> two independent sets of encryption functions, one for text and one
> for bytea.  That they happen to share code under the hood is an
> implementation detail of his code, not a reason to contort the type
> system.  If someone wanted to add functions to encrypt, say, polygons,
> would you start looking for ways to create a binary equivalence between
> polygon and text?  I sure hope not.

Well, encrypt/decrypt are special kinds of functions. When the data is
decrypted, its type is not known, as it is not stored anywhere in the
data. Caller is responsible to casting the result to whatever he needs to,
thus, there must be some way to cast output of decrypted data to any type.

I may be going a bit too far, but, if you think about it, if one wanted to
encrypt a generic type t, these ar e the alternatives: 

a) to encrypt, caller must use encrypt(t_out(val)) and to decrypt
t_in(decrypt(val)).

Problem with that is non-existance of CSTRING datatype as of yet, and a
possible inefficiency of it compared to b).

b) make encrypt operate on 'opaque' type, and just encrypt raw data in
memory, as many as there are, and store the original varlen separately.
(most encrypt-decrypt algorithms do not preserve data length anyway, they
operate in blocks of n bytes).  Question in this situation what to do with
decrypt, options are:

b1) make decrypt return opaque and to allow conversion from opaque to any
datatype, (by blindly setting the oid of return type), I'm not sure how
hard is this one to do with current type system, and do not like safety of
this since an ordinary user would be able to put garbage data into type
that may not be prepared to handle it.

b2) make encrypt store the name of original type in encrypted data. make
decrypt return opaque which would contain (type,data,length) triple, and
allow to cast opaque into any type but _checking_ that opaque has correct
format and that type stored in opaque matches type its being cast to.

This has additional benefit of being able to serialize/deserialize data,
preserving type, which may be used by something else...

In my opinion, a) is probably the easiest option to implement. b2) is
(IMHO) the most correct one, but it may be a bit too much work for not
that much of benefit?

This may be going a bit too far, since original question only dealt with
text-bytea conversions, but maybe its time to look at 'generic' functions
which return generic types.



pgsql-hackers by date:

Previous
From: Bruce Momjian
Date:
Subject: Re: [COMMITTERS] pgsql/src/bin/initdb initdb.sh
Next
From: Tatsuo Ishii
Date:
Subject: Re: stuck spin lock with many concurrent users