Re: pgcrypto: PGP armor headers - Mailing list pgsql-hackers

From Heikki Linnakangas
Subject Re: pgcrypto: PGP armor headers
Date
Msg-id 542BA921.8050603@vmware.com
Whole thread Raw
In response to Re: pgcrypto: PGP armor headers  (Marko Tiikkaja <marko@joh.to>)
Responses Re: pgcrypto: PGP armor headers
List pgsql-hackers
On 09/30/2014 06:39 PM, Marko Tiikkaja wrote:
> On 9/30/14 5:17 PM, Heikki Linnakangas wrote:
>> I'm actually now leaning towards providing just a single function,
>> pgp_armor_headers(text, key OUT text, value OUT text), which returns all
>> the keys and values. That gives maximum flexibility, and leaves it up to
>> the user to decide what to do with duplicate keys. It's pretty easy to
>> use that to extract just a single header, too:
>>
>> <snip>
>>
>> What do you think? Attached patch implements that, but the docs and
>> regression tests now need adjustment.
>
> (You forgot the patch, but I can imagine what it would have been.)

Oops.

> I'm not exactly sure to be honest.  I would personally like to see a
> simple function for extracting a single header value in a scalar context
> without having to deal with all the pain of SRFs, multiple matches and
> no matches.  Sure, that got a lot better in 9.3 with LATERAL but it's
> still way inferior to pgp_armor_header().
>
> I can also see why someone would argue that I should just create the
> function myself and that it doesn't have to be shipped with postgres.
> But on the other hand, this is already an extension one has to
> explicitly go and CREATE, and that considered one more function probably
> wouldn't hurt too much.

Yeah, building the function to extract a single value is pretty simple
once you have the set-returning function:

create function pgp_armor_header(armor text, key text) returns text
language sql as $$
  select string_agg(value, ' ') from pgp_armor_headers($1) where key = $2
$$;

I spent a little time cleaning up the regression tests and docs, and
ended up with the attached. But then I realized that there's a problem
with UTF-8 conversion in the armor() function. It returns the armored
blob as text, but forcibly converts the keys and values to UTF-8. That's
not cool, because you will get invalidly encoded strings into the
database, if you use the function while connected to a database that
uses some other encoding than UTF-8.

RFC4880 says that the headers are in UTF-8, but armor() cannot safely
return UTF-8 encoded text unless the database's encoding is also UTF-8.
It also rightly says that using anything else than plain ASCII, even
though nominally it's UTF-8, is a bad idea, because the whole point of
ASCII-armoring is to make the format safe from encoding conversions.

We have two options:

1. Throw an error if there are any non-ASCII characters in the key/value
arrays.
2. Don't convert them to UTF-8, but use the current database encoding.

Both seem sane to me. If we use the current database encoding, then we
have to also decide what to do with the input, in pgp_armor_headers().
If armor() uses the database encoding, but pgp_armor_headers() treats
the input as UTF-8, then a round-trip with pgp_armor_headers(armor(?))
won't work.

- Heikki


Attachment

pgsql-hackers by date:

Previous
From: Michael Paquier
Date:
Subject: Re: REINDEX CONCURRENTLY 2.0
Next
From: Peter Geoghegan
Date:
Subject: "Value locking" Wiki page