Thread: pgcrypto: PGP armor headers

pgcrypto: PGP armor headers

From
Marko Tiikkaja
Date:
Hi,

Currently there's no way to generate or extract armor headers from the
PGP armored format in pgcrypto.  I've written a patch to add the
support.  For example:

local:marko=#* select armor('zooka', array['Version', 'Comment'],
array['Created by pgcrypto', 'PostgreSQL, the database']);
                armor
-----------------------------------
  -----BEGIN PGP MESSAGE-----      +
  Version: Created by pgcrypto     +
  Comment: PostgreSQL, the database+
                                   +
  em9va2E=                         +
  =D5cR                            +
  -----END PGP MESSAGE-----        +

local:marko=#* select pgp_armor_header(armor('zooka', array['Version',
'Comment'], array['Created by pgcrypto', 'PostgreSQL, the database']),
'Comment');
      pgp_armor_header
--------------------------
  PostgreSQL, the database
(1 row)


.marko

Attachment

Re: pgcrypto: PGP armor headers

From
Marko Tiikkaja
Date:
Hi,

On 8/8/14 3:18 PM, I wrote:
> Currently there's no way to generate or extract armor headers from the
> PGP armored format in pgcrypto.  I've written a patch to add the
> support.

Latest version of the patch here, having fixed some small coding issues.


.marko

Attachment

Re: pgcrypto: PGP armor headers

From
Jeff Janes
Date:
On Fri, Aug 15, 2014 at 1:55 AM, Marko Tiikkaja <marko@joh.to> wrote:
Hi,


On 8/8/14 3:18 PM, I wrote:
Currently there's no way to generate or extract armor headers from the
PGP armored format in pgcrypto.  I've written a patch to add the
support.

Latest version of the patch here, having fixed some small coding issues.

I've built this and tested the installation of the extension, the upgrade from earlier versions, and the basic functions, with and without --enable-cassert

I did occasionally get some failures with 'make check -C contrib/pgcrypto', but I can't reproduce it.  I might have accidentally had some mixture of binaries some with cassert and some without.

No other problems to report.

I didn't do a detailed code review, and I am not a security expert, so I will leave it to the signed-up reviewer to change the status once he takes a look.

One quibble in the documentation, "an error is returned".  Errors get raised, not returned.

This patch will conflict with the pgp signature patch due to the pgcrypto--1.2.sql and kin.

Cheers,

Jeff

Re: pgcrypto: PGP armor headers

From
Joel Jacobson
Date:
Marko, et al,

This is a review of the pgcrypto PGP Armor Headers patch:
http://www.postgresql.org/message-id/53EDCAE8.20004@joh.to

Contents & Purpose
==================
This patch add functions to create and extract OpenPGP Armor Headers.
from OpenPGP messages.

Included in the patch are updated regression test cases and documentation.

Initial Run
===========
The patch applies cleanly to HEAD.

The 144 regression tests all pass successfully against the new patch.

Conclusion
==========
Since I'm using these functions in the BankAPI project,
https://github.com/trustly/bankapi, I have tested them
by actually using them in production, in addition to the provided
regression tests, which is a good sign they are working not just
in theory.

+1 for committer review.

On Fri, Aug 15, 2014 at 10:55 AM, Marko Tiikkaja <marko@joh.to> wrote:
> Hi,
>
>
> On 8/8/14 3:18 PM, I wrote:
>>
>> Currently there's no way to generate or extract armor headers from the
>> PGP armored format in pgcrypto.  I've written a patch to add the
>> support.
>
>
> Latest version of the patch here, having fixed some small coding issues.
>
>
> .marko
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>



Re: pgcrypto: PGP armor headers

From
Heikki Linnakangas
Date:
On 08/15/2014 11:55 AM, Marko Tiikkaja wrote:
> Hi,
>
> On 8/8/14 3:18 PM, I wrote:
>> Currently there's no way to generate or extract armor headers from the
>> PGP armored format in pgcrypto.  I've written a patch to add the
>> support.
>
> Latest version of the patch here, having fixed some small coding issues.

This coding style gives me the willies:

>     guess_len = pgp_armor_enc_len(data_len, num_headers, keys, values);
>     res = palloc(VARHDRSZ + guess_len);
>
>     res_len = pgp_armor_encode((uint8 *) VARDATA(data), data_len,
>                                (uint8 *) VARDATA(res),
>                                num_headers, keys, values);
>     if (res_len > guess_len)
>         ereport(ERROR,
>                 (errcode(ERRCODE_EXTERNAL_ROUTINE_INVOCATION_EXCEPTION),
>                  errmsg("Overflow - encode estimate too small")));

That was OK before this patch, as the length calculation was simple 
enough to verify (although if I were writing it from scratch, I would've 
written it differently). But with this patch, it gets a lot more 
complicated, and I can't easily convince myself that it's correct.

pgp_armor_enc_len might be vulnerable to integer overflow. Consider 1GB 
worth of keys, 1GB worth of values, and 1GB worth of data. I'm not sure 
if you can quite make it overflow a 32-bit unsigned integer, but at 
least you can get nervously close, e.g if you use use max-sized 
key/value arrays, with a single byte in each key and value. Oh, and if 
you use a single-byte server encoding and characters that get expanded 
to multiple bytes in UTF-8, you can go higher.

So I think this (and the corresponding dearmor code too) should be 
refactored to use a StringInfo that gets enlarged as needed, instead of 
hoping to guess the size correctly beforehand. To ease review, might 
make sense to do that as a separate patch over current sources, and the 
main patch on top of that.

BTW, I'm surprised that there is no function to get all the armor 
headers. You can only probe for a particular one with pgp_armor_headder, 
but there is no way to list them all, if you don't know what you're 
looking for.

- Heikki




Re: pgcrypto: PGP armor headers

From
Marko Tiikkaja
Date:
On 9/9/14 10:54 AM, Heikki Linnakangas wrote:
> So I think this (and the corresponding dearmor code too) should be
> refactored to use a StringInfo that gets enlarged as needed, instead of
> hoping to guess the size correctly beforehand. To ease review, might
> make sense to do that as a separate patch over current sources, and the
> main patch on top of that.

Yeah, I thought the same thing while writing that code.  Thanks, I'll do 
it that way.

> BTW, I'm surprised that there is no function to get all the armor
> headers. You can only probe for a particular one with pgp_armor_headder,
> but there is no way to list them all, if you don't know what you're
> looking for.

Right.  The assumption was that due to the nature of how the headers are 
used, the user would always be looking for a certain, predetermined set 
of values.  But I don't oppose to adding a pgp_armor_header_keys() 
function to get the set of all keys, for example.


.marko



Re: pgcrypto: PGP armor headers

From
Marko Tiikkaja
Date:
On 9/9/14 11:01 AM, I wrote:
> On 9/9/14 10:54 AM, Heikki Linnakangas wrote:
>> So I think this (and the corresponding dearmor code too) should be
>> refactored to use a StringInfo that gets enlarged as needed, instead of
>> hoping to guess the size correctly beforehand. To ease review, might
>> make sense to do that as a separate patch over current sources, and the
>> main patch on top of that.
>
> Yeah, I thought the same thing while writing that code.  Thanks, I'll do
> it that way.

Attached is what I have right now.  I started working on the decoding
part, but it has this piece of code:

    /* decode crc */
    if (b64_decode(p + 1, 4, buf) != 3)
        goto out;

which makes the approach a bit uglier.  If I did this the same way, I
would have to create and destroy a StringInfo just for this operation,
which seems ugly.  So I wonder if I shouldn't try and instead keep the
code closer to what it is in HEAD right now; I could call
enlargeStringInfo() first, then hand out a pointer to b64_encode (or
b64_decode()) and finally increment StringInfoData.len by how much was
actually written.  That would keep the code changes a lot smaller, too.

Is either of these approaches anywhere near what you had in mind?

I'm also not sure why we need to keep a copy of the base64
encoding/decoding logic instead of exporting it in utils/adt/encode.c.


.marko

Attachment

Re: pgcrypto: PGP armor headers

From
Heikki Linnakangas
Date:
On 09/10/2014 02:26 PM, Marko Tiikkaja wrote:
> On 9/9/14 11:01 AM, I wrote:
>> On 9/9/14 10:54 AM, Heikki Linnakangas wrote:
>>> So I think this (and the corresponding dearmor code too) should be
>>> refactored to use a StringInfo that gets enlarged as needed, instead of
>>> hoping to guess the size correctly beforehand. To ease review, might
>>> make sense to do that as a separate patch over current sources, and the
>>> main patch on top of that.
>>
>> Yeah, I thought the same thing while writing that code.  Thanks, I'll do
>> it that way.
>
> Attached is what I have right now.  I started working on the decoding
> part, but it has this piece of code:
>
>      /* decode crc */
>      if (b64_decode(p + 1, 4, buf) != 3)
>          goto out;
>
> which makes the approach a bit uglier.  If I did this the same way, I
> would have to create and destroy a StringInfo just for this operation,
> which seems ugly.  So I wonder if I shouldn't try and instead keep the
> code closer to what it is in HEAD right now; I could call
> enlargeStringInfo() first, then hand out a pointer to b64_encode (or
> b64_decode()) and finally increment StringInfoData.len by how much was
> actually written.  That would keep the code changes a lot smaller, too.

Yeah, that sounds reasonable.

> I'm also not sure why we need to keep a copy of the base64
> encoding/decoding logic instead of exporting it in utils/adt/encode.c.

Good question...

- Heikki



Re: pgcrypto: PGP armor headers

From
Marko Tiikkaja
Date:
On 9/10/14 1:38 PM, Heikki Linnakangas wrote:
> On 09/10/2014 02:26 PM, Marko Tiikkaja wrote:
>> So I wonder if I shouldn't try and instead keep the
>> code closer to what it is in HEAD right now; I could call
>> enlargeStringInfo() first, then hand out a pointer to b64_encode (or
>> b64_decode()) and finally increment StringInfoData.len by how much was
>> actually written.  That would keep the code changes a lot smaller, too.
>
> Yeah, that sounds reasonable.

OK, I've attemped to do that in the attached.  I'm pretty sure I didn't
get all of the overflows right, so someone should probably take a really
good look at it.  (I'm not too confident the original code got them
right either, but whatever).

Speaking of good looks, should I add it to the next commitfest as a new
patch, or should we try and get someone to review it like this?

>> I'm also not sure why we need to keep a copy of the base64
>> encoding/decoding logic instead of exporting it in utils/adt/encode.c.
>
> Good question...

I've left this question unanswered for now.  We can fix it later,
independently of this patch.


.marko

Attachment

Re: pgcrypto: PGP armor headers

From
Heikki Linnakangas
Date:
On 09/10/2014 04:35 PM, Marko Tiikkaja wrote:
> Speaking of good looks, should I add it to the next commitfest as a new
> patch, or should we try and get someone to review it like this?

Let's handle this in this commitfest.

- Heikki




Re: pgcrypto: PGP armor headers

From
Heikki Linnakangas
Date:
On 09/10/2014 04:35 PM, Marko Tiikkaja wrote:
> On 9/10/14 1:38 PM, Heikki Linnakangas wrote:
>> On 09/10/2014 02:26 PM, Marko Tiikkaja wrote:
>>> So I wonder if I shouldn't try and instead keep the
>>> code closer to what it is in HEAD right now; I could call
>>> enlargeStringInfo() first, then hand out a pointer to b64_encode (or
>>> b64_decode()) and finally increment StringInfoData.len by how much was
>>> actually written.  That would keep the code changes a lot smaller, too.
>>
>> Yeah, that sounds reasonable.
>
> OK, I've attemped to do that in the attached.  I'm pretty sure I didn't
> get all of the overflows right, so someone should probably take a really
> good look at it.  (I'm not too confident the original code got them
> right either, but whatever).

Looks good, committed. It might've been a tad more efficient to return 
the StringInfo buffer directly from pgp_armor/dearmor, and avoid the 
extra palloc and memcpy, but this isn't performance critical enough for 
it to really matter.

Are you planning to post the main patch rebased on top of this soon? As 
in the next day or two? Otherwise I'll mark this as "Returned with 
feedback" for this commitfest.

- Heikki




Re: pgcrypto: PGP armor headers

From
Marko Tiikkaja
Date:
On 9/25/14 3:50 PM, Heikki Linnakangas wrote:
> On 09/10/2014 04:35 PM, Marko Tiikkaja wrote:
>> OK, I've attemped to do that in the attached.  I'm pretty sure I didn't
>> get all of the overflows right, so someone should probably take a really
>> good look at it.  (I'm not too confident the original code got them
>> right either, but whatever).
>
> Looks good, committed.

Thanks!

> It might've been a tad more efficient to return
> the StringInfo buffer directly from pgp_armor/dearmor, and avoid the
> extra palloc and memcpy, but this isn't performance critical enough for
> it to really matter.

I couldn't see any way of doing that without breaking the VARDATA 
abstraction.  I even went looking for similar cases in the source code, 
but couldn't find any.  If you can come up with a way, feel free to 
change that -- I'd like to learn myself.

But like you, I didn't consider it too important.

> Are you planning to post the main patch rebased on top of this soon? As
> in the next day or two? Otherwise I'll mark this as "Returned with
> feedback" for this commitfest.

Yes.  With good luck I'll get you a rebased one today, otherwise it'll 
have to wait until tomorrow.


.marko



Re: pgcrypto: PGP armor headers

From
Heikki Linnakangas
Date:
On 09/25/2014 04:56 PM, Marko Tiikkaja wrote:
> On 9/25/14 3:50 PM, Heikki Linnakangas wrote:
>> On 09/10/2014 04:35 PM, Marko Tiikkaja wrote:
>> It might've been a tad more efficient to return
>> the StringInfo buffer directly from pgp_armor/dearmor, and avoid the
>> extra palloc and memcpy, but this isn't performance critical enough for
>> it to really matter.
>
> I couldn't see any way of doing that without breaking the VARDATA
> abstraction.  I even went looking for similar cases in the source code,
> but couldn't find any.  If you can come up with a way, feel free to
> change that -- I'd like to learn myself.

You could first append VARHDRSZ zeros to the StringInfo, then append the 
base64-encoded data, and last replace the zeros with the real length, 
using SET_VARSIZE.

>> Are you planning to post the main patch rebased on top of this soon? As
>> in the next day or two? Otherwise I'll mark this as "Returned with
>> feedback" for this commitfest.
>
> Yes.  With good luck I'll get you a rebased one today, otherwise it'll
> have to wait until tomorrow.

Ok, I'll leave this in Waiting on Author state then.

- Heikki




Re: pgcrypto: PGP armor headers

From
Marko Tiikkaja
Date:
On 9/25/14 4:08 PM, Heikki Linnakangas wrote:
> On 09/25/2014 04:56 PM, Marko Tiikkaja wrote:
>> On 9/25/14 3:50 PM, Heikki Linnakangas wrote:
>>> On 09/10/2014 04:35 PM, Marko Tiikkaja wrote:
>>> It might've been a tad more efficient to return
>>> the StringInfo buffer directly from pgp_armor/dearmor, and avoid the
>>> extra palloc and memcpy, but this isn't performance critical enough for
>>> it to really matter.
>>
>> I couldn't see any way of doing that without breaking the VARDATA
>> abstraction.  I even went looking for similar cases in the source code,
>> but couldn't find any.  If you can come up with a way, feel free to
>> change that -- I'd like to learn myself.
>
> You could first append VARHDRSZ zeros to the StringInfo, then append the
> base64-encoded data, and last replace the zeros with the real length,
> using SET_VARSIZE.

That's assuming that VARDATA() is at exactly VARHDRSZ bytes.  I couldn't 
find any callers making that assumption.


.marko



Re: pgcrypto: PGP armor headers

From
Marko Tiikkaja
Date:
Hi,

On 9/25/14, 3:56 PM, I wrote:
> On 9/25/14 3:50 PM, Heikki Linnakangas wrote:
>> Are you planning to post the main patch rebased on top of this soon? As
>> in the next day or two? Otherwise I'll mark this as "Returned with
>> feedback" for this commitfest.
>
> Yes.  With good luck I'll get you a rebased one today, otherwise it'll
> have to wait until tomorrow.

Missed that promise by a day since something unexpected came up
yesterday.  Attached is v3 of the patch.  The changes are:

   - Rebased on top of the current master
   - Added a function pgp_armor_header_keys() to list all keys present
in an armor
   - Changed pgp_armor_header() to use a StringInfo instead of an mbuf
   - Fixed the "error is returned" problem in the documentation pointed
out earlier


.marko

Attachment

Re: pgcrypto: PGP armor headers

From
Heikki Linnakangas
Date:
On 09/27/2014 11:50 PM, Marko Tiikkaja wrote:
> Hi,
>
> On 9/25/14, 3:56 PM, I wrote:
>> On 9/25/14 3:50 PM, Heikki Linnakangas wrote:
>>> Are you planning to post the main patch rebased on top of this soon? As
>>> in the next day or two? Otherwise I'll mark this as "Returned with
>>> feedback" for this commitfest.
>>
>> Yes.  With good luck I'll get you a rebased one today, otherwise it'll
>> have to wait until tomorrow.
>
> Missed that promise by a day since something unexpected came up
> yesterday.  Attached is v3 of the patch.  The changes are:
>
>     - Rebased on top of the current master
>     - Added a function pgp_armor_header_keys() to list all keys present
> in an armor
>     - Changed pgp_armor_header() to use a StringInfo instead of an mbuf
>     - Fixed the "error is returned" problem in the documentation pointed
> out earlier

Thanks! I found the pgp_extract_armor_headers()'s signature quite weird,
so I simplified that by making it always return arrays of keys and
values. The callers is now responsible for returning all the keys
(pgp_armor_header_keys) or finding the single key (pgp_armor_header). I
also partially rewrote the implementation of
pgp_extract_armor_headers(), making it more readable I hope.

If an armor header line ends in CR+LF, pgp_armor_header() returned the
CR as part of the value, with your patch. I don't think that's right,
the line ending should be considered part of the armoring, so I changed
that.

Is there any real life examples or tools out there to generate armors
with headers with duplicate keys? RFC 4880 says:

>    Note that some transport methods are sensitive to line length.  While
>    there is a limit of 76 characters for the Radix-64 data (Section
>    6.3), there is no limit to the length of Armor Headers.  Care should
>    be taken that the Armor Headers are short enough to survive
>    transport.  One way to do this is to repeat an Armor Header key
>    multiple times with different values for each so that no one line is
>    overly long.

Does anyone do that in practice? Is there any precedence for
concatenating the values in other tools that read armor headers?

I wonder if it would make sense to have pgp_armor_header_keys() return
both the keys and values. That would make it easier to use, and it might
then make sense for it to not remove the duplicates or concatenate
values, but just them as is. The caller could then deal with the
duplicates any way he wants. We could keep the function for extracting
the value for a single key, with the concatenating behavior, for
convenience.

- Heikki


Attachment

Re: pgcrypto: PGP armor headers

From
Marko Tiikkaja
Date:
On 9/29/14 3:02 PM, Heikki Linnakangas wrote:
> Thanks! I found the pgp_extract_armor_headers()'s signature quite weird,
> so I simplified that by making it always return arrays of keys and
> values. The callers is now responsible for returning all the keys
> (pgp_armor_header_keys) or finding the single key (pgp_armor_header). I
> also partially rewrote the implementation of
> pgp_extract_armor_headers(), making it more readable I hope.

OK.  Looks good to me.

> If an armor header line ends in CR+LF, pgp_armor_header() returned the
> CR as part of the value, with your patch. I don't think that's right,
> the line ending should be considered part of the armoring, so I changed
> that.

Nice catch.  You are correct.

> Is there any real life examples or tools out there to generate armors
> with headers with duplicate keys? RFC 4880 says:
>
>>     Note that some transport methods are sensitive to line length.  While
>>     there is a limit of 76 characters for the Radix-64 data (Section
>>     6.3), there is no limit to the length of Armor Headers.  Care should
>>     be taken that the Armor Headers are short enough to survive
>>     transport.  One way to do this is to repeat an Armor Header key
>>     multiple times with different values for each so that no one line is
>>     overly long.
>
> Does anyone do that in practice? Is there any precedence for
> concatenating the values in other tools that read armor headers?

Maybe I just suck at $SEARCH_ENGINE, but extracting armor headers 
programmatically doesn't seem to be very popular.  I could only find one 
example, which returned the last instance of the key.  But that seemed 
to be more an accident than anything else; it wasn't documented and the 
source code didn't say anything about it.  I also think that's the worst 
behaviour.  If we can't agree on concatenation, I'd rather see an error.

> I wonder if it would make sense to have pgp_armor_header_keys() return
> both the keys and values. That would make it easier to use, and it might
> then make sense for it to not remove the duplicates or concatenate
> values, but just them as is. The caller could then deal with the
> duplicates any way he wants. We could keep the function for extracting
> the value for a single key, with the concatenating behavior, for
> convenience.

I'd also suggest renaming it to pgp_armor_headers() in that case.  But 
otherwise it seems like a reasonable plan to me.  Want me to do that 
change or are you going to?


.marko



Re: pgcrypto: PGP armor headers

From
Heikki Linnakangas
Date:
On 09/29/2014 05:38 PM, Marko Tiikkaja wrote:
> On 9/29/14 3:02 PM, Heikki Linnakangas wrote:
>> Is there any real life examples or tools out there to generate armors
>> with headers with duplicate keys? RFC 4880 says:
>>
>>>      Note that some transport methods are sensitive to line length.  While
>>>      there is a limit of 76 characters for the Radix-64 data (Section
>>>      6.3), there is no limit to the length of Armor Headers.  Care should
>>>      be taken that the Armor Headers are short enough to survive
>>>      transport.  One way to do this is to repeat an Armor Header key
>>>      multiple times with different values for each so that no one line is
>>>      overly long.
>>
>> Does anyone do that in practice? Is there any precedence for
>> concatenating the values in other tools that read armor headers?
>
> Maybe I just suck at $SEARCH_ENGINE, but extracting armor headers
> programmatically doesn't seem to be very popular.  I could only find one
> example, which returned the last instance of the key.  But that seemed
> to be more an accident than anything else; it wasn't documented and the
> source code didn't say anything about it.  I also think that's the worst
> behaviour.  If we can't agree on concatenation, I'd rather see an error.

May I ask you why you wrote this patch? What are you doing with the headers?

- Heikki




Re: pgcrypto: PGP armor headers

From
Marko Tiikkaja
Date:
On 9/30/14 4:37 PM, Heikki Linnakangas wrote:
> On 09/29/2014 05:38 PM, Marko Tiikkaja wrote:
>> Maybe I just suck at $SEARCH_ENGINE, but extracting armor headers
>> programmatically doesn't seem to be very popular.  I could only find one
>> example, which returned the last instance of the key.  But that seemed
>> to be more an accident than anything else; it wasn't documented and the
>> source code didn't say anything about it.  I also think that's the worst
>> behaviour.  If we can't agree on concatenation, I'd rather see an error.
>
> May I ask you why you wrote this patch? What are you doing with the headers?

We're sending arbitrary messages between systems over HTTP(S), and a 
special header is used to tell the recipient system what type of message 
it is.  The message types are specific to the domain, but you can think 
of them to be roughly equivalent to MIME types.

If what you're trying to get a sense of is why I'd prefer to see 
concatenation, I can't really help you.  For our use case (and perhaps 
for everyone else as well) it would actually make more sense to throw an 
error if pgp_armor_header() is used on a key which appears more than 
once.  The concatenation behaviour was an attempt at a "one size fits 
all" interface, but now that we're going to also have a 
pgp_armor_headers() function for users to implement the behaviour they 
want themselves, there's no real reason to try and guess what everyone 
wants.  I think I'd prefer to see an ERROR in this case now.


.marko



Re: pgcrypto: PGP armor headers

From
Alvaro Herrera
Date:
Heikki Linnakangas wrote:

Happened to land on the middle of the regression test by accident and
noticed:

> +select armor('zooka', array['Version', 'Comment'], array['Created by pgcrypto', 'PostgreSQL, the world''s most most
advancedopen source database']);
 
> +                                  armor                                   
> +--------------------------------------------------------------------------
> + -----BEGIN PGP MESSAGE-----                                             +
> + Version: Created by pgcrypto                                            +
> + Comment: PostgreSQL, the world's most most advanced open source database+
> +                                                                         +
> + em9va2E=                                                                +
> + =D5cR                                                                   +
> + -----END PGP MESSAGE-----                                               +
> + 
> +(1 row)

"most most"?

-- 
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services



Re: pgcrypto: PGP armor headers

From
Heikki Linnakangas
Date:
On 09/30/2014 05:45 PM, Marko Tiikkaja wrote:
> On 9/30/14 4:37 PM, Heikki Linnakangas wrote:
>> On 09/29/2014 05:38 PM, Marko Tiikkaja wrote:
>>> Maybe I just suck at $SEARCH_ENGINE, but extracting armor headers
>>> programmatically doesn't seem to be very popular.  I could only find one
>>> example, which returned the last instance of the key.  But that seemed
>>> to be more an accident than anything else; it wasn't documented and the
>>> source code didn't say anything about it.  I also think that's the worst
>>> behaviour.  If we can't agree on concatenation, I'd rather see an error.
>>
>> May I ask you why you wrote this patch? What are you doing with the headers?
>
> We're sending arbitrary messages between systems over HTTP(S), and a
> special header is used to tell the recipient system what type of message
> it is.  The message types are specific to the domain, but you can think
> of them to be roughly equivalent to MIME types.

Ok. How quaint. :-)

> If what you're trying to get a sense of is why I'd prefer to see
> concatenation, I can't really help you.  For our use case (and perhaps
> for everyone else as well) it would actually make more sense to throw an
> error if pgp_armor_header() is used on a key which appears more than
> once.  The concatenation behaviour was an attempt at a "one size fits
> all" interface, but now that we're going to also have a
> pgp_armor_headers() function for users to implement the behaviour they
> want themselves, there's no real reason to try and guess what everyone
> wants.  I think I'd prefer to see an ERROR in this case now.

I'm actually now leaning towards providing just a single function, 
pgp_armor_headers(text, key OUT text, value OUT text), which returns all 
the keys and values. That gives maximum flexibility, and leaves it up to 
the user to decide what to do with duplicate keys. It's pretty easy to 
use that to extract just a single header, too:

postgres=# select * FROM pgp_armor_headers('
-----BEGIN PGP MESSAGE-----
foo: baar
foo: more foo
singlekey: fsdfsd

em9va2E=
=ZZZZ
-----END PGP MESSAGE-----
') where key = 'singlekey';    key    | value
-----------+-------- singlekey | fsdfsd
(1 row)

And if you want to concatenate possible duplicates:

postgres=# select string_agg(value, ' ') FROM pgp_armor_headers('
-----BEGIN PGP MESSAGE-----
foo: baar
foo: more foo
singlekey: fsdfsd

em9va2E=
=ZZZZ
-----END PGP MESSAGE-----
') where key = 'foo';  string_agg
--------------- baar more foo
(1 row)

What do you think? Attached patch implements that, but the docs and 
regression tests now need adjustment.

- Heikki



Re: pgcrypto: PGP armor headers

From
Marko Tiikkaja
Date:
On 9/30/14 5:17 PM, Heikki Linnakangas wrote:
> I'm actually now leaning towards providing just a single function,
> pgp_armor_headers(text, key OUT text, value OUT text), which returns all
> the keys and values. That gives maximum flexibility, and leaves it up to
> the user to decide what to do with duplicate keys. It's pretty easy to
> use that to extract just a single header, too:
>
> <snip>
>
> What do you think? Attached patch implements that, but the docs and
> regression tests now need adjustment.

(You forgot the patch, but I can imagine what it would have been.)

I'm not exactly sure to be honest.  I would personally like to see a 
simple function for extracting a single header value in a scalar context 
without having to deal with all the pain of SRFs, multiple matches and 
no matches.  Sure, that got a lot better in 9.3 with LATERAL but it's 
still way inferior to pgp_armor_header().

I can also see why someone would argue that I should just create the 
function myself and that it doesn't have to be shipped with postgres. 
But on the other hand, this is already an extension one has to 
explicitly go and CREATE, and that considered one more function probably 
wouldn't hurt too much.

But either way I won't be unhappy, so it's up to you (and/or other 
members of the community) to decide this one.


.marko



Re: pgcrypto: PGP armor headers

From
Heikki Linnakangas
Date:
On 09/30/2014 06:39 PM, Marko Tiikkaja wrote:
> On 9/30/14 5:17 PM, Heikki Linnakangas wrote:
>> I'm actually now leaning towards providing just a single function,
>> pgp_armor_headers(text, key OUT text, value OUT text), which returns all
>> the keys and values. That gives maximum flexibility, and leaves it up to
>> the user to decide what to do with duplicate keys. It's pretty easy to
>> use that to extract just a single header, too:
>>
>> <snip>
>>
>> What do you think? Attached patch implements that, but the docs and
>> regression tests now need adjustment.
>
> (You forgot the patch, but I can imagine what it would have been.)

Oops.

> I'm not exactly sure to be honest.  I would personally like to see a
> simple function for extracting a single header value in a scalar context
> without having to deal with all the pain of SRFs, multiple matches and
> no matches.  Sure, that got a lot better in 9.3 with LATERAL but it's
> still way inferior to pgp_armor_header().
>
> I can also see why someone would argue that I should just create the
> function myself and that it doesn't have to be shipped with postgres.
> But on the other hand, this is already an extension one has to
> explicitly go and CREATE, and that considered one more function probably
> wouldn't hurt too much.

Yeah, building the function to extract a single value is pretty simple
once you have the set-returning function:

create function pgp_armor_header(armor text, key text) returns text
language sql as $$
  select string_agg(value, ' ') from pgp_armor_headers($1) where key = $2
$$;

I spent a little time cleaning up the regression tests and docs, and
ended up with the attached. But then I realized that there's a problem
with UTF-8 conversion in the armor() function. It returns the armored
blob as text, but forcibly converts the keys and values to UTF-8. That's
not cool, because you will get invalidly encoded strings into the
database, if you use the function while connected to a database that
uses some other encoding than UTF-8.

RFC4880 says that the headers are in UTF-8, but armor() cannot safely
return UTF-8 encoded text unless the database's encoding is also UTF-8.
It also rightly says that using anything else than plain ASCII, even
though nominally it's UTF-8, is a bad idea, because the whole point of
ASCII-armoring is to make the format safe from encoding conversions.

We have two options:

1. Throw an error if there are any non-ASCII characters in the key/value
arrays.
2. Don't convert them to UTF-8, but use the current database encoding.

Both seem sane to me. If we use the current database encoding, then we
have to also decide what to do with the input, in pgp_armor_headers().
If armor() uses the database encoding, but pgp_armor_headers() treats
the input as UTF-8, then a round-trip with pgp_armor_headers(armor(?))
won't work.

- Heikki


Attachment

Re: pgcrypto: PGP armor headers

From
Marko Tiikkaja
Date:
On 10/1/14, 9:11 AM, Heikki Linnakangas wrote:
> I spent a little time cleaning up the regression tests and docs, and
> ended up with the attached. But then I realized that there's a problem
> with UTF-8 conversion in the armor() function. It returns the armored
> blob as text, but forcibly converts the keys and values to UTF-8. That's
> not cool, because you will get invalidly encoded strings into the
> database, if you use the function while connected to a database that
> uses some other encoding than UTF-8.
>
> RFC4880 says that the headers are in UTF-8, but armor() cannot safely
> return UTF-8 encoded text unless the database's encoding is also UTF-8.
> It also rightly says that using anything else than plain ASCII, even
> though nominally it's UTF-8, is a bad idea, because the whole point of
> ASCII-armoring is to make the format safe from encoding conversions.

Ugh.  Right.

> We have two options:
>
> 1. Throw an error if there are any non-ASCII characters in the key/value
> arrays.
> 2. Don't convert them to UTF-8, but use the current database encoding.
>
> Both seem sane to me. If we use the current database encoding, then we
> have to also decide what to do with the input, in pgp_armor_headers().
> If armor() uses the database encoding, but pgp_armor_headers() treats
> the input as UTF-8, then a round-trip with pgp_armor_headers(armor(?))
> won't work.

Yeah.  Both options seem fine to me.  Throwing an error perhaps slightly 
more so.


.marko



Re: pgcrypto: PGP armor headers

From
Heikki Linnakangas
Date:
On 10/01/2014 11:58 AM, Marko Tiikkaja wrote:
> On 10/1/14, 9:11 AM, Heikki Linnakangas wrote:
>> We have two options:
>>
>> 1. Throw an error if there are any non-ASCII characters in the key/value
>> arrays.
>> 2. Don't convert them to UTF-8, but use the current database encoding.
>>
>> Both seem sane to me. If we use the current database encoding, then we
>> have to also decide what to do with the input, in pgp_armor_headers().
>> If armor() uses the database encoding, but pgp_armor_headers() treats
>> the input as UTF-8, then a round-trip with pgp_armor_headers(armor(?))
>> won't work.
>
> Yeah.  Both options seem fine to me.  Throwing an error perhaps slightly
> more so.

I went with 1, throw an error. I also added checks that the key or value
doesn't contain any embedded newlines, and that the key doesn't contain
an embedded ": ". Those would cause the armor to be invalid.

I think this is now ready for commit, but since I've changed it quite
significantly from what you originally submitted, please take a moment
to review this.

- Heikki


Attachment

Re: pgcrypto: PGP armor headers

From
Marko Tiikkaja
Date:
On 10/1/14 1:01 PM, Heikki Linnakangas wrote:
> On 10/01/2014 11:58 AM, Marko Tiikkaja wrote:
>> On 10/1/14, 9:11 AM, Heikki Linnakangas wrote:
>>> We have two options:
>>>
>>> 1. Throw an error if there are any non-ASCII characters in the key/value
>>> arrays.
>>> 2. Don't convert them to UTF-8, but use the current database encoding.
>>>
>>> Both seem sane to me. If we use the current database encoding, then we
>>> have to also decide what to do with the input, in pgp_armor_headers().
>>> If armor() uses the database encoding, but pgp_armor_headers() treats
>>> the input as UTF-8, then a round-trip with pgp_armor_headers(armor(?))
>>> won't work.
>>
>> Yeah.  Both options seem fine to me.  Throwing an error perhaps slightly
>> more so.
>
> I went with 1, throw an error. I also added checks that the key or value
> doesn't contain any embedded newlines, and that the key doesn't contain
> an embedded ": ". Those would cause the armor to be invalid.

Great.

> I think this is now ready for commit, but since I've changed it quite
> significantly from what you originally submitted, please take a moment
> to review this.

   1) I see this compiler warning:

     pgp-pgsql.c: In function ‘pg_armor’:
     pgp-pgsql.c:960:18: warning: ‘values’ may be used uninitialized in
this function [-Wmaybe-uninitialized]
     pgp_armor_encode((uint8 *) VARDATA(data), data_len, &buf,

      It's bogus, but worth silencing anyway.

   2) There's what looks like an extra whitespace change in
pgp_armor_encode(), but maybe that's intentional?

   3) Also, I think the attached two corner cases deserve their own tests.

Other than the above, the patch looks good to me.  Huge thanks for your
work on this one!


.marko

Attachment

Re: pgcrypto: PGP armor headers

From
Heikki Linnakangas
Date:
On 10/01/2014 02:47 PM, Marko Tiikkaja wrote:
> On 10/1/14 1:01 PM, Heikki Linnakangas wrote:
>> I think this is now ready for commit, but since I've changed it quite
>> significantly from what you originally submitted, please take a moment
>> to review this.
>
>     1) I see this compiler warning:
>
>       pgp-pgsql.c: In function ‘pg_armor’:
>       pgp-pgsql.c:960:18: warning: ‘values’ may be used uninitialized in
> this function [-Wmaybe-uninitialized]
>       pgp_armor_encode((uint8 *) VARDATA(data), data_len, &buf,
>
>        It's bogus, but worth silencing anyway.
>
>     2) There's what looks like an extra whitespace change in
> pgp_armor_encode(), but maybe that's intentional?
>
>     3) Also, I think the attached two corner cases deserve their own tests.
>
> Other than the above, the patch looks good to me.  Huge thanks for your
> work on this one!

Ok, committed with those fixes.

- Heikki