Re: GSoC 2018: thrift encoding format - Mailing list pgsql-hackers

From Charles Cui
Subject Re: GSoC 2018: thrift encoding format
Date
Msg-id CA+SXE9vKnVbakO5OhTs5MZV2v44RqMjeKMQy2dZZr-oO=yUbVQ@mail.gmail.com
Whole thread Raw
In response to Re: GSoC 2018: thrift encoding format  (Stephen Frost <sfrost@snowman.net>)
List pgsql-hackers
Thanks guys for your ideas! I feel like it is easier to 
follow pg_protobuf 's method to design and implement pg_thrift
for a postgres beginner like me. I can refer pg_protobuf's way of
using functions, writing tests, etc. I will reconsider what's the 
returned format for list, sets, and struct, etc. when I touch that 
part. Right now, I assume inputs are a series of thrift bytes, and
try to implement decoding logics for simple types.

2018-05-04 8:42 GMT-07:00 Stephen Frost <sfrost@snowman.net>:
Greetings,

* Aleksander Alekseev (a.alekseev@postgrespro.ru) wrote:
> > I understand that you're open to having it as a new data type or as a
> > bytea, but I don't agree.  This should be a new data type, just as json
> > is a distinct data type and so is jsonb.
>
> Could you please explain in a little more detail why you believe so?

As mentioned elsewhere, there's multiple ways to encode thrift, no?  We
should pick which one makes sense and make that the interface to the
data type and then we might actually store the data differently, not to
mention that we'll likely want to build on things like indexing
capabilities to this data type, as we have for jsonb, and that's much
cleaner to do with a proper data type than if everyone has to use bytea
to store the data and then functional indexes (if we could even make
that happen...  I'm not thrilled with such an idea in any case).

Data validation is another thing- if it's a thrift data type then we can
validate that it's correct on the way in, and depend on that correctness
on the way out (to some extent- obviously we have to be wary of
corruption possibilities and such).

We could toss out all of our data types and store everything as bytea's
if we wanted to, but we don't, and for quite a few good reasons, these
are just a couple that I'm thinking of off-hand.

> Also I wonder whether in your opinion the extension should provide
> implicit casts from/to bytea as well.

I wouldn't make them implicit...

Thanks!

Stephen

pgsql-hackers by date:

Previous
From: Mark Dilger
Date:
Subject: Re: WIP: a way forward on bootstrap data
Next
From: Michael Paquier
Date:
Subject: Extra newlines added even if PQerrorMessage