Re: GSOC 2018 ideas - Mailing list pgsql-hackers

From Charles Cui
Subject Re: GSOC 2018 ideas
Date
Msg-id CA+SXE9u2hzmb4ofdpF3Htj9aDuY2oK6a3AqBGPc5BMyH649SBw@mail.gmail.com
Whole thread Raw
In response to Re: GSOC 2018 ideas  (Aleksander Alekseev <a.alekseev@postgrespro.ru>)
List pgsql-hackers


2018-03-05 1:42 GMT-08:00 Aleksander Alekseev <a.alekseev@postgrespro.ru>:
Hello Charles,

>    Went through the documents listed by you, and they are helpful!
> It seems the main purpose of extension pg_protobuf is to parse
> a protobuf struct and return the decoded field. May I ask how these kinds
> of extensions are used in postgreSQL (or in other words, the scenarios to
> use these plugins)?

There are a few ideas behind all of this.

1) Sometimes people are not quite happy with strict relational schema by
various reasons and prefer something more agile, like XML or JSON. These
formats are indeed more convenient under certain circumstances, for
instance in terms of ease of changing and migrating the schema.

2) One drawback of JSON is redundancy. For instance, you have to store
the names of all document fields. These names don't carry much
information but consume disk space and RAM thus affecting the overall
performance. ZSON extension [1] partially solved this issue. However I
wouldn't call it particularly convenient and the whole approach of
compressing JSON seems to me more like a dirty hack, not a solution. The
problem appeared because of using the wrong data format in the first
place.

3) Unlike JSON, formats like Protobuf or Thrift are binary formats and
most importantly don't store any field names. Thus they don't create a
problem described above. However, PostgreSQL is not capable to access
Protobuf fields out-of-the-box, for instance to index these fields. This
is what pg_protobuf is for.

The idea of using flexible schema and build index on top of them is awesome!
Will definitely submit a proposal and focus on this if get selected. 
Thanks for answering my questions. 
 
Hopefully this answers you question. If you have other questions please
don't hesitate to ask!

[1]: https://github.com/postgrespro/zson


--
Best regards,
Aleksander Alekseev

pgsql-hackers by date:

Previous
From: Michael Paquier
Date:
Subject: Comment on top of RangeVarGetRelidExtended in namespace.c mentioningRangeVarGetRelid
Next
From: Robert Haas
Date:
Subject: Re: [HACKERS] path toward faster partition pruning