Re: Proposal to use JSON for Postgres Parser format - Mailing list pgsql-hackers

From Alexander Korotkov
Subject Re: Proposal to use JSON for Postgres Parser format
Date
Msg-id CAPpHfdvxPJNudAqdnoZDL21simo-NNbi+pRn6a0SUa+G2Mg+KQ@mail.gmail.com
Whole thread Raw
In response to Re: Proposal to use JSON for Postgres Parser format  (Alexander Korotkov <aekorotkov@gmail.com>)
List pgsql-hackers
On Tue, Sep 20, 2022 at 1:00 PM Alexander Korotkov <aekorotkov@gmail.com> wrote:
> On Tue, Sep 20, 2022 at 7:48 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> > Peter Geoghegan <pg@bowt.ie> writes:
> > > On Mon, Sep 19, 2022 at 8:39 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> > >> Our existing format is certainly not great on those metrics, but
> > >> I do not see how "let's use JSON!" is a route to improvement.
> >
> > > The existing format was designed with developer convenience as a goal,
> > > though -- despite my complaints, and in spite of your objections.
> >
> > As Munro adduces nearby, it'd be a stretch to conclude that the current
> > format was designed with any Postgres-related goals in mind at all.
> > I think he's right that it's a variant of some Lisp-y dump format that's
> > probably far hoarier than even Berkeley Postgres.
> >
> > > If it didn't have to be easy (or even practical) for developers to
> > > directly work with the output format, then presumably the format used
> > > internally could be replaced with something lower level and faster. So
> > > it seems like the two goals (developer ergonomics and faster
> > > interchange format for users) might actually be complementary.
> >
> > I think the principal mistake in what we have now is that the storage
> > format is identical to the "developer friendly" text format (plus or
> > minus some whitespace).  First we need to separate those.  We could
> > have more than one equivalent text format perhaps, and I don't have
> > any strong objection to basing the text format (or one of them) on
> > JSON.
>
> +1 for considering storage format and text format separately.
>
> Let's consider what our criteria could be for the storage format.
>
> 1) Storage effectiveness (shorter is better) and
> serialization/deserialization effectiveness (faster is better). On
> this criterion, the custom binary format looks perfect.
> 2) Robustness in the case of corruption. It seems much easier to
> detect the data corruption and possibly make some partial manual
> recovery for textual format.
> 3) Standartness. It's better to use something known worldwide or at
> least used in other parts of PostgreSQL than something completely
> custom. From this perspective, JSON/JSONB is better than custom
> things.

(sorry, I've accidentally cut the last paragraph from the message)

It seems that there is no perfect fit for this multi-criteria
optimization, and we should pick what is more important.  Any
thoughts?

------
Regards,
Alexander Korotkov



pgsql-hackers by date:

Previous
From: Alexander Korotkov
Date:
Subject: Re: Proposal to use JSON for Postgres Parser format
Next
From: Önder Kalacı
Date:
Subject: Re: [PATCH] Use indexes on the subscriber when REPLICA IDENTITY is full on the publisher