Proposal to use JSON for Postgres Parser format - Mailing list pgsql-hackers

From Michel Pelletier
Subject Proposal to use JSON for Postgres Parser format
Date
Msg-id CACxu=vL_SD=WJiFSJyyBuZAp_2v_XBqb1x9JBiqz52a_g9z3jA@mail.gmail.com
Whole thread Raw
Responses Re: Proposal to use JSON for Postgres Parser format
Re: Proposal to use JSON for Postgres Parser format
List pgsql-hackers
Hello hackers,

As noted in the source:

https://github.com/postgres/postgres/blob/master/src/include/nodes/pg_list.h#L6-L11

 * Once upon a time, parts of Postgres were written in Lisp and used real
 * cons-cell lists for major data structures.  When that code was rewritten
 * in C, we initially had a faithful emulation of cons-cell lists, which
 * unsurprisingly was a performance bottleneck.  A couple of major rewrites
 * later, these data structures are actually simple expansible arrays;
 * but the "List" name and a lot of the notation survives.

The Postgres parser format as described in the wiki page:

https://wiki.postgresql.org/wiki/Query_Parsing

looks almost, but not quite, entirely like JSON:

    SELECT * FROM foo where bar = 42 ORDER BY id DESC LIMIT 23;
       (
          {SELECT
          :distinctClause <>
          :intoClause <>
          :targetList (
             {RESTARGET
             :name <>
             :indirection <>
             :val
                {COLUMNREF
                :fields (
                   {A_STAR
                   }
                )
                :location 7
                }
             :location 7
             }
          )
          :fromClause (
             {RANGEVAR
             :schemaname <>
             :relname foo
             :inhOpt 2
             :relpersistence p
             :alias <>
             :location 14
             }
          )
          ... and so on
       )

This non-standard format is useful for visual inspection and perhaps
simple parsing.  Parsers that do exist for it are generally specific
to some languages.  If there were a standard way to parse queries,
tools like code generators and analysis tools can work with a variety
of libraries that already handle JSON quite well.  Future potential
would include exposing this data to command_ddl_start event triggers.
Providing a JSON Schema would also aid tools that want to validate or
transform the json with rule based systems.

I would like to propose a discussion that in a future major release Postgres switch
from this custom format to JSON.  The current format is question is
generated from macros and functions found in
`src/backend/nodes/readfuncs.c` and `src/backend/nodes/outfuncs.c` and
converting them to emit valid JSON would be relatively
straightforward.

One downside would be that this would not be a forward compatible
binary change across releases.  Since it is unlikely that very much
code is reliant on this custom format; this would not be a huge problem
for most.

Thoughts?

-Michel

pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: Silencing the remaining clang 15 warnings
Next
From: Michael Paquier
Date:
Subject: Re: Support pg_attribute_aligned and noreturn in MSVC