Re: WIP: Generic functions for Node types using generated metadata - Mailing list pgsql-hackers

From Andres Freund
Subject Re: WIP: Generic functions for Node types using generated metadata
Date
Msg-id 20190920224354.jihya5waks642e6s@alap3.anarazel.de
Whole thread Raw
In response to Re: WIP: Generic functions for Node types using generated metadata  (Andres Freund <andres@anarazel.de>)
Responses Re: WIP: Generic functions for Node types using generated metadata
List pgsql-hackers
Hi,

On 2019-09-19 22:18:57 -0700, Andres Freund wrote:
> While working on this I evolved the node string format a bit:
> 
> 1) Node types start with the their "normal" name, rather than
>    uppercase. There seems little point in having such a divergence.
> 
> 2) The node type is followed by the node-type id. That allows to more
>    quickly locate the corresponding node metadata (array and one name
>    recheck, rather than a binary search). I.e. the node starts with
>    "{Scan 18 " rather than "{SCAN " as before.
> 
> 3) Nodes that contain other nodes as sub-types "inline", still emit {}
>    for the subtype. There's no functional need for this, but I found the
>    output otherwise much harder to read.  E.g. for mergejoin we'd have
>    something like
> 
>    {MergeJoin 37 :join {Join 35 :plan {Plan ...} :jointype JOIN_INNER ...} :skip_mark_restore true ...}
> 
> 4) As seen in the above example, enums are decoded to their string
>    values. I found that makes the output easier to read. Again, not
>    functionally required.
> 
> 5) Value nodes aren't emitted without a {Value ...} anymore. I changed
>    this when I expanded the WRITE/READ tests, and encountered failures
>    because the old encoding is not entirely rountrip safe
>    (e.g. -INT32_MIN will be parsed as a float at raw parse time, but
>    after write/read, it'll be parsed as an integer). While that could be
>    fixed in other ways (e.g. by emitting a trailing . for all floats), I
>    also found it to be clearer this way - Value nodes are otherwise
>    undistinguishable from raw strings, raw numbers etc, which is not
>    great.
> 
> It'd also be easier to now just change the node format to something else.

E.g. to just use json. Which'd certainly be a lot easier to delve into,
given the amount of tooling (both on the pg SQL level, and for
commandline / editors / etc).  I don't think it'd be any less
efficient. There'd be a few more = signs, but the lexer is smarter /
faster than the one currently in use for the outfuncs format.  And we'd
just reuse pg_parse_json rather than having a dedicated parser.

- Andres



pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: subscriptionCheck failures on nightjar
Next
From: David Steele
Date:
Subject: Re: backup manifests