Re: remaining sql/json patches - Mailing list pgsql-hackers

From Andres Freund
Subject Re: remaining sql/json patches
Date
Msg-id 20231122193848.vnhi4snatjdgq7uk@awork3.anarazel.de
Whole thread Raw
In response to Re: remaining sql/json patches  (Amit Langote <amitlangote09@gmail.com>)
Responses Re: remaining sql/json patches
Re: remaining sql/json patches
List pgsql-hackers
Hi,

On 2023-11-21 12:52:35 +0900, Amit Langote wrote:
> version     gram.o text bytes  %change  gram.c bytes  %change
>
> 9.6         534010             -        2108984       -
> 10          582554             9.09     2258313       7.08
> 11          584596             0.35     2313475       2.44
> 12          590957             1.08     2341564       1.21
> 13          590381            -0.09     2357327       0.67
> 14          600707             1.74     2428841       3.03
> 15          633180             5.40     2495364       2.73
> 16          653464             3.20     2575269       3.20
> 17-sqljson  672800             2.95     2709422       3.97
>
> So if we put SQL/JSON (including JSON_TABLE()) into 17, we end up with a gram.o 2.95% larger than v16, which granted
isa somewhat larger bump, though also smaller than with some of recent releases.
 

I think it's ok to increase the size if it's necessary increases - but I also
think we've been a bit careless at times, and that that has made the parser
slower.  There's probably also some "infrastructure" work we could do combat
some of the growth too.

I know I triggered the use of the .c bytes and text size, but it'd probably
more sensible to look at the size of the important tables generated by bison.
I think the most relevant defines are:

#define YYLAST   117115
#define YYNTOKENS  521
#define YYNNTS  707
#define YYNRULES  3300
#define YYNSTATES  6255
#define YYMAXUTOK   758


I think a lot of the reason we end up with such a big "state transition" space
is that a single addition to e.g. col_name_keyword or unreserved_keyword
increases the state space substantially, because it adds new transitions to so
many places. We're in quadratic territory, I think.  We might be able to do
some lexer hackery to avoid that, but not sure.

Greetings,

Andres Freund



pgsql-hackers by date:

Previous
From: Tomas Vondra
Date:
Subject: Re: Parallel CREATE INDEX for BRIN indexes
Next
From: Andres Freund
Date:
Subject: Re: Change GUC hashtable to use simplehash?