Re: hint infrastructure setup (v3) - Mailing list pgsql-patches
From | Bruce Momjian |
---|---|
Subject | Re: hint infrastructure setup (v3) |
Date | |
Msg-id | 200404051652.i35GqAN05855@candle.pha.pa.us Whole thread Raw |
In response to | Re: hint infrastructure setup (v3) (Fabien COELHO <coelho@cri.ensmp.fr>) |
List | pgsql-patches |
As another teaching aid idea, is there something that would auto-format queries to they are more readable? --------------------------------------------------------------------------- Fabien COELHO wrote: > > > > Dear Tom, > > > No, just redefine "yyerror" as a macro that passes additional parameters. > > Ok! That's just the kind of the hack-hook I was looking for!! > > > > > The terminals are not available, they must be kept separatly if you > > > want them. This can be done in yylex(). > > > > We don't need them. Any already-shifted tokens are to the left of where > > the error is, no? > > Yes and no. I agree that you don't need them for computing the follow set. > However it would help a lot to generate nicer hints. I'm looking for help > the user, and the follow set is just part of the help. > > Think of typical "SELECT foo, bla, FROM xxx" (it looks stupid on a one > line query, but not so on a very large query) which is quite easier to > detect and hint about because of FROM is just after a comma. > > The rule part is also a real issue. There is no easy way to translate > states into rules, to know whether we're somewhere in a "ShowStmt" or > "AlterTableStmt"... > > If you're after a comma in state 1232, should you just say IDENT... I'd > rather say "user name" or "table name" if I can... Otherwise the hint > stays at a low parser level. Good hints requires to know about the > current context, and it is not easy to get that deep in the automaton. > > > > Yeah, I had come to the same conclusion --- state moves made without > > consuming input would need to be backed out if we wanted to report the > > complete follow set. I haven't yet looked to see how to do that. > > Well, following you're previous suggestion, one can redefine the YYLEX > macro to save the current state each time a token is required. > > > > > As you noted, for things like "SHOW 123", the follow set basically > > > includes all keywords although you can have SHOW ALL or SHOW name. > > > So, as you suggested, you can either say "ident" as a simplification, but > > > > You're ignoring the distinction between classes of keywords. I would > > not recommend treating reserved_keywords as a subclass of ident. > > Ok, I agree that it can help in this very case a little bit here because > ALL is reserved. I did not make this distinction when I played with bison. > > > > > (5) anything that can be done would be hardwired to one version of bison. > > > > I think this argument is completely without merit. > > Possible;-) > > However I don't like writing hacks that depends on other people future > behavior. > > > > > (b) write a new "recursive descendant" parser, and drop "gram.y" > > > > Been there, done that, not impressed with the idea. There's a reason > > why people invented parser generators... > > Sure. It was just a list of options;-) > > > > > As a side effect of my inspection is that the automaton generated by bison > > > could be simplified if a different tradeoff between the lexer, the parser > > > and the post-processing would be chosen. Namelly, all tokens that are > > > just identifiers should be dropped and processed differently. > > > > We are not going to reserve the keywords that are presently unreserved. > > Reserving keywords would help, but I would not think it could be accepted, > because it is not SQL philosophy. SQL is about 30 years also, as old > as yacc ideas... but they did not care at the time;-) When you look at > the syntax, it was designed with a recursive parser in mind. > > Part of my comment was to explain "why" the generated automaton is large. > SQL is a small language, but it has a big automaton in postgresql, and > this is IMHO the explanation. > > > > If you can think of a reasonable way to stop treating them as separate > > tokens inside the grammar without altering the user-visible behavior, > > I'm certainly interested. > > I was thinking about type names especially, and maybe others. > > I join a small proof-of-concept patch to drop some tokens out of the > parser. As a results, 6 tokens, 6 rules and 9 states are removed, > and the automaton table is reduced by 438 entries (1.5%). Not too bad;-) > I think others could be dealt with similarily, or with more work. > > > Thanks for the discussion, > > -- > Fabien. Content-Description: [ Attachment, skipping... ] > > ---------------------------(end of broadcast)--------------------------- > TIP 9: the planner will ignore your desire to choose an index scan if your > joining column's datatypes do not match -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania 19073
pgsql-patches by date: