Thread: Postgresql parser

Postgresql parser

From

andurkar

Date:

27 September 2011, 05:44:57

Hello,
Currently I am working on Postgresql... I need to study the gram.y and
scan.l parser files...since I want to do some qery modification. Can anyone
please help me to understand the files. What should I do ? Is there any
documentation available ?

Regards,
Aditi.


--
View this message in context: http://postgresql.1045698.n5.nabble.com/Postgresql-parser-tp4844522p4844522.html
Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.

Re: Postgresql parser

From

Kerem Kat

Date:

27 September 2011, 06:51:20

On Tue, Sep 27, 2011 at 11:44, andurkar <andurkarad10.comp@coep.ac.in> wrote:
> Hello,
> Currently I am working on Postgresql... I need to study the gram.y and
> scan.l parser files...since I want to do some qery modification. Can anyone
> please help me to understand the files. What should I do ? Is there any
> documentation available ?
>
> Regards,
> Aditi.
>

What kind of modifications do you want to do?

regards,

Kerem KAT

Re: Postgresql parser

From

Florian Pflug

Date:

27 September 2011, 08:28:09

On Sep27, 2011, at 10:44 , andurkar wrote:
> Currently I am working on Postgresql... I need to study the gram.y and
> scan.l parser files...since I want to do some qery modification. Can anyone
> please help me to understand the files. What should I do ? Is there any
> documentation available ?

scan.l defines the lexer, i.e. the algorithm that splits a string (containing
an SQL statement) into a stream of tokens. A token is usually a single word
(i.e., doesn't contain spaces but is delimited by spaces), but can also be
a whole single or double-quoted string for example. The lexer is basically
defined in terms of regular expressions which describe the different token types.

gram.y defines the grammar (the syntactical structure) of SQL statements,
using the tokens generated by the lexer as basic building blocks. The grammar
is defined in BNF notation. BNF resembles regular expressions but works
on the level of tokens, not characters. Also, patterns (called rules or productions
in BNF) are named, and may be recursive, i.e. use themselves as sub-patters.

The actual lexer is generated from scan.l by a tool called flex. You can find
the manual at http://flex.sourceforge.net/manual/

The actual parser is generated from gram.y by a tool called bison. You can find
the manual at http://www.gnu.org/s/bison/.

Beware, though, that you'll have a rather steep learning curve ahead of you
if you've never used flex or bison before.

best regards,
Florian Pflug

Re: Postgresql parser

From

Alvaro Herrera

Date:

27 September 2011, 11:22:56

Excerpts from Florian Pflug's message of mar sep 27 08:28:00 -0300 2011:
> On Sep27, 2011, at 10:44 , andurkar wrote:
> > Currently I am working on Postgresql... I need to study the gram.y and
> > scan.l parser files...since I want to do some qery modification. Can anyone
> > please help me to understand the files. What should I do ? Is there any
> > documentation available ?
> 
> scan.l defines the lexer, i.e. the algorithm that splits a string (containing
> an SQL statement) into a stream of tokens. A token is usually a single word
> (i.e., doesn't contain spaces but is delimited by spaces), but can also be
> a whole single or double-quoted string for example. The lexer is basically
> defined in terms of regular expressions which describe the different token types.

Seemed a good answer so I added it to the developer's faq

http://wiki.postgresql.org/wiki/Developer_FAQ#I_need_to_do_some_changes_to_query_parsing._Can_you_succintly_explain_the_parser_files.3F

Feel free to edit.

-- 
Álvaro Herrera <alvherre@commandprompt.com>
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support