Re: [RFC] nodeToString format and exporting the SQL parser - Mailing list pgsql-hackers

From Jehan-Guillaume (ioguix) de Rorthais
Subject Re: [RFC] nodeToString format and exporting the SQL parser
Date
Msg-id 4BCF389A.2090308@free.fr
Whole thread Raw
In response to Re: [RFC] nodeToString format and exporting the SQL parser  (David Fetter <david@fetter.org>)
Responses Re: [RFC] nodeToString format and exporting the SQL parser  (Pavel Stehule <pavel.stehule@gmail.com>)
Re: [RFC] nodeToString format and exporting the SQL parser  (Bruce Momjian <bruce@momjian.us>)
List pgsql-hackers
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 04/04/2010 18:10, David Fetter wrote:
> On Sat, Apr 03, 2010 at 03:17:30PM +0200, Markus Schiltknecht wrote:
>> Hi,
>>
>> Michael Tharp wrote:
>>> I have been spending a little time making the internal SQL parser
>>> available to clients via a C-language SQL function.
>>
>> This sounds very much like one of the Cluster Features:
>> http://wiki.postgresql.org/wiki/ClusterFeatures#API_into_the_Parser_.2F_Parser_as_an_independent_module
>>
>> Is this what you (or David) have in mind?
> 
> I'm not a fan of statement-based replication of any description.  The
> use cases I have in mind involve things like known-correct syntax
> highlighting in text editors.

The point here is not to expose the internal data structure, but to
deliver a tokenized version of the given SQL script.

There's actually many different use cases for external projects : - syntax highlighting - rewrite query with proper
indentation- replication - properly splitting queries from a script - define type of the query (SELECT ? UPDATE/DELETE
?DDL ?) - checking validity of a query before sending it - ...
 

In addition of PgPool needs, I can see 3 or 4 direct use cases for
pgAdmin and phpPgAdmin.

So it seems to me having the parser code in a shared library would be
very useful for external C projects which can link to it. However it
would be useless for other non-C projects which can't use it directly
but are connected to a PostgreSQL backend anyway (phpPgAdmin as instance).

What about having a new SQL command like TOKENIZE ? it would kinda act
like EXPLAIN but giving a tokenized version of the given SQL script. As
EXPLAIN, it could speak XML, YAML, JSON, you name it...

Each token could have : - a type ('identifier', 'string', 'sql command', 'sql keyword',
'variable'...) - the start position in the string - the value - the line number - ...

A simple example of a tokenizer is the php one: http://fr.php.net/token_get_all

And here is a basic example which return pseudo rows here :

=> TOKENIZE $script$   SELECT 1;   UPDATE test SET "a"=2; $script$;
  type      | pos |   value  | line
- -------------+-----+----------+------SQL_COMMAND | 1   | 'SELECT' |   1CONSTANT    | 8   | '1'      |   1DELIMITER
|9   | ';'      |   1SQL_COMMAND | 11  | 'UPDATE' |   2IDENTIFIER  | 18  | 'test'   |   2SQL_KEYWORD | 23  | 'SET'    |
 2IDENTIFIER  | 27  | '"a"'    |   2OPERATOR    | 30  | '='      |   2CONSTANT    | 31  | '1'      |   2
 

> 
> Cheers,
> David.

As a phpPgAdmin dev, I am thinking about this subject since a long time.
I am interested about trying to create such a patch after discussing it
and if you think it is doable.

- -- 
JGuillaume (ioguix) de Rorthais
http://www.dalibo.com
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.10 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAkvPOJMACgkQxWGfaAgowiLrUACfa7qMVr3oiOVS7JfhTa1S9EqY
pYkAn3Sj6cezC/EdWPu2+kzrgjaDygGE
=oY1c
-----END PGP SIGNATURE-----


pgsql-hackers by date:

Previous
From: Robert Haas
Date:
Subject: Re: [DOCS] Streaming replication document improvements
Next
From: Tom Lane
Date:
Subject: Re: [DOCS] Streaming replication document improvements