Thread: Best practice: call an internal postgresql function (e.g. raw_parser) from another C/Rust binary

Best practice: call an internal postgresql function (e.g. raw_parser) from another C/Rust binary

From
Francois-Guillaume Ribreau
Date:
Hello!

(PostgreSQL rocks)

I wonder what is the easiest way to extract and (ab)use the raw_parser function out of postgresql codebase, as a library, so I can use it from my own code in Rust.

My C is rusty (pun intended) so I tried this:

cd src/backend/parser 
gcc -I../../include -fpic -c parser.c

so raw_parser is present (yey):

nm parser.o | grep raw_parser
0000000000000000 T _raw_parser

but I missed lots of other implementation files (doh) ^^

 nm parser.o | grep U        
                 U _ScanKeywordTokens
                 U _ScanKeywords
                 U _base_yyparse
                 U _cancel_scanner_errposition_callback
                 U _core_yylex
                 U _errcode
                 U _errfinish
                 U _errhint
                 U _errmsg
                 U _errmsg_internal
                 U _errstart
                 U _isxdigit
                 U _palloc
                 U _parser_init
                 U _pg_unicode_to_server
                 U _repalloc
                 U _scanner_errposition
                 U _scanner_finish
                 U _scanner_init
                 U _scanner_isspace
                 U _scanner_yyerror
                 U _setup_scanner_errposition_callback
                 U _strlen
                 U _truncate_identifier

I started to include them one by one but the task is tedious and I'm pretty sure there is an easier way :)

So I did take a look at various makefile in contrib/ folder but I'm not sure adapting them will do want I want, I do not want to make an extension for postgresql but instead generate an .a library that I can access from Rust through FFI.

Does anyone tried this before?

PS: If you are interested, here is the repository: https://github.com/FGRibreau/poc-pgsql-parser
Francois-Guillaume Ribreau <postgresql@fgribreau.com> writes:
> I wonder what is the easiest way to extract and (ab)use the raw_parser
> function out of postgresql codebase, as a library, so I can use it from my
> own code in Rust.

You're not the first to have thought of that.  I'm failing to locate
any relevant threads in our archives, but I distinctly recall having
heard of somebody who'd made a standalone version of our lexer+grammar.
You might try searching on github.

(I make no warranties about how up-to-date any such project may be.)

            regards, tom lane



Re: Best practice: call an internal postgresql function (e.g. raw_parser) from another C/Rust binary

From
Francois-Guillaume Ribreau
Date:
wow thanks, found it (I think) https://github.com/lfittl/libpg_query !

On Tue, Nov 3, 2020 at 10:47 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
Francois-Guillaume Ribreau <postgresql@fgribreau.com> writes:
> I wonder what is the easiest way to extract and (ab)use the raw_parser
> function out of postgresql codebase, as a library, so I can use it from my
> own code in Rust.

You're not the first to have thought of that.  I'm failing to locate
any relevant threads in our archives, but I distinctly recall having
heard of somebody who'd made a standalone version of our lexer+grammar.
You might try searching on github.

(I make no warranties about how up-to-date any such project may be.)

                        regards, tom lane
Hi all,

> You're not the first to have thought of that.  I'm failing to locate
> any relevant threads in our archives, but I distinctly recall having
> heard of somebody who'd made a standalone version of our lexer+grammar.
> You might try searching on github.

Funnily enough, I was only reading about this yesterday - sometimes
wandering through the interweb has its benefits! :-)

The project is DuckDB https://duckdb.org/.

Specifically, this page: https://duckdb.org/docs/why_duckdb.html#duckdbissimple

"SQL Parser: We use the PostgreSQL parser that was repackaged as a
stand-alone library. The translation to our own parse tree is inspired
by Peloton."

The stand-alone library they use is from here (linked in text above):
https://github.com/lfittl/libpg_query

> (I make no warranties about how up-to-date any such project may be.)

Seems interesting and active - DuckDB's last GitHub update 18 days
ago! Last update for the libpg_query is 3 years, however DuckDB appear
to be maintaining their own fork, available here:

https://github.com/cwida/duckdb/tree/master/third_party/libpg_query -
last update 26 days ago!


HTH,

Pól...


>                         regards, tom lane