Re: regexp_replace to remove sql comments - Mailing list pgsql-general

From Tom Lane
Subject Re: regexp_replace to remove sql comments
Date
Msg-id 4747.1446074267@sss.pgh.pa.us
Whole thread Raw
In response to Re: regexp_replace to remove sql comments  (Mike <mike@wolman.co.uk>)
List pgsql-general
Mike <mike@wolman.co.uk> writes:
> Thanks with a bit of moving stuff about I think thats sorted it - in
> case anyone every needs it:
>    SELECT
>     query,
>       trim(regexp_replace(
>          regexp_replace(
>             regexp_replace(query,'\/\*.+\*\/','','g'),
>          '--[^\r\n]*', ' ', 'g')
>       , '\s+', ' ', 'g')) as q
>     FROM public.pg_stat_statements
>     WHERE dbid IN (SELECT oid FROM pg_database WHERE datname =
>   current_database())

This doesn't look too reliable from here:

1. Doesn't handle multiline /* comments.

2. Does wrong thing if more than one /* comment appears on one line.
(You could improve that by using .*? instead of .+, but then it'd
do the wrong thing with nested /* comments.)

3. Breaks things if either -- or /* appear inside a string literal,
double-quoted identifier, or $$ literal.

I'm not at all sure that it's possible to handle this requirement 100%
correctly with regexes; they're unable to do context-sensitive processing.

But so far as pg_stat_statements is concerned, why would you need to
do this at all?  The duplicate-query elimination it does should be
insensitive to comments already.

            regards, tom lane


pgsql-general by date:

Previous
From: Mike
Date:
Subject: Re: regexp_replace to remove sql comments
Next
From: Adrian Klaver
Date:
Subject: Re: Postgresql Installation -- Red Hat vs OpenSUSE vs Ubuntu