Thread: Regex

Regex

From
Theo Galanakis
Date:
<p><font face="Arial" size="2">I know this is not a regex forum, however I had great assistance last time and thought I
wouldask, since the regex is looking through an SQL statement.</font><p><font face="Arial" size="2">I was trying to
parsea SQL statement and get all the tables used. I'm actually doing this in Cold Fusion.</font><p><font face="Arial"
size="2">Ihave gone as far as this </font><p><font face="Arial" size="2">(FROM | JOIN ).*(?<=INNER JOIN|LEFT
JOIN|LEFTOUTER JOIN|AS|WHERE|ORDER BY)</font><p><font face="Arial" size="2">I know I need to look for FROM or JOIN and
getall text within INNER JOIN,LEFT JOIN,LEFT OUTER JOIN,AS,WHERE,ORDER BY... there are various others.</font><p><font
face="Arial"size="2">I have tried a forward reference however this does not work, I think it needs to be between the
.[^INNERJOIN|LEFT JOIN|LEFT OUT JOIN|AS|WHERE|ORDER BY]* somehow!!!</font><p><font face="Arial" size="2">Basically the
regexshould return TABLEA, TABLEB from:</font><p><font face="Arial" size="2">Select * </font><br /><font face="Arial"
size="2">fromTABLEA</font><br /><font face="Arial" size="2">Inner jon TABLEB on tableb.columna =
tablea.columna</font><p><fontface="Arial" size="2">Cheers,</font><br />        <font face="Arial"
size="2">Theo</font><table><tr><tdbgcolor="#ffffff"><font
color="#000000">______________________________________________________________________<br/>This email, including
attachments,is intended only for the addressee<br />and may be confidential, privileged and subject to copyright. If
you<br/>have received this email in error, please advise the sender and delete<br />it. If you are not the intended
recipientof this email, you must not<br />use, copy or disclose its content to anyone. You must not copy or <br
/>communicateto others content that is confidential or subject to <br />copyright, unless you have the consent of the
contentowner.<br /></font></td></tr></table> 

Re: Regex

From
Tom Lane
Date:
Theo Galanakis <Theo.Galanakis@lonelyplanet.com.au> writes:
> Basically the regex should return TABLEA, TABLEB from:

> Select * 
> from TABLEA
> Inner jon TABLEB on tableb.columna = tablea.columna

You realize of course that this problem is mathematically impossible?
Regexes are less powerful than context-free grammars, and so it is a
certainty that there exist legal SQL statements that will fool any regex
that you invent for this purpose.

If you know that the SQL statements are coming from a query generator
that produces only a certain style of SQL code, then you might be able
to come up with a solution that works reliably for the output of that
query generator.  But I wonder if you wouldn't be better off bypassing
the parse-and-deparse hacking and tapping directly into the query
generator.
        regards, tom lane


Re: Regex

From
Theo Galanakis
Date:
<p><font size="2">Thanks Tom, </font><br />        <font size="2">I attacked the issue another way which appears to
work...</font><p>        <font size="2">I used :</font><p>        <font size="2">explain select * from nodes left join
node_nameson node_names.node_id = nodes.node_id</font><p>        <font size="2">which returned :</font><p>        <font
size="2">query</font><br />        <font size="2">  QUERY PLAN </font><br />        <font size="2">1 Merge Right Join
(cost=429.16..793.48rows=4510 width=193) </font><br />        <font size="2">2 Merge Cond: ("outer".node_id =
"inner".node_id)</font><br />        <font size="2">3 -> Index Scan using node_names_node_id_key on node_names
(cost=0.00..278.58rows=                     7253 width=110) </font><p>        <font size="2">4 -> Sort
(cost=429.16..438.89rows=3894 width=83) </font><br />        <font size="2">5 Sort Key: nodes.node_id </font><br
/>       <font size="2">6 -> Seq Scan on nodes (cost=0.00..196.94 rows=3894 width=83) </font><br />        <br
/>       <font size="2">and then programatically searched for lines that begin with :</font><p>        <font
size="2">SeqScan on #table_name#</font><br />        <font size="2">Index Scan using #indexname# on
#table_name#</font><p>       <font size="2">obtaining the #table_name#</font><p>        <font size="2">Being : nodes,
node_names</font><p><fontsize="2">Theo</font><p><font size="2">-----Original Message-----</font><br /><font
size="2">From:Tom Lane [<a href="mailto:tgl@sss.pgh.pa.us">mailto:tgl@sss.pgh.pa.us</a>] </font><br /><font
size="2">Sent:Wednesday, 6 October 2004 1:36 PM</font><br /><font size="2">To: Theo Galanakis</font><br /><font
size="2">Cc:Pgsql-Sql@Postgresql. Org</font><br /><font size="2">Subject: Re: [SQL] Regex </font><br /><p><font
size="2">TheoGalanakis <Theo.Galanakis@lonelyplanet.com.au> writes:</font><br /><font size="2">> Basically the
regexshould return TABLEA, TABLEB from:</font><p><font size="2">> Select *</font><br /><font size="2">> from
TABLEA</font><br/><font size="2">> Inner jon TABLEB on tableb.columna = tablea.columna</font><p><font size="2">You
realizeof course that this problem is mathematically impossible? Regexes are less powerful than context-free grammars,
andso it is a certainty that there exist legal SQL statements that will fool any regex that you invent for this
purpose.</font><p><fontsize="2">If you know that the SQL statements are coming from a query generator that produces
onlya certain style of SQL code, then you might be able to come up with a solution that works reliably for the output
ofthat query generator.  But I wonder if you wouldn't be better off bypassing the parse-and-deparse hacking and tapping
directlyinto the query generator.</font><p>                        <font size="2">regards, tom
lane</font><table><tr><tdbgcolor="#ffffff"><font
color="#000000">______________________________________________________________________<br/>This email, including
attachments,is intended only for the addressee<br />and may be confidential, privileged and subject to copyright. If
you<br/>have received this email in error, please advise the sender and delete<br />it. If you are not the intended
recipientof this email, you must not<br />use, copy or disclose its content to anyone. You must not copy or <br
/>communicateto others content that is confidential or subject to <br />copyright, unless you have the consent of the
contentowner.<br /></font></td></tr></table>