Thread: Regex
<p><font face="Arial" size="2">I know this is not a regex forum, however I had great assistance last time and thought I wouldask, since the regex is looking through an SQL statement.</font><p><font face="Arial" size="2">I was trying to parsea SQL statement and get all the tables used. I'm actually doing this in Cold Fusion.</font><p><font face="Arial" size="2">Ihave gone as far as this </font><p><font face="Arial" size="2">(FROM | JOIN ).*(?<=INNER JOIN|LEFT JOIN|LEFTOUTER JOIN|AS|WHERE|ORDER BY)</font><p><font face="Arial" size="2">I know I need to look for FROM or JOIN and getall text within INNER JOIN,LEFT JOIN,LEFT OUTER JOIN,AS,WHERE,ORDER BY... there are various others.</font><p><font face="Arial"size="2">I have tried a forward reference however this does not work, I think it needs to be between the .[^INNERJOIN|LEFT JOIN|LEFT OUT JOIN|AS|WHERE|ORDER BY]* somehow!!!</font><p><font face="Arial" size="2">Basically the regexshould return TABLEA, TABLEB from:</font><p><font face="Arial" size="2">Select * </font><br /><font face="Arial" size="2">fromTABLEA</font><br /><font face="Arial" size="2">Inner jon TABLEB on tableb.columna = tablea.columna</font><p><fontface="Arial" size="2">Cheers,</font><br /> <font face="Arial" size="2">Theo</font><table><tr><tdbgcolor="#ffffff"><font color="#000000">______________________________________________________________________<br/>This email, including attachments,is intended only for the addressee<br />and may be confidential, privileged and subject to copyright. If you<br/>have received this email in error, please advise the sender and delete<br />it. If you are not the intended recipientof this email, you must not<br />use, copy or disclose its content to anyone. You must not copy or <br />communicateto others content that is confidential or subject to <br />copyright, unless you have the consent of the contentowner.<br /></font></td></tr></table>
Theo Galanakis <Theo.Galanakis@lonelyplanet.com.au> writes: > Basically the regex should return TABLEA, TABLEB from: > Select * > from TABLEA > Inner jon TABLEB on tableb.columna = tablea.columna You realize of course that this problem is mathematically impossible? Regexes are less powerful than context-free grammars, and so it is a certainty that there exist legal SQL statements that will fool any regex that you invent for this purpose. If you know that the SQL statements are coming from a query generator that produces only a certain style of SQL code, then you might be able to come up with a solution that works reliably for the output of that query generator. But I wonder if you wouldn't be better off bypassing the parse-and-deparse hacking and tapping directly into the query generator. regards, tom lane
<p><font size="2">Thanks Tom, </font><br /> <font size="2">I attacked the issue another way which appears to work...</font><p> <font size="2">I used :</font><p> <font size="2">explain select * from nodes left join node_nameson node_names.node_id = nodes.node_id</font><p> <font size="2">which returned :</font><p> <font size="2">query</font><br /> <font size="2"> QUERY PLAN </font><br /> <font size="2">1 Merge Right Join (cost=429.16..793.48rows=4510 width=193) </font><br /> <font size="2">2 Merge Cond: ("outer".node_id = "inner".node_id)</font><br /> <font size="2">3 -> Index Scan using node_names_node_id_key on node_names (cost=0.00..278.58rows= 7253 width=110) </font><p> <font size="2">4 -> Sort (cost=429.16..438.89rows=3894 width=83) </font><br /> <font size="2">5 Sort Key: nodes.node_id </font><br /> <font size="2">6 -> Seq Scan on nodes (cost=0.00..196.94 rows=3894 width=83) </font><br /> <br /> <font size="2">and then programatically searched for lines that begin with :</font><p> <font size="2">SeqScan on #table_name#</font><br /> <font size="2">Index Scan using #indexname# on #table_name#</font><p> <font size="2">obtaining the #table_name#</font><p> <font size="2">Being : nodes, node_names</font><p><fontsize="2">Theo</font><p><font size="2">-----Original Message-----</font><br /><font size="2">From:Tom Lane [<a href="mailto:tgl@sss.pgh.pa.us">mailto:tgl@sss.pgh.pa.us</a>] </font><br /><font size="2">Sent:Wednesday, 6 October 2004 1:36 PM</font><br /><font size="2">To: Theo Galanakis</font><br /><font size="2">Cc:Pgsql-Sql@Postgresql. Org</font><br /><font size="2">Subject: Re: [SQL] Regex </font><br /><p><font size="2">TheoGalanakis <Theo.Galanakis@lonelyplanet.com.au> writes:</font><br /><font size="2">> Basically the regexshould return TABLEA, TABLEB from:</font><p><font size="2">> Select *</font><br /><font size="2">> from TABLEA</font><br/><font size="2">> Inner jon TABLEB on tableb.columna = tablea.columna</font><p><font size="2">You realizeof course that this problem is mathematically impossible? Regexes are less powerful than context-free grammars, andso it is a certainty that there exist legal SQL statements that will fool any regex that you invent for this purpose.</font><p><fontsize="2">If you know that the SQL statements are coming from a query generator that produces onlya certain style of SQL code, then you might be able to come up with a solution that works reliably for the output ofthat query generator. But I wonder if you wouldn't be better off bypassing the parse-and-deparse hacking and tapping directlyinto the query generator.</font><p> <font size="2">regards, tom lane</font><table><tr><tdbgcolor="#ffffff"><font color="#000000">______________________________________________________________________<br/>This email, including attachments,is intended only for the addressee<br />and may be confidential, privileged and subject to copyright. If you<br/>have received this email in error, please advise the sender and delete<br />it. If you are not the intended recipientof this email, you must not<br />use, copy or disclose its content to anyone. You must not copy or <br />communicateto others content that is confidential or subject to <br />copyright, unless you have the consent of the contentowner.<br /></font></td></tr></table>