Thread: WITH clause

WITH clause

From
David Fetter
Date:
Kind people,

I'm looking to the SQL WITH clause as a way to get better regex
support in PostgreSQL.  I've been chatting a little bit about this,
and here's an idea for a behavior.  Implementation details TBD.

WITH res = match (x.foo, '([0-9]+)x([0-9]+)')
SELECT *
FROM x
WHERE y = res[2]
OR    y = res[3];

Here res[1] would be the whole pattern match, res[2] & res[3] would be
the set of digits before x and the set of digits after x,
respectively.

Ideally, there could also be (optionally) res[pre] and res[post] which
would refer to the stuff in foo that preceds the match and stuff in
foo that follows the match, respectively.

Perl weenies may think of $&, $1, ... $n, $` and $'.

Comments, ideas, brickbats, and "you're on crack" comments welcome. :)

Cheers,
D
-- 
David Fetter david@fetter.org http://fetter.org/
phone: +1 510 893 6100    cell: +1 415 235 3778


Re: WITH clause

From
Peter Eisentraut
Date:
David Fetter wrote:
> I'm looking to the SQL WITH clause as a way to get better regex
> support in PostgreSQL.  I've been chatting a little bit about this,
> and here's an idea for a behavior.  Implementation details TBD.
>
> WITH res = match (x.foo, '([0-9]+)x([0-9]+)')
> SELECT *
> FROM x
> WHERE y = res[2]
> OR    y = res[3];

Two questions:

1. What does this have to do with the SQL WITH clause?

2. How does the generalize to arbitrary user-defined operators?



Re: WITH clause

From
Tom Lane
Date:
David Fetter <david@fetter.org> writes:
> I'm looking to the SQL WITH clause as a way to get better regex
> support in PostgreSQL.  I've been chatting a little bit about this,
> and here's an idea for a behavior.  Implementation details TBD.

> WITH res = match (x.foo, '([0-9]+)x([0-9]+)')
> SELECT *
> FROM x
> WHERE y = res[2]
> OR    y = res[3];

This doesn't seem to be even remotely compatible with SQL99's idea
of WITH.
        regards, tom lane


Re: WITH clause

From
David Fetter
Date:
On Fri, Dec 12, 2003 at 07:47:26PM +0100, Peter Eisentraut wrote:
> David Fetter wrote:
> > I'm looking to the SQL WITH clause as a way to get better regex
> > support in PostgreSQL.  I've been chatting a little bit about
> > this, and here's an idea for a behavior.  Implementation details
> > TBD.

> > WITH res = match (x.foo, '([0-9]+)x([0-9]+)')
> > SELECT *
> > FROM x
> > WHERE y = res[2]
> > OR    y = res[3];

> Two questions:
> 
> 1. What does this have to do with the SQL WITH clause?

It seems that the SQL WITH clause is a way to get what I'd think of as
"statement-level environment variables" available to the SQL statement.

> 2. How does the generalize to arbitrary user-defined operators?

I guess that would depend on whether they return arrays or singletons.

Even better, they'd be able to take as arguments and return hashes aka
associative arrays (cf. the TODO for plpgsql named parameters). I'm
not sure what the SQL spec has to say about that last, tho.

Cheers,
D
-- 
David Fetter david@fetter.org http://fetter.org/
phone: +1 510 893 6100    cell: +1 415 235 3778


Re: WITH clause

From
Alvaro Herrera Munoz
Date:
On Fri, Dec 12, 2003 at 10:13:56AM -0800, David Fetter wrote:

> I'm looking to the SQL WITH clause as a way to get better regex
> support in PostgreSQL.  I've been chatting a little bit about this,
> and here's an idea for a behavior.  Implementation details TBD.

I think you could be rather looking at keeping the regmatch_t * from
RE_compile_and_execute somewhere in the TopTransactionContext, and
create accessor functions to it.  You could do

BEGIN;
SELECT ... x ~ '([0-9]+)x([0-9]+)';
SELECT regex_parens(1), regex_parens(2);
COMMIT;

Etc, you get the idea.  Of course you could only access the captured
expressions from the last regex, and only within the same transaction
block.

-- 
Alvaro Herrera (<alvherre[@]dcc.uchile.cl>)
Maybe there's lots of data loss but the records of data loss are also lost.
(Lincoln Yeoh)


Re: WITH clause

From
david@fetter.org (David Fetter)
Date:
In article <20031212190754.GA11548@dcc.uchile.cl> you wrote:
> On Fri, Dec 12, 2003 at 10:13:56AM -0800, David Fetter wrote:
> 
>> I'm looking to the SQL WITH clause as a way to get better regex
>> support in PostgreSQL.  I've been chatting a little bit about this,
>> and here's an idea for a behavior.  Implementation details TBD.

> I think you could be rather looking at keeping the regmatch_t * from
> RE_compile_and_execute somewhere in the TopTransactionContext, and
> create accessor functions to it.  You could do

> BEGIN;
> SELECT ... x ~ '([0-9]+)x([0-9]+)';
> SELECT regex_parens(1), regex_parens(2);
> COMMIT;

> Etc, you get the idea.  Of course you could only access the captured
> expressions from the last regex, and only within the same
> transaction block.

A couple of little issues with this:

1.  It's assuming there will be exactly one regular expression with
associated atoms.  This is probably fixable by making the function
take two arguments, and it would be really, really nice if these were
nameable.  F'rex, regex_matches(regex => 1, match => 2) or
regex_matches(regex => 'the_one_i_wanted', match => 'post')

2.  It doesn't allow for the regex atoms to be used in the SQL
statement.

BTW, I should have said "row-level environment variables" rather than
"statement-level environment variables," although some of the latter
would be nice, too :)

Cheers,
D
-- 
David Fetter david@fetter.org http://fetter.org/
phone: +1 510 893 6100    cell: +1 415 235 3778

Sophomoric (adj.)  The itch to be unique.


Re: WITH clause

From
Christopher Browne
Date:
david@fetter.org (David Fetter) wrote:
> I'm looking to the SQL WITH clause as a way to get better regex
> support in PostgreSQL.  I've been chatting a little bit about this,
> and here's an idea for a behavior.  Implementation details TBD.
>
> WITH res = match (x.foo, '([0-9]+)x([0-9]+)')
> SELECT *
> FROM x
> WHERE y = res[2]
> OR    y = res[3];

So you're proposing what amounts to the LET function in Lisp:
 (let* ((homepath (user-homedir-pathname))        (maildir (make-pathname :directory (append
                (pathname-directory                                              homepath)
            '("Mail"))))        (mailhome (merge-pathnames maildir homepath)))   (do-something-with-mail mailhome)
(do-something-with-homehomepath))
 

Or the ML let structure...

let a = 1 and b = 2 in a + b;;
- : int = 3

I think this could be a "simply smashing" idea, allowing values to get
declared once, up front, offering the further potential for them to be
reused.

Unfortunately, this doesn't seem to fit with the way WITH is defined
in SQL.

And while the LISP presentation may have "way too many parentheses,"
the ML syntax, being pretty much orthogonal to what SQL currently
offers, strikes me as being preferable.
-- 
select 'cbbrowne' || '@' || 'ntlug.org';
http://www.ntlug.org/~cbbrowne/unix.html
:FATAL ERROR -- ILLEGAL ERROR


Re: WITH clause

From
Dennis Bjorklund
Date:
On Sat, 13 Dec 2003, Christopher Browne wrote:

> Unfortunately, this doesn't seem to fit with the way WITH is defined
> in SQL.

How is the WITH construct defined in SQL?

-- 
/Dennis



Re: WITH clause

From
Hannu Krosing
Date:
David Fetter kirjutas R, 12.12.2003 kell 20:13:
> Kind people,
> 
> I'm looking to the SQL WITH clause as a way to get better regex
> support in PostgreSQL.  I've been chatting a little bit about this,
> and here's an idea for a behavior.  Implementation details TBD.
> 
> WITH res = match (x.foo, '([0-9]+)x([0-9]+)')
> SELECT *
> FROM x
> WHERE y = res[2]
> OR    y = res[3];

why not use current standard syntax

SELECT x.* FROM x,     (select match (x.foo, '([0-9]+)x([0-9]+)')       from x innerx      where innerx.pk = x.pk
)as res
 
HAVING y = get_match_group(res, 2)   OR y = get_match_group(res, 3)
;

with functions match(str,regex) which returns a match object
and get_match_group which extracts matched groups from it.

---------------
Hannu



Re: WITH clause

From
David Fetter
Date:
On Sat, Dec 13, 2003 at 10:58:59PM +0200, Hannu Krosing wrote:
> David Fetter kirjutas R, 12.12.2003 kell 20:13:
> > Kind people,

> > I'm looking to the SQL WITH clause as a way to get better regex
> > support in PostgreSQL.  I've been chatting a little bit about
> > this, and here's an idea for a behavior.  Implementation details
> > TBD.

> > WITH res = match (x.foo, '([0-9]+)x([0-9]+)')
> > SELECT *
> > FROM x
> > WHERE y = res[2]
> > OR    y = res[3];
> 
> why not use current standard syntax
> 
> SELECT x.*
>   FROM x, 
>      (select match (x.foo, '([0-9]+)x([0-9]+)')
>         from x innerx
>        where innerx.pk = x.pk
>        ) as res
> HAVING y = get_match_group(res, 2)
>     OR y = get_match_group(res, 3)
> ;
> 
> with functions match(str,regex) which returns a match object and
> get_match_group which extracts matched groups from it.

Hannu, excellent idea!  This is just the kind of discussion I was
trying to start :)

Cheers,
D
-- 
David Fetter david@fetter.org http://fetter.org/
phone: +1 510 893 6100    cell: +1 415 235 3778


Re: WITH clause

From
Hannu Krosing
Date:
David Fetter kirjutas L, 13.12.2003 kell 23:17:
> On Sat, Dec 13, 2003 at 10:58:59PM +0200, Hannu Krosing wrote:
> > David Fetter kirjutas R, 12.12.2003 kell 20:13:
> > > Kind people,
> 
> > > I'm looking to the SQL WITH clause as a way to get better regex
> > > support in PostgreSQL.  I've been chatting a little bit about
> > > this, and here's an idea for a behavior.  Implementation details
> > > TBD.
> 
> > > WITH res = match (x.foo, '([0-9]+)x([0-9]+)')
> > > SELECT *
> > > FROM x
> > > WHERE y = res[2]
> > > OR    y = res[3];
> > 
> > why not use current standard syntax
> > 
> > SELECT x.*
> >   FROM x, 
> >      (select match (x.foo, '([0-9]+)x([0-9]+)')
> >         from x innerx
> >        where innerx.pk = x.pk
> >        ) as res
> > HAVING y = get_match_group(res, 2)
> >     OR y = get_match_group(res, 3)
> > ;
> > 
> > with functions match(str,regex) which returns a match object and
> > get_match_group which extracts matched groups from it.
> 
> Hannu, excellent idea! 

As I understand it the SQL Standard WITH can be rewritten to SELECT in
FROM if the query is not recursive and does not reference other
subqueries. That's why I proposed the above syntax (we don't yet support
WITH). 

>  This is just the kind of discussion I was
> trying to start :)

Thanks ;)

Actually I notice that WHERE could be used there instead of HAVING.
Should be faster too.

------------
Hannu



Re: WITH clause

From
Greg Stark
Date:
Hannu Krosing <hannu@tm.ee> writes:

> SELECT x.*
>   FROM x, 
>      (select match (x.foo, '([0-9]+)x([0-9]+)')
>         from x innerx
>        where innerx.pk = x.pk
>        ) as res
> HAVING y = get_match_group(res, 2)
>     OR y = get_match_group(res, 3)
> ;

Well you don't need to go fetch from the table an extra time. Presumably the
data will be cached but it's still a lot of extra work to process the data
twice.

You could just do

select *  from (       select x.*,               (select match(foo, '([0-9]+)x([0-9]+)') as res      )where y = res[2]
or y = res[3]
 



But what Hannu's saying is that the SQL Standard WITH is precisely syntactic
sugar for subqueries used like above.

It sounds like WITH is to subqueries as let is to lambda....


-- 
greg