Thread: algebraic rewriting

algebraic rewriting

From
sophie yang
Date:
From the big picture of PostgreSQL architecture, I
understand that query processing includes parser,
rewriter, optimizer and executor. I am wondering at
which step, PostgreSQL performs algebraic rewriting?
What rules are used for rewriting? Are these rules
hard wired or are they stored in some catalog table?
If I want to check the source codes, which file should
I look at?

Thanks!!

__________________________________________________
Do You Yahoo!?
Yahoo! Tax Center - online filing with TurboTax
http://taxes.yahoo.com/

Re: algebraic rewriting

From
"Josh Berkus"
Date:
Sophie,

You, my dear, are not a novice.  Get off this list!  <grin>

> >From the big picture of PostgreSQL architecture, I
> understand that query processing includes parser,
> rewriter, optimizer and executor. I am wondering at
> which step, PostgreSQL performs algebraic rewriting?
> What rules are used for rewriting? Are these rules
> hard wired or are they stored in some catalog table?
> If I want to check the source codes, which file should
> I look at?

Check out http://developer.postgresql.org/index.php

And then join the pgsql-hackers mailing list.

-Josh Berkus


Re: algebraic rewriting

From
Tom Lane
Date:
sophie yang <yangsophie@yahoo.com> writes:
> rewriter, optimizer and executor. I am wondering at
> which step, PostgreSQL performs algebraic rewriting?

What do you consider "algebraic rewriting"?  The optimizer does a pass of
constant-subexpression simplification, and also some boolean-expression
normalization, but I'm not sure if that's what you're looking for.

            regards, tom lane

Re: algebraic rewriting

From
sophie yang
Date:
What I mean about "algebraic rewriting" is some
rule-based simplification optimization. For example,
Say we have two relations: R(A, B), S(B, C).
Simplification rules include:
1) R join R = R
2) R - R = NULL
3) NULL join R = NULL
4) SELECT[A=a](R join S) = (SELECT[A=a](R)) join S
   e.g.
   (select * from R, S where R.B=S.B and R.A=a) =>
   (select * from R, S where R.A=a and R.B=S.B)
5) PROJECT[A,B](R JOIN S) = R JOIN (PROJECT[B](S))
6) PROJECT[A](PROJECT[A,B](R)) = PROJECT[A](R)
etc.

I am wondering if progreSQL does such simplification
to prune the plan space before estimate costs?

Sophie Yang

--- Tom Lane <tgl@sss.pgh.pa.us> wrote:
> sophie yang <yangsophie@yahoo.com> writes:
> > rewriter, optimizer and executor. I am wondering
> at
> > which step, PostgreSQL performs algebraic
> rewriting?
>
> What do you consider "algebraic rewriting"?  The
> optimizer does a pass of
> constant-subexpression simplification, and also some
> boolean-expression
> normalization, but I'm not sure if that's what
> you're looking for.
>
>             regards, tom lane
>
> ---------------------------(end of
> broadcast)---------------------------
> TIP 5: Have you checked our extensive FAQ?
>
> http://www.postgresql.org/users-lounge/docs/faq.html


__________________________________________________
Do You Yahoo!?
Yahoo! Games - play chess, backgammon, pool and more
http://games.yahoo.com/

Re: algebraic rewriting

From
Tom Lane
Date:
sophie yang <yangsophie@yahoo.com> writes:
> What I mean about "algebraic rewriting" is some
> rule-based simplification optimization. For example,
> Say we have two relations: R(A, B), S(B, C).
> Simplification rules include:
> 1) R join R = R
> 2) R - R = NULL
> 3) NULL join R = NULL
> 4) SELECT[A=a](R join S) = (SELECT[A=a](R)) join S
>    e.g.
>    (select * from R, S where R.B=S.B and R.A=a) =>
>    (select * from R, S where R.A=a and R.B=S.B)
> 5) PROJECT[A,B](R JOIN S) = R JOIN (PROJECT[B](S))
> 6) PROJECT[A](PROJECT[A,B](R)) = PROJECT[A](R)
> etc.

Ah, I see.

> I am wondering if progreSQL does such simplification
> to prune the plan space before estimate costs?

No, there are no relation-level simplifications like these.  Offhand,
I'm having a hard time thinking of real-world cases where any of these
equivalences would be useful.  Can you cite examples of realistic SQL
queries that these transformations would help for?

We do apply transformations of this kind to boolean expressions
(AND/OR/NOT trees), but there are not any at the level of relational
expressions.  One difficulty in doing that is that the SQL-standard
semantics for INTERSECT/EXCEPT/UNION are not really compatible with the
natural algebraic transformations.  For instance, A UNION A is *not* A
... it is A with duplicate rows eliminated.  A few versions ago we had
an intersect/except implementation that actually transformed UNION,
INTERSECT and EXCEPT query trees into boolean OR, AND, and AND NOT
trees, and then applied the boolean expression simplifier (which knows
about De Morgan's laws, CNF, and so on).  It was way cool; too bad it
did not meet the letter of the SQL-standard semantics, much less do
anything that was worth the cycles it expended for typical queries.
We ripped it out a release or two back.

I'm quite willing to discuss this stuff more, but I agree with Josh
that this is far off-topic for pgsql-novice.  See you on pghackers?

            regards, tom lane