Thread: Calculation dependencies in views

Calculation dependencies in views

From
Rick Delaney
Date:
I have a table with, say, a dozen fields and I want to end up with a view with
around 50 fields, calculated from the original dozen.

So it would be something like this:

CREATE VIEW final AS   SELECT       x, y, z,       (x + y)         as a,       (y + z)         as b,       (x + y) * z
  as c,       (y + z) * x     as d,       (x + y) * z + x as e,       (y + z) * x + x as f   FROM my_table;
 

except my expressions are longer and more complicated.  However, my expressions
do have similar dependencies and redundancies.

My question is what is a good way of dealing with this?  I was going to do
something like

CREATE VIEW one AS   SELECTid,       (x + y)         as a,       (y + z)         as b,  FROM my_table;

CREATE VIEW two AS   SELECTm.id,       o.a * m.z     as c,       o.b * m.x     as d,  FROM my_table m, one o  WHERE
m.id= o.id;
 

etc.  but I'll end up with a lot of levels and joins going this route which I
expect will be pretty slow.

Can someone recommend anything to me?  Should I be approaching this from a
completely different angle i.e other than views?

I'm pretty new at this so any pointers will be appreciated.

--Rick


Re: [SQL] Calculation dependencies in views

From
Jan Wieck
Date:
Rick Delaney wrote:

> I have a table with, say, a dozen fields and I want to end up with a view with
> around 50 fields, calculated from the original dozen.
>
> So it would be something like this:
>
> CREATE VIEW final AS
>     SELECT
>         x, y, z,
>         (x + y)         as a,
>         (y + z)         as b,
>         (x + y) * z     as c,
>         (y + z) * x     as d,
>         (x + y) * z + x as e,
>         (y + z) * x + x as f
>    FROM my_table;
>
> except my expressions are longer and more complicated.  However, my expressions
> do have similar dependencies and redundancies.
    Be aware that 6.5.* might be unable to deal with the resulting size of    the rewrite rule. And I'm not sure that
I'llget TOAST ready for 7.0 to    handle it.
 

> My question is what is a good way of dealing with this?  I was going to do
> something like
>
> CREATE VIEW one AS
>     SELECT
>         id,
>         (x + y)         as a,
>         (y + z)         as b,
>    FROM my_table;
>
> CREATE VIEW two AS
>     SELECT
>         m.id,
>         o.a * m.z     as c,
>         o.b * m.x     as d,
>    FROM my_table m, one o
>    WHERE m.id = o.id;
>
> etc.  but I'll end up with a lot of levels and joins going this route which I
> expect will be pretty slow.
    In the above sample, it will end up in a join, because even after    rewriting there will be more than one used
rangetableentries in the    query. But if you're able to do the nesting in a manner like this:
 

         CREATE VIEW one AS           SELECT             x, y, z,             x + y AS sum_xy,             x + z AS
sum_xz,            y + z AS sum_yz           FROM my_table;
 
         CREATE VIEW final AS           SELECT             x, y, z,             sum_xy AS a,             sum_yz AS b,
         sum_xy * z AS c,             sum_yz * x AS d,             sum_xy * z + x AS e,             sum_yz * x + x AS f
         FROM one;
 
    and keep all views follow strictly one path
    final -> n-1 -> n-2 -> ... -> two -> one
    you'll end up the a construct that requires nested levels of rewriting.    But the resulting query will definitely
bea single table scan having    all your complicated expressions attached.
 
    BTW: if you nest many views, you'll probably get an ERROR about    "infinite recursion" from the rewriter. In that
case,you must raise the    value of the
 
    #define REWRITE_INVOKE_MAX 10
    at line 1466 in src/backend/rewrite/rewriteHander.c to some value fairly    above your nesting level.

> Can someone recommend anything to me?  Should I be approaching this from a
> completely different angle i.e other than views?
    Well, I'm a rewriting- and trigger-man. It really depends on the usual    behaviour of your data and application.
    Keep in mind, that the above view technique requires each value to be    computed for each single view attribute
foreach single row at each    SELECT. So if you usually SELECT relatively small amounts of your data,    or update big
amountsof it between the SELECT's, it wouldn't hurt.
 
    But if you're dealing with huge set's of row's at SELECT time and rarely    changes, you might be better off with
settingup a table having all the    fields in place, and let a trigger put all the calculated values in    place.
 


> I'm pretty new at this so any pointers will be appreciated.
    You're welcome.



Jan

--

#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.                                  #
#========================================= wieck@debis.com (Jan Wieck) #





Re: [SQL] Calculation dependencies in views

From
Tom Lane
Date:
Rick Delaney <rick@consumercontact.com> writes:
> I have a table with, say, a dozen fields and I want to end up with a
> view with around 50 fields, calculated from the original dozen.

> So it would be something like this:

> CREATE VIEW final AS
>     SELECT
>         x, y, z,
>         (x + y)         as a,
>         (y + z)         as b,
>         (x + y) * z     as c,
>         (y + z) * x     as d,
>         (x + y) * z + x as e,
>         (y + z) * x + x as f 
>    FROM my_table;

> except my expressions are longer and more complicated.  However, my
> expressions do have similar dependencies and redundancies.

> My question is what is a good way of dealing with this?

I think what you are getting at is avoiding duplicate computations,
such as the repeated evaluations of x + y in the above example?

My guess is that you are wasting your time to worry about it, unless
the repeated calculations are *really* expensive (a float add is down
in the noise ... though a ten-thousand-digit NUMERIC trig function
might not be).

I don't believe that nested views will buy anything given the current
Postgres implementation of views.  As far as I understand it, the
rewriter works by substitution of expressions, so that a reference to
a second-level view will end up being executed just the same as if
you'd written out the expressions in full.  (This will probably change
once we have the much-discussed, little-implemented ability to write
subselects in FROM clauses; but that's a release or two away yet.)

If you are indeed dealing with 10K-digit numerics or something equally
CPU-hoggish, you might consider storing the intermediate results into
a temp table and then doing a second select using the temp table.
But you'd need to be saving an awful lot of calculations to make that
profitable, I think.

If you're just concerned about avoiding code complexity, then by
all means go with the nested views; that will sweep the complicated
expressions under the rug, and someday it will even buy you some
efficiency.  But it's not going to help much if you need to shave
cycles today.
        regards, tom lane


Re: [SQL] Calculation dependencies in views

From
Rick Delaney
Date:
Jan Wieck wrote:
> 
> Rick Delaney wrote:
> 
> >
>      Be aware that 6.5.* might be unable to deal with the resulting size of
>      the rewrite rule. And I'm not sure that I'll get TOAST ready for 7.0 to
>      handle it.
> 
[snip]
>      But if you're able to do the nesting in a manner like this:
> 
>           CREATE VIEW one AS
>             SELECT
>               x, y, z,
>               x + y AS sum_xy,
>               x + z AS sum_xz,
>               y + z AS sum_yz
>             FROM my_table;
> 
>           CREATE VIEW final AS
>             SELECT
>               x, y, z,
>               sum_xy AS a,
>               sum_yz AS b,
>               sum_xy * z AS c,
>               sum_yz * x AS d,
>               sum_xy * z + x AS e,
>               sum_yz * x + x AS f
>             FROM one;
> 
>      and keep all views follow strictly one path
> 
>      final -> n-1 -> n-2 -> ... -> two -> one
> 
>      you'll end up the a construct that requires nested levels of rewriting.
>      But the resulting query will definitely be a single table scan having
>      all your complicated expressions attached.

I'd have to take a good look at all the expressions again to know if I could do
this but I think I probably can.  Only, with what I've started on I'm already
getting "rule plan string too big" errors.  (What I've started on is just one
view that doesn't have all the expressions in it yet).

So I will just get the same thing with nested views, no?  That is how I
understand the "The Postgres Rule System" section of the manual and Tom Lane's
answer to my post.

If I could structure things as
   subset_1_n -> subset_1_n-1 -> ... -> subset_1_2 -> subset_1_1   subset_2_n -> subset_2_n-1 -> ... -> subset_2_2 ->
subset_2_1  :   :   subset_m_n -> subset_m_n-1 -> ... -> subset_m_2 -> subset_m_1
 

final -> join subset_1_n, subset_2_n, ..., subset_m_n

where each view subset_i_n has a usably-sized rule plan string then could I get
this thing working?

I can (and will) try this out with my stuff but I'm just trying to understand the
general principles.

>      But if you're dealing with huge set's of row's at SELECT time and rarely
>      changes, you might be better off with setting up a table having all the
>      fields in place, and let a trigger put all the calculated values in
>      place.

This will be primarily INSERT/UPDATE a row and then SELECT the view of that row. 
Am I right that a trigger would make more sense in this case then?

Thanks a lot for your help,

--Rick


Re: [SQL] Calculation dependencies in views

From
Rick Delaney
Date:
Tom Lane wrote:
> 
> I think what you are getting at is avoiding duplicate computations,
> such as the repeated evaluations of x + y in the above example?
> 
> My guess is that you are wasting your time to worry about it, unless
> the repeated calculations are *really* expensive (a float add is down
> in the noise ... though a ten-thousand-digit NUMERIC trig function
> might not be).

I was more interested in reducing code complexity but this is good to know
anyway.  Thanks a lot for your help.  Your last remark brings another question to
mind.

If I do this:

rick=> CREATE TABLE foo (foo numeric(9,2));
CREATE
rick=> CREATE VIEW bar AS SELECT foo * foo AS bar FROM foo;
CREATE
rick=> \d foo
Table    = foo
+----------------------------------+--------------------------+-------+
|              Field               |      Type                | Length|
+----------------------------------+--------------------------+-------+
| foo                              | numeric                  |  9.2  |
+----------------------------------+--------------------------+-------+
rick=> \d bar
View    = bar
Query   = SELECT ("foo"."foo" * "foo"."foo") AS "bar" FROM "foo";
+----------------------------------+--------------------------+-------+
|              Field               |      Type                | Length|
+----------------------------------+--------------------------+-------+
| bar                              | numeric                  |65535.65|
+----------------------------------+--------------------------+-------+
            ^^^^^^^^
 
What governs the precision of the numeric in the view "bar"?  I couldn't find the
specifics of this in the docs.  Is it something I need to ever think of?  If so,
can I cast it to a smaller numeric?  I think 18.4 should be sufficient in this
case.

Thanks again,

--Rick


Re: [SQL] Calculation dependencies in views

From
Jan Wieck
Date:
Rick Delaney wrote:

> Jan Wieck wrote:
> >
> > Rick Delaney wrote:
> >
> > >
> >      Be aware that 6.5.* might be unable to deal with the resulting size of
> >      the rewrite rule. And I'm not sure that I'll get TOAST ready for 7.0 to
> >      handle it.
>
> I'd have to take a good look at all the expressions again to know if I could do
> this but I think I probably can.  Only, with what I've started on I'm already
> getting "rule plan string too big" errors.  (What I've started on is just one
> view that doesn't have all the expressions in it yet).
    That's what I meant with "might be unable...". You talked about a 50    column view with complicated expressions,
andthat's surely too large for    the current implementation.
 

> So I will just get the same thing with nested views, no?  That is how I
> understand the "The Postgres Rule System" section of the manual and Tom Lane's
> answer to my post.
    As the author of "The Postgres Rule System" I say yes, except that it    requires some recursions in the rewriter.
Thus,it will take a little    longer to rewrite, but that's a constant time per query, not depending on    the amount
ofdata you're selecting. So if you're happy with the response    times for your test queries, it's not the rewriter any
moreif you get    response time problems later.
 

>
> If I could structure things as
>
>     subset_1_n -> subset_1_n-1 -> ... -> subset_1_2 -> subset_1_1
>     subset_2_n -> subset_2_n-1 -> ... -> subset_2_2 -> subset_2_1
>     :
>     :
>     subset_m_n -> subset_m_n-1 -> ... -> subset_m_2 -> subset_m_1
>
> final -> join subset_1_n, subset_2_n, ..., subset_m_n
>
> where each view subset_i_n has a usably-sized rule plan string then could I get
> this thing working?
    As said, there is a limit of 10 recursions for the rewriter actually    hardcoded in rewriteHandler.c (line 1466).
Raisethat value if you get    "infinite loop" errors.
 

> This will be primarily INSERT/UPDATE a row and then SELECT the view of that row.
> Am I right that a trigger would make more sense in this case then?
    If you only insert one row, then select that one row, the cascaded view    setup is best IMHO. The trigger solution
wouldonly be better, if you    seldom do INSERT/UPDATE, but often SELECT and maybe SELECT many rows.
 

Jan

--

#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.                                  #
#========================================= wieck@debis.com (Jan Wieck) #





Re: [SQL] Calculation dependencies in views

From
Rick Delaney
Date:
Jan Wieck wrote:
> 
> Rick Delaney wrote:
> 
> > Jan Wieck wrote:
> > >
> > >      Be aware that 6.5.* might be unable to deal with the resulting size of
> > >      the rewrite rule. And I'm not sure that I'll get TOAST ready for 7.0 to
> > >      handle it.
> >
> > I'd have to take a good look at all the expressions again to know if I could do
> > this but I think I probably can.  Only, with what I've started on I'm already
> > getting "rule plan string too big" errors.  (What I've started on is just one
> > view that doesn't have all the expressions in it yet).
> 
>      That's what I meant with "might be unable...". You talked about a 50
>      column view with complicated expressions, and that's surely too large for
>      the current implementation.
> 
> > So I will just get the same thing with nested views, no?  That is how I                          ^^^^^^^^^^
          ^^
 
> > understand the "The Postgres Rule System" section of the manual and Tom Lane's
> > answer to my post.
> 
>      As the author of "The Postgres Rule System" I say yes, except that it
     ^^^^^^^^^
 
>      requires some recursions in the rewriter. Thus, it will take a little

Sorry to be stupid but with my terribly phrased question I can't tell if you mean
"Yes, you are correct that you will still get 'rule plan string too big' errors
with nested views" or "Yes, nested views will work fine as I described despite
your erroneous interpretation of what you read (but you may have to fix the
recursion limit as I already said)".  :-)

I'm 99% sure you mean the latter, now, after re-reading "The Postgres Rule
System" and your first reply to me.  I guess the only reason I'm not 100% sure is
I don't understand where the 'rule plan string too big' limit comes from.  Is it
in the storage of the view/rule (related to the 8K limit on row size)?  I guess
that would make sense.

I suppose if I re-read enough times I'll get it.

Thanks again,

--Rick


Re: [SQL] Calculation dependencies in views

From
Jan Wieck
Date:
Rick Delaney wrote:

> Jan Wieck wrote:
> >
> >      As the author of "The Postgres Rule System" I say yes, except that it
>                                                    ^^^^^^^^^
> >      requires some recursions in the rewriter. Thus, it will take a little
>
> Sorry to be stupid but with my terribly phrased question I can't tell if you mean
> "Yes, you are correct that you will still get 'rule plan string too big' errors
> with nested views" or "Yes, nested views will work fine as I described despite
> your erroneous interpretation of what you read (but you may have to fix the
> recursion limit as I already said)".  :-)
>
> I'm 99% sure you mean the latter, now, after re-reading "The Postgres Rule
> System" and your first reply to me.  I guess the only reason I'm not 100% sure is
> I don't understand where the 'rule plan string too big' limit comes from.  Is it
> in the storage of the view/rule (related to the 8K limit on row size)?  I guess
> that would make sense.
    Stop (rereading) recursion now - you got it.
    It's exactly the storage of rules (views are implemented via rewriting    rules) in combo with the 8K limit, that
causesthis "rule plan string too    big" error. Just the the view/rule creation utility does some extra check    for
it.
    Rules are stored as a special kind of printable querytree string. A    querytree is the systems internal
representationof a query, and they are    really verbose and thus - well - huge (compared to what the user typed in).
 
    Example 1:

         CREATE VIEW v1 AS             SELECT a, a + b AS ab, a + b + c AS abc FROM t1;
    Example 2:

         CREATE VIEW v1_sub AS             SELECT a, c, a + b AS ab FROM t1;         CREATE VIEW v1 AS
SELECTa, ab, ab + c AS abc FROM v1_sub;
 
    These two examples will result in exactly the same querytree after    rewriting, if you SELECT from v1. The second
needstwo passes in the    rewriter, but that's the only difference.
 


Jan

--

#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.                                  #
#========================================= wieck@debis.com (Jan Wieck) #





Autonumber column

From
"Bron Till"
Date:
How can I create an autonumber (automatically incremented by 1) column in
table in Postgres? Do I have to use a trigger?




Re: [SQL] Autonumber column

From
Bruce Momjian
Date:
Sequences.  See my book for a chapter on it on web site.

[Charset iso-8859-1 unsupported, filtering to ASCII...]
> How can I create an autonumber (automatically incremented by 1) column in
> table in Postgres? Do I have to use a trigger?
> 
> 
> 
> ************
> 
> 


--  Bruce Momjian                        |  http://www.op.net/~candle maillist@candle.pha.pa.us            |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026
 


Re: [SQL] Calculation dependencies in views

From
Tom Lane
Date:
Jan Wieck <wieck@debis.com> writes:
>      Example 1:
>           CREATE VIEW v1 AS
>               SELECT a, a + b AS ab, a + b + c AS abc FROM t1;

>      Example 2:
>           CREATE VIEW v1_sub AS
>               SELECT a, c, a + b AS ab FROM t1;
>           CREATE VIEW v1 AS
>               SELECT a, ab, ab + c AS abc FROM v1_sub;

>      These two examples will result in exactly the same querytree after
>      rewriting, if you SELECT from v1. The second needs two passes in the
>      rewriter, but that's the only difference.

Actually, there's another big difference: the stored rule plan string
for v1 in the second case is shorter than it is in the first case,
because the a+b additions don't show up as operator nodes in v1's
definition in the second case.  (If the references to v1_sub were
flattened out before the rule were stored, it wouldn't take two passes
of rewriting to expand the rule.  But they aren't, and it does ;-).)

I tried these examples and got:

select rulename,length(ev_action) from pg_rewrite where rulename like '_RETv%';
rulename  |length
----------+------
_RETv1    |  1922
_RETv1_sub|  1558
_RETv1_up |  1566
(3 rows)

So, if your problem is that you need to work around the rule plan string
length limit, then indeed eliminating common subexpressions with nested
views can be a win.  There isn't much win in this example, but then we
only got rid of two additions here.
        regards, tom lane


Re: [SQL] Autonumber column

From
Vladimir Terziev
Date:
   Yes, you can
   SQL type is serial

   Vladimir




Re: [SQL] Calculation dependencies in views

From
Rick Delaney
Date:
Tom Lane wrote:
> 
> Jan Wieck <wieck@debis.com> writes:
> 
> >      These two examples will result in exactly the same querytree after
> >      rewriting, if you SELECT from v1. The second needs two passes in the
> >      rewriter, but that's the only difference.
> 
> Actually, there's another big difference: the stored rule plan string
> for v1 in the second case is shorter than it is in the first case,
> because the a+b additions don't show up as operator nodes in v1's
> definition in the second case.  (If the references to v1_sub were
> flattened out before the rule were stored, it wouldn't take two passes
> of rewriting to expand the rule.  But they aren't, and it does ;-).)

Interesting.  Anyway, I grok it now.  Thank you both for all your help.

Cheers,

--Rick