Thread: Calculation dependencies in views
I have a table with, say, a dozen fields and I want to end up with a view with around 50 fields, calculated from the original dozen. So it would be something like this: CREATE VIEW final AS SELECT x, y, z, (x + y) as a, (y + z) as b, (x + y) * z as c, (y + z) * x as d, (x + y) * z + x as e, (y + z) * x + x as f FROM my_table; except my expressions are longer and more complicated. However, my expressions do have similar dependencies and redundancies. My question is what is a good way of dealing with this? I was going to do something like CREATE VIEW one AS SELECTid, (x + y) as a, (y + z) as b, FROM my_table; CREATE VIEW two AS SELECTm.id, o.a * m.z as c, o.b * m.x as d, FROM my_table m, one o WHERE m.id= o.id; etc. but I'll end up with a lot of levels and joins going this route which I expect will be pretty slow. Can someone recommend anything to me? Should I be approaching this from a completely different angle i.e other than views? I'm pretty new at this so any pointers will be appreciated. --Rick
Rick Delaney wrote: > I have a table with, say, a dozen fields and I want to end up with a view with > around 50 fields, calculated from the original dozen. > > So it would be something like this: > > CREATE VIEW final AS > SELECT > x, y, z, > (x + y) as a, > (y + z) as b, > (x + y) * z as c, > (y + z) * x as d, > (x + y) * z + x as e, > (y + z) * x + x as f > FROM my_table; > > except my expressions are longer and more complicated. However, my expressions > do have similar dependencies and redundancies. Be aware that 6.5.* might be unable to deal with the resulting size of the rewrite rule. And I'm not sure that I'llget TOAST ready for 7.0 to handle it. > My question is what is a good way of dealing with this? I was going to do > something like > > CREATE VIEW one AS > SELECT > id, > (x + y) as a, > (y + z) as b, > FROM my_table; > > CREATE VIEW two AS > SELECT > m.id, > o.a * m.z as c, > o.b * m.x as d, > FROM my_table m, one o > WHERE m.id = o.id; > > etc. but I'll end up with a lot of levels and joins going this route which I > expect will be pretty slow. In the above sample, it will end up in a join, because even after rewriting there will be more than one used rangetableentries in the query. But if you're able to do the nesting in a manner like this: CREATE VIEW one AS SELECT x, y, z, x + y AS sum_xy, x + z AS sum_xz, y + z AS sum_yz FROM my_table; CREATE VIEW final AS SELECT x, y, z, sum_xy AS a, sum_yz AS b, sum_xy * z AS c, sum_yz * x AS d, sum_xy * z + x AS e, sum_yz * x + x AS f FROM one; and keep all views follow strictly one path final -> n-1 -> n-2 -> ... -> two -> one you'll end up the a construct that requires nested levels of rewriting. But the resulting query will definitely bea single table scan having all your complicated expressions attached. BTW: if you nest many views, you'll probably get an ERROR about "infinite recursion" from the rewriter. In that case,you must raise the value of the #define REWRITE_INVOKE_MAX 10 at line 1466 in src/backend/rewrite/rewriteHander.c to some value fairly above your nesting level. > Can someone recommend anything to me? Should I be approaching this from a > completely different angle i.e other than views? Well, I'm a rewriting- and trigger-man. It really depends on the usual behaviour of your data and application. Keep in mind, that the above view technique requires each value to be computed for each single view attribute foreach single row at each SELECT. So if you usually SELECT relatively small amounts of your data, or update big amountsof it between the SELECT's, it wouldn't hurt. But if you're dealing with huge set's of row's at SELECT time and rarely changes, you might be better off with settingup a table having all the fields in place, and let a trigger put all the calculated values in place. > I'm pretty new at this so any pointers will be appreciated. You're welcome. Jan -- #======================================================================# # It's easier to get forgiveness for being wrong than for being right. # # Let's break this rule - forgive me. # #========================================= wieck@debis.com (Jan Wieck) #
Rick Delaney <rick@consumercontact.com> writes: > I have a table with, say, a dozen fields and I want to end up with a > view with around 50 fields, calculated from the original dozen. > So it would be something like this: > CREATE VIEW final AS > SELECT > x, y, z, > (x + y) as a, > (y + z) as b, > (x + y) * z as c, > (y + z) * x as d, > (x + y) * z + x as e, > (y + z) * x + x as f > FROM my_table; > except my expressions are longer and more complicated. However, my > expressions do have similar dependencies and redundancies. > My question is what is a good way of dealing with this? I think what you are getting at is avoiding duplicate computations, such as the repeated evaluations of x + y in the above example? My guess is that you are wasting your time to worry about it, unless the repeated calculations are *really* expensive (a float add is down in the noise ... though a ten-thousand-digit NUMERIC trig function might not be). I don't believe that nested views will buy anything given the current Postgres implementation of views. As far as I understand it, the rewriter works by substitution of expressions, so that a reference to a second-level view will end up being executed just the same as if you'd written out the expressions in full. (This will probably change once we have the much-discussed, little-implemented ability to write subselects in FROM clauses; but that's a release or two away yet.) If you are indeed dealing with 10K-digit numerics or something equally CPU-hoggish, you might consider storing the intermediate results into a temp table and then doing a second select using the temp table. But you'd need to be saving an awful lot of calculations to make that profitable, I think. If you're just concerned about avoiding code complexity, then by all means go with the nested views; that will sweep the complicated expressions under the rug, and someday it will even buy you some efficiency. But it's not going to help much if you need to shave cycles today. regards, tom lane
Jan Wieck wrote: > > Rick Delaney wrote: > > > > Be aware that 6.5.* might be unable to deal with the resulting size of > the rewrite rule. And I'm not sure that I'll get TOAST ready for 7.0 to > handle it. > [snip] > But if you're able to do the nesting in a manner like this: > > CREATE VIEW one AS > SELECT > x, y, z, > x + y AS sum_xy, > x + z AS sum_xz, > y + z AS sum_yz > FROM my_table; > > CREATE VIEW final AS > SELECT > x, y, z, > sum_xy AS a, > sum_yz AS b, > sum_xy * z AS c, > sum_yz * x AS d, > sum_xy * z + x AS e, > sum_yz * x + x AS f > FROM one; > > and keep all views follow strictly one path > > final -> n-1 -> n-2 -> ... -> two -> one > > you'll end up the a construct that requires nested levels of rewriting. > But the resulting query will definitely be a single table scan having > all your complicated expressions attached. I'd have to take a good look at all the expressions again to know if I could do this but I think I probably can. Only, with what I've started on I'm already getting "rule plan string too big" errors. (What I've started on is just one view that doesn't have all the expressions in it yet). So I will just get the same thing with nested views, no? That is how I understand the "The Postgres Rule System" section of the manual and Tom Lane's answer to my post. If I could structure things as subset_1_n -> subset_1_n-1 -> ... -> subset_1_2 -> subset_1_1 subset_2_n -> subset_2_n-1 -> ... -> subset_2_2 -> subset_2_1 : : subset_m_n -> subset_m_n-1 -> ... -> subset_m_2 -> subset_m_1 final -> join subset_1_n, subset_2_n, ..., subset_m_n where each view subset_i_n has a usably-sized rule plan string then could I get this thing working? I can (and will) try this out with my stuff but I'm just trying to understand the general principles. > But if you're dealing with huge set's of row's at SELECT time and rarely > changes, you might be better off with setting up a table having all the > fields in place, and let a trigger put all the calculated values in > place. This will be primarily INSERT/UPDATE a row and then SELECT the view of that row. Am I right that a trigger would make more sense in this case then? Thanks a lot for your help, --Rick
Tom Lane wrote: > > I think what you are getting at is avoiding duplicate computations, > such as the repeated evaluations of x + y in the above example? > > My guess is that you are wasting your time to worry about it, unless > the repeated calculations are *really* expensive (a float add is down > in the noise ... though a ten-thousand-digit NUMERIC trig function > might not be). I was more interested in reducing code complexity but this is good to know anyway. Thanks a lot for your help. Your last remark brings another question to mind. If I do this: rick=> CREATE TABLE foo (foo numeric(9,2)); CREATE rick=> CREATE VIEW bar AS SELECT foo * foo AS bar FROM foo; CREATE rick=> \d foo Table = foo +----------------------------------+--------------------------+-------+ | Field | Type | Length| +----------------------------------+--------------------------+-------+ | foo | numeric | 9.2 | +----------------------------------+--------------------------+-------+ rick=> \d bar View = bar Query = SELECT ("foo"."foo" * "foo"."foo") AS "bar" FROM "foo"; +----------------------------------+--------------------------+-------+ | Field | Type | Length| +----------------------------------+--------------------------+-------+ | bar | numeric |65535.65| +----------------------------------+--------------------------+-------+ ^^^^^^^^ What governs the precision of the numeric in the view "bar"? I couldn't find the specifics of this in the docs. Is it something I need to ever think of? If so, can I cast it to a smaller numeric? I think 18.4 should be sufficient in this case. Thanks again, --Rick
Rick Delaney wrote: > Jan Wieck wrote: > > > > Rick Delaney wrote: > > > > > > > Be aware that 6.5.* might be unable to deal with the resulting size of > > the rewrite rule. And I'm not sure that I'll get TOAST ready for 7.0 to > > handle it. > > I'd have to take a good look at all the expressions again to know if I could do > this but I think I probably can. Only, with what I've started on I'm already > getting "rule plan string too big" errors. (What I've started on is just one > view that doesn't have all the expressions in it yet). That's what I meant with "might be unable...". You talked about a 50 column view with complicated expressions, andthat's surely too large for the current implementation. > So I will just get the same thing with nested views, no? That is how I > understand the "The Postgres Rule System" section of the manual and Tom Lane's > answer to my post. As the author of "The Postgres Rule System" I say yes, except that it requires some recursions in the rewriter. Thus,it will take a little longer to rewrite, but that's a constant time per query, not depending on the amount ofdata you're selecting. So if you're happy with the response times for your test queries, it's not the rewriter any moreif you get response time problems later. > > If I could structure things as > > subset_1_n -> subset_1_n-1 -> ... -> subset_1_2 -> subset_1_1 > subset_2_n -> subset_2_n-1 -> ... -> subset_2_2 -> subset_2_1 > : > : > subset_m_n -> subset_m_n-1 -> ... -> subset_m_2 -> subset_m_1 > > final -> join subset_1_n, subset_2_n, ..., subset_m_n > > where each view subset_i_n has a usably-sized rule plan string then could I get > this thing working? As said, there is a limit of 10 recursions for the rewriter actually hardcoded in rewriteHandler.c (line 1466). Raisethat value if you get "infinite loop" errors. > This will be primarily INSERT/UPDATE a row and then SELECT the view of that row. > Am I right that a trigger would make more sense in this case then? If you only insert one row, then select that one row, the cascaded view setup is best IMHO. The trigger solution wouldonly be better, if you seldom do INSERT/UPDATE, but often SELECT and maybe SELECT many rows. Jan -- #======================================================================# # It's easier to get forgiveness for being wrong than for being right. # # Let's break this rule - forgive me. # #========================================= wieck@debis.com (Jan Wieck) #
Jan Wieck wrote: > > Rick Delaney wrote: > > > Jan Wieck wrote: > > > > > > Be aware that 6.5.* might be unable to deal with the resulting size of > > > the rewrite rule. And I'm not sure that I'll get TOAST ready for 7.0 to > > > handle it. > > > > I'd have to take a good look at all the expressions again to know if I could do > > this but I think I probably can. Only, with what I've started on I'm already > > getting "rule plan string too big" errors. (What I've started on is just one > > view that doesn't have all the expressions in it yet). > > That's what I meant with "might be unable...". You talked about a 50 > column view with complicated expressions, and that's surely too large for > the current implementation. > > > So I will just get the same thing with nested views, no? That is how I ^^^^^^^^^^ ^^ > > understand the "The Postgres Rule System" section of the manual and Tom Lane's > > answer to my post. > > As the author of "The Postgres Rule System" I say yes, except that it ^^^^^^^^^ > requires some recursions in the rewriter. Thus, it will take a little Sorry to be stupid but with my terribly phrased question I can't tell if you mean "Yes, you are correct that you will still get 'rule plan string too big' errors with nested views" or "Yes, nested views will work fine as I described despite your erroneous interpretation of what you read (but you may have to fix the recursion limit as I already said)". :-) I'm 99% sure you mean the latter, now, after re-reading "The Postgres Rule System" and your first reply to me. I guess the only reason I'm not 100% sure is I don't understand where the 'rule plan string too big' limit comes from. Is it in the storage of the view/rule (related to the 8K limit on row size)? I guess that would make sense. I suppose if I re-read enough times I'll get it. Thanks again, --Rick
Rick Delaney wrote: > Jan Wieck wrote: > > > > As the author of "The Postgres Rule System" I say yes, except that it > ^^^^^^^^^ > > requires some recursions in the rewriter. Thus, it will take a little > > Sorry to be stupid but with my terribly phrased question I can't tell if you mean > "Yes, you are correct that you will still get 'rule plan string too big' errors > with nested views" or "Yes, nested views will work fine as I described despite > your erroneous interpretation of what you read (but you may have to fix the > recursion limit as I already said)". :-) > > I'm 99% sure you mean the latter, now, after re-reading "The Postgres Rule > System" and your first reply to me. I guess the only reason I'm not 100% sure is > I don't understand where the 'rule plan string too big' limit comes from. Is it > in the storage of the view/rule (related to the 8K limit on row size)? I guess > that would make sense. Stop (rereading) recursion now - you got it. It's exactly the storage of rules (views are implemented via rewriting rules) in combo with the 8K limit, that causesthis "rule plan string too big" error. Just the the view/rule creation utility does some extra check for it. Rules are stored as a special kind of printable querytree string. A querytree is the systems internal representationof a query, and they are really verbose and thus - well - huge (compared to what the user typed in). Example 1: CREATE VIEW v1 AS SELECT a, a + b AS ab, a + b + c AS abc FROM t1; Example 2: CREATE VIEW v1_sub AS SELECT a, c, a + b AS ab FROM t1; CREATE VIEW v1 AS SELECTa, ab, ab + c AS abc FROM v1_sub; These two examples will result in exactly the same querytree after rewriting, if you SELECT from v1. The second needstwo passes in the rewriter, but that's the only difference. Jan -- #======================================================================# # It's easier to get forgiveness for being wrong than for being right. # # Let's break this rule - forgive me. # #========================================= wieck@debis.com (Jan Wieck) #
How can I create an autonumber (automatically incremented by 1) column in table in Postgres? Do I have to use a trigger?
Sequences. See my book for a chapter on it on web site. [Charset iso-8859-1 unsupported, filtering to ASCII...] > How can I create an autonumber (automatically incremented by 1) column in > table in Postgres? Do I have to use a trigger? > > > > ************ > > -- Bruce Momjian | http://www.op.net/~candle maillist@candle.pha.pa.us | (610) 853-3000+ If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup. | Drexel Hill, Pennsylvania19026
Jan Wieck <wieck@debis.com> writes: > Example 1: > CREATE VIEW v1 AS > SELECT a, a + b AS ab, a + b + c AS abc FROM t1; > Example 2: > CREATE VIEW v1_sub AS > SELECT a, c, a + b AS ab FROM t1; > CREATE VIEW v1 AS > SELECT a, ab, ab + c AS abc FROM v1_sub; > These two examples will result in exactly the same querytree after > rewriting, if you SELECT from v1. The second needs two passes in the > rewriter, but that's the only difference. Actually, there's another big difference: the stored rule plan string for v1 in the second case is shorter than it is in the first case, because the a+b additions don't show up as operator nodes in v1's definition in the second case. (If the references to v1_sub were flattened out before the rule were stored, it wouldn't take two passes of rewriting to expand the rule. But they aren't, and it does ;-).) I tried these examples and got: select rulename,length(ev_action) from pg_rewrite where rulename like '_RETv%'; rulename |length ----------+------ _RETv1 | 1922 _RETv1_sub| 1558 _RETv1_up | 1566 (3 rows) So, if your problem is that you need to work around the rule plan string length limit, then indeed eliminating common subexpressions with nested views can be a win. There isn't much win in this example, but then we only got rid of two additions here. regards, tom lane
Yes, you can SQL type is serial Vladimir
Tom Lane wrote: > > Jan Wieck <wieck@debis.com> writes: > > > These two examples will result in exactly the same querytree after > > rewriting, if you SELECT from v1. The second needs two passes in the > > rewriter, but that's the only difference. > > Actually, there's another big difference: the stored rule plan string > for v1 in the second case is shorter than it is in the first case, > because the a+b additions don't show up as operator nodes in v1's > definition in the second case. (If the references to v1_sub were > flattened out before the rule were stored, it wouldn't take two passes > of rewriting to expand the rule. But they aren't, and it does ;-).) Interesting. Anyway, I grok it now. Thank you both for all your help. Cheers, --Rick