Thread: Materialized views in Oracle

Materialized views in Oracle

From
Mike Christensen
Date:

So I used to think materialized views in Postgres would be an awesome feature.  That is until I had to endure the hell hole which is Oracle's implementation..  what a complete joke..  did MS SQL's indexed views do any better?  Hopefully if PG 10 implements this, they'll make it actually useful to people.

Re: Materialized views in Oracle

From
Ben Chobot
Date:
On Sep 21, 2011, at 1:17 PM, Mike Christensen wrote:

So I used to think materialized views in Postgres would be an awesome feature.  That is until I had to endure the hell hole which is Oracle's implementation..  what a complete joke..  did MS SQL's indexed views do any better?  Hopefully if PG 10 implements this, they'll make it actually useful to people.

Not sure if this was a question or just blowing off frustration, but you can do materialized views in PG just fine. It's just that you have to do the work of updating them yourself. You can make a good case that's a good thing, because it lets you optimize the process around your needs..... but it does add for dba complexity.

(Hey, just like replication! :) )

Re: Materialized views in Oracle

From
Mike Christensen
Date:

Hmm I think a materialized view you have to update yourself is called a "table"..  but after dealing with the nightmare that is Oracle 11g, I think it'd be much more fun going that route with triggers and everything.  Yes this thread is a complete vent, and also a plea to the pg guys to do materialized views right!

On Sep 21, 2011 1:54 PM, "Ben Chobot" <bench@silentmedia.com> wrote:
> On Sep 21, 2011, at 1:17 PM, Mike Christensen wrote:
>
>> So I used to think materialized views in Postgres would be an awesome feature. That is until I had to endure the hell hole which is Oracle's implementation.. what a complete joke.. did MS SQL's indexed views do any better? Hopefully if PG 10 implements this, they'll make it actually useful to people.
>>
>
> Not sure if this was a question or just blowing off frustration, but you can do materialized views in PG just fine. It's just that you have to do the work of updating them yourself. You can make a good case that's a good thing, because it lets you optimize the process around your needs..... but it does add for dba complexity.
>
> (Hey, just like replication! :) )

Re: Materialized views in Oracle

From
Scott Marlowe
Date:
On Wed, Sep 21, 2011 at 2:54 PM, Ben Chobot <bench@silentmedia.com> wrote:
> On Sep 21, 2011, at 1:17 PM, Mike Christensen wrote:
>
> So I used to think materialized views in Postgres would be an awesome
> feature.  That is until I had to endure the hell hole which is Oracle's
> implementation..  what a complete joke..  did MS SQL's indexed views do any
> better?  Hopefully if PG 10 implements this, they'll make it actually useful
> to people.
>
> Not sure if this was a question or just blowing off frustration, but you can
> do materialized views in PG just fine. It's just that you have to do the
> work of updating them yourself. You can make a good case that's a good
> thing, because it lets you optimize the process around your needs..... but
> it does add for dba complexity.

Well if you set it up with triggers etc you can have materialized
views that autoupdate, but that may or may not be the best way to do
things for your particular workload.

Re: Materialized views in Oracle

From
Craig Ringer
Date:
On 09/22/2011 06:02 AM, Mike Christensen wrote:
> Hmm I think a materialized view you have to update yourself is called a
> "table"..  but after dealing with the nightmare that is Oracle 11g, I
> think it'd be much more fun going that route with triggers and
> everything.  Yes this thread is a complete vent, and also a plea to the
> pg guys to do materialized views right!

In terms of defining "right," it'd be rather handy to know what's wrong
with the implementation you're currently struggling with. The Oracle
guys won't have set out to do it wrong; whatever's wrong with it won't
be obvious until you try to use it a different way to how they
envisioned it or will be an issue that only comes up in real-world use.

It's entirely possible any future Pg support could fall into the same
issue ... unless opportunities like this are used to learn from the
experience of those who went before.

Exactly what do you find frustrating with the mat views you're using?

What's wrong with them? What's right with them? How are you trying to
use them? What problem are you attempting to solve with them? How do you
*want* and *expect* them to work?

--
Craig Ringer


Re: Materialized views in Oracle

From
Mike Christensen
Date:
>> Hmm I think a materialized view you have to update yourself is called a
>> "table"..  but after dealing with the nightmare that is Oracle 11g, I
>> think it'd be much more fun going that route with triggers and
>> everything.  Yes this thread is a complete vent, and also a plea to the
>> pg guys to do materialized views right!
>
> In terms of defining "right," it'd be rather handy to know what's wrong with
> the implementation you're currently struggling with. The Oracle guys won't
> have set out to do it wrong; whatever's wrong with it won't be obvious until
> you try to use it a different way to how they envisioned it or will be an
> issue that only comes up in real-world use.
>
> It's entirely possible any future Pg support could fall into the same issue
> ... unless opportunities like this are used to learn from the experience of
> those who went before.
>
> Exactly what do you find frustrating with the mat views you're using?
>
> What's wrong with them? What's right with them? How are you trying to use
> them? What problem are you attempting to solve with them? How do you *want*
> and *expect* them to work?

Oh you've asked for it..

Lemme give you a bit of a brain dump about my day.  I have to first be
up front, I'm not a database expert by any means and I'm nowhere close
to an Oracle expert, or even an Oracle novice.  If Oracle was a
swimming pool, I would have those little floaty duck things on my
arms.  I took a contract gig for "a major cell phone company" that
happens to use Oracle, so I've been trying to figure out stuff as I
go.  I'm about 3 weeks in, so that's exactly how much Oracle
experience I have now.

One of the major components I'm working on is this reporting engine
that runs these major huge expensive queries (seriously, some of them
take 2-3 minutes to run) to generate data that gets dumped to an Excel
file for VPs to read (or make pretty charts out of and look smart in
board rooms).  I've been trying to speed some of these queries up, but
unfortunately the SQL code that generates these reports is about 4,000
lines long and is wrapped in a massive Oracle "package".  The SQL code
is not indented nor does it have any whitespace or anything, you'd
have more fun trying to debug into minimized Javascript files on
Facebook.  I've tried to run some SQL formatters on the code, and so
far any formatter will just crash.  Today, I decided to re-write one
of these reports from scratch since they wanted a bunch of changes
anyway.  The data, as you've probably imagined, is highly
un-normalized and the queries are pretty crazy, so I got the idea to
make a few strategic DB changes as well as stick this report in a
materialized view.  That way, we could just query from that, throw in
whatever filters we need and poof, instant report in like half a
second.  If done right.

Problem 1:

I spent quite a long time writing the query that I was planning on
turning into a view.  This involved doing things like a sub-query to
get the minimum date from another set of rows, and building a comma
delimited list out another nested select with the wm_concat function.
After finally getting this query to work, turning it into a MV ended
up being impossible.

Oracle has a few really cool features around materialized views.
There's a REFRESH ON COMMIT option that automatically updates the
materialized view any time the underlying data is changed (otherwise
you have to manually update it, or setup an update schedule for it).
There's also a FAST REFRESH option that will only update the data
that's changed, rather than run the entire query and re-build every
row (Doing a full refresh on every commit would be crazy expensive, so
no)..  In my opinion, doing a FAST REFRESH ON COMMIT seems to be the
only interesting feature - otherwise, there's not really a point -
just make a normal table and re-build it once a day.  Plus, they
demand these reports to be real-time data anyway.  However, if you
want both of these options (fast refresh and update on commit), Oracle
turns into a complete nightmare.

First, apparently there can be no JOINS in your query.  If you create
a view with:

SELECT A.Foo, B.Foo FROM A INNER JOIN B ON A.ID=B.ID

You'll get the cryptic error:

ORA-12054: cannot set the ON COMMIT refresh attribute for the materialized view

Now, you can do the SAME exact query without the ANSI joins:

SELECT A.Foo, B.Foo FROM A, B WHERE A.ID=B.ID

Now all of a sudden it works.  Yes, that makes a lot of sense, and
thanks for the helpful error message too.

Second: You have to create these materialized view logs for all the
underlying tables if you want FAST REFRESH.  This allows Oracle to log
the exact data that changed so it can figure out which parts of your
view table to update.  Why can't Oracle just log its own data when
there's fast refreshing views dependent on it?  You not only have to
create these logs, but you have to create them with special ROWID
parameters otherwise you get more cryptic unhelpful Oracle errors like
the one above.

Next: If you have a SUBSELECT in your query, you're screwed:

SELECT
  A,
  B,
  (SELECT MIN(DATE) FROM MyDates WHERE ID=TABLE1.ID) as EarliestDate
FROM TABLE1

If you do this, you'll get:

ORA-22818: subquery expressions not allowed here

You have to re-write this query in a weird LEFT JOIN instead.  Once
you do that, you'll also break FAST REFRESH with the cryptic error
message from above.

Basically, FAST REFRESH and ON COMMIT are extremely limited in what
you can do with them; any query more complicated than a basic SELECT *
FROM FOO seems to be asking for trouble.  If there are reasons for
these things, both the error message displayed and the Oracle
documentation make trouble shooting next to impossible.  I guess this
is why Oracle consultants make so much money.

So, let's review the actual steps required to make a FAST REFRESH ON
COMMIT view with joins:

First, take the tables:

CREATE TABLE foo(foo NUMBER, CONSTRAINT foo_pk PRIMARY KEY(foo));
CREATE TABLE bar(foo NUMBER, bar NUMBER, CONSTRAINT bar_pk PRIMARY
KEY(foo, bar));

We first have to create materialized view logs for both these guys:

CREATE MATERIALIZED VIEW LOG ON foo WITH ROWID;
CREATE MATERIALIZED VIEW LOG ON bar WITH ROWID;

If we do not do this OR we do not use the ROWID parameter, we get
"Cannot set the ON COMMIT refresh attribute".  We now have to create
the view like this:

CREATE MATERIALIZED VIEW foo_bar
  NOLOGGING
  CACHE
  BUILD IMMEDIATE
  REFRESH FAST ON COMMIT  AS SELECT foo.foo,
                                    bar.bar,
                                    foo.ROWID AS foo_rowid,
                                    bar.ROWID AS bar_rowid
                               FROM foo, bar
                              WHERE foo.foo = bar.foo;

Notice a lot of things: No ANSI JOINs (if you use them, cryptic
error), we have to explicitly say what columns we want, we have to
return the ROWID columns of every unique table.  If we don't do any of
those things, we get "Cannot set the ON COMMIT refresh attribute."

An Oracle expert can probably get into details about why all of these
things are absolutely vital to the database engine, but discovering
these things took over 4 hours of my time today.  I'll also admit
there is this SQL function called DBMS_MVIEW.EXPLAIN_MVIEW that
apparently helps people figure out why their materialized views don't
support certain options, but I couldn't get this function to work, nor
have I gotten any of those DMBS functions to work.  I think they might
be part of some package I either don't have access to or are not
installed on my server; again, the error messages are too cryptic to
tell why.

Problem 2:

Ok so now I have a materialized view.  Since I was never able to do
the nested select with the earliest date (nor LEFT JOIN it in) no
matter how hard I tried, I decided I should stick this value on the
PROJECT table and setup a trigger to update it every time one of the
dates changed.  This should be super easy, right?  Even if you think
it's a bad approach (it might be) or you hate triggers, it should
STILL be easy..  I spent about 4 hours today trying to get the most
basic trigger working at all.

Here was my first attempt:

CREATE TRIGGER Trigger_UpdateTrainingDelivery
    AFTER DELETE OR INSERT OR UPDATE OF STARTDATE
    ON TPM_TRAININGPLAN
    FOR EACH ROW
    BEGIN
       UPDATE TPM_PROJECT SET TRAININGDELIVERYSTART = (SELECT
MIN(TP.STARTDATE) FROM TPM_TRAININGPLAN TP WHERE TP.PROJECTID =
:new.PROJECTID)
       WHERE PROJECTID = :new.PROJECTID;
    END;

So, every time a TRAININGPLAN row changes its STARTDATE field, I want
to recalc TRAININGDELIVERYSTART to be the earliest date in that set.
Should be simple.  However, I tried messing with this query syntax for
hours and got no where.  Each time I tried to create the trigger, I
got the message:

Warnings: --->
   W (1): Warning: execution completed with warning
          <---

That's right - It told me there was a warning, but wasn't actually
nice enough to tell me what the warning is.  There is a "SHOW ERRORS"
command, but it apparently requires SQLPlus and don't get me started
on that thing.  Instead, you have to query a special error table.
When I did this, there was random errors in no particular order, some
didn't even appear to be related to any of this.  There were no
timestamps on the error either, and I found the data totally useless.
However, it still created the trigger.  So then I tried to update one
of the TRAININGPLAN rows.  When I do this, I get:

>[Error] Script lines: 20-22 ------------------------
 ORA-04098: trigger 'TPMDBO.TRIGGER_UPDATETRAININGDELIVERY' is invalid
and failed re-validation
 Script line 20, statement line 1, column 7

So, basically Oracle was fine creating my trigger even though it knew
it was invalid.  Now my database is broken, and it won't even tell me
what's wrong with the trigger.

Well, I never actually did figure out how to fix my TRIGGER but I did
learn that even if I do figure out the magical trigger syntax, I'll
get nothing but "the table is mutating" errors (How X-Men, will my
table have special mutant powers too?).  Since TRAININGPLAN is in a
state of flux until the transaction is committed, it's apparently
impossible to select those rows and get the minimum date.  I don't
know if Postgres has this problem, but it seems to me the engine
should be able to tell me what the minimum date will be since the
trigger is running AFTER the rows are updated, and then commit
everything at once.  I don't see a reason for this limitation, nor do
I see a reason why Oracle can't spit out any meaningful error message
about my invalid trigger either when I create the trigger or the
trigger runs.

It seems that I'm going to have to change my middle-ware to update
this column, which will be a mess because my middle-ware is a giant
nightmare of .NET Entity classes and bad code which I did not write.
If I wrote it, it would be good code.  .NET is my specialty.  One
could argue this is the proper place to do this anyway.

In conclusion:

Overall, I've found Oracle near impossible to work with relative to
Postgres.  The documentation is hard to read, there's very few good
examples, and the tools are expensive, buggy, and crash a lot.
There's no good error messages, there's obscure limitations, and
everything is designed way more complicated than it needs to be.

I'd rather work on a PG database any time of the night or day, no
questions about it.  Maybe this message will show just how important
good error messages and solid documentation are, especially for
beginners.  Anyway, sorry for the massive email but hopefully this'll
be of use to someone, or at least make people chuckle as they
reminisce about their early days as a DBO.  Thanks for listening!

Mike

Re: Materialized views in Oracle

From
Craig Ringer
Date:
On 09/22/2011 01:34 PM, Mike Christensen wrote:

[lots]

I'm going to heavily paraphrase a summary, then a summarized response.
If you disagree with the way I interpret and condense the full message,
please feel free to grump. Roughly, I read your comments as:

- You're focused on a single use-case that you're having problems with,
rather than a more general issue with mat. views. [Nonetheless, this
experience highlights some issues that probably apply to all Oracle
materialized views].

- You want materialized views to support selective updating ("fast
refresh") for any query, and they don't. You appear to find the
limitations on what queries are and are not supported frustrating. The
inability to recognise the equivalance of "a INNER JOIN b ON a.id=b.id"
and "a,b WHERE a.id=b.id" is particularly annoying, as is the inability
to do a selective update on a materialized view involving  an outer join
or a subquery.

- The frustration created by these limitations is caused more by the
awful error messages and the limited documentation, rather than from the
limitations themselves. It sounds like you wouldn't mind so much if you
knew what you could and couldn't do with materialized views in advance
or at least got descriptive error messages.

- You don't see materialized views without selective updating ("fast
refresh") as useful. [I disagree, though I can see how it wouldn't be
very useful for the particular use case you're facing.]



My viewpoint here is:


- Your use case isn't exactly a poster-child for materialized views in
the first place, so there's a degree of round-peg vs square-hole going
on. It'd be really nice if a materialized view could solve your problem,
but it'd be really nice if I could cycle at 100 km/h, too. In the real
world, everything has trade-offs and limitations.

- I don't see the limitations on which queries can be used with FAST
REFRESH materialized views as unreasonable, but cannot believe how bad
the error messages about them are.

- The error messages are horrible, and it sounds like the documentation
could certainly use some improvement. That seems consistent with Oracle
in general, and not unique to materialized views.

- Personally, I'm very impressed that FAST REFRESH materialized views
work at all! That would be a HARD feature to implement in a general way
and make perform even half-decently. Oracle deserve some credit for
this. They've managed to make things like a journal-to-ledger
materialized view where only a given customer's ledger entries are
updated when a journal entry is added possible - without any custom coding.

- Everything seems to be very manual, with nothing done automagically
behind the scenes. That seems to fit the general Oracle philosophy of
how things should work. I don't see any sign that it's significantly
different for materialized views than it is for the rest of Oracle.

- The inability to transform join syntax is pretty lame. I blame the SQL
standard for having so many stupid, arbitrary ways to write the same
thing, and having a language so incredibly painful to work with that
producing an easily transformed and analysed representation of a query
is nearly impossible. (eg: Seriously, who came up with the window
function syntax, and what were they smoking?).

- A materialized view that requires a full query re-execution on every
commit is IMO certainly still very useful. If the source table(s) are
not updated particularly often and the view query is very expensive to
run, but the view is queried extremely frequently, then this will still
be a huge win.



[inline response to long version]:

> The data, as you've probably imagined, is highly
> un-normalized and the queries are pretty crazy, so I got the idea to
> make a few strategic DB changes as well as stick this report in a
> materialized view.  That way, we could just query from that, throw in
> whatever filters we need and poof, instant report in like half a
> second.  If done right.

Yeah, there's your problem. That's a pretty bad use case for a
materialized view IMO. That's never going to be anything except painful
and slow.

Unless you need this data extremely frequently as part of other queries
that're made much cheaper by the use of the materialized view, it's
going to be a net loss. You're better off just cleaning up the report as
much as you can and running it on demand or on a schedule.

> Oracle has a few really cool features around materialized views.
> There's a REFRESH ON COMMIT option that automatically updates the
> materialized view any time the underlying data is changed (otherwise
> you have to manually update it, or setup an update schedule for it).
> There's also a FAST REFRESH option that will only update the data
> that's changed, rather than run the entire query and re-build every
> row (Doing a full refresh on every commit would be crazy expensive, so
> no)..  In my opinion, doing a FAST REFRESH ON COMMIT seems to be the
> only interesting feature - otherwise, there's not really a point -
> just make a normal table and re-build it once a day.

There's a lot to be said for a regular refresh on commit, too. Sometimes
you need that data to always be consistent, and if the data it's drawn
from is updated once an hour but the view is queried 10 times a second,
the cost of the update will be well worth it.

> Plus, they demand these reports to be real-time data anyway.

Thar's yer problem, maytee. Executives.

How real-time is real-time? Do they want it "as soon as I press the
button, with no delay?" Do they actually need it up-to-date as of five
seconds before it was requested? Or just reasonably fresh and with no
waiting around to get it?

Maybe a bit of work refining their requirements might help. Being
executives at a telco, there's a fair chance they'll demand that it not
only be instant and perfectly up to date, but real-time linked into
their presentation so it changes (with pretty animations) as they're
giving their talk ... but you never know, you might get a sane one.
There must be one or two somewhere in the phone industry.

> First, apparently there can be no JOINS in your query.  If you create
> a view with:
>
> SELECT A.Foo, B.Foo FROM A INNER JOIN B ON A.ID=B.ID
>
> You'll get the cryptic error:
>
> ORA-12054: cannot set the ON COMMIT refresh attribute for the materialized view
>
> Now, you can do the SAME exact query without the ANSI joins:
>
> SELECT A.Foo, B.Foo FROM A, B WHERE A.ID=B.ID
>
> Now all of a sudden it works.  Yes, that makes a lot of sense, and
> thanks for the helpful error message too.

Yeah, it's a bit surprising that they can't recognise those as
equivalent, but it's hardly the end of the world. The error message is a
bit nasty though.

> Second: You have to create these materialized view logs for all the
> underlying tables if you want FAST REFRESH.  This allows Oracle to log
> the exact data that changed so it can figure out which parts of your
> view table to update.  Why can't Oracle just log its own data when
> there's fast refreshing views dependent on it?  You not only have to
> create these logs, but you have to create them with special ROWID
> parameters otherwise you get more cryptic unhelpful Oracle errors like
> the one above.

That just seems to be the Oracle philosophy of making everything an
explicit admin action. The cynic in me wonders if this is a scheme to
keep the armies of Oracle admins doing endless training course
refreshes. In truth I imagine it's more about giving admins control and
allowing them to avoid unintended side-effects like adding a whole lot
of logging overhead for a hot table they didn't realize would be
referenced by a materialized view that goes through several intermediate
views.

> Next: If you have a SUBSELECT in your query, you're screwed:
>
> SELECT
>    A,
>    B,
>    (SELECT MIN(DATE) FROM MyDates WHERE ID=TABLE1.ID) as EarliestDate
> FROM TABLE1
>
> If you do this, you'll get:
>
> ORA-22818: subquery expressions not allowed here

Fair enough. Remember, they have to figure out how to do this selective
updating magic. To do that, they need to be able to prove how an update
to a given table will affect a subset of rows in the materialized view,
and need to be able to re-calculate all those rows.

How would *you* implement that when any ol' subquery is allowed?

Writing rules and proof engines for this sort of stuff is *hard*. The
more restrictions you apply, the easier it gets. It's also often best to
start out with something limited but working, then progressively enhance
it by removing limitations.

> You have to re-write this query in a weird LEFT JOIN instead.  Once
> you do that, you'll also break FAST REFRESH with the cryptic error
> message from above.

Again, that's no great surprise. A LEFT OUTER JOIN isn't going to be the
easiest thing to do set proofs about.

> Basically, FAST REFRESH and ON COMMIT are extremely limited in what
> you can do with them; any query more complicated than a basic SELECT *
> FROM FOO seems to be asking for trouble.

Personally I'm kind of impressed by what FAST REFRESH *can* do. That
sort of thing is hard to do, in the mathematical sense of "hard".

> We first have to create materialized view logs for both these guys:
>
> CREATE MATERIALIZED VIEW LOG ON foo WITH ROWID;
> CREATE MATERIALIZED VIEW LOG ON bar WITH ROWID;

Yep, so you're making the costs of the view explicit. Fair enough to me,
though I suspect PostgreSQL's approach would be a 'NOTICE' message
informing you the log was being created - if it needed a log-based
approach, that is.

> CREATE MATERIALIZED VIEW foo_bar
>    NOLOGGING
>    CACHE
>    BUILD IMMEDIATE
>    REFRESH FAST ON COMMIT  AS SELECT foo.foo,
>                                      bar.bar,
>                                      foo.ROWID AS foo_rowid,
>                                      bar.ROWID AS bar_rowid
>                                 FROM foo, bar
>                                WHERE foo.foo = bar.foo;
>
> Notice a lot of things: No ANSI JOINs (if you use them, cryptic
> error)

Yup, that one's weird, but then Oracle has always had funny ideas about
how joins should be written.

> we have to explicitly say what columns we want

Well, you're always doing that anyway. Right? RIGHT!?!

Seriously, "*" is the worst damn feature in SQL. Its use outside
interactive testing should be punishable by cattle prod.

> we have to return the ROWID columns of every unique table.

I'm not at all surprised that they need that data to support fast
refresh, but I'll grant it is a bit weird that you have to explicitly
write them in the query result set. I'd expect the DB to transparently
add them behind the scenes if they weren't specified.

> If we don't do any of
> those things, we get "Cannot set the ON COMMIT refresh attribute."

Yup, it's the error messages that kill me. My attempts to try out Oracle
freebie editions were quickly crushed by the horrible interface of their
interactive SQL tool and the absolutely miserable error messages the
database produces.

> An Oracle expert can probably get into details about why all of these
> things are absolutely vital to the database engine, but discovering
> these things took over 4 hours of my time today.

Again, it sounds like the real issue is bad documentation and bad error
messages. You're *ALWAYS* going to have limitations on what a
selectively updated materialized view can do; it's just the way you're
informed of them that sucks.

[snip]

> Warnings: --->
>     W (1): Warning: execution completed with warning
>            <---
>
> That's right - It told me there was a warning, but wasn't actually
> nice enough to tell me what the warning is.  There is a "SHOW ERRORS"
> command, but it apparently requires SQLPlus and don't get me started
> on that thing.  Instead, you have to query a special error table.

OK, and here we have "oracle has a miserable UI from hell" issues
that're well known, but nothing to do with mat views.

> Since TRAININGPLAN is in a
> state of flux until the transaction is committed, it's apparently
> impossible to select those rows and get the minimum date.  I don't
> know if Postgres has this problem, but it seems to me the engine
> should be able to tell me what the minimum date will be since the
> trigger is running AFTER the rows are updated, and then commit
> everything at once.

Correct. An AFTER trigger in PostgreSQL will see the transaction's local
view of the uncommitted data. If the transaction rolls back, the effects
of the trigger will too so there's no risk of a dirty read leaking out
and becoming visible outside the transaction. If the transaction
commits, the data is valid.

This can get complicated when you have triggers acting recursively on a
table and it isn't always that easy to understand exactly what a trigger
will see.

--
Craig Ringer

Re: Materialized views in Oracle

From
Thomas Kellerer
Date:
Craig Ringer, 22.09.2011 08:34:
> - You don't see materialized views without selective updating ("fast
> refresh") as useful. [I disagree, though I can see how it wouldn't be
> very useful for the particular use case you're facing.]

One thing that is often overlooked and that I find most useful is the rewrite magic that Oracle can use with MVIEWS.

Assume an application is running the following statement:

select department_id,
        count(*) as num_employees
from employees
group by department_id;

and due to the size of the table this statement is slow. One can create a materialized view like this:

create materialized view num_emps
refresh complete on commit
enable query rewrite
as
select department_id,
        count(*) as num_employees
from employees
group by department_id;

Now each time the application runs the original statement, will silently rewrite the query into "SELECT * FROM
num_emps"which is a lot faster than the real statement. 

Oracle will know whether the view is stale and will do the rewriting only if this is applicable.

Regards
Thomas

Re: Materialized views in Oracle

From
Jayadevan M
Date:
Hello,
..
> up front, I'm not a database expert by any means and I'm nowhere close
> to an Oracle expert, or even an Oracle novice.  If Oracle was a
> swimming pool, I would have those little floaty duck things on my
> arms.  
Please go through this link

http://asktom.oracle.com/pls/asktom/f?p=100:11:0::NO::P11_QUESTION_ID:15695764787749
Going through this and other queries/replies from Tom helps much more than going through Oracle documentation. Tom is far more helpful than the floaty ducks.
By the way, I have used MVs in Oracle for a long time and did not run into major issues with them. But I started working with Oracle sometime ago. So what comes across as 'strange' SQL to non-Oracle users is 'very normal' SQL for me:).
Do be careful with on COMMIT FAST REFRESH MVs in OLTP systems.

See..
http://asktom.oracle.com/pls/asktom/f?p=100:11:0::::P11_QUESTION_ID:4541191739042

"
4) basically, your commits will tend to serialize. Whether this is an issue depends on how long your materialized view takes to refresh itself. Remember - the commits will serialize - not entire transactions, just the very end of them. "

"As mentioned above, materialized views typically add overhead to individual transactions and, if created with REFRESH ON COMMIT, will introduce contention. The overhead arises from the need to track the changes made by a transaction, these changes will either be maintained in the session state or in log tables. In a high end OLTP system, this overhead is not desirable. "


Regards,
Jayadevan





DISCLAIMER:


"The information in this e-mail and any attachment is intended only for the person to whom it is addressed and may contain confidential and/or privileged material. If you have received this e-mail in error, kindly contact the sender and destroy all copies of the original communication. IBS makes no warranty, express or implied, nor guarantees the accuracy, adequacy or completeness of the information contained in this email or any attachment and is not liable for any errors, defects, omissions, viruses or for resultant loss or damage, if any, direct or indirect."





Re: Materialized views in Oracle

From
Rob Sargent
Date:

On 09/21/2011 11:34 PM, Mike Christensen wrote:

[ much omitted ]
> One of the major components I'm working on is this reporting engine
> that runs these major huge expensive queries (seriously, some of them
> take 2-3 minutes to run) to generate data that gets dumped to an Excel
> file for VPs to read (or make pretty charts out of and look smart in
> board rooms).  I've been trying to speed some of these queries up, but

I see a serious disconnect between "realtime" and "dumped to excel".
Even executives or large cell phone companies learn pretty quickly that
they had better run the 3 minute query 5 minutes before the have to go
to that meeting.

Re: Materialized views in Oracle

From
Greg Smith
Date:
On 09/22/2011 01:34 AM, Mike Christensen wrote:
> If Oracle was a swimming pool, I would have those little floaty duck
> things on my arms.

Yes, it's too bad the license to get Oracle 11g with Floaties is cost
prohibitive for most companies.

--
Greg Smith   2ndQuadrant US    greg@2ndQuadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support  www.2ndQuadrant.us


Re: Materialized views in Oracle

From
Mike Christensen
Date:
On Wed, Sep 21, 2011 at 11:34 PM, Craig Ringer <ringerc@ringerc.id.au> wrote:
[snip]
> This can get complicated when you have triggers acting recursively on a
> table and it isn't always that easy to understand exactly what a trigger
> will see.

I do agree with most all your points.  The value I got out of this
experience was the perspective I gained dealing with materialized
views and being forced to think about the logic behind their
implementation.  As you said, this feature is friggen tough to do
right.  I read up a bit about Microsoft SQL’s “indexed views” and they
too have a long list of things you can’t do, and from what I can tell,
they only support the idea of keeping them up to date on every commit.
So, since we’re not on this list to discuss how we can improve Oracle,
I’d like to not make that the focus of my response.  I’d rather have a
discussion on what materialized views could mean in the Postgres
world.  I think my expectations, now slightly more validated through
my realization of what’s possible with Oracle, would go something like
this:

1) Though I might have given the impression that a “manual complete
refresh” is not useful, I definitely see value in this especially for
data warehousing scenarios.  However, I’d almost want to call this a
“snapshot” and not a “materialized view” – the two things are so
different, I think it warrants different syntax.  I think the ability
to create a snapshot in time would be quite useful:

CREATE SNAPSHOT Foo AS
   SELECT * FROM Bar;

Now, I have “Foo” as a record in time, and can refer to it as I could
any other table. I’d also like the ability to refresh it (via a
schedule or a trigger):

REFRESH SNAPSHOT Foo;

Snapshots would remember their underlying query and could thus easily
update at any time.

2) The huge feature here (which MS SQL implements as indexed views)
are views that automatically update as the data underneath them
changes.  I’ve come to the conclusion that, while it’s impressive that
Oracle can do anything close to this and have it actually work pretty
well in a lot of scenarios, Oracle simply can’t have an
all-encompassing knowledge of exactly how my database works and where
all the data comes from.  You know who does have that knowledge?  Me.
I believe I should have the power to instruct Postgres exactly when
and how to update my view in situations where it cannot be
automatically ascertained through the engine, rather than the DB
saying “Sorry I can’t be perfect thus you can’t do that.”  For me, I
see this “chore” as a lot more appealing than trying to figure out why
I can’t make the view that I want.

I expect to be able to create two kinds of materialized views: Ones
that COMPLETE refresh any time any referred column changes, and ones
that do a FAST refresh.  If I specify FAST but the engine can’t infer
what it needs, I should get a NOTICE and it should fall back to a
COMPLETE.  If I specify nothing, it should create FAST if it can, and
if not fall back to COMPLETE without notice.

When creating a materialized view, I believe warnings should be issued
when a column’s source cannot be inferred by the parser, but at the
risk of being controversial, I think I should still be able to create
the view anyway.  I’ve always been partial to systems that allow you
to shoot yourself in the foot.  I could see doing something like:

CREATE MATERIALIZED VIEW Foo AS
   SELECT ID, Name FROM Users;

Postgres knows that ID is a primary key, and can thus update the view
when Users changes.  Had I not put in a primary key, I think one
should be generated for me automatically based on every primary key in
the referred tables.  If tables do not have primary keys, you’d get a
warning that the view has to be re-created on any change to the
referred keyless tables.

CREATE MATERIALIZED VIEW Foo AS
   SELECT State, AVG(Age) FROM Users GROUP BY State;

Ok this is a tough one to figure out.  Since the Age column is
referred to, we could force a refresh every time Age in any row
changes.  In theory, the engine could be smart enough to realize the
aggregate age came from a group, and thus update the appropriate
“State” row when any Age within that state changed.  Wow, this is
getting tough; I see why Oracle just said no way on aggregate
functions.

CREATE MATERIALIZED VIEW Foo AS
   SELECT ID, GetStateFromZip(ZipCode) as State FROM Users;

Ouch, a function.  Well, the engine could look at the volatility of my
GetStateFromZip function and make an assumption that passing in value
X would always result in Y for nonvolatile functions.  Then, update
row ID when ZipCode changes.  However, if the function is volatile or
the data that the function itself uses changes, then we’d definitely
run into issues.  Two options: One, you’d issue a NOTICE and say
something like “The source of column ‘State’ cannot be inferred.” and
create the view anyway.  Maybe there could be some STRICT option or
pragma to simply not allow creating the dangerous views that could get
stale.  Another option, the user could provide the insight the engine
needs to create the view.

CREATE MATERIALIZED VIEW Foo
   REFRESH ON (ZipCodes Z WHERE Z.Zip = Users.Zip) AS
   SELECT ID, GetStateFromZip (ZipCode) as State FROM Users;

Now we’ve told the engine that when any column in the ZipCodes table
changes (you could probably qualify this further), then to update the
materialized view of any Users row with that zipcode.  Ok, this isn’t
100% ironed out but the idea is that I can provide instructions on how
to handle situations that cannot be inferred automatically.

I really like the idea of having the database do the best it can with
the data it has, and issue NOTICEs if your view won’t be perfect.
Then, allowing me to provide hints as to when and how to update the
pieces of data that cannot be automatically inferred if I want to get
around those warnings.  To me, it seems more polite than insisting the
database knows all and when it doesn’t, denying the very existence of
the view completely.

One could argue that if you’re going this far, you’ve basically turned
the whole trigger mechanism inside out and one could already implement
this whole thing on Postgres 9 using real tables and triggers when the
appropriate data changes.  This is something I’m struggling with as
well.  Materialized views seem to be great for quick database
snapshots and very simple selects, but anything too complicated and
it’s either not an option, or you’d have to explain so much logic that
you’re better off using a series of triggers and writing a custom
solution anyway.

This makes me really question the fundamental use of materialized
views.  In other words, what is a real solid scenario that you would
use one for that simply cannot be done currently using triggers?  Or,
is it simply the “ease of creation” people are after?  There must be
things I’m just not seeing, as this is the number one feature request
on postgresql.uservoice.com by a massive landslide, and I fail to
believe all those people are just “lazy.”

Re: Materialized views in Oracle

From
Josh Kupershmidt
Date:
On Fri, Sep 23, 2011 at 3:03 AM, Mike Christensen <mike@kitchenpc.com> wrote:
> 1) Though I might have given the impression that a “manual complete
> refresh” is not useful, I definitely see value in this especially for
> data warehousing scenarios.  However, I’d almost want to call this a
> “snapshot” and not a “materialized view” – the two things are so
> different, I think it warrants different syntax.
[snip]

I like the terminology employed by this page:
  http://tech.jonathangardner.net/wiki/PostgreSQL/Materialized_Views
which clarifies between "snapshot", "eager", "lazy", and "very lazy"
materialized views.

[snip]

> One could argue that if you’re going this far, you’ve basically turned
> the whole trigger mechanism inside out and one could already implement
> this whole thing on Postgres 9 using real tables and triggers when the
> appropriate data changes.  This is something I’m struggling with as
> well.  Materialized views seem to be great for quick database
> snapshots and very simple selects, but anything too complicated and
> it’s either not an option, or you’d have to explain so much logic that
> you’re better off using a series of triggers and writing a custom
> solution anyway.
>
> This makes me really question the fundamental use of materialized
> views.  In other words, what is a real solid scenario that you would
> use one for that simply cannot be done currently using triggers?  Or,
> is it simply the “ease of creation” people are after?  There must be
> things I’m just not seeing, as this is the number one feature request
> on postgresql.uservoice.com by a massive landslide, and I fail to
> believe all those people are just “lazy.”

As a preface, I think materialized views are awesome when used
correctly. I don't have first-hand experience with Oracle's
implementation of them.

As far as I know, the maintenance of the matview itself can always be
done just as well using triggers as it can by a database engine (like
Oracle). In fact, I'd bet that for any reasonably complicated matview,
you're going to be able to wring out quite a bit more performance from
the hand-written one, as you can employ optimizations that the
database engine just doesn't know about.

For example, a while back I wrote a heavily-queried ~50M row matview
which was derived from ~12 or so base relations, some of which were
heavily updated. I was able to make the load imposed by the matview a
small part of the bulk-loading of the base tables by keeping track of
which changes to the base tables were actually "interesting" to the
matview, where "interesting" meant "could possibly cause a significant
change to the corresponding row in the matview". Plus little tricks
like knowing when it'd be cheaper to perform a full refresh of the
matview vs. slogging through millions of rows of UPDATEs. I imagine a
database-engine maintained matview in Oracle would have imposed more
system load and been slower to keep up-to-date.

Of course, one cool thing a database-engine-supported matview can do
is automatic query-rewriting, if it can prove that the materialized
view is both up to date, and capable of satisfying the user's query.
But then, if you are using materialized views, is it really that hard
to change your performance-intensive queries to select from the view
instead of the base table?

As for materialized views being the #1 most-requested feature on
uservoice... well, perhaps the voters don't have a good idea of how
much effort it would be to implement (vs. other compelling features we
could add), and perhaps also don't understand how powerful and
flexible trigger-maintained matviews can be. The fact that the
suggestion claims it would "... boost performance for every web
application" makes me pretty skeptical right off the bat. I'm not
saying it wouldn't be cool to have, just that there is still a lot
more low-hanging fruit that the data-warehousing crowd would benefit
from.

Josh