Thread: Re: - GSoC - snapshot materialized view (work-in-progress) patch

Re: - GSoC - snapshot materialized view (work-in-progress) patch

From
Robert Haas
Date:
2010/7/8 Pavel Baroš <baros.p@seznam.cz>:
> Description of patch:
> 1) can create MV, and is created uninitialized with data
>   CREATE MATERIALIZED VIEW mvname AS SELECT ...

This doesn't seem acceptable.  It should populate it on creation.

> 2) can refresh MV
>   ALTER MATERIALIZED VIEW mvname REFRESH
>
> 3) MV cannot be modified by DML commands (INSERT, UPDATE and DELETE are not
> permitted)
>
> 4) index can be created and used with MV
>
> 5) pg_dump is repaired, in previous patch dump threw error, now dont, but it
> is sort of dummy, I want to reach state, where refreshing command will be
> posed after all COPY statements (when all data are in tables). In this patch
> REFRESH command is right behind CREATE MV command.

Hmm... ISTM that you probably need some kind of dependency stuff in
here to make the materialized view get created after the tables it
depends on have been populated with data.  It needs to work with
parallel restore, too.  I'm not sure exactly how the dependency stuff
in pg_dump works, though.

A subtle point here is that if you dump and restore a database
containing a materialized view, the new database might not be quite
the same as the old one, because the materialized view might have been
out of date before, and when you recreate it, it'll get refreshed.
I'm not sure there's much we can/should do about that, though.

> 6) psql works too, new command \dm[S+] was added to the list
>  \d[S+] [PATTERN]   - lists all db objects like tables, view, materialized
> view and sequences
>  \dm[S+] [PATTERN]  - lists all materialized views
>
> 7) there are some docs too, but I guess it is not enough, at least my
> english will need to correct

If we're going to treat materialized views as a separate object type,
you probably need to break out the docs for CREATE MATERIALIZED VIEW,
ALTER MATERIALIZED VIEW, and DROP MATERIALIZED VIEW into their own
pages, rather than having then mixed up with corresponding pages for
regular views.

> 8) some ALTER TABLE commands works, ie. RENAME TO, OWNER TO, SET SCHEMA, SET
> TABLESPACE
>
> 9) MV and columns can be commented
>
> 10) also some functions behave as expected, but if you know about some I did
> not mention and could fail when used with MV, I appreciate your hints
>     pg_get_viewdef()
>     pg_get_ruledef()
>     pg_relation_filenode()
>     pg_relation_filepath()
>     pg_table_size()
>
>
> In progress:
> - regression tests
> - behavior of various ALTER commands, ie SET STATISTIC, CLUSTER ON,
> ENABLE/DISABLE RULE, etc.

This isn't right:

rhaas=# create view v as select * from t;
CREATE VIEW
rhaas=# alter view v refresh;
ERROR:  unrecognized alter table type: 41

Please add your patch here, so that it will be reviewed during the
about-to-begin CommitFest.

https://commitfest.postgresql.org/action/commitfest_view/open

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company


Re: - GSoC - snapshot materialized view (work-in-progress) patch

From
Pavel Baroš
Date:
Dne 9.7.2010 21:33, Robert Haas napsal(a):
> 2010/7/8 Pavel Baroš<baros.p@seznam.cz>:
>> Description of patch:
>> 1) can create MV, and is created uninitialized with data
>>    CREATE MATERIALIZED VIEW mvname AS SELECT ...
>
> This doesn't seem acceptable.  It should populate it on creation.
>

Yes, it would be better, in addition, true is, this behavior will be 
required if is expected to implement incremental MV in the close future.

>> 2) can refresh MV
>>    ALTER MATERIALIZED VIEW mvname REFRESH
>>
>> 3) MV cannot be modified by DML commands (INSERT, UPDATE and DELETE are not
>> permitted)
>>
>> 4) index can be created and used with MV
>>
>> 5) pg_dump is repaired, in previous patch dump threw error, now dont, but it
>> is sort of dummy, I want to reach state, where refreshing command will be
>> posed after all COPY statements (when all data are in tables). In this patch
>> REFRESH command is right behind CREATE MV command.
>
> Hmm... ISTM that you probably need some kind of dependency stuff in
> here to make the materialized view get created after the tables it
> depends on have been populated with data.  It needs to work with
> parallel restore, too.  I'm not sure exactly how the dependency stuff
> in pg_dump works, though.
>

never mind in case MV will be populated on creation.

> A subtle point here is that if you dump and restore a database
> containing a materialized view, the new database might not be quite
> the same as the old one, because the materialized view might have been
> out of date before, and when you recreate it, it'll get refreshed.
> I'm not sure there's much we can/should do about that, though.
>

yes, it is interesting, of course, there can be real-life example, where 
population on creating is needed and is not, and I'm thinking of 
solution similar to Oracle or DB2. Add some option to creating MV, that 
enable/disable population on creating:

http://www.ibm.com/developerworks/data/library/techarticle/dm-0708khatri/

Oracle:  CREATE MATERIALIZED VIEW mvname  [ BUILD [IMMEDIATE | DEFERRED] ]  AS SELECT ..

DB2:  CREATE TABLE mvname  AS SELECT ...  [ INITIALLY DEFERRED | IMMEDIATE ]

>> 6) psql works too, new command \dm[S+] was added to the list
>>   \d[S+] [PATTERN]   - lists all db objects like tables, view, materialized
>> view and sequences
>>   \dm[S+] [PATTERN]  - lists all materialized views
>>

I also noticed I forgot handle options \dp and \dpp, this should be OK 
in next version of patch.

>> 7) there are some docs too, but I guess it is not enough, at least my
>> english will need to correct
>
> If we're going to treat materialized views as a separate object type,
> you probably need to break out the docs for CREATE MATERIALIZED VIEW,
> ALTER MATERIALIZED VIEW, and DROP MATERIALIZED VIEW into their own
> pages, rather than having then mixed up with corresponding pages for
> regular views.
>

Yeah, that was problem I just solved like that here, but I confess this 
would be better.


>> In progress:
>> - regression tests
>> - behavior of various ALTER commands, ie SET STATISTIC, CLUSTER ON,
>> ENABLE/DISABLE RULE, etc.
>
> This isn't right:
>
> rhaas=# create view v as select * from t;
> CREATE VIEW
> rhaas=# alter view v refresh;
> ERROR:  unrecognized alter table type: 41
>

I know, cases like that will be more than that. Thats why I work on good 
tests now.

> Please add your patch here, so that it will be reviewed during the
> about-to-begin CommitFest.
>
> https://commitfest.postgresql.org/action/commitfest_view/open
>

OK, but will you help me with that form? Do you think I can fill it like 
that? I'm not sure about few fields ..

Name:             Snapshot materialized views
CommitFest Topic: [ Miscellaneous | SQL Features ] ???
Patch Status:     Needs review
Author:           me
Reviewers:        You?
Commiters:        who?

and I quess fields 'Date Closed' and 'Message-ID for Original Patch' 
will be filled later.


thanks a lot


Pavel Baros



Re: - GSoC - snapshot materialized view (work-in-progress) patch

From
Pavel Baroš
Date:
<div class="moz-text-flowed" lang="x-central-euro" style="font-family: -moz-fixed; font-size: 12px;">Dne 9.7.2010
21:33,Robert Haas napsal(a): <br /><blockquote style="color: rgb(0, 0, 0);" type="cite">2010/7/8 Pavel Baroš<a
class="moz-txt-link-rfc2396E"href="mailto:baros.p@seznam.cz"><baros.p@seznam.cz></a>: <br /><blockquote
style="color:rgb(0, 0, 0);" type="cite">Description of patch: <br /> 1) can create MV, and is created uninitialized
withdata <br />    CREATE MATERIALIZED VIEW mvname AS SELECT ... <br /></blockquote><br /> This doesn't seem
acceptable. It should populate it on creation. <br /><br /></blockquote><br /> Yes, it would be better, in addition,
trueis, this behavior will be required if is expected to implement incremental MV in the close future. <br /><br
/><blockquotestyle="color: rgb(0, 0, 0);" type="cite"><blockquote style="color: rgb(0, 0, 0);" type="cite">2) can
refreshMV <br />    ALTER MATERIALIZED VIEW mvname REFRESH <br /><br /> 3) MV cannot be modified by DML commands
(INSERT,UPDATE and DELETE are not <br /> permitted) <br /><br /> 4) index can be created and used with MV <br /><br />
5)pg_dump is repaired, in previous patch dump threw error, now dont, but it <br /> is sort of dummy, I want to reach
state,where refreshing command will be <br /> posed after all COPY statements (when all data are in tables). In this
patch<br /> REFRESH command is right behind CREATE MV command. <br /></blockquote><br /> Hmm... ISTM that you probably
needsome kind of dependency stuff in <br /> here to make the materialized view get created after the tables it <br />
dependson have been populated with data.  It needs to work with <br /> parallel restore, too.  I'm not sure exactly how
thedependency stuff <br /> in pg_dump works, though. <br /><br /></blockquote><br /> never mind in case MV will be
populatedon creation. <br /><br /><blockquote style="color: rgb(0, 0, 0);" type="cite">A subtle point here is that if
youdump and restore a database <br /> containing a materialized view, the new database might not be quite <br /> the
sameas the old one, because the materialized view might have been <br /> out of date before, and when you recreate it,
it'llget refreshed. <br /> I'm not sure there's much we can/should do about that, though. <br /><br /></blockquote><br
/>yes, it is interesting, of course, there can be real-life example, where population on creating is needed and is not,
andI'm thinking of solution similar to Oracle or DB2. Add some option to creating MV, that enable/disable population on
creating:<br /><br /><a class="moz-txt-link-freetext"
href="http://www.ibm.com/developerworks/data/library/techarticle/dm-0708khatri/">http://www.ibm.com/developerworks/data/library/techarticle/dm-0708khatri/</a><br
/><br/> Oracle: <br />   CREATE MATERIALIZED VIEW mvname <br />   [ BUILD [IMMEDIATE | DEFERRED] ] <br />   AS SELECT
..<br /><br /> DB2: <br />   CREATE TABLE mvname <br />   AS SELECT ... <br />   [ INITIALLY DEFERRED | IMMEDIATE ] <br
/><br/><blockquote style="color: rgb(0, 0, 0);" type="cite"><blockquote style="color: rgb(0, 0, 0);" type="cite">6)
psqlworks too, new command \dm[S+] was added to the list <br />   \d[S+] [PATTERN]   - lists all db objects like
tables,view, materialized <br /> view and sequences <br />   \dm[S+] [PATTERN]  - lists all materialized views <br
/><br/></blockquote></blockquote><br /> I also noticed I forgot handle options \dp and \dpp, this should be OK in next
versionof patch. <br /><br /><blockquote style="color: rgb(0, 0, 0);" type="cite"><blockquote style="color: rgb(0, 0,
0);"type="cite">7) there are some docs too, but I guess it is not enough, at least my <br /> english will need to
correct<br /></blockquote><br /> If we're going to treat materialized views as a separate object type, <br /> you
probablyneed to break out the docs for CREATE MATERIALIZED VIEW, <br /> ALTER MATERIALIZED VIEW, and DROP MATERIALIZED
VIEWinto their own <br /> pages, rather than having then mixed up with corresponding pages for <br /> regular views.
<br/><br /></blockquote><br /> Yeah, that was problem I just solved like that here, but I confess this would be better.
<br/><br /><br /><blockquote style="color: rgb(0, 0, 0);" type="cite"><blockquote style="color: rgb(0, 0, 0);"
type="cite">Inprogress: <br /> - regression tests <br /> - behavior of various ALTER commands, ie SET STATISTIC,
CLUSTERON, <br /> ENABLE/DISABLE RULE, etc. <br /></blockquote><br /> This isn't right: <br /><br /> rhaas=# create
viewv as select * from t; <br /> CREATE VIEW <br /> rhaas=# alter view v refresh; <br /> ERROR:  unrecognized alter
tabletype: 41 <br /><br /></blockquote><br /> I know, cases like that will be more than that. Thats why I work on good
testsnow. <br /><br /><blockquote style="color: rgb(0, 0, 0);" type="cite">Please add your patch here, so that it will
bereviewed during the <br /> about-to-begin CommitFest. <br /><br /><a class="moz-txt-link-freetext"
href="https://commitfest.postgresql.org/action/commitfest_view/open">https://commitfest.postgresql.org/action/commitfest_view/open</a><br
/><br/></blockquote><br /> OK, but will you help me with that form? Do you think I can fill it like that? I'm not sure
aboutfew fields .. <br /><br /> Name:             Snapshot materialized views <br /> CommitFest Topic: [ Miscellaneous
|SQL Features ] ??? <br /> Patch Status:     Needs review <br /> Author:           me <br /> Reviewers:        You? <br
/>Commiters:        who? <br /><br /> and I quess fields 'Date Closed' and 'Message-ID for Original Patch' will be
filledlater. <br /><br /><br /> thanks a lot <br /><br /><br /> Pavel Baros <br /><br /></div> 

Re: - GSoC - snapshot materialized view (work-in-progress) patch

From
"Kevin Grittner"
Date:
Pavel Baroš<baros.p@seznam.cz> wrote:
> Dne 9.7.2010 21:33, Robert Haas napsal(a):
>> Please add your patch here, so that it will be reviewed during
>> the about-to-begin CommitFest.
>>
>> https://commitfest.postgresql.org/action/commitfest_view/open
>>
> 
> OK, but will you help me with that form? Do you think I can fill
> it like that? I'm not sure about few fields ..
> 
> Name:             Snapshot materialized views
> CommitFest Topic: [ Miscellaneous | SQL Features ] ???
SQL Features seems reasonable to me.
> Patch Status:     Needs review
> Author:           me
> Reviewers:        You?
Leave empty.  Reviewers will sign up or be assigned.
> Commiters:        who?
That comes much later -- when the patch is complete and has a
favorable review, then a committer will pick it up.
> and I quess fields 'Date Closed' and 'Message-ID for Original
> Patch' will be filled later.
Date closed is only set for patches which are committed, returned
with feedback (for a later CommitFest), or rejected.  When you make
an entry which references a post to the lists, you should fill in
the Message-ID from the email header of the post.  You may be able
to get this from your email software as soon as you send the post;
if not, you can find it on the archive page for the post.
-Kevin


Re: - GSoC - snapshot materialized view (work-in-progress) patch

From
Thom Brown
Date:
2010/7/12 Kevin Grittner <Kevin.Grittner@wicourts.gov>:
> Pavel Baroš<baros.p@seznam.cz> wrote:
>> Dne 9.7.2010 21:33, Robert Haas napsal(a):
>
>>> Please add your patch here, so that it will be reviewed during
>>> the about-to-begin CommitFest.
>>>
>>> https://commitfest.postgresql.org/action/commitfest_view/open
>>>
>>
>> OK, but will you help me with that form? Do you think I can fill
>> it like that? I'm not sure about few fields ..
>>
>> Name:             Snapshot materialized views
>> CommitFest Topic: [ Miscellaneous | SQL Features ] ???
>
> SQL Features seems reasonable to me.
>
>> Patch Status:     Needs review
>> Author:           me
>> Reviewers:        You?
>
> Leave empty.  Reviewers will sign up or be assigned.
>
>> Commiters:        who?
>
> That comes much later -- when the patch is complete and has a
> favorable review, then a committer will pick it up.
>
>> and I quess fields 'Date Closed' and 'Message-ID for Original
>> Patch' will be filled later.
>
> Date closed is only set for patches which are committed, returned
> with feedback (for a later CommitFest), or rejected.  When you make
> an entry which references a post to the lists, you should fill in
> the Message-ID from the email header of the post.  You may be able
> to get this from your email software as soon as you send the post;
> if not, you can find it on the archive page for the post.

This topic hasn't been touched on in nearly a year, but is the work
that's been done so far salvageable?  I'm not sure what happens to
GSoC project work that doesn't get finished in time.

--
Thom Brown
Twitter: @darkixion
IRC (freenode): dark_ixion
Registered Linux user: #516935

EnterpriseDB UK: http://www.enterprisedb.com
The Enterprise PostgreSQL Company