GSoC - proposal - Materialized Views in PostgreSQL - Mailing list pgsql-hackers

From pavelbaros
Subject GSoC - proposal - Materialized Views in PostgreSQL
Date
Msg-id hpo33i$7am$1@news.hub.org
Whole thread Raw
Responses Re: GSoC - proposal - Materialized Views in PostgreSQL  ("Kevin Grittner" <Kevin.Grittner@wicourts.gov>)
Re: GSoC - proposal - Materialized Views in PostgreSQL  (Robert Haas <robertmhaas@gmail.com>)
Re: GSoC - proposal - Materialized Views in PostgreSQL  (Greg Smith <greg@2ndquadrant.com>)
Re: GSoC - proposal - Materialized Views in PostgreSQL  (Josh Berkus <josh@agliodbs.com>)
List pgsql-hackers
Hello,

I am sending my proposal on GSoC. Details are listed below. Please, if 
you have ideas, tips, or if you only want to say you opinion about my 
project, go ahead.

thanks,
Pavel Baros

Abstract:

It is effort to implement snapshot materialized view (are only updated 
when refreshed) in PostgreSQL. In this time I finished some part of it 
and I am trying to publish my present work on git.postgresql.org


Benefits to the PostgreSQL Community

First of all, it would be the best if my work is helpful to everybody 
who misses materialized views in PostgreSQL, because PostgreSQL do not 
have still implemented materialized views. In addition, MV is mentioned 
as feature in TODO list.


Deliverables

First of all, at the end of whole my project is not only writing 
bachelors thesis, but finish it as patch and if possible, get patch into 
next PostgrSQL release, or keep git repository actual to last PosgreSQL 
version. I have also personal goals. Arouse the interest about 
implementing MV in PostgreSQL, or at least arouse discussion about it.


Project Schedule

My work goes quite good, I am on good way to finish main parts on 
backend in few weeks. After that I will make and run tests and implement 
related changes to PosgreSQL tools (psql, pg_dump, etc.). I am also 
waiting for approval for my repository named "materialized_view" on 
git.postgresql.org, so I could publish completed parts. For now next 
step will be to discuss implementation on postgresql.hackers.


Bio

I am from Czech Republic and I am studying on Faculty of Electrical 
Engineering on Czech Technical University in Prague www.fel.cvut.cz/en/. 
My bachelor thesis is based on this project, implementing MV in PostgreSQL.

I've experienced many different jobs. The best experience for me was, 
when I've worked as tester and software engineer in C/C++ and C# for 
Radiant Systems Inc. for more than year. After that I've worked as Web 
developer with Internet technologies (PHP, HTML, CSS, ...), where the 
goal was to make an internal system for an advertising agency. Finally 
my recent job was as Windows Mobile Developer. Except the first 
experience, others lasts only few months mainly because those were 
temporary projects. For now I am looking for some part time job, of 
course, preferably something closer to database systems.


Implementation:  could be divided to few steps:

1) create materialized view
- modify grammar (parser): CREATE MATERIALIZED VIEW mv_name AS SELECT ...
- change executor, so that it will create physical table defined by 
select statement

2) change rewriter
- usually, view is relation with defined rule and when rewriting, rule 
is fired and relation (view) is replaced by definition of view. If 
relation do not have rule, planner and executor behave to it as physical 
table (relation). In case of materialized view we want to rewrite select 
statement only in case when we refreshing MV. In other cases rewriter 
should skip rewriting and pick up physical relation. Exclude situation 
when other rewrite rules which are not related to MV definition are 
specified.

3) create command that takes snapshot (refresh MV)
- modify grammar: ALTER MATERIALIZED VIEW mv_name REFRESH;
- taking snapshot (refreshing) is similar to command "SELECT INTO ..." 
and I decided to follow the way it works. After parsing query and before 
transformation is MANUALLY created tree representation of "SELECT * INTO 
..." with flag IntoClause->isrefresh set true, indicating it is 
refreshing materialized view. Everithing acts as it would be regular 
"SELECT INTO ..." except functions OpenIntoRel() and CloseIntoRel(). In 
function OpenIntoRel is created temp table (without catalog) and set as 
destination for result of select. In function CloseIntoRel executor swap 
relfilenode's of temp table and original table and finally delete temp 
table. Behavior of CloseIntoRel function is inspired by CLUSTER statement.


Contacts:  baros.p@seznam.cz


pgsql-hackers by date:

Previous
From: Andrew Dunstan
Date:
Subject: Re: Gsoc XQuery
Next
From: Jaime Casanova
Date:
Subject: testing hot standby