Re: GSOC PostgreSQL partitioning issue - Mailing list pgsql-hackers

From Necati Batur
Subject Re: GSOC PostgreSQL partitioning issue
Date
Msg-id z2w7c3006191004090610u8600c63dka4d1c70c780de9b6@mail.gmail.com
Whole thread Raw
In response to GSOC PostgreSQL partitioning issue  (Necati Batur <necatibatur@gmail.com>)
Responses Re: GSOC PostgreSQL partitioning issue  (Robert Haas <robertmhaas@gmail.com>)
List pgsql-hackers
Hi all,
I am new at open source project however in a user point of view I must confess that usability is a really though issue ,even if the performance of a database is crucial.

As to my idea for improve postgresql is ;
http://www.postgresql.org/docs/current/interactive/ddl-partitioning.html  in cavetaes section is mentioned that
"The schemes shown here assume that the partition key column(s) of a row never change, or at least do not change enough to require it to move to another partition. An UPDATE that attempts to do that will fail because of the CHECK constraints. If you need to handle such cases, you can put suitable update triggers on the partition tables, but it makes management of the structure much more complicated."

Fixing this issue will help to improve the usability of partitions since the users do not want to deal with low-level integrity issues such as CHECK constraint.

Roughly, I can say that if we want to deal with this issue,the first operation would be writing a trigger to check if an update operation causes a transfer issue between partitions.Then, if it is inevitable the user should be prompted about they are doing. Warning the system or user would generallry causes more trouble this point we need to decide on possible fixing ways and give more details about which choise will cause in what results. Then, creating a temprory table before commiting something will hellp us to conrol completeness and correctness.

I tried to give more details about what I want to do.If you anything should be fixed in my proposal please earn me.
Thanks 

2010/4/8 Necati Batur <necatibatur@gmail.com>
Benefits of Project

Partitioning refers to splitting what is logically one large table
into smaller physical pieces. Partitioning can provide several
benefits:

Query performance can be improved dramatically in certain situations,
particularly when most of the heavily accessed rows of the table are
in a single partition or a small number of partitions. The
partitioning substitutes for leading columns of indexes, reducing
index size and making it more likely that the heavily-used parts of
the indexes fit in memory.

When queries or updates access a large percentage of a single
partition, performance can be improved by taking advantage of
sequential scan of that partition instead of using an index and random
access reads scattered across the whole table.

Bulk loads and deletes can be accomplished by adding or removing
partitions, if that requirement is planned into the partitioning
design. ALTER TABLE is far faster than a bulk operation. It also
entirely avoids the VACUUM overhead caused by a bulk DELETE.

Seldom-used data can be migrated to cheaper and slower storage media.

Delivarables

*The trigger based operations can be done automatically

*The stored procedures can help us to do some functionalities like
check constraint problem

*manual VACUUM or ANALYZE commands can be handled by using triggers
DBMS SQL can help to provide faster executions

*Some more functionalities can be added to UPDATE operations to make
administrations easy

Timeline (not exact but most probably)

Start at june 7 and End around 7 september

*Warm up to environment to Postgresql(1-2 weeks)

*Determine exact operations to be addded on postgresql

*Initial coding as to workbreakdown structure

*Start implementing on distributed environment to check inital functions work

*Write test cases for code

*Further implementation to support full functionalities on ideas

*Write it to discussion site and collect feedbacks

*More support upon feedbacks

*Last tests and documentation of final operations

About me

I am a senior student at computer engineering at iztech in turkey. My
areas of inetrests are information management, OOP(Object Oriented
Programming) and currently bioinformatics. I have been working with a
Asistan Professor(Jens Allmer) in molecular biology genetics
department for one year.Firstly, we worked on a protein database 2DB
and we presented the project in HIBIT09 organization. The Project  was
“Database management system independence by amending 2DB with a
database access layer”. Currently, I am working on another project
(Kerb) as my senior project which is a general sqeuential task
management system intend to reduce the errors and increase time saving
in biological experiments. We will present this project in HIBIT2010
too. Moreover,I am good at data structures and implementations on C.


Contact: e-mails; necatibatur@gmail.com , necati_batur@hotmail.com(msn)

pgsql-hackers by date:

Previous
From: Magnus Hagander
Date:
Subject: Re: Win32 timezone matching
Next
From: Magnus Hagander
Date:
Subject: is_absolute_path incorrect on Windows