Re: [HACKERS] [POC] hash partitioning - Mailing list pgsql-hackers

From Yugo Nagata
Subject Re: [HACKERS] [POC] hash partitioning
Date
Msg-id 20170417175042.161c71b4.nagata@sraoss.co.jp
Whole thread Raw
In response to Re: [HACKERS] [POC] hash partitioning  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: [HACKERS] [POC] hash partitioning
List pgsql-hackers
On Fri, 14 Apr 2017 09:05:14 -0400
Robert Haas <robertmhaas@gmail.com> wrote:

> On Fri, Apr 14, 2017 at 4:23 AM, Yugo Nagata <nagata@sraoss.co.jp> wrote:
> > On Thu, 13 Apr 2017 16:40:29 -0400
> > Robert Haas <robertmhaas@gmail.com> wrote:
> >> On Fri, Mar 17, 2017 at 7:57 AM, Yugo Nagata <nagata@sraoss.co.jp> wrote:
> >> > I also understanded that my design has a problem during pg_dump and
> >> > pg_upgrade, and that some information to identify the partition
> >> > is required not depending the command order. However, I feel that
> >> > Amul's design is a bit complicated with the rule to specify modulus.
> >> >
> >> > I think we can use simpler syntax, for example, as below.
> >> >
> >> >  CREATE TABLE h1 PARTITION OF h FOR (0);
> >> >  CREATE TABLE h2 PARTITION OF h FOR (1);
> >> >  CREATE TABLE h3 PARTITION OF h FOR (2);
> >>
> >> I don't see how that can possibly work.  Until you see all the table
> >> partitions, you don't know what the partitioning constraint for any
> >> given partition should be, which seems to me to be a fatal problem.
> >
> > If a partition has an id, the partitioning constraint can be written as
> >
> >  hash_func(hash_key) % N = id
> >
> > wehre N is the number of paritions. Doesn't it work?
> 
> Only if you know the number of partitions.  But with your syntax,
> after seeing only the first of the CREATE TABLE .. PARTITION OF
> commands, what should the partition constraint be?  It depends on how
> many more such commands appear later in the dump file, which you do
> not know at that point.

I thought that the partition constraint could be decided every
time a new partition is created or attached, and that it woule be
needed to relocate records automatically when the partition configuration
changes. However, I have come to think that the automatic relocation
might not be needed at this point.

> 
> >> I agree that Amul's syntax - really, I proposed it to him - is not the
> >> simplest, but I think all the details needed to reconstruct the
> >> partitioning constraint need to be explicit.  Otherwise, I'm pretty
> >> sure things we're going to have lots of problems that we can't really
> >> solve cleanly.  We can later invent convenience syntax that makes
> >> common configurations easier to set up, but we should invent the
> >> syntax that spells out all the details first.
> >
> > I have a question about Amul's syntax. After we create partitions
> > as followings,
> >
> >  create table foo (a integer, b text) partition by hash (a);
> >  create table foo1 partition of foo with (modulus 2, remainder 0);
> >  create table foo2 partition of foo with (modulus 2, remainder 1);
> >
> > we cannot create any additional partitions for the partition.
> >
> > Then, after inserting records into foo1 and foo2, how we can
> > increase the number of partitions?
> 
> You can detach foo1, create two new partitions with modulus 4 and
> remainders 0 and 2, and move the data over from the old partition.
> 
> I realize that's not as automated as you might like, but it's no worse
> than what is currently required for list and range partitioning when
> you split a partition.  Someday we might build in tools to do that
> kind of data migration automatically, but right now we have none.

Thanks. I understood it. The automatic data migration feature 
would be better to be implemented separately.

> 
> -- 
> Robert Haas
> EnterpriseDB: http://www.enterprisedb.com
> The Enterprise PostgreSQL Company


-- 
Yugo Nagata <nagata@sraoss.co.jp>



pgsql-hackers by date:

Previous
From: Arthur Zakirov
Date:
Subject: Re: [HACKERS] Extracting GiST index structure stats?
Next
From: Fabien COELHO
Date:
Subject: Re: [HACKERS] Variable substitution in psql backtick expansion