Re: [HACKERS] [POC] hash partitioning - Mailing list pgsql-hackers
From | amul sul |
---|---|
Subject | Re: [HACKERS] [POC] hash partitioning |
Date | |
Msg-id | CAAJ_b95eMdLuiM2_gAZ=Bz3yoXg+OG1mHs6H4US5eRhVptPwRg@mail.gmail.com Whole thread Raw |
In response to | Re: [HACKERS] [POC] hash partitioning (Robert Haas <robertmhaas@gmail.com>) |
Responses |
Re: [HACKERS] [POC] hash partitioning
|
List | pgsql-hackers |
On Fri, May 19, 2017 at 10:35 PM, Robert Haas <robertmhaas@gmail.com> wrote: > On Fri, May 19, 2017 at 5:32 AM, amul sul <sulamul@gmail.com> wrote: >> Updated patch attached. 0001-patch rebased against latest head. >> 0002-patch also incorporates code comments and error message changes >> as per Robert's & your suggestions. Thanks ! > > - if (spec->strategy != PARTITION_STRATEGY_LIST) > - elog(ERROR, "invalid strategy in partition bound spec"); > + Assert(spec->strategy == PARTITION_STRATEGY_LIST); > > Let's just drop these hunks. I realize this is a response to a review > comment I made, but I take it back. If the existing code is already > doing it this way, there's no real need to revise it. The patch > doesn't even make it consistent anyway, since elsewhere you elog() for > a similar case. Perhaps elog() is best anyway. > Done. > - partitioning methods include range and list, where each partition is > - assigned a range of keys and a list of keys, respectively. > + partitioning methods include hash, range and list, where each partition is > + assigned a modulus and remainder of keys, a range of keys and a list of > + keys, respectively. > > I think this sentence has become too long and unwieldy, and is more > unclear than helpful. I'd just write "The currently supported > partitioning methods are list, range, and hash." The use of the word > include is actually wrong here, because it implies that there are more > not mentioned here, which is false. > Done. > - expression. If no btree operator class is specified when creating a > - partitioned table, the default btree operator class for the datatype will > - be used. If there is none, an error will be reported. > + expression. List and range partitioning uses only btree operator class. > + Hash partitioning uses only hash operator class. If no operator class is > + specified when creating a partitioned table, the default operator class > + for the datatype will be used. If there is none, an error will be > + reported. > + </para> > > I suggest: If no operator class is specified when creating a > partitioned table, the default operator class of the appropriate type > (btree for list and range partitioning, hash for hash partitioning) > will be used. If there is none, an error will be reported. > Done. > + <para> > + Since hash partitiong operator class, provide only equality, > not ordering, > + collation is not relevant in hash partition key column. An error will be > + reported if collation is specified. > > partitiong -> partitioning. Also, remove the comma after "operator > class" and change "not relevant in hash partition key column" to "not > relevant for hash partitioning". Also change "if collation is > specified" to "if a collation is specified". > Done. > + Create a hash partitioned table: > +<programlisting> > +CREATE TABLE orders ( > + order_id bigint not null, > + cust_id bigint not null, > + status text > +) PARTITION BY HASH (order_id); > +</programlisting></para> > > Move this down so it's just above the example of creating partitions. > Done. > + * For range and list partitioned tables, datums is an array of datum-tuples > + * with key->partnatts datums each. > + * For hash partitioned tables, it is an array of datum-tuples with 2 datums, > + * modulus and remainder, corresponding to a given partition. > > Second line is very short; reflow as one paragraph. > Done > * In case of range partitioning, it stores one entry per distinct range > * datum, which is the index of the partition for which a given datum > * is an upper bound. > + * In the case of hash partitioning, the number of the entries in the indexes > + * array is same as the greatest modulus amongst all partitions. For a given > + * partition key datum-tuple, the index of the partition which would > accept that > + * datum-tuple would be given by the entry pointed by remainder produced when > + * hash value of the datum-tuple is divided by the greatest modulus. > > Insert line break before the new text as a paragraph break. Will wait for more inputs on Ashutosh's explanation upthread. > > + char strategy; /* hash, list or range bounds? */ > > Might be clearer to just write /* hash, list, or range? */ or /* > bounds for hash, list, or range? */ > Done, used "hash, list, or range?" > > +static uint32 compute_hash_value(PartitionKey key, Datum *values, > bool *isnull); > + > > I think there should be a blank line after this but not before it. > Done. > I don't really see why hash partitioning needs to touch > partition_bounds_equal() at all. Why can't the existing logic work > for hash partitioning without change? > Unlike list and range partition, ndatums does not represents size of the indexes array, also dimension of datums array in not the same as a key->partnatts. > + valid_bound = true; > > valid_modulus, maybe? > Sure, added. > - errmsg("data type %s has no default btree operator class", > - format_type_be(atttype)), > - errhint("You must specify a btree operator > class or define a default btree operator class for the data type."))); > + errmsg("data type %s has no default %s operator class", > + format_type_be(atttype), am_method), > + errhint("You must specify a %s operator > class or define a default %s operator class for the data type.", > + am_method, am_method))); > > Let's use this existing wording from typecmds.c: > > errmsg("data type %s has no default operator > class for access method \"%s\"", > > and for the hint, maybe: You must specify an operator class or define > a default operator class for the data type. Leave out the %s, in > other words. > Done. > + /* > + * Hash operator classes provide only equality, not ordering. > + * Collation, which is relevant for ordering and not for equality, is > + * irrelevant for hash partitioning. > + */ > + if (*strategy == PARTITION_STRATEGY_HASH && pelem->collation != NIL) > + ereport(ERROR, > + (errcode(ERRCODE_FEATURE_NOT_SUPPORTED), > + errmsg("cannot use collation for hash partitioning"), > + parser_errposition(pstate, pelem->location))); > > This error message is not very informative, and it requires > propagating information about the partitioning type into parts of the > code that otherwise don't require it. I was waffling before on > whether to ERROR here; I think now I'm in favor of ignoring the > problem. The collation won't do any harm; it just won't affect the > behavior. > Removed. > + * Identify opclass to use. For list and range partitioning we use > + * only btree operator class, which seems enough for those. For hash > + * partitioning, we use hash operator class. > > Strange wording. Suggest: Identify the appropriate operator class. > For list and range partitioning, we use a btree operator class; hash > partitioning uses a hash operator class. > Done > + FOR VALUES WITH '(' hash_partbound ')' /*TODO: syntax is > not finalised*/ > > Remove the comment. > Done. > + foreach (lc, $5) > + { > + DefElem *opt = (DefElem *) lfirst(lc); > + > + if (strcmp(opt->defname, "modulus") == 0) > + n->modulus = defGetInt32(opt); > + else if (strcmp(opt->defname, "remainder") == 0) > + n->remainder = defGetInt32(opt); > + else > + ereport(ERROR, > + (errcode(ERRCODE_SYNTAX_ERROR), > + errmsg("unrecognized hash > partition bound specification \"%s\"", > + opt->defname), > + parser_errposition(opt->location))); > + } > > This logic doesn't complain if the same option is specified more than > once. I suggest adding a check for that, and also pushing this logic > out into a helper function that gets called here instead of including > it inline. > Added duplicate error. About separate helper function, can't we have as it is, because, imo, we might not going to use that elsewhere? > + errmsg("hash partition modulus must be a positive > integer"))); > > modulus for hash partition > > + errmsg("hash partition remainder must be a > non-negative integer"))); > > remainder for hash partition > > + errmsg("hash partition modulus must be greater than remainder"))); > > modulus for hash partition must be greater than remainder > Done. Similar changes in gram.y as well. > +-- values are hashed, row may map to different partitions, which result in > > the row > > +-- regression failure. To avoid this, let's create non-default hash function > > create a non-default Done. Updated patch attached. Thanks a lot for review. Regards, Amul -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Attachment
pgsql-hackers by date: