Re: Multi-Column List Partitioning - Mailing list pgsql-hackers
From | Nitin Jadhav |
---|---|
Subject | Re: Multi-Column List Partitioning |
Date | |
Msg-id | CAMm1aWaXqSca6Zv4Wez+278vcvinpJT8ZTJZNCkhj88hh+CGzg@mail.gmail.com Whole thread Raw |
In response to | Re: Multi-Column List Partitioning (Jeevan Ladhe <jeevan.ladhe@enterprisedb.com>) |
List | pgsql-hackers |
Thanks Jeevan for looking into this thread.
> I did not review the patch in detail, but a quick look at it leaves me
> with following comments:
I will incorporate these changes.
> ...FOR VALUES IN (1, 2, 3, 4), (11, 22, 33, 44), where the first set
> is the list for values of column A and second list is for column B. We
> can treat these lists as A X B possible values or simply (a1, b1), (a2,
> b2) internally. However I see other proprietary databases already have
> syntax something similar that you are proposing here. So, I leave it
> open for the thoughts from experts. Also, though what I propose might be
> easy from a user perspective, but might not be that easy for
> implementation, given that for a larger number of columns in partition list
> e.g. A X B X C X D lists become unmanageable.
I feel this is also not easy from a user's perspective. For example for a partition
with 2 partition keys (a,b) for values like (1,1), (1,2), (1,3), (1,4),(1,5). This
would be converted to (1,1,1,1,1), (1,2,3,4,5). It is difficult to match the values
of column 'a' to 'b'. Anyways let's wait for the other's opinion about this.
Thanks & Regards,
Nitin Jadhav
On Fri, May 7, 2021 at 7:36 PM Jeevan Ladhe <jeevan.ladhe@enterprisedb.com> wrote:
> While reviewing one of the 'Table partitioning' related patches,> I found that Postgres does not support multiple column based LIST> partitioning. Based on this understanding, I have started working on> this feature. I also feel that 'Multi-Column List Partitioning' can> be benefited to the Postgres users in future.+1 for the feature. I also think this can help users deal with someuseful cases.> CREATE TABLE t2_1 PARTITION OF t2 FOR VALUES IN (1, 2), (1, 5), (2,> 2),(2, 10);IMHO, listing every single tuple like this might be a bit cumbersome forthe user. What about something like this:...FOR VALUES IN (1, 2, 3, 4), (11, 22, 33, 44), where the first setis the list for values of column A and second list is for column B. Wecan treat these lists as A X B possible values or simply (a1, b1), (a2,b2) internally. However I see other proprietary databases already havesyntax something similar that you are proposing here. So, I leave itopen for the thoughts from experts. Also, though what I propose might beeasy from a user perspective, but might not be that easy forimplementation, given that for a larger number of columns in partition liste.g. A X B X C X D lists become unmanageable.I did not review the patch in detail, but a quick look at it leaves mewith following comments:1.> + * list. Then this function will continue the serach and return the index ofTypo:s/serach/search2.A compiler warning:partprune.c: In function ‘get_matching_list_bounds’:partprune.c:2731:20: error: passing argument 5 of ‘partition_list_bsearch’ makes pointer from integer without a cast [-Werror=int-conversion]2731 | nvalues, value, &is_equal);| ^~~~~| || Datum {aka long unsigned int}In file included from partprune.c:53:../../../src/include/partitioning/partbounds.h:120:32: note: expected ‘Datum *’ {aka ‘long unsigned int *’} but argument is of type ‘Datum’ {aka ‘long unsigned int’}120 | int nvalues, Datum *value, bool *is_equal);| ~~~~~~~^~~~~3.And, a server crash with following case:postgres=# CREATE TABLE t1 (a int) PARTITION BY LIST (a);CREATE TABLEpostgres=# CREATE TABLE t1p1 PARTITION OF t1 FOR VALUES IN (1, 2, 3);CREATE TABLEpostgres=# \d+ t1p1server closed the connection unexpectedlyThis probably means the server terminated abnormallybefore or while processing the request.The connection to the server was lost. Attempting reset: Failed.!?>Stacktrace:(gdb) bt#0 __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50#1 0x00007f5d273c5859 in __GI_abort () at abort.c:79#2 0x000055779d2eb69d in ExceptionalCondition (conditionName=0x55779d4978d8 "ptr == NULL || nodeTag(ptr) == type", errorType=0x55779d4978c3 "FailedAssertion",fileName=0x55779d4978a0 "../../../src/include/nodes/nodes.h", lineNumber=603) at assert.c:69#3 0x000055779d03a684 in castNodeImpl (type=T_Const, ptr=0x55779e457b18) at ../../../src/include/nodes/nodes.h:603#4 0x000055779d04368a in get_qual_for_list (parent=0x7f5d1df829b8, spec=0x55779e457950) at partbounds.c:4155#5 0x000055779d03ac60 in get_qual_from_partbound (rel=0x7f5d1df82570, parent=0x7f5d1df829b8, spec=0x55779e457950) at partbounds.c:272#6 0x000055779d2cf630 in generate_partition_qual (rel=0x7f5d1df82570) at partcache.c:379#7 0x000055779d2cf468 in get_partition_qual_relid (relid=32771) at partcache.c:308#8 0x000055779d2592bf in pg_get_partition_constraintdef (fcinfo=0x55779e44ee50) at ruleutils.c:2019#9 0x000055779cec7221 in ExecInterpExpr (state=0x55779e44dfb0, econtext=0x55779e407fe8, isnull=0x7ffddf9b109f) at execExprInterp.c:744#10 0x000055779cec954f in ExecInterpExprStillValid (state=0x55779e44dfb0, econtext=0x55779e407fe8, isNull=0x7ffddf9b109f) at execExprInterp.c:1819#11 0x000055779cf1d58a in ExecEvalExprSwitchContext (state=0x55779e44dfb0, econtext=0x55779e407fe8, isNull=0x7ffddf9b109f)at ../../../src/include/executor/executor.h:338#12 0x000055779cf1d602 in ExecProject (projInfo=0x55779e44dfa8) at ../../../src/include/executor/executor.h:372#13 0x000055779cf1db2f in ExecNestLoop (pstate=0x55779e407ed0) at nodeNestloop.c:241#14 0x000055779cedf136 in ExecProcNodeFirst (node=0x55779e407ed0) at execProcnode.c:462#15 0x000055779ced3053 in ExecProcNode (node=0x55779e407ed0) at ../../../src/include/executor/executor.h:257#16 0x000055779ced5a87 in ExecutePlan (estate=0x55779e407c80, planstate=0x55779e407ed0, use_parallel_mode=false, operation=CMD_SELECT, sendTuples=true, numberTuples=0,direction=ForwardScanDirection, dest=0x55779e425a88, execute_once=true) at execMain.c:1551#17 0x000055779ced372d in standard_ExecutorRun (queryDesc=0x55779e453520, direction=ForwardScanDirection, count=0, execute_once=true) at execMain.c:361#18 0x000055779ced353c in ExecutorRun (queryDesc=0x55779e453520, direction=ForwardScanDirection, count=0, execute_once=true) at execMain.c:305#19 0x000055779d13d287 in PortalRunSelect (portal=0x55779e398800, forward=true, count=0, dest=0x55779e425a88) at pquery.c:912#20 0x000055779d13cec0 in PortalRun (portal=0x55779e398800, count=9223372036854775807, isTopLevel=true, run_once=true, dest=0x55779e425a88, altdest=0x55779e425a88,qc=0x7ffddf9b14f0) at pquery.c:756#21 0x000055779d1361ce in exec_simple_query (query_string=0x55779e3367a0 "SELECT inhparent::pg_catalog.regclass,\n pg_catalog.pg_get_expr(c.relpartbound, c.oid),\n inhdetachpending,\n pg_catalog.pg_get_partition_constraintdef(c.oid)\nFROM pg_catalog.pg_class c JOIN pg_catalo"...) at postgres.c:1214#22 0x000055779d13ad8b in PostgresMain (argc=1, argv=0x7ffddf9b1710, dbname=0x55779e3626f8 "postgres", username=0x55779e3626d8 "hadoop") at postgres.c:4476#23 0x000055779d0674d3 in BackendRun (port=0x55779e358380) at postmaster.c:4488#24 0x000055779d066d8c in BackendStartup (port=0x55779e358380) at postmaster.c:4210#25 0x000055779d062f9b in ServerLoop () at postmaster.c:1742#26 0x000055779d062734 in PostmasterMain (argc=3, argv=0x55779e3308b0) at postmaster.c:1414#27 0x000055779cf5805f in main (argc=3, argv=0x55779e3308b0) at main.c:209Regards,Jeevan LadheOn Thu, May 6, 2021 at 7:33 PM Nitin Jadhav <nitinjadhavpostgres@gmail.com> wrote:Hi,While reviewing one of the 'Table partitioning' related patches, I found that Postgres does not support multiple column based LIST partitioning. Based on this understanding, I have started working on this feature. I also feel that 'Multi-Column List Partitioning' can be benefited to the Postgres users in future.I am attaching the WIP patch for this feature here. It supports 'Multi-Column List Partitioning', however some tasks are still pending. I would like to know your thoughts about this, So that I can continue the work with improvising the current patch.Following things are handled in the patch.1. SyntaxCREATE TABLE table_name (attrs) PARTITION BY LIST(list_of_columns);Earlier there was no provision to mention multiple columns as part of the 'list_of_columns' clause. Now we can mention the list of columns separated by comma.CREATE TABLE table_name_p1 PARTITION OF table_name FOR VALUES IN list_of_values.Whereas list_of_columns can bea. (value [,...])b. (value [,...]) [,...]I would like to list a few examples here for better understanding.Ex-1:CREATE TABLE t1(a int) PARTITION BY LIST(a);CREATE TABLE t1_1 PARTITION OF t1 FOR VALUES IN (1, 2, 10, 5, 7);Ex-2:CREATE TABLE t2(a int, b int) PARTITION BY LIST(a,b);CREATE TABLE t2_1 PARTITION OF t2 FOR VALUES IN (1, 2), (1, 5), (2, 2),(2, 10);Please share if any changes are required in the above syntax.2. Modified transformation logic to support above syntax.3. Modified the data structures to store the information caused by above syntax. Also modified the searching logic to route the tuple to the appropriate partition.4. Done a few basic testing and verified CREATE TABLE, INSERT INTO and SELECT are working fine.Following items are pending and I am working on it.1. Handling of 'NULL' values.2. Support multi column case in partition pruning.3. Add test cases to the regression test suite.Please share your thoughts.Thanks & Regards,Nitin Jadhav
pgsql-hackers by date: