Thread: [HACKERS] Re: [BUGS] BUG #14657: Server process segmentation fault in v10, May10th dev snapshot
[HACKERS] Re: [BUGS] BUG #14657: Server process segmentation fault in v10, May10th dev snapshot
From
Amit Langote
Date:
On 2017/05/18 10:49, Amit Langote wrote: > On 2017/05/18 2:14, Dilip Kumar wrote: >> On Wed, May 17, 2017 at 7:41 PM, <sveinn.sveinsson@gmail.com> wrote: >>> (gdb) bt >>> #0 0x000000000061ab1b in list_nth () >>> #1 0x00000000005e4081 in ExecLockNonLeafAppendTables () >>> #2 0x00000000005f4d52 in ExecInitMergeAppend () >>> #3 0x00000000005e0365 in ExecInitNode () >>> #4 0x00000000005f35a7 in ExecInitLimit () >>> #5 0x00000000005e00f3 in ExecInitNode () >>> #6 0x00000000005dd207 in standard_ExecutorStart () >>> #7 0x00000000006f96d2 in PortalStart () >>> #8 0x00000000006f5c7f in exec_simple_query () >>> #9 0x00000000006f6fac in PostgresMain () >>> #10 0x0000000000475cdc in ServerLoop () >>> #11 0x0000000000692ffa in PostmasterMain () >>> #12 0x0000000000476600 in main () > > Thanks for the test case Sveinn and thanks Dilip for analyzing. > >> Seems like the issue is that the plans under multiple subroots are >> pointing to the same partitioned_rels. > > That's correct. > >> If I am not getting it wrong "set_plan_refs(PlannerInfo *root, Plan >> *plan, int rtoffset)" the rtoffset is specific to the subroot. Now, >> problem is that set_plan_refs called for different subroot is updating >> the same partition_rel info and make this value completely wrong which >> will ultimately make ExecLockNonLeafAppendTables to access the out of >> bound "rte" index. > > Yes. > >> set_plan_refs >> { >> [clipped] >> case T_MergeAppend: >> { >> [clipped] >> >> foreach(l, splan->partitioned_rels) >> { >> lfirst_int(l) += rtoffset; >> >> >> I think the solution should be that create_merge_append_path make the >> copy of partitioned_rels list? > > Yes, partitioned_rels should be copied. > >> Attached patch fixes the problem but I am not completely sure about the fix. > > Thanks for creating the patch, although I think a better fix would be to > make get_partitioned_child_rels() do the list_copy. That way, any other > users of partitioned_rels will not suffer the same issue. Attached patch > implements that, along with a regression test. > > Added to the open items. Oops, forgot to cc -hackers. Patch attached again. Thanks, Amit -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Attachment
[HACKERS] Re: [BUGS] BUG #14657: Server process segmentation fault in v10, May10th dev snapshot
From
Sveinn Sveinsson
Date:
The patch fixed the problem, thanks a lot. Regards, Sveinn. On fim 18.maí 2017 01:53, Amit Langote wrote: > On 2017/05/18 10:49, Amit Langote wrote: >> On 2017/05/18 2:14, Dilip Kumar wrote: >>> On Wed, May 17, 2017 at 7:41 PM, <sveinn.sveinsson@gmail.com> wrote: >>>> (gdb) bt >>>> #0 0x000000000061ab1b in list_nth () >>>> #1 0x00000000005e4081 in ExecLockNonLeafAppendTables () >>>> #2 0x00000000005f4d52 in ExecInitMergeAppend () >>>> #3 0x00000000005e0365 in ExecInitNode () >>>> #4 0x00000000005f35a7 in ExecInitLimit () >>>> #5 0x00000000005e00f3 in ExecInitNode () >>>> #6 0x00000000005dd207 in standard_ExecutorStart () >>>> #7 0x00000000006f96d2 in PortalStart () >>>> #8 0x00000000006f5c7f in exec_simple_query () >>>> #9 0x00000000006f6fac in PostgresMain () >>>> #10 0x0000000000475cdc in ServerLoop () >>>> #11 0x0000000000692ffa in PostmasterMain () >>>> #12 0x0000000000476600 in main () >> Thanks for the test case Sveinn and thanks Dilip for analyzing. >> >>> Seems like the issue is that the plans under multiple subroots are >>> pointing to the same partitioned_rels. >> That's correct. >> >>> If I am not getting it wrong "set_plan_refs(PlannerInfo *root, Plan >>> *plan, int rtoffset)" the rtoffset is specific to the subroot. Now, >>> problem is that set_plan_refs called for different subroot is updating >>> the same partition_rel info and make this value completely wrong which >>> will ultimately make ExecLockNonLeafAppendTables to access the out of >>> bound "rte" index. >> Yes. >> >>> set_plan_refs >>> { >>> [clipped] >>> case T_MergeAppend: >>> { >>> [clipped] >>> >>> foreach(l, splan->partitioned_rels) >>> { >>> lfirst_int(l) += rtoffset; >>> >>> >>> I think the solution should be that create_merge_append_path make the >>> copy of partitioned_rels list? >> Yes, partitioned_rels should be copied. >> >>> Attached patch fixes the problem but I am not completely sure about the fix. >> Thanks for creating the patch, although I think a better fix would be to >> make get_partitioned_child_rels() do the list_copy. That way, any other >> users of partitioned_rels will not suffer the same issue. Attached patch >> implements that, along with a regression test. >> >> Added to the open items. > Oops, forgot to cc -hackers. Patch attached again. > > Thanks, > Amit
Re: [HACKERS] Re: [BUGS] BUG #14657: Server process segmentationfault in v10, May 10th dev snapshot
From
Ashutosh Bapat
Date:
On Thu, May 18, 2017 at 7:23 AM, Amit Langote <Langote_Amit_f8@lab.ntt.co.jp> wrote: > On 2017/05/18 10:49, Amit Langote wrote: >> On 2017/05/18 2:14, Dilip Kumar wrote: >>> On Wed, May 17, 2017 at 7:41 PM, <sveinn.sveinsson@gmail.com> wrote: >>>> (gdb) bt >>>> #0 0x000000000061ab1b in list_nth () >>>> #1 0x00000000005e4081 in ExecLockNonLeafAppendTables () >>>> #2 0x00000000005f4d52 in ExecInitMergeAppend () >>>> #3 0x00000000005e0365 in ExecInitNode () >>>> #4 0x00000000005f35a7 in ExecInitLimit () >>>> #5 0x00000000005e00f3 in ExecInitNode () >>>> #6 0x00000000005dd207 in standard_ExecutorStart () >>>> #7 0x00000000006f96d2 in PortalStart () >>>> #8 0x00000000006f5c7f in exec_simple_query () >>>> #9 0x00000000006f6fac in PostgresMain () >>>> #10 0x0000000000475cdc in ServerLoop () >>>> #11 0x0000000000692ffa in PostmasterMain () >>>> #12 0x0000000000476600 in main () >> >> Thanks for the test case Sveinn and thanks Dilip for analyzing. >> >>> Seems like the issue is that the plans under multiple subroots are >>> pointing to the same partitioned_rels. >> >> That's correct. >> >>> If I am not getting it wrong "set_plan_refs(PlannerInfo *root, Plan >>> *plan, int rtoffset)" the rtoffset is specific to the subroot. Now, >>> problem is that set_plan_refs called for different subroot is updating >>> the same partition_rel info and make this value completely wrong which >>> will ultimately make ExecLockNonLeafAppendTables to access the out of >>> bound "rte" index. >> >> Yes. >> >>> set_plan_refs >>> { >>> [clipped] >>> case T_MergeAppend: >>> { >>> [clipped] >>> >>> foreach(l, splan->partitioned_rels) >>> { >>> lfirst_int(l) += rtoffset; >>> >>> >>> I think the solution should be that create_merge_append_path make the >>> copy of partitioned_rels list? >> >> Yes, partitioned_rels should be copied. >> >>> Attached patch fixes the problem but I am not completely sure about the fix. >> >> Thanks for creating the patch, although I think a better fix would be to >> make get_partitioned_child_rels() do the list_copy. That way, any other >> users of partitioned_rels will not suffer the same issue. Attached patch >> implements that, along with a regression test. >> >> Added to the open items. > > Oops, forgot to cc -hackers. Patch attached again. May be we should add a comment as to why the copy is needed. We still have the same copy shared across multiple append paths and set_plan_refs would change change it underneath those. May not be a problem right now but may be a problem in the future. Another option, which consumes a bit less memory is to make a copy at the time of planning if the path gets selected as the cheapest path. -- Best Wishes, Ashutosh Bapat EnterpriseDB Corporation The Postgres Database Company
Re: [HACKERS] Re: [BUGS] BUG #14657: Server process segmentationfault in v10, May 10th dev snapshot
From
Robert Haas
Date:
On Fri, May 19, 2017 at 6:07 AM, Ashutosh Bapat <ashutosh.bapat@enterprisedb.com> wrote: > We still have the same copy shared across multiple append paths and > set_plan_refs would change change it underneath those. May not be a > problem right now but may be a problem in the future. I agree. I think it's better for the path-creation functions to copy the list, so that there is no surprising sharing of substructure. set_plan_refs() obviously expects this data to be unshared, and this seems like the best way to ensure that's true in all cases. Committed that way. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company