Re: [HACKERS] Event triggers + table partitioning cause server crash in current master - Mailing list pgsql-hackers

From Mark Dilger
Subject Re: [HACKERS] Event triggers + table partitioning cause server crash in current master
Date
Msg-id 422A86B8-9F76-4FF5-8A8A-6F331459493C@gmail.com
Whole thread Raw
In response to Re: [HACKERS] Event triggers + table partitioning cause server crashin current master  (Amit Langote <Langote_Amit_f8@lab.ntt.co.jp>)
Responses Re: [HACKERS] Event triggers + table partitioning cause server crash in current master  (Mark Dilger <hornschnorter@gmail.com>)
List pgsql-hackers
> On May 14, 2017, at 11:02 PM, Amit Langote <Langote_Amit_f8@lab.ntt.co.jp> wrote:
>
> On 2017/05/14 12:03, Mark Dilger wrote:
>> Hackers,
>>
>> I discovered a reproducible crash using event triggers in the current
>> development version, 29c7d5e4844443acaa74a0d06dd6c70b320bb315.
>> I was getting a crash before this version, and cloned a fresh copy of
>> the sources to be sure I was up to date, so I don't think the bug can be
>> attributed to Andres' commit.  (The prior version I was testing against
>> was heavily modified by me, so I recreated the bug using the latest
>> standard, unmodified sources.)
>>
>> I create both before and after event triggers early in the regression test
>> schedule, which then fire here and there during the following tests, leading
>> fairly reproducibly to the server crashing somewhere during the test suite.
>> These crashes do not happen for me without the event triggers being added
>> to the tests.  Many tests show as 'FAILED' simply because the logging
>> that happens in the event triggers creates unexpected output for the test.
>> Those "failures" are expected.  The server crashes are not.
>>
>> The server logs suggest the crashes might be related to partitioned tables.
>>
>> Please find attached the patch that includes my changes to the sources
>> for recreating this bug.  The logs and regression.diffs are a bit large; let
>> me know if you need them.
>>
>> I built using the command
>>
>> ./configure --enable-cassert --enable-tap-tests && make -j4 && make check
>
> Thanks for the report and providing steps to reproduce.
>
> It seems that it is indeed a bug related to creating range-partitioned
> tables.  DefineRelation() calls AlterTableInternal() to add NOT NULL
> constraints on the range partition key columns, but the code fails to
> first initialize the event trigger context information.  Attached patch
> should fix that.
>
> Thanks to the above test case, I also discovered that in the case of
> creating a partition, manipulations performed by MergeAttributes() on the
> input schema list may cause it to become invalid, that is, the List
> metadata (length) will no longer match the reality, because while the
> ListCells are deleted from the input list, the List pointer passed to
> list_delete_cell does not point to the same list.  This caused a crash
> when the CreateStmt in question was subsequently passed to copyObject,
> which tried to access CreateStmt.tableElts that has become invalid as just
> described.  The attached patch also takes care of that.

I can confirm that this fixes the crash that I was seeing.  I have read
through the patch briefly, but will give it a more thorough review in the
next few hours.

Many thanks for your attention on this!

Mark Dilger


pgsql-hackers by date:

Previous
From: Sokolov Yura
Date:
Subject: Re: [HACKERS] Small improvement to compactify_tuples
Next
From: Ildus Kurbangaliev
Date:
Subject: Re: [HACKERS] Bug in ExecModifyTable function and trigger issuesfor foreign tables