Re: POC: postgres_fdw insert batching - Mailing list pgsql-hackers

From Tomas Vondra
Subject Re: POC: postgres_fdw insert batching
Date
Msg-id 261fbedb-1ced-82f7-e978-60cfe17bce07@enterprisedb.com
Whole thread Raw
In response to Re: POC: postgres_fdw insert batching  (Tomas Vondra <tomas.vondra@enterprisedb.com>)
Responses Re: POC: postgres_fdw insert batching  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
On 1/21/21 12:52 AM, Tomas Vondra wrote:
> Hmm, seems that florican doesn't like this :-(
> 
> https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=florican&dt=2021-01-20%2023%3A08%3A15 
> 
> 
> It's a i386 machine running FreeBSD, so not sure what exactly it's picky 
> about. But when I tried running this under valgrind, I get some strange 
> failures in the new chunk in ExecInitModifyTable:
> 
>    /*
>     * Determine if the FDW supports batch insert and determine the batch
>     * size (a FDW may support batching, but it may be disabled for the
>     * server/table).
>     */
>    if (!resultRelInfo->ri_usesFdwDirectModify &&
>        operation == CMD_INSERT &&
>        resultRelInfo->ri_FdwRoutine != NULL &&
>        resultRelInfo->ri_FdwRoutine->GetForeignModifyBatchSize &&
>        resultRelInfo->ri_FdwRoutine->ExecForeignBatchInsert)
>        resultRelInfo->ri_BatchSize =
> 
> resultRelInfo->ri_FdwRoutine->GetForeignModifyBatchSize(resultRelInfo);
>    else
>        resultRelInfo->ri_BatchSize = 1;
> 
>    Assert(resultRelInfo->ri_BatchSize >= 1);
> 
> It seems as if the resultRelInfo is not initialized, or something like 
> that. I wouldn't be surprised if the 32-bit machine was pickier and 
> failing because of that.
> 
> A sample of the valgrind log is attached. It's pretty much just 
> repetitions of these three reports.
> 

OK, it's definitely accessing uninitialized memory, because the 
resultRelInfo (on line 2801, i.e. the "if" condition) looks like this:

(gdb) p resultRelInfo
$1 = (ResultRelInfo *) 0xe595988
(gdb) p *resultRelInfo
$2 = {type = 2139062142, ri_RangeTableIndex = 2139062143, 
ri_RelationDesc = 0x7f7f7f7f7f7f7f7f, ri_NumIndices = 2139062143, 
ri_IndexRelationDescs = 0x7f7f7f7f7f7f7f7f, ri_IndexRelationInfo = 
0x7f7f7f7f7f7f7f7f,
   ri_TrigDesc = 0x7f7f7f7f7f7f7f7f, ri_TrigFunctions = 
0x7f7f7f7f7f7f7f7f, ri_TrigWhenExprs = 0x7f7f7f7f7f7f7f7f, 
ri_TrigInstrument = 0x7f7f7f7f7f7f7f7f, ri_ReturningSlot = 
0x7f7f7f7f7f7f7f7f, ri_TrigOldSlot = 0x7f7f7f7f7f7f7f7f,
   ri_TrigNewSlot = 0x7f7f7f7f7f7f7f7f, ri_FdwRoutine = 
0x7f7f7f7f7f7f7f7f, ri_FdwState = 0x7f7f7f7f7f7f7f7f, 
ri_usesFdwDirectModify = 127, ri_NumSlots = 2139062143, ri_BatchSize = 
2139062143, ri_Slots = 0x7f7f7f7f7f7f7f7f,
   ri_PlanSlots = 0x7f7f7f7f7f7f7f7f, ri_WithCheckOptions = 
0x7f7f7f7f7f7f7f7f, ri_WithCheckOptionExprs = 0x7f7f7f7f7f7f7f7f, 
ri_ConstraintExprs = 0x7f7f7f7f7f7f7f7f, ri_GeneratedExprs = 
0x7f7f7f7f7f7f7f7f,
   ri_NumGeneratedNeeded = 2139062143, ri_junkFilter = 
0x7f7f7f7f7f7f7f7f, ri_returningList = 0x7f7f7f7f7f7f7f7f, 
ri_projectReturning = 0x7f7f7f7f7f7f7f7f, ri_onConflictArbiterIndexes = 
0x7f7f7f7f7f7f7f7f,
   ri_onConflict = 0x7f7f7f7f7f7f7f7f, ri_PartitionCheckExpr = 
0x7f7f7f7f7f7f7f7f, ri_PartitionRoot = 0x7f7f7f7f7f7f7f7f, 
ri_RootToPartitionMap = 0x8, ri_PartitionTupleSlot = 0x8, 
ri_ChildToRootMap = 0xe5952b0,
   ri_CopyMultiInsertBuffer = 0xe596740}
(gdb)

I may be wrong, but the most likely explanation seems to be this is due 
to the junk filter initialization, which simply moves past the end of 
the mtstate->resultRelInfo array.

It kinda seems the GetForeignModifyBatchSize call should happen before 
that block. The attached patch fixes this for me (i.e. regression tests 
pass with no valgrind reports.

Or did I get that wrong?

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Attachment

pgsql-hackers by date:

Previous
From: Zhihong Yu
Date:
Subject: Re: POC: postgres_fdw insert batching
Next
From: Kyotaro Horiguchi
Date:
Subject: Re: Printing LSN made easy