Re: brininsert optimization opportunity - Mailing list pgsql-hackers

From Tomas Vondra
Subject Re: brininsert optimization opportunity
Date
Msg-id 56a0517d-1878-5f9d-e4b2-7ed8ba63d586@enterprisedb.com
Whole thread Raw
In response to Re: brininsert optimization opportunity  (Soumyadeep Chakraborty <soumyadeep2007@gmail.com>)
Responses Re: brininsert optimization opportunity
List pgsql-hackers
On 12/13/23 09:12, Soumyadeep Chakraborty wrote:
> On Tue, Dec 12, 2023 at 3:25 AM Tomas Vondra
> <tomas.vondra@enterprisedb.com> wrote:
> 
> 
>> The attached 0001 patch fixes this by adding the call to validate_index,
> which seems like the proper place as it's where the indexInfo is
> allocated and destroyed.
> 
> Yes, and by reading validate_index's header comment, there is a clear
> expectation that we will be adding tuples to the index in the table AM call
> table_index_validate_scan (albeit "validating" doesn't necessarily convey this
> side effect). So, it makes perfect sense to call it here.
> 

OK

> 
>> But this makes me think - are there any other places that might call
>> index_insert without then also doing the cleanup? I did grep for the
>> index_insert() calls, and it seems OK except for this one.
>>
>> There's a call in toast_internals.c, but that seems OK because that only
>> deals with btree indexes (and those don't need any cleanup). The same
>> logic applies to unique_key_recheck(). The rest goes through
>> execIndexing.c, which should do the cleanup in ExecCloseIndices().
>>
>> Note: We should probably call the cleanup even in the btree cases, even
>> if only to make it clear it needs to be called after index_insert().
> 
> Agreed. Doesn't feel great, but yeah all of these btree specific code does call
> through index_* functions, instead of calling btree functions directly. So,
> ideally we should follow through with that pattern and call the cleanup
> explicitly. But we are introducing per-tuple overhead that is totally wasted.

I haven't tried but I very much doubt this will be measurable. It's just
a trivial check if a pointer is NULL. We do far more expensive stuff in
this code path.

> Maybe we can add a comment instead like:
> 
> void
> toast_close_indexes(Relation *toastidxs, int num_indexes, LOCKMODE lock)
> {
> int i;
> 
> /*
> * Save a few cycles by choosing not to call index_insert_cleanup(). Toast
> * indexes are btree, which don't have a aminsertcleanup() anyway.
> */
> 
> /* Close relations and clean up things */
> ...
> }
> 
> And add something similar for unique_key_recheck()? That said, I am also not
> opposed to just accepting these wasted cycles, if the commenting seems wonky.
> 

I really don't want to do this sort of stuff unless we know it actually
saves something.

>> I propose we do a much simpler thing instead - allow the cache may be
>> initialized / cleaned up repeatedly, and make sure it gets reset at
>> convenient place (typically after index_insert calls that don't go
>> through execIndexing). That'd mean the cleanup does not need to happen
>> very far from the index_insert(), which makes the reasoning much easier.
>> 0002 does this.
> 
> That kind of goes against the ethos of the ii_AmCache, which is meant
> to capture state to be used across multiple index inserts.

Why would it be against the ethos? The point is that we reuse stuff over
multiple index_insert() calls. If we can do that for all inserts, cool.
But if that's difficult, it's maybe better to cache for smaller batches
of inserts (instead of making it more complex for all index AMs, even
those not doing any caching).

> Also, quoting the current docs:
> 
> "If the index AM wishes to cache data across successive index insertions
> within an SQL statement, it can allocate space
> in <literal>indexInfo->ii_Context</literal> and store a pointer to the
> data in <literal>indexInfo->ii_AmCache</literal> (which will be NULL
> initially). After the index insertions complete, the memory will be freed
> automatically. If additional cleanup is required (e.g. if the cache contains
> buffers and tuple descriptors), the AM may define an optional function
> <literal>indexinsertcleanup</literal>, called before the memory is released."
> 
> The memory will be freed automatically (as soon as ii_Context goes away). So,
> why would we explicitly free it, like in the attached 0002 patch? And the whole
> point of the cleanup function is to do something other than freeing memory, as
> the docs note highlights so well.
> 

Not sure I follow. The whole reason for having the index_insert_cleanup
callback is we can't rely on ii_Context going away, exactly because that
just throws away the memory and we need to release buffers etc.

The only reason why the 0002 patch does pfree() is that it clearly
indicates whether the cache is initialized. We could have a third state
"allocated but not initialized", but that doesn't seem worth it.

If you're saying the docs are misleading in some way, then maybe we need
to clarify that.

> Also, the 0002 patch does introduce per-tuple function call overhead in
> heapam_index_validate_scan().
> 

How come? The cleanup is done only once after the scan completes. Isn't
that exactly what we want to do?

> Also, now we have split brain...in certain situations we want to call
> index_insert_cleanup() per index insert and in certain cases we would like it
> called once for all inserts. Not very easy to understand IMO.
> 
> Finally, I don't think that any index AM would have anything to clean up after
> every insert.
> 

But I didn't suggest to call the cleanup after every index insert. If a
function does a bunch of index_insert calls, it'd do the cleanup only
once. The point is that it'd happen "close" to the inserts, when we know
it needs to be done. Yes, it might happen multiple times for the same
query, but that still likely saves quite a bit of work (compared to
per-insert init+cleanup).

We need to call the cleanup at some point, and the only alternative I
can think of is to call it in every place that calls BuildIndexInfo
(unless we can guarantee the place can't do index_insert).

> I had tried (abused) an approach with MemoryContextCallbacks upthread and that
> seems a no-go. And yes I agree, having a dual to makeIndexInfo() (like
> destroyIndexInfo()) means that we lose the benefits of ii_Context. That could
> be disruptive to existing AMs in-core and outside.
> 

What do you mean by "dual makeIndexInfo"?


regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



pgsql-hackers by date:

Previous
From: Nikita Malakhov
Date:
Subject: Re: Is a clearer memory lifespan for outerTuple and innerTuple useful?
Next
From: Tomas Vondra
Date:
Subject: Re: planner chooses incremental but not the best one