Re: Batch update of indexes - Mailing list pgsql-hackers

From Konstantin Knizhnik
Subject Re: Batch update of indexes
Date
Msg-id 569F9FCD.5080500@postgrespro.ru
Whole thread Raw
In response to Re: Batch update of indexes  (Anastasia Lubennikova <a.lubennikova@postgrespro.ru>)
Responses Re: Batch update of indexes  (Simon Riggs <simon@2ndQuadrant.com>)
Re: Batch update of indexes  (Anastasia Lubennikova <a.lubennikova@postgrespro.ru>)
List pgsql-hackers
Hi,

> Hi, I glad to see that you interested in that too.
> I think this is a good feature and I think it will be very useful to have.
> I have already mentioned some related problems and possible 
> improvements in my presentation.
> http://www.slideshare.net/AnastasiaLubennikova/indexes-dont-mean-slow-inserts
> Last two slides concern to this thread. Briefly, I've suggested to 
> think about insertion buffer. Something very similar to it is already 
> implemented in BRIN. It does not index last data from heap, while the 
> number of last pages is less than pages_per_block.

Do you mean GIN-like usage of insertion buffer (here it is called 
"pending list")?
So that we have to combine search in the main tree and in the insert buffer?
Actually this is what I want to avoided (because at least in case of GIN 
pending list cause significant degrade of performance,
while up-to-date state of full text index is rarely required).


> The next point, I've thought about is a bulk update. Problem is that 
> update like "UPDATE mytable set a = a+1;" causes N searches from the 
> root of B-tree. I looks very strange to me, and I'd like to fix it 
> somehow. The obvious solution is to update all tuples on the page at a 
> time, and keep the number of last updated page. But, maybe it's a bit 
> off-thread here.

Bulk update is the second question (but very important).
First I just want to be able to append index concurrently, not blocking 
insert.

>
>> One interesting approach of solving this problem is discussed in this 
>> article:
>>
>> https://mark.zealey.org/2016/01/08/how-we-tweaked-postgres-upsert-performance-to-be-2-3-faster-than-mongodb
>>
>> Them are using materialized views to build indexes in background.
>> Interesting idea, but copying content of the whole table just to be 
>> able to build index concurrently seems to be overkill.
>
> This approach seems like a tricky crutch to me. And I agree that it 
> requires a lot of extra work.

It will be very interesting to know how people are using materialized views.
Delayed building of indexes seems to be one of the popular use cases, 
although requiring large overhead, first of all storage overhead.

>
>>
>> Please notice that such alter table statement, changing condition for 
>> partial index, is not supported now.
>
> Don't you think, that this feature could be used in a very wrong way? 
> Do not take it as criticism, just a bit of thoughts.
>

Everything which can be misused, will be misused:)
But I do not worry much about it...
If it can address real challenges, then it will be good thing in any case.

Ideally we should be able to alter everything. Naive implementation of 
such alter clause can just to build new index with temporary name, then 
drop old index and rename new index.


>
> There was the discussion of the patch for partial indexes.
> http://postgresql.nabble.com/PATCH-index-only-scans-with-partial-indexes-td5857568.html
>  Since I haven't watched it closely, It seems to be open still. I 
> think it'll be interesting to you.
>

So small patch...
Why it was not accepted?
I do no see any problems with it...


> -- 
> Anastasia Lubennikova
> Postgres Professional:http://www.postgrespro.com
> The Russian Postgres Company

-- 
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company




pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: Set search_path + server-prepared statements = cached plan must not change result type
Next
From: Craig Ringer
Date:
Subject: Re: WIP: Failover Slots