Re: faster ETL / bulk data load for heap tables - Mailing list pgsql-hackers

From Amit Kapila
Subject Re: faster ETL / bulk data load for heap tables
Date
Msg-id CAA4eK1JzPrwyVLyfAO6nhxiy=9uiJnxBW+AzT9AjNG1ifmypcw@mail.gmail.com
Whole thread Raw
In response to Re: faster ETL / bulk data load for heap tables  (Luc Vlaming <luc@swarm64.com>)
List pgsql-hackers
On Sat, Jan 2, 2021 at 3:34 PM Luc Vlaming <luc@swarm64.com> wrote:
>
> On 02-01-2021 08:36, Amit Kapila wrote:
> > On Fri, Jan 1, 2021 at 7:37 PM Luc Vlaming <luc@swarm64.com> wrote:
> >>
> >> Hi,
> >>
> >> In an effort to speed up bulk data loading/transforming I noticed that
> >> considerable time is spent in the relation extension lock.
> >>
> >
> > We already do extend the relation in bulk when there is a contention
> > on relation extension lock via RelationAddExtraBlocks. I wonder why is
> > that not able to address this kind of workload. On a quick look at
> > your patch, it seems you are always trying to extend the relation by
> > 128 blocks for copy operation after acquiring the lock whereas the
> > current mechanism has some smarts where it decides based on the number
> > of waiters. Now, it is possible that we should extend by a larger
> > number of blocks as compared to what we are doing now but using some
> > ad-hoc number might lead to space wastage. Have you tried to fiddle
> > with the current scheme of bulk-extension to see if that addresses
> > some of the gains you are seeing? I see that you have made quite a few
> > other changes that might be helping here but still, it is better to
> > see how much bottleneck is for relation extension lock and if that can
> > be addressed with the current mechanism rather than changing the
> > things in a different way.
> >
>
> Hi,
>
> Thanks for looking at the patch!
>
> Yes I tried that. I guess I should have also shared all other things I
> have tried before I ended up with these patches.
> I've tried to improve RelationAddExtraBlocks to extend the mechanism to
> more aggresively allocate blocks, to not put them in the FSM immediately
> as this also seemed like a point of contention, extending ReadBufferBI
> to allocate several pages in a loop, and a few more variants like this.
> To be sure I just tried again a few variants where I made e.g.
> extraBlocks=128, 256, etc, and disabled e.g. the FSM code. None of those
> grant very big performance gains or actually make it slower, suggesting
> the code is parameterized quite well already for the current design.
>
> The main problem is that the relation extension lock is taken to extend
> one block at a time, whilst doing (expensive) syscalls like pwrite().
> Even though we then put these blocks immediately in the FSM and such,
> the bottleneck stays the extension of the file itself, no matter how
> many blocks we allocate in a loop in RelationAddExtraBlocks.
>
> Therefore what I've set out to do is:
> - make the relation extension lock taken as short as possible.
> - reduce the time spent on syscalls as much as possible.
>
> This resulted in a design which hands out blocks to a specific backend
> so that everything after the file extension can be done safely without
> locks.
>

Oh, so you mean to say that the benefit comes from the fact that you
have disassociated relation extension from buffer allocation and then
extend it in multiples and then separately allocated buffer for all
those new blocks. I don't see anything wrong with that immediately but
I think it would be better if we know the benefits of each
optimization step. Basically, as of now, you have done multiple things
to get the benefits shown but it is not clear which of those
optimizations have generated how much benefit.

> Given that the current API did not allow this to be specified
> properly for now I added extra functions which have just as purpose to
> extend the relation so that they could be purpose built for this. I did
> not want to suggest this is however the best API there could be. The
> current state of this patch is in that sense still very crude. I just
> wrote the code I needed to be able to do performance testing. So it is
> as you rightfully pointed out quite ad-hoc and not very generic, nor the
> right design, has quite some code duplication, etc.
>
> I was at this point mostly looking for feedback on the design/approach.
> If this requires a (much) cleaner and more productified patch then I can
> arrange that. However I thought to first find out if this approach makes
> sense at all before doing more testing to make this more generic.
>

I think irrespective of which approach is used there is always a
better chance of getting meaningful feedback on cleaner patches.

-- 
With Regards,
Amit Kapila.



pgsql-hackers by date:

Previous
From: Masahiko Sawada
Date:
Subject: Re: pg_waldump/heapdesc.c and struct field names
Next
From: Abhijit Menon-Sen
Date:
Subject: Re: pg_waldump/heapdesc.c and struct field names