Re: POC: Cleaning up orphaned files using undo logs - Mailing list pgsql-hackers

From Dilip Kumar
Subject Re: POC: Cleaning up orphaned files using undo logs
Date
Msg-id CAFiTN-v3CikDEqn8NAJN=h9rOwnws0pqiFx=7Dpahfh6KCCbbA@mail.gmail.com
Whole thread Raw
In response to Re: POC: Cleaning up orphaned files using undo logs  (Andres Freund <andres@anarazel.de>)
List pgsql-hackers
On Fri, Aug 16, 2019 at 10:56 AM Andres Freund <andres@anarazel.de> wrote:
>
> Hi,
>
> On 2019-08-16 09:44:25 +0530, Dilip Kumar wrote:
> > On Wed, Aug 14, 2019 at 2:48 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> > >
> > > On Wed, Aug 14, 2019 at 12:27 PM Andres Freund <andres@anarazel.de> wrote:
> >
> > > >   I think that batch reading should just copy the underlying data into a
> > > >   char* buffer. Only the records that currently are being used by
> > > >   higher layers should get exploded into an unpacked record. That will
> > > >   reduce memory usage quite noticably (and I suspect it also drastically
> > > >   reduce the overhead due to a large context with a lot of small
> > > >   allocations that then get individually freed).
> > >
> > > Ok, I got your idea.  I will analyze it further and work on this if
> > > there is no problem.
> >
> > I think there is one problem that currently while unpacking the undo
> > record if the record is compressed (i.e. some of the fields does not
> > exist in the record) then we read those fields from the first record
> > on the page.  But, if we just memcpy the undo pages to the buffers and
> > delay the unpacking whenever it's needed seems that we would need to
> > know the page boundary and also we need to know the offset of the
> > first complete record on the page from where we can get that
> > information (which is currently in undo page header).
>
> I don't understand why that's a problem?
Okay, I was assuming that we will be only copying data part not
complete page including the page header.  If we copy the page header
as well we might be able to unpack the compressed record as well.

>
>
> > As of now even if we leave this issue apart I am not very clear what
> > benefit you are seeing in the way you are describing compared to the
> > way I am doing it now?
> >
> > a) Is it the multiple palloc? If so then we can allocate memory at
> > once and flatten the undo records in that.  Earlier, I was doing that
> > but we need to align each unpacked undo record so that we can access
> > them directly and based on Robert's suggestion I have modified it to
> > multiple palloc.
>
> Part of it.
>
> > b) Is it the memory size problem that the unpack undo record will take
> > more memory compared to the packed record?
>
> Part of it.
>
> > c) Do you think that we will not need to unpack all the records?  But,
> > I think eventually, at the higher level we will have to unpack all the
> > undo records ( I understand that it will be one at a time)
>
> Part of it. There's a *huge* difference between having a few hundred to
> thousand unpacked records, each consisting of several independent
> allocations, in memory and having one large block containing all
> packed records in a batch, and a few allocations for the few unpacked
> records that need to exist.
>
> There's also d) we don't need separate tiny memory copies while holding
> buffer locks etc.

Yeah, that too.  Yet another problem could be that how are we going to
process those record? Because for that we need to know all the undo
record pointers between start_urecptr and the end_urecptr right?  we
just have the big memory chunk and we have no idea how many undo
records are there and what are their undo record pointers.  And
without knowing that information, I am unable to imagine how we are
going to sort them based on block number.

-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com



pgsql-hackers by date:

Previous
From: Craig Ringer
Date:
Subject: Re: Global temporary tables
Next
From: Jeevan Chalke
Date:
Subject: Re: block-level incremental backup