Home > mailing lists

Re: POC: Cleaning up orphaned files using undo logs - Mailing list pgsql-hackers

From	Amit Kapila
Subject	Re: POC: Cleaning up orphaned files using undo logs
Date	July 13, 2019 13:25:51
Msg-id	CAA4eK1K_eDXnkYWTeFbXzHMamMLfdxnJBB3wLE5c_6idERXMpg@mail.gmail.com Whole thread Raw
In response to	Re: POC: Cleaning up orphaned files using undo logs (Robert Haas <robertmhaas@gmail.com>)
Responses	Re: POC: Cleaning up orphaned files using undo logs (Robert Haas <robertmhaas@gmail.com>) Re: POC: Cleaning up orphaned files using undo logs (Andres Freund <andres@anarazel.de>)
List	pgsql-hackers

Tree view

On Fri, Jul 12, 2019 at 7:08 PM Robert Haas <robertmhaas@gmail.com> wrote:
>
> On Fri, Jul 12, 2019 at 5:40 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> > Apart from this, the duplicate key (ex. for size queues the size of
> > two requests can be same) handling might need some work.  Basically,
> > either special combine function needs to be written (not sure yet what
> > we should do there) or we always need to ensure that the key is unique
> > like (size + start_urec_ptr).  If the size is the same, then we can
> > decide based on start_urec_ptr.
>
> I think that this problem is somewhat independent of whether we use an
> rbtree or a binaryheap or some other data structure.
>

I think then I am missing something because what I am talking about is
below code in rbt_insert:
rbt_insert()
{
..
cmp = rbt->comparator(data, current, rbt->arg);
if (cmp == 0)
{
/*
* Found node with given key.  Apply combiner.
*/
rbt->combiner(current, data, rbt->arg);
*isNew = false;
return current;
}
..
}

If you see, here it doesn't add the duplicate key in the tree and that
is not the case with binary_heap as far as I can understand.

>  I would be
> inclined to use XID as a tiebreak for the size queue, so that it's
> more likely to match the ordering of the XID queue, but if that's
> inconvenient, then some other arbitrary value like start_urec_ptr
> should be fine.
>

I think it would be better to use start_urec_ptr because XID can be
non-unique in our case.  As I explained in one of the emails above [1]
that we register the requests for logged and unlogged relations
separately, so XID can be non-unique.

> >
> > I think even if we currently go with a binary heap, it will be
> > possible to change it to rbtree later, but I am fine either way.
>
> Well, I don't see much point in revising all of this logic twice. We
> should pick the way we want it to work and make it work that way.
>

Yeah, I agree.  So, I am assuming here that as you have discussed this
idea with Andres offlist, he is on board with changing it as he has
originally suggested using binary_heap.  Andres, do let us know if you
think differently here.  It would be good if anyone else following the
thread can also weigh in.

[1] - https://www.postgresql.org/message-id/CAA4eK1LEKyPZD5Dy4j1u2smUUyMzxgC2YLj8E%2BaJpsvG7sVJYA%40mail.gmail.com

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

pgsql-hackers by date:

From: Jose Luis Tallon
Date: 13 July 2019, 13:00:48
Subject: Re: [PATCH] Implement uuid_version()

From: Thomas Munro
Date: 13 July 2019, 14:17:00
Subject: Re: refactoring - share str2*int64 functions

Re: POC: Cleaning up orphaned files using undo logs - Mailing list pgsql-hackers

Previous

Next