Re: POC: Cleaning up orphaned files using undo logs - Mailing list pgsql-hackers

From Andres Freund
Subject Re: POC: Cleaning up orphaned files using undo logs
Date
Msg-id 20190718055625.7e2afih3f3c2xuug@alap3.anarazel.de
Whole thread Raw
In response to Re: POC: Cleaning up orphaned files using undo logs  (Amit Kapila <amit.kapila16@gmail.com>)
List pgsql-hackers
Hi,

On 2019-07-18 11:15:05 +0530, Amit Kapila wrote:
> On Wed, Jul 17, 2019 at 3:37 AM Andres Freund <andres@anarazel.de> wrote:
> > I'm not yet sure whether we'd want the rbtree nodes being pointed to
> > directly by the hashtable, or whether we'd want one indirection.
> >
> > e.g. either something like:
> >
> >
> > typedef struct UndoWorkerQueue
> > {
> >     /* priority ordered tree */
> >     RBTree *tree;
> >     ....
> > }
> >
> 
> I think we also need the size of rbtree (aka how many nodes/undo
> requests it has) to know whether we can add more.  This information is
> available in binary heap, but here I think we need to track it in
> UndoWorkerQueue.  Basically, at each enqueue/dequeue, we need to
> increment/decrement the same.
> 
> > typedef struct UndoWorkerQueueEntry
> > {
> >      RBTNode tree_node;
> >
> >      /*
> >       * Reference hashtable via key, not pointers, entries might be
> >       * moved.
> >       */
> >      RollbackHashKey rollback_key
> >      ...
> > } UndoWorkerQueueEntry;
> >
> 
> In UndoWorkerQueueEntry, we might also want to include some other info
> like dbid, request_size, next_retry_at, err_occurred_at so that while
> accessing queue entry in comparator functions or other times, we don't
> always need to perform hash table search.  OTOH, we can do hash_search
> as well, but may be code-wise it will be better to keep additional
> information.

The dots signal that additional fields are needed in those places.


> Another thing is we need some freelist/array for
> UndoWorkerQueueEntries equivalent to size of three queues?

I think using the slist as I proposed for the second alternative is
better?


> BTW, do you have any preference for using dynahash or simplehash for
> RollbackHashTable?

I find simplehash nicer to use in code, personally, and it's faster in
most cases...

Greetings,

Andres Freund



pgsql-hackers by date:

Previous
From: Andres Freund
Date:
Subject: Re: partition routing layering in nodeModifyTable.c
Next
From: Amit Kapila
Date:
Subject: Re: SegFault on 9.6.14