Re: refactoring relation extension and BufferAlloc(), faster COPY - Mailing list pgsql-hackers

From Heikki Linnakangas
Subject Re: refactoring relation extension and BufferAlloc(), faster COPY
Date
Msg-id f5164e7a-eef6-8972-75a3-8ac622ed0c6e@iki.fi
Whole thread Raw
In response to Re: refactoring relation extension and BufferAlloc(), faster COPY  (Heikki Linnakangas <hlinnaka@iki.fi>)
Responses Re: refactoring relation extension and BufferAlloc(), faster COPY  (Andres Freund <andres@anarazel.de>)
List pgsql-hackers
> v2-0005-bufmgr-Acquire-and-clean-victim-buffer-separately.patch
This can be applied separately from the rest of the patches, which is 
nice. Some small comments on it:

* Needs a rebase, it conflicted slightly with commit f30d62c2fc.

* GetVictimBuffer needs a comment to explain what it does. In 
particular, mention that it returns a buffer that's pinned and known 
!BM_TAG_VALID.

* I suggest renaming 'cur_buf' and other such local variables in 
GetVictimBufffer to just 'buf'. 'cur' prefix suggests that there is some 
other buffer involved too, but there is no 'prev' or 'next' or 'other' 
buffer. The old code called it just 'buf' too, and before this patch it 
actually was a bit confusing because there were two buffers involved. 
But with this patch, GetVictimBuffer only deals with one buffer at a time.

* This FIXME:

>         /* OK, do the I/O */
>         /* FIXME: These used the wrong smgr before afaict? */
>         {
>             SMgrRelation smgr = smgropen(BufTagGetRelFileLocator(&buf_hdr->tag),
>                                          InvalidBackendId);
> 
>             TRACE_POSTGRESQL_BUFFER_WRITE_DIRTY_START(buf_hdr->tag.forkNum,
>                                                       buf_hdr->tag.blockNum,
>                                                       smgr->smgr_rlocator.locator.spcOid,
>                                                       smgr->smgr_rlocator.locator.dbOid,
>                                                       smgr->smgr_rlocator.locator.relNumber);
> 
>             FlushBuffer(buf_hdr, smgr, IOOBJECT_RELATION, io_context);
>             LWLockRelease(content_lock);
> 
>             ScheduleBufferTagForWriteback(&BackendWritebackContext,
>                                           &buf_hdr->tag);
> 
>             TRACE_POSTGRESQL_BUFFER_WRITE_DIRTY_DONE(buf_hdr->tag.forkNum,
>                                                      buf_hdr->tag.blockNum,
>                                                      smgr->smgr_rlocator.locator.spcOid,
>                                                      smgr->smgr_rlocator.locator.dbOid,
>                                                      smgr->smgr_rlocator.locator.relNumber);
>         }

I believe that was intentional. The probes previously reported the block 
and relation whose read *caused* the eviction. It was not just the smgr 
but also the blockNum and forkNum that referred to the block that was 
being read. There's another pair of probe points, 
TRACE_POSTGRESQL_BUFFER_FLUSH_START/DONE, inside FlushBuffer that 
indicate the page that is being flushed.

I see that reporting the evicted page is more convenient with this 
patch, otherwise you'd need to pass the smgr and blocknum of the page 
that's being read to InvalidateVictimBuffer(). IMHO you can just remove 
these probe points. We don't need to bend over backwards to maintain 
specific probe points.

* InvalidateVictimBuffer reads the buffer header with an atomic read op, 
just to check if BM_TAG_VALID is set. If it's not, it does nothing 
(except for a few Asserts). But the caller has already read the buffer 
header. Consider refactoring it so that the caller checks VM_TAG_VALID, 
and only calls InvalidateVictimBuffer if it's set, saving one atomic 
read in InvalidateVictimBuffer. I think it would be just as readable, so 
no loss there. I doubt the atomic read makes any measurable performance 
difference, but it looks redundant.

* I don't understand this comment:

>     /*
>      * Clear out the buffer's tag and flags and usagecount.  We must do
>      * this to ensure that linear scans of the buffer array don't think
>      * the buffer is valid.
>      *
>      * XXX: This is a pre-existing comment I just moved, but isn't it
>      * entirely bogus with regard to the tag? We can't do anything with
>      * the buffer without taking BM_VALID / BM_TAG_VALID into
>      * account. Likely doesn't matter because we're already dirtying the
>      * cacheline, but still.
>      *
>      */
>     ClearBufferTag(&buf_hdr->tag);
>     buf_state &= ~(BUF_FLAG_MASK | BUF_USAGECOUNT_MASK);
>     UnlockBufHdr(buf_hdr, buf_state);

What exactly is wrong with clearing the tag? What does dirtying the 
cacheline have to do with the correctness here?

* pgindent

- Heikki



pgsql-hackers by date:

Previous
From: Andres Freund
Date:
Subject: Re: proposal: psql: psql variable BACKEND_PID
Next
From: Andres Freund
Date:
Subject: Re: proposal: psql: psql variable BACKEND_PID