Re: Scaling XLog insertion (was Re: Moving more work outside WALInsertLock) - Mailing list pgsql-hackers

From Heikki Linnakangas
Subject Re: Scaling XLog insertion (was Re: Moving more work outside WALInsertLock)
Date
Msg-id 4F5479FC.3030304@enterprisedb.com
Whole thread Raw
In response to Re: Scaling XLog insertion (was Re: Moving more work outside WALInsertLock)  (Amit Kapila <amit.kapila@huawei.com>)
List pgsql-hackers
On 20.02.2012 08:00, Amit Kapila wrote:
> I was trying to understand this patch and had few doubts:
>
> 1. In PerformXLogInsert(), why there is need to check freespace when
> already during ReserveXLogInsertLocation(), the space is reserved. Is
> it possible that the record size is more than actually calculted in
> ReserveXLogInsertLocation(), if so in that case what I understand is
> it is moving to next page to write, however isn't it possible that
> some other backend had already reserved that space.

The calculations between PerformXLogInsert (called CopyXLogRecordToWAL() 
in the latest patch version) and ReserveXLogInsertLocation() must always 
match, otherwise we have reserved incorrect amount of WAL and you get 
corrupt WAL. They both need to do the same calculations of how the WAL 
record is split across pages, which depends on how much free space there 
is on the first page. There is an assertion in CopyXLogRecordToWAL() to 
check that once it's finished writing the WAL record, the last byte 
landed on the position that ReserveXLogInsertLocation() calculated it 
would.

Another way to do that would be to remember the calculations done in 
ReserveXLogInsertLocation(), in an extra array or something. But we want 
to keep ReserveXLogInsertLocation() as simple as possible, as that runs 
while holding the spinlock. Any extra CPU cycles there will hurt 
scalability.

> 2. In function WaitForXLogInsertionSlotToBecomeFree(), chances are
> there such that when nextslot equals lastslot, all new backends try
> to  reserve a slot will start waiting on same last slot which can
> lead to serialization for those backends and can impact latency.

True. That warrants some performance testing to see if that effect is 
significant. (it's surely better than the current situation, anyway, 
where all WAL insertions block on the single lock)

> 3. GetXlogBuffer - This will get called twice, once for normal
> buffer, second time for when there is not enough space in current
> page, and both times it can lead to I/O whereas in earlier algorithm,
> the chances of I/O is only once.

I don't see any difference to the previous situation. In both cases, if 
you need a new page to copy the WAL record to, you need to first flush 
out some old pages from the WAL buffers if they're all dirty. The patch 
doesn't change the number of WAL buffers consumed. Note that 
GetXLogBuffer() is very cheap when it doesn't need to do I/O, extra 
calls to it don't matter if the page is already initialized.

--   Heikki Linnakangas  EnterpriseDB   http://www.enterprisedb.com


pgsql-hackers by date:

Previous
From: Michael Meskes
Date:
Subject: Re: ECPG FETCH readahead
Next
From: Shigeru Hanada
Date:
Subject: Re: pgsql_fdw, FDW for PostgreSQL server