Re: [WIP] Pipelined Recovery - Mailing list pgsql-hackers

From Imran Zaheer
Subject Re: [WIP] Pipelined Recovery
Date
Msg-id CA+UBfa=qDfWB90w5AsmX4f3PbeeM++GbaoVagd9ff-DKQDLvWA@mail.gmail.com
Whole thread Raw
In response to Re: [WIP] Pipelined Recovery  (Henson Choi <assam258@gmail.com>)
Responses Re: [WIP] Pipelined Recovery
List pgsql-hackers
>
> Hi Xuneng, Imran, and everyone,
>

Hi Henson and Xuneng.

Thanks for explaining the approaches to Xuneng.

>
> The two approaches target different bottlenecks. The current patch
> parallelizes WAL decoding, which keeps the redo path single-threaded
> and avoids the Hot Standby visibility problem entirely.
>

You are right both approaches
target different bottlenecks. Pipeline patch aims to improve overall
cpu throughput
and to save CPU time by offloading the steps we can safely do in parallel with
out causing synchronization problems.

> One thing I am curious about in the current patch: WAL records are
> already in a serialized format on disk. The producer decodes them and
> then re-serializes into a different custom format for shm_mq. What is
> the advantage of this second serialization format over simply passing
> the raw WAL bytes after CRC validation and letting the consumer decode
> directly? Offloading CRC to a separate core could still improve
> throughput at the cost of higher total CPU usage, without needing the
> custom format.
>

Thanks. You are right there was no need to serialize the decoded record again.
I was not aware that we already have continuous bytes in memory. In my
next patch
I will remove this extra serialization step.

> Koichi's approach parallelizes redo (buffer I/O) itself, which attacks
> a larger cost — Jakub's flamegraphs show BufferAlloc ->
> GetVictimBuffer -> FlushBuffer dominating in both p0 and p1 — but at
> the expense of much harder concurrency problems.
>
> Whether the decode pipelining ceiling is high enough, or whether the
> redo parallelization complexity is tractable, seems like the central
> design question for this area.

I still have to investigate the problem related to `GetVictimBuffer` that
Jakub mentioned. But I was trying that how can we safely offload the work done
 by `XLogReadBufferForRedoExtended` to a separate
pipeline worker, or maybe we can try prefetching the buffer header so
the main redo
loop doesn't have to spend time getting the buffer

Thanks for the feedback. That was helpful.


Regards,
Imran Zaheer



pgsql-hackers by date:

Previous
From: Chao Li
Date:
Subject: Re: Exit walsender before confirming remote flush in logical replication
Next
From: Chao Li
Date:
Subject: Use proc_exit() in WalRcvWaitForStartPosition