Re: Some questions about PostgreSQL’s design. - Mailing list pgsql-hackers

From Heikki Linnakangas
Subject Re: Some questions about PostgreSQL’s design.
Date
Msg-id 8400b24d-37d0-49ee-94c9-ba8709dcf9ab@iki.fi
Whole thread Raw
In response to Some questions about PostgreSQL’s design.  (陈宗志 <baotiao@gmail.com>)
Responses Re: Some questions about PostgreSQL’s design.
Re: Some questions about PostgreSQL’s design.
List pgsql-hackers
On 20/08/2024 11:46, 陈宗志 wrote:
> I’ve recently started exploring PostgreSQL implementation. I used to
> be a MySQL InnoDB developer, and I find the PostgreSQL community feels
> a bit strange.
> 
> There are some areas where they’ve done really well, but there are
> also some obvious issues that haven’t been improved.
> 
> For example, the B-link tree implementation in PostgreSQL is
> particularly elegant, and the code is very clean.
> But there are some clear areas that could be improved but haven’t been
> addressed, like the double memory problem where the buffer pool and
> page cache store the same page, using full-page writes to deal with
> torn page writes instead of something like InnoDB’s double write
> buffer.
> 
> It seems like these issues have clear solutions, such as using
> DirectIO like InnoDB instead of buffered IO, or using a double write
> buffer instead of relying on the full-page write approach.
> Can anyone replay why?

There are pros and cons. With direct I/O, you cannot take advantage of 
the kernel page cache anymore, so it becomes important to tune 
shared_buffers more precisely. That's a downside: the system requires 
more tuning. For many applications, squeezing the last ounce of 
performance just isn't that important. There are also scaling issues 
with the Postgres buffer cache, which might need to be addressed first.

With double write buffering, there are also pros and cons. It also 
requires careful tuning. And replaying WAL that contains full-page 
images can be much faster, because you can write new page images 
"blindly" without reading the old pages first. We have WAL prefetching 
now, which alleviates that, but it's no panacea.

In summary, those are good solutions but they're not obviously better in 
all circumstances.

> However, the PostgreSQL community’s mailing list is truly a treasure
> trove, where you can find really interesting discussions. For
> instance, this discussion on whether lock coupling is needed for
> B-link trees, etc.
> https://www.postgresql.org/message-id/flat/CALJbhHPiudj4usf6JF7wuCB81fB7SbNAeyG616k%2Bm9G0vffrYw%40mail.gmail.com

Yep, there are old threads and patches for double write buffers and 
direct IO too :-).

-- 
Heikki Linnakangas
Neon (https://neon.tech)




pgsql-hackers by date:

Previous
From: Andrew Dunstan
Date:
Subject: Re: why is pg_upgrade's regression run so slow?
Next
From: Jacob Champion
Date:
Subject: Re: Add new protocol message to change GUCs for usage with future protocol-only GUCs