Home > mailing lists

Fsync request queue - Mailing list pgsql-hackers

From	Andres Freund
Subject	Fsync request queue
Date	April 25, 2018 00:00:54
Msg-id	20180424180054.inih6bxfspgowjuc@alap3.anarazel.de Whole thread Raw
Responses	Re: Fsync request queue
List	pgsql-hackers

Tree view

Hi,

While thinking about the at the fsync mess, I started looking at the
fsync request queue. I was primarily wondering whether we can keep FDs
open long enough (by forwarding them to the checkpointer) to guarantee
that we see the error. But that's mostly irrelevant for what I'm
wondering about here.

The fsync request queue often is fairly large. 20 bytes for each
shared_buffers isn't a neglebible overhead. One reason it needs to be
fairly large is that we do not deduplicate while inserting, we just add
an entry on every single write.

ISTM that using a hashtable sounds saner, because we'd deduplicate on
insert. While that'd require locking, we can relatively easily reduce
the overhead of that by keeping track of something like mdsync_cycle_ctr
in MdfdVec, and only insert again if the cycle was incremented since.

Right now if the queue is full and can't be compacted we end up
fsync()ing on every single write, rather than once per checkpoint
afaict. That's a fairly horrible.

For the case that there's no space in the map, I'd suggest to just do
10% or so of the fsync in the poor sod of a process that finds no
space. That's surely better than constantly fsyncing on every single
write. We can also make bgwriter check the size of the hashtable on a
regular basis and do some of them if it gets too full.

The hashtable also I think has some advantages for the future. I've
introduced something very similar in my radix tree based buffer mapping.

Greetings,

Andres Freund

pgsql-hackers by date:

From: Peter Eisentraut
Date: 24 April 2018, 23:39:44
Subject: Re: Toast issues with OldestXmin going backwards

From: Юрий Соколов
Date: 25 April 2018, 00:12:00
Subject: Re: [HACKERS] Clock with Adaptive Replacement

Fsync request queue - Mailing list pgsql-hackers

Previous

Next