Re: Reduce WAL logging of INSERT SELECT - Mailing list pgsql-hackers

From Gokulakannan Somasundaram
Subject Re: Reduce WAL logging of INSERT SELECT
Date
Msg-id CAHMh4-acj5sQr77MfqZvSbYN5y0soZqvDu2S0PL39Ws0P_HWsA@mail.gmail.com
Whole thread Raw
In response to Re: Reduce WAL logging of INSERT SELECT  (Heikki Linnakangas <heikki.linnakangas@enterprisedb.com>)
Responses Re: Reduce WAL logging of INSERT SELECT
List pgsql-hackers


However, for small operations it's a net loss - you avoid writing a WAL record, but you have to fsync() the heap instead. If you only modify a few rows, the extra fsync (or fsyncs if there are indexes too) is more expensive than writing the WAL.

We'd need a heuristic to decide whether to write WAL or fsync at the end. For regular INSERTs, UPDATEs and DELETEs, you have the planner's estimate of number of rows affected. Another thing we should do is move the fsync call from the end of COPY (and other such operations) to the end of transaction. That way if you do e.g one COPY followed by a bunch of smaller INSERTs or UPDATEs, you only need to fsync the files once.

Have you thought about recovery, especially when the page size is greater than the disk block size( > 4K ). With WAL, there is a way to restore the pages to the original state, during recovery, if there are partial page writes. Is it possible to do the same with direct fsync without WAL?

Gokul.

pgsql-hackers by date:

Previous
From: Bruce Momjian
Date:
Subject: Re: Reduce WAL logging of INSERT SELECT
Next
From: Kohei KaiGai
Date:
Subject: Re: [v9.1] sepgsql - userspace access vector cache