Re: [PATCHES] Full page writes improvement, code update - Mailing list pgsql-hackers

From Koichi Suzuki
Subject Re: [PATCHES] Full page writes improvement, code update
Date
Msg-id 461D9090.60808@oss.ntt.co.jp
Whole thread Raw
In response to Re: [PATCHES] Full page writes improvement, code update  (Hannu Krosing <hannu@skype.net>)
Responses Re: [PATCHES] Full page writes improvement, code update
List pgsql-hackers
I don't fully understand what "transaction log" means.   If it means
"archived WAL", the current (8.2) code handle WAL as follows:

1) If full_page_writes=off, then no full page writes will be written to
WAL, except for those during onlie backup (between pg_start_backup and
pg_stop_backup).   The WAL size will be considerably small but it cannot
recover from partial/inconsistent write to the database files.  We have
to go back to the online backup and apply all the archive log.

2) If full_page_writes=on, then full page writes will be written at the
first update of a page after each checkpoint, plus full page writes at
1).   Because we have no means (in 8.2) to optimize the WAL so far, what
we can do is to copy WAL or gzip it at archive time.

If we'd like to keep good chance of recovery after the crash, 8.2
provides only the method 2), leaving archive log size considerably
large.  My proposal maintains the chance of crash recovery the same as
in the case of full_page_writes=on and reduces the size of archived log
as in the case of full_page_writes=off.

Regards;

Hannu Krosing wrote:
> Ühel kenal päeval, T, 2007-04-10 kell 18:17, kirjutas Joshua D. Drake:
>>> In terms of idle time for gzip and other command to archive WAL offline,
>>> no difference in the environment was given other than the command to
>>> archive.   My guess is because the user time is very large in gzip, it
>>> has more chance for scheduler to give resource to other processes.   In
>>> the case of cp, idle time is more than 30times longer than user time.
>>> Pg_compresslog uses seven times longer idle time than user time.  On the
>>> other hand, gzip uses less idle time than user time.   Considering the
>>> total amount of user time, I think it's reasonable measure.
>>>
>>> Again, in my proposal, it is not the issue to increase run time
>>> performance.   Issue is to decrease the size of archive log to save the
>>> storage.
>> Considering the relatively little amount of storage a transaction log
>> takes, it would seem to me that the performance angle is more appropriate.
>
> As I understand it it's not about transaction log but about write-ahead
> log.
>
> and the amount of data in WAL can become very important once you have to
> keep standby servers in different physical locations (cities, countries
> or continents) where channel throughput and cost comes into play.
>
> With simple cp (scp/rsync) the amount of WAL data needing to be copied
> is about 10x more than data collected by trigger based solutions
> (Slony/pgQ). With pg_compresslog WAL-shipping seems to have roughly the
> same amount and thus becomes a viable alternative again.
>
>> Is it more efficient in other ways besides negligible tps? Possibly more
>> efficient memory usage? Better restore times for a crashed system?
>
> I think that TPS is more affected by number of writes than size of each
> block written, so there is probably not that much to gain in TPS, except
> perhaps from better disk cache usage.
>
> For me pg_compresslog seems to be a winner even if it just does not
> degrade performance.
>


--
Koichi Suzuki


pgsql-hackers by date:

Previous
From: Koichi Suzuki
Date:
Subject: Re: [PATCHES] Full page writes improvement, code update
Next
From: Gregory Stark
Date:
Subject: Makefile patch to make gcov work on Postgres contrib modules