Re: BUG #7521: Cannot disable WAL log while using pg_dump - Mailing list pgsql-bugs

From Gezeala M. Bacuño II
Subject Re: BUG #7521: Cannot disable WAL log while using pg_dump
Date
Msg-id CAJKO3mXknzYWAqg=E=SB9C0ZbczkfbkQc==+01SfyjTDzkvLyQ@mail.gmail.com
Whole thread Raw
In response to Re: BUG #7521: Cannot disable WAL log while using pg_dump  ("Boy de Laat" <boy@atsc.nl>)
Responses Re: BUG #7521: Cannot disable WAL log while using pg_dump  (Gezeala M. Bacuño II <gezeala@gmail.com>)
List pgsql-bugs
adding pgsql-bugs list in case OP posts back.

On Fri, Sep 7, 2012 at 11:29 AM, Pavan Deolasee
<pavan.deolasee@gmail.com> wrote:
> (Adding -hackers. Did not realize it got dropped)
>
> On Fri, Sep 7, 2012 at 11:25 PM, Gezeala M. Bacuño II <gezeala@gmail.com>
> wrote:
>>
>> On Fri, Sep 7, 2012 at 7:17 AM, Pavan Deolasee <pavan.deolasee@gmail.com>
>> wrote:
>> >
>> >
>> > On Fri, Sep 7, 2012 at 7:00 PM, Marie Bacuno II <gezeala@gmail.com>
>> > wrote:
>> >>
>> >>
>> >> On Sep 7, 2012, at 2:19, Pavan Deolasee <pavan.deolasee@gmail.com>
>> >> wrote:
>> >>
>> >>
>> >> > or have long running transactions ?
>> >>
>> >> Yes but I don't think there are when the snapshot was taken. Does the
>> >> pg_xlog_location_diff() result from latest and prior checkpoint upon
>> >> start-up indicates the size of replayed changes?
>> >>
>> >
>> > Thats the amount of additional WAL generated after you started the
>> > server.
>> >
>> >>
>> >> >
>> >> > BTW, the following query returns ~60GB. Thats the amount of WAL
>> >> > written after the server was started and at the end of pg_dump (I
>> >> > don't think pg_xlog_location_diff() is available in the older
>> >> > releases).
>> >> >
>> >> > postgres=# select pg_xlog_location_diff('4450/7A14F280',
>> >> > '4441/5E681F38')/(2^30);
>> >> >    ?column?
>> >> > ------------------
>> >> > 60.1980484202504
>> >>
>> >> It'll be great to know what the wals modified..?
>> >
>> >
>> > You would need something like xlogdump to decipher them. I quickly tried
>> > this and it seems to work against 8.4 version that you are running.
>> > https://github.com/snaga/xlogdump
>> >
>> > Download the source code, compile and run it against one of the most
>> > recent
>> > WAL files in the cluster against which you ran pg_dump. You would need
>> > to
>> > set PATH to contain the pg_config of the server you are running. Please
>> > post
>> > the output.
>> >
>> > Thanks,
>> > Pavan
>> >
>> >
>>
>> Here you go:
>>
>> ## last WAL
>> $ xlogdump -S /dbpool/data/pg_xlog/00000001000044500000007A
>>
>> /dbpool/data/pg_xlog/00000001000044500000007A:
>>
>> Unexpected page info flags 0003 at offset 0
>> Skipping unexpected continuation record at offset 0
>> ReadRecord: record with zero len at 17488/7A14F310
>> Unexpected page info flags 0001 at offset 150000
>> Skipping unexpected continuation record at offset 150000
>> Unable to read continuation page?
>>  ** maybe continues to next segment **
>> ---------------------------------------------------------------
>> TimeLineId: 1, LogId: 17488, LogSegment: 122
>>
>> Resource manager stats:
>>   [0]XLOG      : 3 records, 120 bytes (avg 40.0 bytes)
>>                  checkpoint: 3, switch: 0, backup end: 0
>>   [1]Transaction: 0 record, 0 byte (avg 0.0 byte)
>>                  commit: 0, abort: 0
>>   [2]Storage   : 0 record, 0 byte (avg 0.0 byte)
>>   [3]CLOG      : 0 record, 0 byte (avg 0.0 byte)
>>   [4]Database  : 0 record, 0 byte (avg 0.0 byte)
>>   [5]Tablespace: 0 record, 0 byte (avg 0.0 byte)
>>   [6]MultiXact : 0 record, 0 byte (avg 0.0 byte)
>>   [7]Reserved 7: 0 record, 0 byte (avg 0.0 byte)
>>   [8]Reserved 8: 0 record, 0 byte (avg 0.0 byte)
>>   [9]Heap2     : 2169 records, 43380 bytes (avg 20.0 bytes)
>>   [10]Heap      : 0 record, 0 byte (avg 0.0 byte)
>>                  ins: 0, upd/hot_upd: 0/0, del: 0
>>   [11]Btree     : 0 record, 0 byte (avg 0.0 byte)
>>   [12]Hash      : 0 record, 0 byte (avg 0.0 byte)
>>   [13]Gin       : 0 record, 0 byte (avg 0.0 byte)
>>   [14]Gist      : 0 record, 0 byte (avg 0.0 byte)
>>   [15]Sequence  : 0 record, 0 byte (avg 0.0 byte)
>>
>> Backup block stats: 2169 blocks, 16551816 bytes (avg 7631.1 bytes)
>>
>
> I think both my theories seem to be holding up. Heap2 resource manager is
> used only for vacuum freeze, lazy vacuum or HOT prune. Given your access
> pattern, I bet its the third activity that kicking in on your database. You
> got many pages with dead tuples and they are getting cleaned at the first
> opportunity, which happens to be the pg_dump thats run immediately after the
> server restart. This is seen by all 2169 WAL records in the file being
> attributed to the Heap2 RM above.
>
> Whats additionally happening is each of these records are on different heap
> pages. The cleanup activity dirties those pages. Since each of these pages
> is being dirtied for the first time after a recent checkpoint and
> full_page_writes is turned ON, entire page is backed up in the WAL record.
> You can see the exact number of backup blocks in the stats above.
>
> I don't think we have any mechanism to control or stop HOT from doing what
> it wants to do, unless you are willing to run a modified server for this
> reason. But you can at least bring down the WAL volume by turning
> full_page_writes OFF.
>
> Thanks,
> Pavan

Great. Finally got some light on this. I'll disable full_page_writes
on my next backup and will post back results tomorrow. Thanks.



pgsql-bugs by date:

Previous
From: Pavan Deolasee
Date:
Subject: Re: BUG #7521: Cannot disable WAL log while using pg_dump
Next
From: pieter@infomet.com
Date:
Subject: BUG #7524: Partitioning