Thread: Re: [GENERAL] VACUUM touching file but not updating relation

Re: [GENERAL] VACUUM touching file but not updating relation

From
Tom Lane
Date:
Thom Brown <thom@linux.com> writes:
> On 11 November 2011 00:55, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Thom Brown <thom@linux.com> writes:
>>> I just noticed that the VACUUM process touches a lot of relations
>>> (affects mtime) but for one file I looked at, it didn't change.  This
>>> doesn't always happen, and many relations aren't touched at all.

>> No immmediate ideas as to why the mtime would change if the file
>> contents didn't.  It seems like there must be a code path that marked
>> a buffer dirty without having changed it, but we're usually pretty
>> careful about that.

> I checked all files where the time stamp of the file had changed, but
> had the same MD5 sum.  I used the list in the query you mentioned and
> get: [ mostly indexes ]

Hmm, is this on a hot standby master?

I observe that _bt_delitems_vacuum() unconditionally dirties the page
and writes a WAL record, whether it has anything to do or not; and that
if XLogStandbyInfoActive() then btvacuumscan will indeed call it despite
there being (probably) nothing useful to do.  Seems like that could be
improved.  The comment explaining why it's necessary to do that doesn't
make any sense to me, either.

            regards, tom lane

Re: [GENERAL] VACUUM touching file but not updating relation

From
Thom Brown
Date:
On 11 November 2011 23:28, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Thom Brown <thom@linux.com> writes:
>> On 11 November 2011 00:55, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>> Thom Brown <thom@linux.com> writes:
>>>> I just noticed that the VACUUM process touches a lot of relations
>>>> (affects mtime) but for one file I looked at, it didn't change.  This
>>>> doesn't always happen, and many relations aren't touched at all.
>
>>> No immmediate ideas as to why the mtime would change if the file
>>> contents didn't.  It seems like there must be a code path that marked
>>> a buffer dirty without having changed it, but we're usually pretty
>>> careful about that.
>
>> I checked all files where the time stamp of the file had changed, but
>> had the same MD5 sum.  I used the list in the query you mentioned and
>> get: [ mostly indexes ]
>
> Hmm, is this on a hot standby master?

It's using a wal_level of hot_standby and has max_wal_senders set to
2, but it's not actually replicating to anywhere else.  But if I
comment out both of these, restart, then compare pre-vacuum and
post-vacuum, I get the following results for unchanged but touched
items:

test=# select oid,relname from pg_class where relfilenode in
(11680,11682,11684,11686,11690,16530);
  oid  |       relname
-------+---------------------
  2619 | pg_statistic
  2840 | pg_toast_2619
  2841 | pg_toast_2619_index
 16530 | cows2
(4 rows)

The items which didn't match a result in this instance were 11686 and
11690, which is surprising since they both have a visibility map and
free space map, indicating they're some kind of table.

> I observe that _bt_delitems_vacuum() unconditionally dirties the page
> and writes a WAL record, whether it has anything to do or not; and that
> if XLogStandbyInfoActive() then btvacuumscan will indeed call it despite
> there being (probably) nothing useful to do.  Seems like that could be
> improved.  The comment explaining why it's necessary to do that doesn't
> make any sense to me, either.

Well the effect, in the single instances I've checked, is certainly
more pronounced for hot_standby, but there still appears to be some
occurrences for minimal wal_level too.

--
Thom Brown
Twitter: @darkixion
IRC (freenode): dark_ixion
Registered Linux user: #516935

EnterpriseDB UK: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: [GENERAL] VACUUM touching file but not updating relation

From
Thom Brown
Date:
On 12 November 2011 00:08, Thom Brown <thom@linux.com> wrote:
> On 11 November 2011 23:28, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Thom Brown <thom@linux.com> writes:
>>> On 11 November 2011 00:55, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>>> Thom Brown <thom@linux.com> writes:
>>>>> I just noticed that the VACUUM process touches a lot of relations
>>>>> (affects mtime) but for one file I looked at, it didn't change.  This
>>>>> doesn't always happen, and many relations aren't touched at all.
>>
>>>> No immmediate ideas as to why the mtime would change if the file
>>>> contents didn't.  It seems like there must be a code path that marked
>>>> a buffer dirty without having changed it, but we're usually pretty
>>>> careful about that.
>>
>>> I checked all files where the time stamp of the file had changed, but
>>> had the same MD5 sum.  I used the list in the query you mentioned and
>>> get: [ mostly indexes ]
>>
>> Hmm, is this on a hot standby master?
>
> It's using a wal_level of hot_standby and has max_wal_senders set to
> 2, but it's not actually replicating to anywhere else.  But if I
> comment out both of these, restart, then compare pre-vacuum and
> post-vacuum, I get the following results for unchanged but touched
> items:
>
> test=# select oid,relname from pg_class where relfilenode in
> (11680,11682,11684,11686,11690,16530);
>  oid  |       relname
> -------+---------------------
>  2619 | pg_statistic
>  2840 | pg_toast_2619
>  2841 | pg_toast_2619_index
>  16530 | cows2
> (4 rows)
>
> The items which didn't match a result in this instance were 11686 and
> 11690, which is surprising since they both have a visibility map and
> free space map, indicating they're some kind of table.
>
>> I observe that _bt_delitems_vacuum() unconditionally dirties the page
>> and writes a WAL record, whether it has anything to do or not; and that
>> if XLogStandbyInfoActive() then btvacuumscan will indeed call it despite
>> there being (probably) nothing useful to do.  Seems like that could be
>> improved.  The comment explaining why it's necessary to do that doesn't
>> make any sense to me, either.
>
> Well the effect, in the single instances I've checked, is certainly
> more pronounced for hot_standby, but there still appears to be some
> occurrences for minimal wal_level too.

So would you say this is acceptable and normal activity, or is
something awry here?

--
Thom Brown
Twitter: @darkixion
IRC (freenode): dark_ixion
Registered Linux user: #516935

EnterpriseDB UK: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: [GENERAL] VACUUM touching file but not updating relation

From
Tom Lane
Date:
Thom Brown <thom@linux.com> writes:
>> On 11 November 2011 23:28, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>> I observe that _bt_delitems_vacuum() unconditionally dirties the page
>>> and writes a WAL record, whether it has anything to do or not; and that
>>> if XLogStandbyInfoActive() then btvacuumscan will indeed call it despite
>>> there being (probably) nothing useful to do.  Seems like that could be
>>> improved.  The comment explaining why it's necessary to do that doesn't
>>> make any sense to me, either.

>> Well the effect, in the single instances I've checked, is certainly
>> more pronounced for hot_standby, but there still appears to be some
>> occurrences for minimal wal_level too.

> So would you say this is acceptable and normal activity, or is
> something awry here?

Well, it's expected given the current coding in the btree vacuum logic.
It's not clear to me why it was written like that, though.

            regards, tom lane

Re: [GENERAL] VACUUM touching file but not updating relation

From
Simon Riggs
Date:
On Fri, Nov 18, 2011 at 2:47 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Thom Brown <thom@linux.com> writes:
>>> On 11 November 2011 23:28, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>>> I observe that _bt_delitems_vacuum() unconditionally dirties the page
>>>> and writes a WAL record, whether it has anything to do or not; and that
>>>> if XLogStandbyInfoActive() then btvacuumscan will indeed call it despite
>>>> there being (probably) nothing useful to do.  Seems like that could be
>>>> improved.  The comment explaining why it's necessary to do that doesn't
>>>> make any sense to me, either.
>
>>> Well the effect, in the single instances I've checked, is certainly
>>> more pronounced for hot_standby, but there still appears to be some
>>> occurrences for minimal wal_level too.
>
>> So would you say this is acceptable and normal activity, or is
>> something awry here?
>
> Well, it's expected given the current coding in the btree vacuum logic.
> It's not clear to me why it was written like that, though.

I'll take a look.

--
 Simon Riggs                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Re: [GENERAL] VACUUM touching file but not updating relation

From
Simon Riggs
Date:
On Fri, Nov 18, 2011 at 2:47 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Thom Brown <thom@linux.com> writes:
>>> On 11 November 2011 23:28, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>>> I observe that _bt_delitems_vacuum() unconditionally dirties the page
>>>> and writes a WAL record, whether it has anything to do or not; and that
>>>> if XLogStandbyInfoActive() then btvacuumscan will indeed call it despite
>>>> there being (probably) nothing useful to do.  Seems like that could be
>>>> improved.  The comment explaining why it's necessary to do that doesn't
>>>> make any sense to me, either.
>
>>> Well the effect, in the single instances I've checked, is certainly
>>> more pronounced for hot_standby, but there still appears to be some
>>> occurrences for minimal wal_level too.
>
>> So would you say this is acceptable and normal activity, or is
>> something awry here?
>
> Well, it's expected given the current coding in the btree vacuum logic.
> It's not clear to me why it was written like that, though.

The code works as designed.

_bt_delitems_vacuum() is only ever called with nitems == 0 when it is
the last block of the relation with wal_level = hot standby

As discussed in the comments we must issue a WAL record for the last
block, whatever else has occurred.

So the correct number of WAL records is emitted and I see no bug there.

--
 Simon Riggs                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Re: [GENERAL] VACUUM touching file but not updating relation

From
Tom Lane
Date:
Simon Riggs <simon@2ndQuadrant.com> writes:
> On Fri, Nov 18, 2011 at 2:47 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Well, it's expected given the current coding in the btree vacuum logic.
>> It's not clear to me why it was written like that, though.

> The code works as designed.

> _bt_delitems_vacuum() is only ever called with nitems == 0 when it is
> the last block of the relation with wal_level = hot standby

> As discussed in the comments we must issue a WAL record for the last
> block, whatever else has occurred.

> So the correct number of WAL records is emitted and I see no bug there.

What Thom's complaining about is that the buffer may be marked dirty
unnecessarily, ie when there has been no actual data change.

            regards, tom lane

Re: [GENERAL] VACUUM touching file but not updating relation

From
Simon Riggs
Date:
On Fri, Nov 18, 2011 at 3:18 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

> What Thom's complaining about is that the buffer may be marked dirty
> unnecessarily, ie when there has been no actual data change.

OK, I'll patch it.

--
 Simon Riggs                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Re: [GENERAL] VACUUM touching file but not updating relation

From
Simon Riggs
Date:
On Fri, Nov 18, 2011 at 3:18 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

>> So the correct number of WAL records is emitted and I see no bug there.
>
> What Thom's complaining about is that the buffer may be marked dirty
> unnecessarily, ie when there has been no actual data change.

Based upon both your feedback, I made a change to stop the block being
marked dirty, though Tom now wants that removed.

Thom, your earlier analysis showing that the md5 checksum of a
relation had changed is not happening because of the section of code
you identified. The code sets some data on the page, which would cause
the md5 checksum to change. So it cannot be the btree code  at
_bt_delitems_vacuum() causing this.

--
 Simon Riggs                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services