Thread: [HACKERS] On markers of changed data

[HACKERS] On markers of changed data

From
Andrey Borodin
Date:
Hi, hackers!

Currently I'm working on page-level incremental backups using WAL-G codebase[0]. And I have two questions that I cannot
resolvemyself. 

Incremental backup is a set of changes, that should be applied over preexisting backup. I use page LSN to understand
shouldpage be backup`ed or not. 

Question 1. FSM and VM.
As you can see here [1] FSM and VM files are exempt from incremental tracking and are backuped as whole files. I've
doneit this way, because sanity checks [2] of page headers have indicated a lot of "invalid" pages in FSM and VM files.
Butseems like in some pages headers are valid with sane LSNs. 
Can I use LSNs as history marker on FSM and VM pages? On 1Tb backup I get like 150Mb of FSM+VM, and it's kind of a lot.

Question 2. File dates.
Is it safe to use file modification time to track that file were changes since previous backup? If the file has date
beforestart of previous backup I just add it to "skip list" [3]. 
I have assumption: every time file is changes in filesystem, it's modification date is updated to higher value.
Is this assumption valid for most of used platforms and filesystems? Or can I check this "capacity" of FS?

Thank you for your attention. I'll be glad to receive any information\pointers on this matter.


Best regards, Andrey Borodin, Yandex.

[0] https://github.com/wal-g/wal-g/pull/29
[1] https://github.com/wal-g/wal-g/pull/29/files#diff-d77406e827f5f947d4d4a1e6d76c1f4eR114
[2] https://github.com/wal-g/wal-g/pull/29/files#diff-d77406e827f5f947d4d4a1e6d76c1f4eR50
[3] https://github.com/wal-g/wal-g/pull/29/files#diff-f5c8f0067297f98eb5acc6e2c6b1b234R87

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] On markers of changed data

From
Tom Lane
Date:
Andrey Borodin <x4mmm@yandex-team.ru> writes:
> Is it safe to use file modification time to track that file were changes
> since previous backup?

I'd say no:

1. You don't know the granularity of the filesystem's timestamps, at least
not without making unportable assumptions.

2. There's no guarantee that the system clock can't be set backwards.

3. It's not uncommon for filesystems to have optimizations whereby they
skip or delay some updates of file mtimes.  (I think this is usually
optional, but you couldn't know whether it's turned on.)

#2 is probably the worst of these problems.
        regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] On markers of changed data

From
Michael Paquier
Date:
On Fri, Oct 6, 2017 at 11:22 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Andrey Borodin <x4mmm@yandex-team.ru> writes:
>> Is it safe to use file modification time to track that file were changes
>> since previous backup?
>
> I'd say no:
>
> 1. You don't know the granularity of the filesystem's timestamps, at least
> not without making unportable assumptions.
>
> 2. There's no guarantee that the system clock can't be set backwards.
>
> 3. It's not uncommon for filesystems to have optimizations whereby they
> skip or delay some updates of file mtimes.  (I think this is usually
> optional, but you couldn't know whether it's turned on.)
>
> #2 is probably the worst of these problems.

Or upwards. A simple example of things depending on clock changes is
for example VM snapshotting. Any logic not depending on monotonic
timestamps, with things like clock_gettime(CLOCK_MONOTONIC) is a lot
of fun to investigate until you know that they are not using any
monotonic logic... So the answer is *no*, do not depend on FS-level
timestamps. The only sane method for Postgres is really to scan the
page header LSNs, and of course you already know that.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] On markers of changed data

From
Alvaro Herrera
Date:
Michael Paquier wrote:

> The only sane method for Postgres is really to scan the
> page header LSNs, and of course you already know that.

I hope the idea is not to have to scan every single page in the
database, because that would be too slow.  It should be possible to
build this so that a single summary LSN is kept for a largish group of
pages, allowing large portions of the database to be skipped from even
being read if they are known not to contain any page newer than the
previous backup.

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] On markers of changed data

From
Stephen Frost
Date:
Tom, Michael,

* Michael Paquier (michael.paquier@gmail.com) wrote:
> On Fri, Oct 6, 2017 at 11:22 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> > Andrey Borodin <x4mmm@yandex-team.ru> writes:
> >> Is it safe to use file modification time to track that file were changes
> >> since previous backup?
> >
> > I'd say no:
> >
> > 1. You don't know the granularity of the filesystem's timestamps, at least
> > not without making unportable assumptions.
> >
> > 2. There's no guarantee that the system clock can't be set backwards.
> >
> > 3. It's not uncommon for filesystems to have optimizations whereby they
> > skip or delay some updates of file mtimes.  (I think this is usually
> > optional, but you couldn't know whether it's turned on.)
> >
> > #2 is probably the worst of these problems.
>
> Or upwards. A simple example of things depending on clock changes is
> for example VM snapshotting. Any logic not depending on monotonic
> timestamps, with things like clock_gettime(CLOCK_MONOTONIC) is a lot
> of fun to investigate until you know that they are not using any
> monotonic logic... So the answer is *no*, do not depend on FS-level
> timestamps. The only sane method for Postgres is really to scan the
> page header LSNs, and of course you already know that.

Really, these comments appear, at least to me, to be based on an
incorrect assumption that's only considering how tools like rsync use
mtime.

No, you can't trust rsync-based backups that look at mtime and only copy
if the mtime of the source file is currently 'more recent' than the
mtime of the destination file.

That doesn't mean that mtime can't be used to perform incremental
backups using PG, but care has to be taken when doing so to minimize the
risk of a file getting skipped that should have been copied.

There's a few things to do to minimize that risk:

Use mtime only as an indication of if the file changed from the last
time you looked at it- doesn't matter if the mtime on the file is newer
or older.  If the mtime is *different*, then you can't trust that the
contents are the same and you need to include it in the backup.  Of
course, combine this with checking the file size has changed, but in PG
there's lots of files of the same size, so that's not a terribly good
indicator.

Further, you have to get the mtime of all the files *before* you start
backing them up.  If you take the mtime of the file at the time you
start actually copying it, then it could possibly be modified while you
copy it but without the mtime being updated from when you initially
pulled it (and that's not even talking about the concerns around the
clock time moving back and forth).  To address the granularity concern,
you should also be sure to wait after you collect all the mtimes but
before actually starting the backup to the level of granularity.  Any
optimization which delays setting the mtime would, certainly, still get
around to updating the mtime before the next backup runs and therefore
that file might get copied even though it hadn't changed, but that's
still technically correct, just slightly more work.  Lastly, don't trust
any times which are from after the time that you collected the mtimes-
either during the initial backup or when you are doing the subsequent
incremental.  Any file whose mtime is different *or* is from after the
time the mtimes were collected should be copied.

This isn't to say that there isn't some risk to using mtime, there still
is- if a backup is made of a file and its mtime collected, and then time
moves backwards, and the file is modified again at the *exact* same
time, leading the 'new' mtime to be identical to the 'old' mtime while
the file's contents are different, and that file is not subsequently
modified before the next backup happens, then the file might not be
included in the backup even though it should be.

Other risks are just blatent corruption happening in the mtime field, or
a kernel-level bug that doesn't update mtime when it should, or the
kernel somehow resetting the mtime back after the file has been changed,
or someone explicitly setting the mtime back after changing a file, or
perhaps other such attacks, though eliminating all of those risks isn't
possible (regardless of solution- someone could go change the LSN on a
page too, for example, and foil a tool which was based on that).

These are risks which I'd love to remove, but they also strike me as
quite small and ones which practical users are willing to accept for
their incremental and differential backups, though it's a reason to also
take full backups regularly.

As Alvaro notes downthread, it's also the only reasonable option
available today.  It'd be great to have a better solution, and perhaps
one which summarizes the LSNs in each file would work and be better, but
that would also only be available for PG11, at the earliest.

Thanks!

Stephen

Re: [HACKERS] On markers of changed data

From
Michael Paquier
Date:

> Le 6 oct. 2017 à 23:44, Alvaro Herrera <alvherre@alvh.no-ip.org> a écrit :
>
> Michael Paquier wrote:
>
>> The only sane method for Postgres is really to scan the
>> page header LSNs, and of course you already know that.
>
> I hope the idea is not to have to scan every single page in the
> database, because that would be too slow.  It should be possible to
> build this so that a single summary LSN is kept for a largish group of
> pages, allowing large portions of the database to be skipped from even
> being read if they are known not to contain any page newer than the
> previous backup.

That’s actually what pg_rman is doing for what it calls incremental backups (perhaps that would be differential backup
inPG terminology?), and the performance is bad as you can imagine. We could have a dedicated LSN map to do such things
with4 bytes per page. I am still not convinced that this much facility and the potential bug risks are worth it though,
Postgresalready knows about differential backups if you shape it as a delta of WAL segments. I think that, in order to
finda LSN map more convincing, we should find first other use cases where it could become useful. Some use cases may
popup with VACUUM, but I have not studied the question hard enough... 
--
Michael

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] On markers of changed data

From
Alvaro Herrera
Date:
Michael Paquier wrote:

> That’s actually what pg_rman is doing for what it calls incremental
> backups (perhaps that would be differential backup in PG
> terminology?), and the performance is bad as you can imagine. We could
> have a dedicated LSN map to do such things with 4 bytes per page. I am
> still not convinced that this much facility and the potential bug
> risks are worth it though, Postgres already knows about differential
> backups if you shape it as a delta of WAL segments. I think that, in
> order to find a LSN map more convincing, we should find first other
> use cases where it could become useful. Some use cases may pop up with
> VACUUM, but I have not studied the question hard enough...

The case I've discussed with barman developers is a large database
(couple dozen of TBs should be enough) where a large fraction (say 95%)
is read-only but there are many changes to the active part of the data,
so that WAL is more massive than size of active data.

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] On markers of changed data

From
Stephen Frost
Date:
Alvaro, Michael,

* Alvaro Herrera (alvherre@alvh.no-ip.org) wrote:
> Michael Paquier wrote:
> > That’s actually what pg_rman is doing for what it calls incremental
> > backups (perhaps that would be differential backup in PG
> > terminology?), and the performance is bad as you can imagine. We could
> > have a dedicated LSN map to do such things with 4 bytes per page. I am
> > still not convinced that this much facility and the potential bug
> > risks are worth it though, Postgres already knows about differential
> > backups if you shape it as a delta of WAL segments. I think that, in
> > order to find a LSN map more convincing, we should find first other
> > use cases where it could become useful. Some use cases may pop up with
> > VACUUM, but I have not studied the question hard enough...
>
> The case I've discussed with barman developers is a large database
> (couple dozen of TBs should be enough) where a large fraction (say 95%)
> is read-only but there are many changes to the active part of the data,
> so that WAL is more massive than size of active data.

Yes, we've seen environments like that also.

Thanks!

Stephen

Re: [HACKERS] On markers of changed data

From
Andrey Borodin
Date:
Tom, Alvaro, Michael, and especially Septhen, thank you for your valuable comments.

I feel enlightened about mtime.
My takeaway is:
1. Any other marker would be better (It can be WAL scan during archiving, some new LSN-based mechanics* et c.)
2. mtime could be used, with precautions described by Stephen are taken.

But my other question still seems unanswered: can I use LSN logic for incrementing FSM and VM? Seems like most of the
timethere is valid LSN 


* I like the idea of using something for both incr(diff) backups and VACUUM, it worth thinking about.

Best regards, Andrey Borodin.



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] On markers of changed data

From
Stephen Frost
Date:
Andrey,

* Andrey Borodin (x4mmm@yandex-team.ru) wrote:
> But my other question still seems unanswered: can I use LSN logic for incrementing FSM and VM? Seems like most of the
timethere is valid LSN
 

I haven't gone and audited it myself, but I would certainly expect you
to be able to use the LSN for everything which is WAL'd.  If you have
cases where that's not the case, it'd be useful to see them.

Thanks!

Stephen

Re: [HACKERS] On markers of changed data

From
Andrey Borodin
Date:
> 8 окт. 2017 г., в 20:11, Stephen Frost <sfrost@snowman.net> написал(а):
> * Andrey Borodin (x4mmm@yandex-team.ru) wrote:
>> But my other question still seems unanswered: can I use LSN logic for incrementing FSM and VM? Seems like most of
thetime there is valid LSN 
>
> I haven't gone and audited it myself, but I would certainly expect you
> to be able to use the LSN for everything which is WAL'd.  If you have
> cases where that's not the case, it'd be useful to see them.

Thanks, Stephen, this actually pointed what to look for
VM is WAL-logged [0]
FSM is not [1]

Now I have everything I wanted, and go back coding :)

Best regards, Andrey Borodin.

[0]
https://github.com/postgres/postgres/blob/113b0045e20d40f726a0a30e33214455e4f1385e/src/backend/access/heap/visibilitymap.c#L315
[1]
https://github.com/postgres/postgres/blob/1d25779284fe1ba08ecd57e647292a9deb241376/src/backend/storage/freespace/freespace.c#L593

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] On markers of changed data

From
Michael Paquier
Date:
On Mon, Oct 9, 2017 at 2:23 PM, Andrey Borodin <x4mmm@yandex-team.ru> wrote:
>> I haven't gone and audited it myself, but I would certainly expect you
>> to be able to use the LSN for everything which is WAL'd.  If you have
>> cases where that's not the case, it'd be useful to see them.
>
> Thanks, Stephen, this actually pointed what to look for
> VM is WAL-logged [0]
> FSM is not [1]

If you are willing to go down this road, here are my takes on the matter:
- Any LSN map should be in a different file than FSM and VM. The VM
uses 2 bits per blocks now, and the FSM 8 bits. The FSM is wanted to
remain small, so trying to plug into it more data is not welcome. The
VM bits also are dedicated to purely visibility matters. We may not
want to use that for the VM.
- Those should not be logged, as we would end up with tracking down
WAL records for things that themselves track the effects of other
records.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] On markers of changed data

From
Andrey Borodin
Date:
Hi Michael!
9 окт. 2017 г., в 17:28, Michael Paquier <michael.paquier@gmail.com> написал(а):
VM is WAL-logged [0]
FSM is not [1]

If you are willing to go down this road, here are my takes on the matter:
- Any LSN map should be in a different file than FSM and VM. The VM
uses 2 bits per blocks now, and the FSM 8 bits. The FSM is wanted to
remain small, so trying to plug into it more data is not welcome. The
VM bits also are dedicated to purely visibility matters. We may not
want to use that for the VM.
- Those should not be logged, as we would end up with tracking down
WAL records for things that themselves track the effects of other
records.
I was asking about FSM and VM not because I wanted to place something there, but because I was looking for the way to backup FSM and VM efficiently. VM can be copied page-incrementally, FSM cannot.

But the design you are describing resembles PTRACK[0]: fork for page changes tracking, not WAL-logged, but crash safe due to recovery from others WALs.

Best regards, Andrey Borodin.


[0] https://gist.github.com/stalkerg/ab833d94e2f64df241f1835651e06e4b

Re: [HACKERS] On markers of changed data

From
Peter Geoghegan
Date:
On Sat, Oct 7, 2017 at 6:34 AM, Stephen Frost <sfrost@snowman.net> wrote:
>> > That’s actually what pg_rman is doing for what it calls incremental
>> > backups (perhaps that would be differential backup in PG
>> > terminology?), and the performance is bad as you can imagine. We could
>> > have a dedicated LSN map to do such things with 4 bytes per page. I am
>> > still not convinced that this much facility and the potential bug
>> > risks are worth it though, Postgres already knows about differential
>> > backups if you shape it as a delta of WAL segments. I think that, in
>> > order to find a LSN map more convincing, we should find first other
>> > use cases where it could become useful. Some use cases may pop up with
>> > VACUUM, but I have not studied the question hard enough...
>>
>> The case I've discussed with barman developers is a large database
>> (couple dozen of TBs should be enough) where a large fraction (say 95%)
>> is read-only but there are many changes to the active part of the data,
>> so that WAL is more massive than size of active data.
>
> Yes, we've seen environments like that also.

I'm pretty sure that those cases are cases where there are many more
FPIs than might be expected, due to a lack of locality. (UUID PKs can
make the size of WAL balloon, for example.)

--
Peter Geoghegan


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] On markers of changed data

From
Greg Stark
Date:
On 8 October 2017 at 08:52, Andrey Borodin <x4mmm@yandex-team.ru> wrote:
>
> 1. Any other marker would be better (It can be WAL scan during archiving, some new LSN-based mechanics* et c.)

The general shape of what I would like to see is some log which lists
where each checkpoint starts and ends and what blocks are modified
since the previous checkpoint. Then to generate an incremental backup
from any point in time to the current you union all the block lists
between them and fetch those blocks. There are other ways of using
this aside from incremental backups on disk too -- you could imagine a
replica that has fallen behind requesting the block lists and then
fetching just those blocks instead of needing to receive and apply all
the wal. Or possibly even making a cost-based decision between the two
depending on which would be faster.

It would also be useful for going in the reverse direction: look up
all the records (or just the last record) that modified a given block.
Instead of having to scan all the wal you would only need to scan the
checkpoint eras that had touched that block.



-- 
greg


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] On markers of changed data

From
Alvaro Herrera
Date:
Greg Stark wrote:

> The general shape of what I would like to see is some log which lists
> where each checkpoint starts and ends and what blocks are modified
> since the previous checkpoint. Then to generate an incremental backup
> from any point in time to the current you union all the block lists
> between them and fetch those blocks. There are other ways of using
> this aside from incremental backups on disk too -- you could imagine a
> replica that has fallen behind requesting the block lists and then
> fetching just those blocks instead of needing to receive and apply all
> the wal.

Hmm, this sounds pretty clever.  And we already have the blocks touched
by each record thanks to the work for pg_rewind (so we don't have to do
any nasty tricks like the stuff Suzuki-san did for pg_lesslog, where
each WAL record had to be processed individually to know what blocks it
referenced), so it shouldn't be *too* difficult ...

> It would also be useful for going in the reverse direction: look up
> all the records (or just the last record) that modified a given block.

Well, a LSN map is what I was suggesting.

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] On markers of changed data

From
Stephen Frost
Date:
Alvaro,

* Alvaro Herrera (alvherre@alvh.no-ip.org) wrote:
> Greg Stark wrote:
>
> > The general shape of what I would like to see is some log which lists
> > where each checkpoint starts and ends and what blocks are modified
> > since the previous checkpoint. Then to generate an incremental backup
> > from any point in time to the current you union all the block lists
> > between them and fetch those blocks. There are other ways of using
> > this aside from incremental backups on disk too -- you could imagine a
> > replica that has fallen behind requesting the block lists and then
> > fetching just those blocks instead of needing to receive and apply all
> > the wal.
>
> Hmm, this sounds pretty clever.  And we already have the blocks touched
> by each record thanks to the work for pg_rewind (so we don't have to do
> any nasty tricks like the stuff Suzuki-san did for pg_lesslog, where
> each WAL record had to be processed individually to know what blocks it
> referenced), so it shouldn't be *too* difficult ...

Yeah, it sounds interesting, but I was just chatting w/ David about it
and we were thinking about how checkpoints are really rather often done,
so you end up with quite a few of these lists being out there.

Now, if the lists were always kept in a sorted fashion, then perhaps we
would be able to essentially merge-sort them all back together and
de-dup that way, but even then, you're talking about an awful lot if
you're looking at daily incrementals- that's 288 standard 5-minute
checkpoints, each with some 128k pages changed, assuming max_wal_size of
1GB, and I think we can all agree that the default max_wal_size is for
rather small systems.  That ends up being something around 2MB per
checkpoint to store the pages in or half a gig per day just to keep
track of the pages which changed in each checkpoint across that day.

There's a bit of hand-waving in there, but I don't think it's all that
much to reach a conclusion that this might not be the best approach.
David and I were kicking around the notion of a 'last LSN' which is kept
on a per-relation basis, but, of course, that ends up not really being
granular enough, and would likely be a source of contention unless we
could work out a way to make it "lazy" updated somehow, or similar.

> > It would also be useful for going in the reverse direction: look up
> > all the records (or just the last record) that modified a given block.
>
> Well, a LSN map is what I was suggesting.

Not sure I entirely followed what you were getting at here..?

Thanks!

Stephen

Re: [HACKERS] On markers of changed data

From
Greg Stark
Date:
On 10 October 2017 at 23:50, Stephen Frost <sfrost@snowman.net> wrote:
> Yeah, it sounds interesting, but I was just chatting w/ David about it
> and we were thinking about how checkpoints are really rather often done,
> so you end up with quite a few of these lists being out there.
>
> Now, if the lists were always kept in a sorted fashion, then perhaps we
> would be able to essentially merge-sort them all back together and
> de-dup that way but even then, you're talking about an awful lot if
> you're looking at daily incrementals- that's 288 standard 5-minute
> checkpoints, each with some 128k pages changed, assuming max_wal_size of
> 1GB, and I think we can all agree that the default max_wal_size is for
> rather small systems.  That ends up being something around 2MB per
> checkpoint to store the pages in or half a gig per day just to keep
> track of the pages which changed in each checkpoint across that day.

I was actually imagining a bitmap, probably for each 1GB piece of each
table. That's probably how you would maintain this data in memory
anyways. After compression it should be fairly small. You'll probably
be modifying the same blocks frequently or doing bulk loads which will
touch a consecutive range of blocks.

But that's still about the same amount of data. But probably you don't
want to actually keep every checkpoint anyways. The nice thing about
the changelists is that they will tend to reach a maximum size
regardless of how long a time range they span so if you keep one
changelist for every 10 checkpoints or every 100 checkpoints you could
reduce the storage needs and only lose the time precision.


-- 
greg


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] On markers of changed data

From
Robert Haas
Date:
On Fri, Oct 6, 2017 at 10:34 AM, Michael Paquier
<michael.paquier@gmail.com> wrote:
> On Fri, Oct 6, 2017 at 11:22 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> I'd say no:
>>
>> 1. You don't know the granularity of the filesystem's timestamps, at least
>> not without making unportable assumptions.
>>
>> 2. There's no guarantee that the system clock can't be set backwards.
>>
>> 3. It's not uncommon for filesystems to have optimizations whereby they
>> skip or delay some updates of file mtimes.  (I think this is usually
>> optional, but you couldn't know whether it's turned on.)
>>
>> #2 is probably the worst of these problems.
>
> Or upwards. A simple example of things depending on clock changes is
> for example VM snapshotting. Any logic not depending on monotonic
> timestamps, with things like clock_gettime(CLOCK_MONOTONIC) is a lot
> of fun to investigate until you know that they are not using any
> monotonic logic... So the answer is *no*, do not depend on FS-level
> timestamps. The only sane method for Postgres is really to scan the
> page header LSNs, and of course you already know that.

It might also be worth noting that the FSM is not WAL-logged, so it
probably either doesn't have page LSNs or they are not meaningful.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] On markers of changed data

From
Andrey Borodin
Date:
Hello!


> 9 окт. 2017 г., в 10:23, Andrey Borodin <x4mmm@yandex-team.ru> написал(а):
>
> Thanks, Stephen, this actually pointed what to look for
> VM is WAL-logged [0]
> FSM is not [1]
>
> [0]
https://github.com/postgres/postgres/blob/113b0045e20d40f726a0a30e33214455e4f1385e/src/backend/access/heap/visibilitymap.c#L315
> [1]
https://github.com/postgres/postgres/blob/1d25779284fe1ba08ecd57e647292a9deb241376/src/backend/storage/freespace/freespace.c#L593

After tests of binary equivalence before and after backup I've come to conclusion, that Visibility Map cannot be
backupedincrementally: it's bits are unset without page LSN bump. This can lead to wrong results of Index Only Scans
executedon freshly restored backups. 

In my implementation of incremental backup in WAL-G I will disable any FSM, VM and XACT\CLOG incrementation.

Posting this for the record, so that if someone goes this way info will be available. Thank you for your attention.

Best regards, Andrey Borodin.

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers