Thread: [ADMIN] Does Postgres ever write to tables without file system timestamps getting updated?

Hi all,

I'm running Postgres 9.6 and backing it up once a while simply by
stopping the cluster and using rsync on file level. One day I've
recognized that some files for tables in my backup and prod system
have the same size and last written timestamp, while the data itself
actually differs. I recognized that using rsync with checksums and
wondered why much more data gets transferred than expected. So I
calculated hash sums for those files and those were different.

The important thing is that after rsync with checksums transmitted
those changed files with unmodified timestamps, the hash sums of the
files were back in sync again. So it seems very unlikely that the
problem is during rsync copying data itself.

I can only think of two reasons: Either Postgres has some behaviour
where data is actually written to files without changing timestamps or
my backed up data gets modified somehow, which sounds like corruption,
because as a backup, it shouldn't get modified of course.

So, is there any such functionality in Postgres, writing data without
changes to timestamps of the file in the file system? Any other ideas
on where those hash differences could come from?

Sounds like to be sure I need to regularly generate hashes of my
backups and compare those to unmodified files in the backup source.
The backups are not stored on checksumming file systems like BTRFS or
ZFS, so silent data corruption might be an aspect.

Thanks for your ideas!

P.S.: Posted that on SU as well, but didn't get much attention.


https://superuser.com/questions/1216259/does-postgres-ever-write-to-tables-without-file-system-timestamps-getting-update

Mit freundlichen Grüßen,

Thorsten Schöning

--
Thorsten Schöning       E-Mail: Thorsten.Schoening@AM-SoFT.de
AM-SoFT IT-Systeme      http://www.AM-SoFT.de/

Telefon...........05151-  9468- 55
Fax...............05151-  9468- 88
Mobil..............0178-8 9468- 04

AM-SoFT GmbH IT-Systeme, Brandenburger Str. 7c, 31789 Hameln
AG Hannover HRB 207 694 - Geschäftsführer: Andreas Muchow



What's the resolution of the timestamps on your file system? It's always possibly that postgres writes, rsync checks,
postgreswrites again within that window--especially if the timestamp granularity is a second rather than a much smaller
window.(Heck, there have been file systems with 2-second granularity.) 

Use pg_start_backup.

> On Jun 6, 2017, at 2:04 AM, Thorsten Schöning <tschoening@am-soft.de> wrote:
>
> Hi all,
>
> I'm running Postgres 9.6 and backing it up once a while simply by
> stopping the cluster and using rsync on file level. One day I've
> recognized that some files for tables in my backup and prod system
> have the same size and last written timestamp, while the data itself
> actually differs. I recognized that using rsync with checksums and
> wondered why much more data gets transferred than expected. So I
> calculated hash sums for those files and those were different.
>
> The important thing is that after rsync with checksums transmitted
> those changed files with unmodified timestamps, the hash sums of the
> files were back in sync again. So it seems very unlikely that the
> problem is during rsync copying data itself.
>
> I can only think of two reasons: Either Postgres has some behaviour
> where data is actually written to files without changing timestamps or
> my backed up data gets modified somehow, which sounds like corruption,
> because as a backup, it shouldn't get modified of course.
>
> So, is there any such functionality in Postgres, writing data without
> changes to timestamps of the file in the file system? Any other ideas
> on where those hash differences could come from?
>
> Sounds like to be sure I need to regularly generate hashes of my
> backups and compare those to unmodified files in the backup source.
> The backups are not stored on checksumming file systems like BTRFS or
> ZFS, so silent data corruption might be an aspect.
>
> Thanks for your ideas!
>
> P.S.: Posted that on SU as well, but didn't get much attention.
>
>
https://superuser.com/questions/1216259/does-postgres-ever-write-to-tables-without-file-system-timestamps-getting-update
>
> Mit freundlichen Grüßen,
>
> Thorsten Schöning
>
> --
> Thorsten Schöning       E-Mail: Thorsten.Schoening@AM-SoFT.de
> AM-SoFT IT-Systeme      http://www.AM-SoFT.de/
>
> Telefon...........05151-  9468- 55
> Fax...............05151-  9468- 88
> Mobil..............0178-8 9468- 04
>
> AM-SoFT GmbH IT-Systeme, Brandenburger Str. 7c, 31789 Hameln
> AG Hannover HRB 207 694 - Geschäftsführer: Andreas Muchow
>
>
>
> --
> Sent via pgsql-admin mailing list (pgsql-admin@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-admin

--
Scott Ribe
scott_ribe@elevated-dev.com
(303) 722-0567



Greetings,

* scott ribe (scott_ribe@elevated-dev.com) wrote:
> What's the resolution of the timestamps on your file system? It's always possibly that postgres writes, rsync checks,
postgreswrites again within that window--especially if the timestamp granularity is a second rather than a much smaller
window.(Heck, there have been file systems with 2-second granularity.) 

Thorsten said that the database was shut down before performing the
backup, which I would generally think to be sufficient.  It is, perhaps,
barely possible that a scripted "shutdown, rsync, startup" would result
in a file being modified before and after the rsync but within the
1-second typical granularity that rsync uses.  That is certainly one of
the risks of using an rsync-based backup solution, though I've only
actually seen corruption happen due to rsync when using it with the
pg_start/stop_backup methodology (and only in a controlled environment,
so far), but then, I don't run into many cases where people are willing
to shut down their database to do a backup.

> Use pg_start_backup.

Just using pg_start_backup, of course, is insufficient if the database
is running and is impossible if the database is not running.  To
properly backup a running PG system, pg_start/stop_backup needs to be
used *and* the WAL created during the backup must be preserved (through
archive_command or pg_receivewal or similar).

Generally speaking, I'd strongly recommend against trying to write your
own backup solution for PG and would, instead, suggest using one of the
existing solutions, such as pg_basebackup, pgBackRest, or barman.

Thanks!

Stephen

Attachment
Thorsten Schöning wrote:
> I'm running Postgres 9.6 and backing it up once a while simply by
> stopping the cluster and using rsync on file level. One day I've
> recognized that some files for tables in my backup and prod system
> have the same size and last written timestamp, while the data itself
> actually differs. I recognized that using rsync with checksums and
> wondered why much more data gets transferred than expected. So I
> calculated hash sums for those files and those were different.
> 
> The important thing is that after rsync with checksums transmitted
> those changed files with unmodified timestamps, the hash sums of the
> files were back in sync again. So it seems very unlikely that the
> problem is during rsync copying data itself.
> 
> I can only think of two reasons: Either Postgres has some behaviour
> where data is actually written to files without changing timestamps or
> my backed up data gets modified somehow, which sounds like corruption,
> because as a backup, it shouldn't get modified of course.
> 
> So, is there any such functionality in Postgres, writing data without
> changes to timestamps of the file in the file system? Any other ideas
> on where those hash differences could come from?

PostgreSQL doesn't know or care on which file system it is running,
it just opens files and writes to them.  So it won't do anything to
bypass the update of file access times.

Yours,
Laurenz Albe

Guten Tag scott ribe,
am Dienstag, 6. Juni 2017 um 14:57 schrieben Sie:

> What's the resolution of the timestamps on your file system? It's
> always possibly that postgres writes, rsync checks, postgres writes
> again within that window[...]

Obviously I wasn't clear enough: Postgres is not running and my
problem is with files and their timestamps in the past. So regarding
the timestamps the files didn't change for weeks and the files have
the same size and timestamp in the source and my backup. But if I
calculate hash sums of those files the results may differ, so rsync
using checksums will back them up, although the files claim that they
haven't changed for days/weeks.

Maybe the following example makes it more clear:

Backup:

> a1171645dc187c498ce05a25b0e5157f  2613.13
> -rw------- 12 109 119 1073741824 May 21 04:58 2613.13

Production:

> f02c1c2724714af2c5c08f8b67ab0f11  2613.13
> -rw------- 1 postgres postgres 1073741824 Mai 21 04:58 2613.13

The exact same file regarding size and timestamp, but actually
different content. After using rsync with checksums, the file in the
backup still has the same size and timestamp, but new content, because
this time the calculated hash is the same as in production.

The file belongs to pg_largeobject and that table contains a lot of
data, hence the named suffixes. Most of those files in the sequence
have old timestamps like the one above, more than a few days without
any writing, and are NOT all backed up and have the same MD5 hash like
in my backup. Only few files once a while differ like the one in the
example.

Being pg_largeobject and because we actually delete large objects from
the database once a while, reusing existing files is perfectly OK by
Postgres and as expected. But all my tests showed that during writes
the timestamp of those files is actually updated and not kept or reset
that much into the past. Our file system in use is ext4, so shouldn't
have a problem with timestamps in general.

And that's what makes me wonder: If Postgres doesn't reset timestamps
to the past or freeze them somehow for some reason, this sounds like
data corruption in my backups.

Mit freundlichen Grüßen,

Thorsten Schöning

--
Thorsten Schöning       E-Mail: Thorsten.Schoening@AM-SoFT.de
AM-SoFT IT-Systeme      http://www.AM-SoFT.de/

Telefon...........05151-  9468- 55
Fax...............05151-  9468- 88
Mobil..............0178-8 9468- 04

AM-SoFT GmbH IT-Systeme, Brandenburger Str. 7c, 31789 Hameln
AG Hannover HRB 207 694 - Geschäftsführer: Andreas Muchow



Guten Tag Thorsten Schöning,
am Dienstag, 6. Juni 2017 um 15:32 schrieben Sie:

> The file belongs to pg_largeobject and that table contains a lot of
> data, hence the named suffixes. Most of those files in the sequence
> have old timestamps like the one above, more than a few days without
> any writing, and are NOT all backed up and have the same MD5 hash like
> in my backup. Only few files once a while differ like the one in the
> example.

From the following very old data files, mostly unchanged for
days/weeks, e.g. 2613.13 got transferred because of different
checksums, while 2613.10 didn't:

> -rw------- 1 postgres postgres 1073741824 Jun  4 04:40 2613
> -rw------- 1 postgres postgres 1073741824 Mai 21 04:42 2613.1
> -rw------- 1 postgres postgres 1073741824 Mai 21 04:56 2613.10
> -rw------- 1 postgres postgres 1073741824 Mai 21 04:57 2613.11
> -rw------- 1 postgres postgres 1073741824 Mai 21 04:57 2613.12
> -rw------- 1 postgres postgres 1073741824 Mai 21 04:58 2613.13
> -rw------- 1 postgres postgres 1073741824 Mai 21 04:59 2613.14
> -rw------- 1 postgres postgres 1073741824 Mai 28 04:40 2613.15
> -rw------- 1 postgres postgres  686645248 Jun  4 04:42 2613.16
> -rw------- 1 postgres postgres 1073741824 Mai 21 04:44 2613.2
> -rw------- 1 postgres postgres 1073741824 Mai 21 04:46 2613.3
> -rw------- 1 postgres postgres 1073741824 Mai 21 04:47 2613.4
> -rw------- 1 postgres postgres 1073741824 Mai 21 04:49 2613.5
> -rw------- 1 postgres postgres 1073741824 Mai 21 04:50 2613.6
> -rw------- 1 postgres postgres 1073741824 Mai 21 04:52 2613.7
> -rw------- 1 postgres postgres 1073741824 Mai 21 04:53 2613.8
> -rw------- 1 postgres postgres 1073741824 Jun  4 04:40 2613.9
> -rw------- 1 postgres postgres    4407296 Jun  4 04:42 2613_fsm
> -rw------- 1 postgres postgres     548864 Jun  4 04:42 2613_vm

Mit freundlichen Grüßen,

Thorsten Schöning

--
Thorsten Schöning       E-Mail: Thorsten.Schoening@AM-SoFT.de
AM-SoFT IT-Systeme      http://www.AM-SoFT.de/

Telefon...........05151-  9468- 55
Fax...............05151-  9468- 88
Mobil..............0178-8 9468- 04

AM-SoFT GmbH IT-Systeme, Brandenburger Str. 7c, 31789 Hameln
AG Hannover HRB 207 694 - Geschäftsführer: Andreas Muchow



Thorsten Schöning <tschoening@am-soft.de> writes:

> From the following very old data files, mostly unchanged for
> days/weeks, e.g. 2613.13 got transferred because of different
> checksums, while 2613.10 didn't:

How many bytes was changed in this file?

--
Sergey Burladyan


On Jun 6, 2017, at 7:32 AM, Thorsten Schöning <tschoening@am-soft.de> wrote:
>
> Obviously I wasn't clear enough: Postgres is not running and my
> problem is with files and their timestamps in the past. So regarding
> the timestamps the files didn't change for weeks and the files have
> the same size and timestamp in the source and my backup. But if I
> calculate hash sums of those files the results may differ, so rsync
> using checksums will back them up, although the files claim that they
> haven't changed for days/weeks.

You were clear, it's too early in the morning here.

You seem to be assuming that the files were changed close to the time you found the problem; as far as I can tell
that'snot actually true, they could have been changed back at the time of the file timestamps, and could have been
out-of-syncfor a while, right? 

My suggestion to use pg_start_backup is irrelevant to the exact problem you're describing. The documentation for it
doesdescribe a good way to get a file-based backup, without stopping the server, and you might want to check into that.
Youcould also look at pg_base_backup. 

--
Scott Ribe
scott_ribe@elevated-dev.com
(303) 722-0567



Guten Tag scott ribe,
am Dienstag, 6. Juni 2017 um 16:28 schrieben Sie:

> You seem to be assuming that the files were changed close to the
> time you found the problem;[...]

No, I assume that the files were last changed at the time of the last
written timestamp in the production system. Which would make my
backup corrupt if it has the same size/timestamp but different data.
And that change in my backup could have been introduced anytime
between the last backup and now, I simply don't know, because I don't
calculate hash sums for my backups or use some checksumming file
system like BTRFS or ZFS. So it might be that my backups are simply
broken for various reasons and I need to have a closer look at this.

OR MAYBE Postgres resets timestamps in production and what I see is a
perfectly valid use case for some reason I don't understand yet. But
seems pretty unlikely, as Laurenz Albe already mentioned that Postgres
doesn't care about timestamps. And that's what I see in my tests,
timestamps always increase if a file is written in production.

Mit freundlichen Grüßen,

Thorsten Schöning

--
Thorsten Schöning       E-Mail: Thorsten.Schoening@AM-SoFT.de
AM-SoFT IT-Systeme      http://www.AM-SoFT.de/

Telefon...........05151-  9468- 55
Fax...............05151-  9468- 88
Mobil..............0178-8 9468- 04

AM-SoFT GmbH IT-Systeme, Brandenburger Str. 7c, 31789 Hameln
AG Hannover HRB 207 694 - Geschäftsführer: Andreas Muchow



On 6/6/17 10:39 AM, Thorsten Schöning wrote:
> Guten Tag scott ribe,
> am Dienstag, 6. Juni 2017 um 16:28 schrieben Sie:
>
>> You seem to be assuming that the files were changed close to the
>> time you found the problem;[...]
>
> No, I assume that the files were last changed at the time of the last
> written timestamp in the production system. Which would make my
> backup corrupt if it has the same size/timestamp but different data.
> And that change in my backup could have been introduced anytime
> between the last backup and now, I simply don't know, because I don't
> calculate hash sums for my backups or use some checksumming file
> system like BTRFS or ZFS. So it might be that my backups are simply
> broken for various reasons and I need to have a closer look at this.

If you used rsync to do the backup originally then it is very possible
to have timestamp/size be the same and the checksums be different.

This is because PostgreSQL may modify a file while rsync is copying it.
If this happens in the same second that rsync built the manifest then
the timestamp will not be updated but the data may change.

This is why it is not safe to do incremental backups of PostgreSQL with
rsync without using checksums.

--
-David
david@pgmasters.net


To be clear, for potential future readers, just using checksum is not enough to back up a live database. You still have
touse pg_start_backup and pay attention to its documentation re WAL files. 

> On Jun 6, 2017, at 8:54 AM, David Steele <david@pgmasters.net> wrote:
>
> This is why it is not safe to do incremental backups of PostgreSQL with
> rsync without using checksums.

--
Scott Ribe
scott_ribe@elevated-dev.com
(303) 722-0567



Guten Tag Sergey Burladyan,
am Dienstag, 6. Juni 2017 um 16:08 schrieben Sie:

> How many bytes was changed in this file?

Don't know what and how many exactly was changed, but it seemed that
rsync was transferring pretty much the whole file again. The download
was only little higher than what my DSL is able to provide, which
means rsync downloaded a lot instead of keeping already present data
like it does when nothing/very little has changed in files.

Mit freundlichen Grüßen,

Thorsten Schöning

--
Thorsten Schöning       E-Mail: Thorsten.Schoening@AM-SoFT.de
AM-SoFT IT-Systeme      http://www.AM-SoFT.de/

Telefon...........05151-  9468- 55
Fax...............05151-  9468- 88
Mobil..............0178-8 9468- 04

AM-SoFT GmbH IT-Systeme, Brandenburger Str. 7c, 31789 Hameln
AG Hannover HRB 207 694 - Geschäftsführer: Andreas Muchow