Thread: strange behavior of WAL files

strange behavior of WAL files

From
Atul Kumar
Date:
HI,

We have a centos 6 enviornment where postgres 9.6 is running on it.

We have strange behavior of WAL files of pg_xlog directory

As we have set archive_command to archive WAL files at different
location and the archive_command is working fine.

So strange behavior is :

We have a WAL file say for example "00000001000036CD000000E2" of
01.06.2021 (1st June 2021) that is getting archive successfully at the
archive location and once it is archived, this file with same name
(00000001000036CD000000E2) is getting generated with the latest
timestamp (as today is 04.06.2021).and all old WAL files are behaving
in same manner.

Old WAL files get archived and once it get archived they get generated
with the same name with latest timestamp.

So please help me in telling why such kind of behavior is occurring.

So why this kind of behavior is happening.

the total number of files in pg_xlog directory is around 4016.

Note: There is no replication configured on the server.




Regards,
Atul




Regards,
Atul



Re: strange behavior of WAL files

From
Jehan-Guillaume de Rorthais
Date:
On Fri, 4 Jun 2021 15:39:30 +0530
Atul Kumar <akumar14871@gmail.com> wrote:

> HI,
> 
> We have a centos 6 enviornment where postgres 9.6 is running on it.
> 
> We have strange behavior of WAL files of pg_xlog directory
> 
> As we have set archive_command to archive WAL files at different
> location and the archive_command is working fine.
> 
> So strange behavior is :
> 
> We have a WAL file say for example "00000001000036CD000000E2" of
> 01.06.2021 (1st June 2021) that is getting archive successfully at the
> archive location and once it is archived, this file with same name
> (00000001000036CD000000E2) is getting generated with the latest
> timestamp (as today is 04.06.2021).and all old WAL files are behaving
> in same manner.

What is you archive_command?

I'm not sure I understand correctly, but keep in mind your
archive_command must be "read only". Do not remove the WAL file after archiving
it.

Regards,



Re: strange behavior of WAL files

From
Atul Kumar
Date:
Hi,


archive_command is 'cp %p /nfslogs/wal/%f'

and no, we are not removing anything from pg_xlog directory.

once old WAL files of pg_xlog directory are archived in
'/nfslogs/wal/' directory then these WAL files are getting generated
with the same name in pg_xlog directory.

my query is Why is this happening ?



please help me with your suggestions.


Regards.




On 6/4/21, Jehan-Guillaume de Rorthais <jgdr@dalibo.com> wrote:
> On Fri, 4 Jun 2021 15:39:30 +0530
> Atul Kumar <akumar14871@gmail.com> wrote:
>
>> HI,
>>
>> We have a centos 6 enviornment where postgres 9.6 is running on it.
>>
>> We have strange behavior of WAL files of pg_xlog directory
>>
>> As we have set archive_command to archive WAL files at different
>> location and the archive_command is working fine.
>>
>> So strange behavior is :
>>
>> We have a WAL file say for example "00000001000036CD000000E2" of
>> 01.06.2021 (1st June 2021) that is getting archive successfully at the
>> archive location and once it is archived, this file with same name
>> (00000001000036CD000000E2) is getting generated with the latest
>> timestamp (as today is 04.06.2021).and all old WAL files are behaving
>> in same manner.
>
> What is you archive_command?
>
> I'm not sure I understand correctly, but keep in mind your
> archive_command must be "read only". Do not remove the WAL file after
> archiving
> it.
>
> Regards,
>



Re: strange behavior of WAL files

From
Vijaykumar Jain
Date:
I have not seen this, so cannot comment, but when I am trying to simulate i do not see issues.

One thing to note, 
It seems your wal is on nfs mount , can you rule out any nfs errors if it is nfs.

On Fri, Jun 4, 2021, 6:24 PM Atul Kumar <akumar14871@gmail.com> wrote:
Hi,


archive_command is 'cp %p /nfslogs/wal/%f'

and no, we are not removing anything from pg_xlog directory.

once old WAL files of pg_xlog directory are archived in
'/nfslogs/wal/' directory then these WAL files are getting generated
with the same name in pg_xlog directory.

my query is Why is this happening ?



please help me with your suggestions.


Regards.




On 6/4/21, Jehan-Guillaume de Rorthais <jgdr@dalibo.com> wrote:
> On Fri, 4 Jun 2021 15:39:30 +0530
> Atul Kumar <akumar14871@gmail.com> wrote:
>
>> HI,
>>
>> We have a centos 6 enviornment where postgres 9.6 is running on it.
>>
>> We have strange behavior of WAL files of pg_xlog directory
>>
>> As we have set archive_command to archive WAL files at different
>> location and the archive_command is working fine.
>>
>> So strange behavior is :
>>
>> We have a WAL file say for example "00000001000036CD000000E2" of
>> 01.06.2021 (1st June 2021) that is getting archive successfully at the
>> archive location and once it is archived, this file with same name
>> (00000001000036CD000000E2) is getting generated with the latest
>> timestamp (as today is 04.06.2021).and all old WAL files are behaving
>> in same manner.
>
> What is you archive_command?
>
> I'm not sure I understand correctly, but keep in mind your
> archive_command must be "read only". Do not remove the WAL file after
> archiving
> it.
>
> Regards,
>


Re: strange behavior of WAL files

From
Vijaykumar Jain
Date:
I will try to simulate this and see if i can reproduce it, currently in between difficult interviews where i have little hope :) 


and you can try pg_waldump
to see what is in the WAL, and if you see any issues.





On Fri, 4 Jun 2021 at 18:45, Vijaykumar Jain <vijaykumarjain.github@gmail.com> wrote:
I have not seen this, so cannot comment, but when I am trying to simulate i do not see issues.

One thing to note, 
It seems your wal is on nfs mount , can you rule out any nfs errors if it is nfs.

On Fri, Jun 4, 2021, 6:24 PM Atul Kumar <akumar14871@gmail.com> wrote:
Hi,


archive_command is 'cp %p /nfslogs/wal/%f'

and no, we are not removing anything from pg_xlog directory.

once old WAL files of pg_xlog directory are archived in
'/nfslogs/wal/' directory then these WAL files are getting generated
with the same name in pg_xlog directory.

my query is Why is this happening ?



please help me with your suggestions.


Regards.




On 6/4/21, Jehan-Guillaume de Rorthais <jgdr@dalibo.com> wrote:
> On Fri, 4 Jun 2021 15:39:30 +0530
> Atul Kumar <akumar14871@gmail.com> wrote:
>
>> HI,
>>
>> We have a centos 6 enviornment where postgres 9.6 is running on it.
>>
>> We have strange behavior of WAL files of pg_xlog directory
>>
>> As we have set archive_command to archive WAL files at different
>> location and the archive_command is working fine.
>>
>> So strange behavior is :
>>
>> We have a WAL file say for example "00000001000036CD000000E2" of
>> 01.06.2021 (1st June 2021) that is getting archive successfully at the
>> archive location and once it is archived, this file with same name
>> (00000001000036CD000000E2) is getting generated with the latest
>> timestamp (as today is 04.06.2021).and all old WAL files are behaving
>> in same manner.
>
> What is you archive_command?
>
> I'm not sure I understand correctly, but keep in mind your
> archive_command must be "read only". Do not remove the WAL file after
> archiving
> it.
>
> Regards,
>




--
Thanks,
Vijay
Mumbai, India

Re: strange behavior of WAL files

From
Tom Lane
Date:
Atul Kumar <akumar14871@gmail.com> writes:
> once old WAL files of pg_xlog directory are archived in
> '/nfslogs/wal/' directory then these WAL files are getting generated
> with the same name in pg_xlog directory.

Are you sure you are describing the behavior accurately?

What I would expect to happen, once an old WAL file has been archived
and the server knows its contents are no longer needed, is for the
WAL file to be "recycled" by renaming it to have a name that's in-the-
future in the WAL name series, whereupon it will wait its turn to be
reused by future WAL writes.  On most filesystems the rename as such
doesn't change the file's mod time, so you'll see files that seem
to be in-the-future according to their names, but have old timestamps.

(There's a limit on how many future WAL files we'll tee up this way,
so it's possible that an old one would just get deleted instead.
But the steady-state behavior is to just rotate them around.)

            regards, tom lane



Re: strange behavior of WAL files

From
Atul Kumar
Date:
Hi Jehan,

Just to add little more info about this issue is : We have set value
4000  for parameter wal_keep_segments.

So is there any chance that after a certain number of WAL files,
postgres will start recycling the WAL with same name ?



Please share your valuable suggestion.



Regards.
Atul









On 6/4/21, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Atul Kumar <akumar14871@gmail.com> writes:
>> once old WAL files of pg_xlog directory are archived in
>> '/nfslogs/wal/' directory then these WAL files are getting generated
>> with the same name in pg_xlog directory.
>
> Are you sure you are describing the behavior accurately?
>
> What I would expect to happen, once an old WAL file has been archived
> and the server knows its contents are no longer needed, is for the
> WAL file to be "recycled" by renaming it to have a name that's in-the-
> future in the WAL name series, whereupon it will wait its turn to be
> reused by future WAL writes.  On most filesystems the rename as such
> doesn't change the file's mod time, so you'll see files that seem
> to be in-the-future according to their names, but have old timestamps.
>
> (There's a limit on how many future WAL files we'll tee up this way,
> so it's possible that an old one would just get deleted instead.
> But the steady-state behavior is to just rotate them around.)
>
>             regards, tom lane
>



Re: strange behavior of WAL files

From
Atul Kumar
Date:
hi Tom,

Please check my findings below

older
-rw------- 1 enterprisedb enterprisedb 16777216 Jun  2 02:47
00000001000036CF000000A4
-rw------- 1 enterprisedb enterprisedb 16777216 Jun  2 02:45
00000001000036CF000000A3
-rw------- 1 enterprisedb enterprisedb 16777216 Jun  2 02:44
00000001000036CF000000A5

please note that above files are of June 2nd and once it is archived
it will be recycled with same name with current timestamp, check
below:

newer

-rw------- 1 enterprisedb enterprisedb 16777216 Jun  4 08:19
00000001000036CF000000A0
-rw------- 1 enterprisedb enterprisedb 16777216 Jun  4 08:20
00000001000036CF000000A1
-rw------- 1 enterprisedb enterprisedb 16777216 Jun  4 08:22
00000001000036CF000000A2
drwx------ 2 enterprisedb enterprisedb   311296 Jun  4 08:22 archive_status
-rw------- 1 enterprisedb enterprisedb 16777216 Jun  4 08:23
00000001000036CF000000A3
-rw------- 1 enterprisedb enterprisedb 16777216 Jun  4 08:23
00000001000036CF000000A4



the file names ending with A3 and A4 are the files that got generated
with same name with the latest timestamp.



So that's why I called it strange behavior, please suggest your opinion.





Regards,
Atul










On 6/4/21, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Atul Kumar <akumar14871@gmail.com> writes:
>> once old WAL files of pg_xlog directory are archived in
>> '/nfslogs/wal/' directory then these WAL files are getting generated
>> with the same name in pg_xlog directory.
>
> Are you sure you are describing the behavior accurately?
>
> What I would expect to happen, once an old WAL file has been archived
> and the server knows its contents are no longer needed, is for the
> WAL file to be "recycled" by renaming it to have a name that's in-the-
> future in the WAL name series, whereupon it will wait its turn to be
> reused by future WAL writes.  On most filesystems the rename as such
> doesn't change the file's mod time, so you'll see files that seem
> to be in-the-future according to their names, but have old timestamps.
>
> (There's a limit on how many future WAL files we'll tee up this way,
> so it's possible that an old one would just get deleted instead.
> But the steady-state behavior is to just rotate them around.)
>
>             regards, tom lane
>



Re: strange behavior of WAL files

From
Vijaykumar Jain
Date:
it gets cleaned up for me.

turn archiving on, simulate success using /bin/true
turn archiving off, simulate success using /bin/false
generate wals by some DMLS.


postgres@go:~/pgsql/data/pg_wal$ grep wal_size ../postgresql.conf
max_wal_size = 100MB
min_wal_size = 80MB
postgres@go:~/pgsql/data/pg_wal$ grep archive_ ../postgresql.conf | grep -v "^#"
archive_mode = on               # enables archiving; off, on, or always
archive_command = '/bin/true'           # command to use to archive a logfile segment

# fake failed archiving 
postgres@go:~/pgsql/data/pg_wal$ sed -i -e 's,/bin/true,/bin/false,' ../postgresql.conf
postgres@go:~/pgsql/data/pg_wal$ grep archive_ ../postgresql.conf | grep -v "^#"
archive_mode = on               # enables archiving; off, on, or always
archive_command = '/bin/false'          # command to use to archive a logfile segment

postgres@go:~/pgsql/data/pg_wal$ stoppg
waiting for server to shut down.... done
server stopped
postgres@go:~/pgsql/data/pg_wal$ startpg
waiting for server to start.... done
server started

postgres@go:~/pgsql/data/pg_wal$ ls
0000000100000000000000B1  0000000100000000000000B3  0000000100000000000000B5  archive_status
0000000100000000000000B2  0000000100000000000000B4  0000000100000000000000B6

# generate wals, switch_wal, check is wal files incresed as achiving failing
postgres@go:~/pgsql/data/pg_wal$ for i in {1..10}; do psql -c 'insert into t select x from generate_series(1, 100) x; delete from t; select pg_switch_wal();'; sleep 1; done
 pg_switch_wal
---------------
 0/B1004690
(1 row)

 pg_switch_wal
---------------
 0/B2002F68
(1 row)

 pg_switch_wal
---------------
 0/B3003098
(1 row)

 pg_switch_wal
---------------
 0/B4003068
(1 row)

 pg_switch_wal
---------------
 0/B50039B8
(1 row)

 pg_switch_wal
---------------
 0/B60030E0
(1 row)

 pg_switch_wal
---------------
 0/B7002F68
(1 row)

 pg_switch_wal
---------------
 0/B8003078
(1 row)

 pg_switch_wal
---------------
 0/B9004128
(1 row)

 pg_switch_wal
---------------
 0/BA003048
(1 row)

postgres@go:~/pgsql/data/pg_wal$ ls
0000000100000000000000B1  0000000100000000000000B4  0000000100000000000000B7  0000000100000000000000BA
0000000100000000000000B2  0000000100000000000000B5  0000000100000000000000B8  0000000100000000000000BB
0000000100000000000000B3  0000000100000000000000B6  0000000100000000000000B9  archive_status

postgres@go:~/pgsql/data/pg_wal$ pg_controldata -D ~/pgsql/data | grep REDO                                                             
Latest checkpoint's REDO location:    0/B9000028
Latest checkpoint's REDO WAL file:    0000000100000000000000B9


# enable success archiving, old wals should get recycled and not appear again.
postgres@go:~/pgsql/data/pg_wal$ sed -i -e 's,/bin/false,/bin/true,' ../postgresql.conf                                                 
postgres@go:~/pgsql/data/pg_wal$ stoppg
waiting for server to shut down....... done
server stopped
postgres@go:~/pgsql/data/pg_wal$ startpg
waiting for server to start.... done
server started

postgres@go:~/pgsql/data/pg_wal$ ls
0000000100000000000000B1  0000000100000000000000B4  0000000100000000000000B7  0000000100000000000000BA  archive_status
0000000100000000000000B2  0000000100000000000000B5  0000000100000000000000B8  0000000100000000000000BB
0000000100000000000000B3  0000000100000000000000B6  0000000100000000000000B9  0000000100000000000000BC
postgres@go:~/pgsql/data/pg_wal$ pg_controldata -D ~/pgsql/data | grep REDO
Latest checkpoint's REDO location:    0/BC000028
Latest checkpoint's REDO WAL file:    0000000100000000000000BC

postgres@go:~/pgsql/data/pg_wal$ for i in {1..5}; do psql -c 'insert into t select x from generate_series(1, 100) x; delete from t; select pg_switch_wal();'; sleep 1; done
 pg_switch_wal
---------------
 0/BC004150
(1 row)

 pg_switch_wal
---------------
 0/BD003068
(1 row)

 pg_switch_wal
---------------
 0/BE003070
(1 row)

 pg_switch_wal
---------------
 0/BF003098
(1 row)

 pg_switch_wal
---------------
 0/C0004170
(1 row)

# old wals cleaned up. 
postgres@go:~/pgsql/data/pg_wal$ ls
0000000100000000000000C0  0000000100000000000000C2  0000000100000000000000C4  archive_status
0000000100000000000000C1  0000000100000000000000C3  0000000100000000000000C5


it seems to work fine in this basic test.

Hence I said, i did not see that earlier.
FYI, i tested this on pg13, i have not worked on pg9.6


On Fri, 4 Jun 2021 at 21:36, Atul Kumar <akumar14871@gmail.com> wrote:
hi Tom,

Please check my findings below

older
-rw------- 1 enterprisedb enterprisedb 16777216 Jun  2 02:47
00000001000036CF000000A4
-rw------- 1 enterprisedb enterprisedb 16777216 Jun  2 02:45
00000001000036CF000000A3
-rw------- 1 enterprisedb enterprisedb 16777216 Jun  2 02:44
00000001000036CF000000A5

please note that above files are of June 2nd and once it is archived
it will be recycled with same name with current timestamp, check
below:

newer

-rw------- 1 enterprisedb enterprisedb 16777216 Jun  4 08:19
00000001000036CF000000A0
-rw------- 1 enterprisedb enterprisedb 16777216 Jun  4 08:20
00000001000036CF000000A1
-rw------- 1 enterprisedb enterprisedb 16777216 Jun  4 08:22
00000001000036CF000000A2
drwx------ 2 enterprisedb enterprisedb   311296 Jun  4 08:22 archive_status
-rw------- 1 enterprisedb enterprisedb 16777216 Jun  4 08:23
00000001000036CF000000A3
-rw------- 1 enterprisedb enterprisedb 16777216 Jun  4 08:23
00000001000036CF000000A4



the file names ending with A3 and A4 are the files that got generated
with same name with the latest timestamp.



So that's why I called it strange behavior, please suggest your opinion.





Regards,
Atul










On 6/4/21, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Atul Kumar <akumar14871@gmail.com> writes:
>> once old WAL files of pg_xlog directory are archived in
>> '/nfslogs/wal/' directory then these WAL files are getting generated
>> with the same name in pg_xlog directory.
>
> Are you sure you are describing the behavior accurately?
>
> What I would expect to happen, once an old WAL file has been archived
> and the server knows its contents are no longer needed, is for the
> WAL file to be "recycled" by renaming it to have a name that's in-the-
> future in the WAL name series, whereupon it will wait its turn to be
> reused by future WAL writes.  On most filesystems the rename as such
> doesn't change the file's mod time, so you'll see files that seem
> to be in-the-future according to their names, but have old timestamps.
>
> (There's a limit on how many future WAL files we'll tee up this way,
> so it's possible that an old one would just get deleted instead.
> But the steady-state behavior is to just rotate them around.)
>
>                       regards, tom lane
>




--
Thanks,
Vijay
Mumbai, India

Re: strange behavior of WAL files

From
Tom Lane
Date:
Atul Kumar <akumar14871@gmail.com> writes:
> Please check my findings below

> older
> -rw------- 1 enterprisedb enterprisedb 16777216 Jun  2 02:47
> 00000001000036CF000000A4
> -rw------- 1 enterprisedb enterprisedb 16777216 Jun  2 02:45
> 00000001000036CF000000A3
> -rw------- 1 enterprisedb enterprisedb 16777216 Jun  2 02:44
> 00000001000036CF000000A5

I suspect these files were archived awhile ago (with different
names) and have already been renamed in preparation for using
them as future WAL segments ...

> -rw------- 1 enterprisedb enterprisedb 16777216 Jun  4 08:23
> 00000001000036CF000000A3
> -rw------- 1 enterprisedb enterprisedb 16777216 Jun  4 08:23
> 00000001000036CF000000A4

... and here we see that they just got overwritten with new WAL data,
which would make their new contents eligible for archiving.

Have you made any attempt to correlate your observations with
the actual WAL write position?  (pg_controldata would give you
at least a rough approximation of that, i.e. the WAL write
position as of the most recent checkpoint.  I think you can
get a more up-to-date result from one or another system view,
but I don't remember which.)

            regards, tom lane



Re: strange behavior of WAL files

From
Ravi Krishna
Date:
this is a very interesting case.  Atul keep us posted.