Thread: Qestion about .partial WAL file

Qestion about .partial WAL file

From
"Matsumura, Ryo"
Date:
Hi, Hackers

I noticed something strange. Does it cause nothing?
I didn't detect anything, but feel restless.

Step:
- There are two standbys that connect to primary.
- Kill primary and promote one standby.
- Restart another standby that is reset primary_conninfo to connect new primary.

I expected that the latest WAL segment file in old timeline is renamed with .partial suffix,
but it's not renamed in the restarted standby.

xlog.c says the following, but I didn't understand the bad situation.

         * the archive. It's physically present in the new file with new TLI,
         * but recovery won't look there when it's recovering to the older
-->      * timeline. On the other hand, if we archive the partial segment, and
-->      * the original server on that timeline is still running and archives
-->      * the completed version of the same segment later, it will fail. (We
         * used to do that in 9.4 and below, and it caused such problems).
         *
         * As a compromise, we rename the last segment with the .partial
         * suffix, and archive it. Archive recovery will never try to read
         * .partial segments, so they will normally go unused. But in the odd
         * PITR case, the administrator can copy them manually to the pg_wal
         * directory (removing the suffix). They can be useful in debugging,
         * too.

Regards
Ryo Matsumura




Re: Qestion about .partial WAL file

From
Michael Paquier
Date:
On Thu, Apr 11, 2019 at 12:32:21AM +0000, Matsumura, Ryo wrote:
> I expected that the latest WAL segment file in old timeline is renamed with .partial suffix,
> but it's not renamed in the restarted standby.

Please note that the last partial segment is only generated on an
instance which has promoted.  If you replug another standby into the
promoted standby, then this replugged standby will not generate a
.partial file, and it should not.  What kind of behavior you think is
right and what did you expect?

> xlog.c says the following, but I didn't understand the bad situation.
>
>          * the archive. It's physically present in the new file with new TLI,
>          * but recovery won't look there when it's recovering to the older
> -->      * timeline. On the other hand, if we archive the partial segment, and
> -->      * the original server on that timeline is still running and archives
> -->      * the completed version of the same segment later, it will fail. (We
>          * used to do that in 9.4 and below, and it caused such problems).

If using archive_mode = on, then a promoted standby which archives WAL
segments in the same location as the primary may finish by creating a
conflict if the previous primary is still running after the standby
has been promoted, and that this previous primary is able to finish
the segment where WAL has forked.
--
Michael

Attachment

RE: Qestion about .partial WAL file

From
"Matsumura, Ryo"
Date:
Michael-san

Thank for your advice.

> then a promoted standby which archives WAL segments in the same
> location as the primary

> if the previous primary is still running after the standby

I could not come up with the combination, but I understand now.
Sorry for bothering you.

Regards
Ryo Matsumura