Re: .ready and .done files considered harmful - Mailing list pgsql-hackers
From | Dilip Kumar |
---|---|
Subject | Re: .ready and .done files considered harmful |
Date | |
Msg-id | CAFiTN-tR+3+GjP0Qeys8jwh=jz9VpP2ibhT9ubFDLmgNb1QtMg@mail.gmail.com Whole thread Raw |
In response to | Re: .ready and .done files considered harmful (Robert Haas <robertmhaas@gmail.com>) |
Responses |
Re: .ready and .done files considered harmful
|
List | pgsql-hackers |
iOn Tue, May 4, 2021 at 7:38 PM Robert Haas <robertmhaas@gmail.com> wrote: > > On Tue, May 4, 2021 at 12:27 AM Andres Freund <andres@anarazel.de> wrote: > > On 2021-05-03 16:49:16 -0400, Robert Haas wrote: > > > I have two possible ideas for addressing this; perhaps other people > > > will have further suggestions. A relatively non-invasive fix would be > > > to teach pgarch.c how to increment a WAL file name. After archiving > > > segment N, check using stat() whether there's an .ready file for > > > segment N+1. If so, do that one next. If not, then fall back to > > > performing a full directory scan. > > > > Hm. I wonder if it'd not be better to determine multiple files to be > > archived in one readdir() pass? > > I think both methods have some merit. If we had a way to pass a range > of files to archive_command instead of just one, then your way is > distinctly better, and perhaps we should just go ahead and invent such > a thing. If not, your way doesn't entirely solve the O(n^2) problem, > since you have to choose some upper bound on the number of file names > you're willing to buffer in memory, but it may lower it enough that it > makes no practical difference. I am somewhat inclined to think that it > would be good to start with the method I'm proposing, since it is a > clear-cut improvement over what we have today and can be done with a > relatively limited amount of code change and no redesign, and then > perhaps do something more ambitious afterward. I agree that if we continue to archive one file using the archive command then Robert's solution of checking the existence of the next WAL segment (N+1) has an advantage. But, currently, if you notice pgarch_readyXlog always consider any history file as the oldest file but that will not be true if we try to predict the next WAL segment name. For example, if we have archived 000000010000000000000004 then next we will look for 000000010000000000000005 but after generating segment 000000010000000000000005, if there is a timeline switch then we will have the below files in the archive status (000000010000000000000005.ready, 00000002.history file). Now, the existing archiver will archive 00000002.history first whereas our code will archive 000000010000000000000005 first. Said that I don't see any problem with that because before archiving any segment file from TL 2 we will definitely archive the 00000002.history file because we will not find the 000000010000000000000006.ready and we will scan the full directory and now we will find 00000002.history as oldest file. > > > > However, that's still pretty wasteful. Every time we have to wait for > > > the next file to be ready for archiving, we'll basically fall back to > > > repeatedly scanning the whole directory, waiting for it to show up. Is this true? that only when we have to wait for the next file to be ready we got for scanning? If I read the code in "pgarch_ArchiverCopyLoop", for every single file to achieve it is calling "pgarch_readyXlog", wherein it scans the directory every time. So I did not understand your point that only when it needs to wait for the next .ready file it need to scan the full directory. It appeared it always scans the full directory after archiving each WAL segment. What am I missing? > > Hm. That seems like it's only an issue because .done and .ready are in > > the same directory? Otherwise the directory would be empty while we're > > waiting for the next file to be ready to be archived. > > I think that's right. If we agree with your above point that it only needs to scan the full directory when it has to wait for the next file to be ready then making a separate directory for .done file can improve a lot because the directory will be empty so scanning will not be very costly. -- Regards, Dilip Kumar EnterpriseDB: http://www.enterprisedb.com
pgsql-hackers by date: