Thread: why pg_walfile_name() cannot be executed during recovery?
Hello Hackers,
Why pg_walfile_name() can't be executed under recovery? What is the best way for me to get the current timeline and/or the file being recovering on the standby using a postgres query? I know I can get it via process title but don't want to go that route.
Thanks,
Satya
On Fri, Apr 2, 2021 at 4:23 AM SATYANARAYANA NARLAPURAM <satyanarlapuram@gmail.com> wrote: > Why pg_walfile_name() can't be executed under recovery? I believe the issue is that the backend executing the function might not have an accurate idea about which TLI to use. But I don't understand why we can't find some solution to that problem. > What is the best way for me to get the current timeline and/or the file being recovering on the standby using a postgresquery? I know I can get it via process title but don't want to go that route. pg_stat_wal_receiver has LSN and TLI information, but probably won't help except when WAL receiver is actually active. pg_last_wal_receive_lsn() and pg_last_wal_replay_lsn() will give the LSN at any point during recovery, but not the TLI. We might have some gaps in this area... -- Robert Haas EDB: http://www.enterprisedb.com
Re: why pg_walfile_name() cannot be executed during recovery?
From
Jehan-Guillaume de Rorthais
Date:
On Fri, 2 Apr 2021 08:22:09 -0400 Robert Haas <robertmhaas@gmail.com> wrote: > On Fri, Apr 2, 2021 at 4:23 AM SATYANARAYANA NARLAPURAM > <satyanarlapuram@gmail.com> wrote: > > Why pg_walfile_name() can't be executed under recovery? > > I believe the issue is that the backend executing the function might > not have an accurate idea about which TLI to use. But I don't > understand why we can't find some solution to that problem. > > > What is the best way for me to get the current timeline and/or the file > > being recovering on the standby using a postgres query? I know I can get it > > via process title but don't want to go that route. > > pg_stat_wal_receiver has LSN and TLI information, but probably won't > help except when WAL receiver is actually active. > pg_last_wal_receive_lsn() and pg_last_wal_replay_lsn() will give the > LSN at any point during recovery, but not the TLI. We might have some > gaps in this area... Yep, see previous discussion: https://www.postgresql.org/message-id/flat/20190723180518.635ac554%40firost The status by the time was to consider a new view eg. pg_stat_recovery, to report various recovery stats. But maybe the best place now would be to include it in the new pg_stat_wal view? As I'm interesting with this feature as well, I volunteer to work on it as author or reviewer. Regards,
On Fri, Apr 2, 2021 at 5:52 PM Robert Haas <robertmhaas@gmail.com> wrote: > > On Fri, Apr 2, 2021 at 4:23 AM SATYANARAYANA NARLAPURAM > <satyanarlapuram@gmail.com> wrote: > > Why pg_walfile_name() can't be executed under recovery? > > I believe the issue is that the backend executing the function might > not have an accurate idea about which TLI to use. But I don't > understand why we can't find some solution to that problem. > > > What is the best way for me to get the current timeline and/or the file being recovering on the standby using a postgresquery? I know I can get it via process title but don't want to go that route. > > pg_stat_wal_receiver has LSN and TLI information, but probably won't > help except when WAL receiver is actually active. > pg_last_wal_receive_lsn() and pg_last_wal_replay_lsn() will give the > LSN at any point during recovery, but not the TLI. We might have some > gaps in this area... I spent some time today to allow pg_walfile_{name, name_offset} run in recovery. Timeline ID is computed while in recovery as follows - WAL receiver's last received and flushed WAL record's TLI if it's streaming, otherwise the last replayed WAL record's TLI. This way, these functions can be used on standby or PITR server or even in crash recovery if the server opens up for read-only connections. Please have a look at the attached patch. If the approach looks okay, I can add notes in the documentation. Regards, Bharath Rupireddy.
Attachment
On Thu, Apr 7, 2022 at 9:32 AM Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> wrote: > I spent some time today to allow pg_walfile_{name, name_offset} run in > recovery. Timeline ID is computed while in recovery as follows - WAL > receiver's last received and flushed WAL record's TLI if it's > streaming, otherwise the last replayed WAL record's TLI. This way, > these functions can be used on standby or PITR server or even in crash > recovery if the server opens up for read-only connections. I don't think this is a good definition. Suppose I ask for pg_walfile_name() using an older LSN. With this approach, we're going to get a filename based on the idea that the TLI that was in effect back then is the same one as the TLI that is in effect now, which might not be true. For example, suppose that the current TLI is 2 and it branched off of timeline 1 at 10/0. If I ask for pg_walfile_name('F/0'), it's going to give me the name of a WAL file that has never existed. That seems bad. It's also worth noting that there's a bit of a definitional problem here. If in the same situation, I ask for pg_walfile_name('11/0'), it's going to give me a filename based on TLI 2, but there's also a WAL file for that LSN with TLI 1. How do we know which one the user wants? Perhaps one idea would be to say that the relevant TLI is the one which was in effect at the time that LSN was replayed. If we do that, what about future LSNs? We could assume that for future LSNs, the TLI should be the same as the current TLI, but maybe that's also misleading, because recovery_target_timeline could be set. I think it's really important to start by being precise about the question that we think pg_walfile_name() ought to be answering. If we don't know that, then we really can't say what TLI it should be using. It's not hard to make the function return SOME answer using SOME TLI, but then it's not clear that the answer is the right one for any particular purpose. And in that case the function is more dangerous than useful, because people will write code that uses it to do stuff, and then that stuff won't actually work correctly under all circumstances. -- Robert Haas EDB: http://www.enterprisedb.com
On Thu, Apr 07, 2022 at 11:37:15AM -0400, Robert Haas wrote: > It's also worth noting that there's a bit of a definitional problem > here. If in the same situation, I ask for pg_walfile_name('11/0'), > it's going to give me a filename based on TLI 2, but there's also a > WAL file for that LSN with TLI 1. How do we know which one the user > wants? Perhaps one idea would be to say that the relevant TLI is the > one which was in effect at the time that LSN was replayed. If we do > that, what about future LSNs? We could assume that for future LSNs, > the TLI should be the same as the current TLI, but maybe that's also > misleading, because recovery_target_timeline could be set. FWIW, for future positions, I'd be rather on board with the concept of using the TLI currently being replayed, but as you say that comes down to the definition borders we want to use. Another possibility would be to return an error and kick the can down the road if we are unsure of what the right behavior is. For past positions, this should go through a lookup of the timeline history file (the patch does not do that at quick glance). > I think it's really important to start by being precise about the > question that we think pg_walfile_name() ought to be answering. If we > don't know that, then we really can't say what TLI it should be using. > It's not hard to make the function return SOME answer using SOME TLI, > but then it's not clear that the answer is the right one for any > particular purpose. And in that case the function is more dangerous > than useful, because people will write code that uses it to do stuff, > and then that stuff won't actually work correctly under all > circumstances. Agreed. -- Michael
Attachment
On Thu, Apr 7, 2022 at 9:07 PM Robert Haas <robertmhaas@gmail.com> wrote: > > On Thu, Apr 7, 2022 at 9:32 AM Bharath Rupireddy > <bharath.rupireddyforpostgres@gmail.com> wrote: > > I spent some time today to allow pg_walfile_{name, name_offset} run in > > recovery. Timeline ID is computed while in recovery as follows - WAL > > receiver's last received and flushed WAL record's TLI if it's > > streaming, otherwise the last replayed WAL record's TLI. This way, > > these functions can be used on standby or PITR server or even in crash > > recovery if the server opens up for read-only connections. > > I don't think this is a good definition. Suppose I ask for > pg_walfile_name() using an older LSN. With this approach, we're going > to get a filename based on the idea that the TLI that was in effect > back then is the same one as the TLI that is in effect now, which > might not be true. For example, suppose that the current TLI is 2 and > it branched off of timeline 1 at 10/0. If I ask for > pg_walfile_name('F/0'), it's going to give me the name of a WAL file > that has never existed. That seems bad. > > It's also worth noting that there's a bit of a definitional problem > here. If in the same situation, I ask for pg_walfile_name('11/0'), > it's going to give me a filename based on TLI 2, but there's also a > WAL file for that LSN with TLI 1. How do we know which one the user > wants? Perhaps one idea would be to say that the relevant TLI is the > one which was in effect at the time that LSN was replayed. If we do > that, what about future LSNs? We could assume that for future LSNs, > the TLI should be the same as the current TLI, but maybe that's also > misleading, because recovery_target_timeline could be set. Fundamental question - should the pg_walfile_{name, name_offset} check whether the file with the computed WAL file name exists on the server right now or ever existed earlier? Right now, they don't do that, see [1]. I think we can make the functions more robust: pg_walfile_{name, name_offset}(lsn, check_if_file_exists = false, tli = invalid_timelineid) - when check_if_file_exists is true checks for the computed WAL file existence and when a valid tli is provided uses it in computing the WAL file name. When tli isn't provided, it continues to use insert tli for primary, and in recovery it uses tli as proposed in my patch. Perhaps, it can also do (as Michael suggested) this - if check_if_file_exists is true and tli isn't provided and there's timeline history, then it can go look at all the timelines and whether the file exists with the computed name with history tli. > I think it's really important to start by being precise about the > question that we think pg_walfile_name() ought to be answering. If we > don't know that, then we really can't say what TLI it should be using. > It's not hard to make the function return SOME answer using SOME TLI, > but then it's not clear that the answer is the right one for any > particular purpose. And in that case the function is more dangerous > than useful, because people will write code that uses it to do stuff, > and then that stuff won't actually work correctly under all > circumstances. Yes, once we agree on the semantics of these functions, having better documentation will help. Thoughts? [1] postgres=# select * from pg_walfile_name('50000/dfdf'); pg_walfile_name -------------------------- 000000010005000000000000 (1 row) postgres=# select * from pg_walfile_name_offset('50000/dfdf'); file_name | file_offset --------------------------+------------- 000000010005000000000000 | 57311 (1 row) Regards, Bharath Rupireddy.
On Fri, Apr 8, 2022 at 9:31 AM Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> wrote: > Fundamental question - should the pg_walfile_{name, name_offset} check > whether the file with the computed WAL file name exists on the server > right now or ever existed earlier? Right now, they don't do that, see > [1]. I don't think that checking whether the file exists is the right approach. However, I do think that it's important to be precise about which TLI is going to be used. I think it would be reasonable to redefine this function (on both the primary and the standby) so that the TLI that is used is the one that was in effect at the time record at the given LSN was either written or replayed. Then, you could potentially use this function to figure out whether you still have the WAL files that are needed to replay up to some previous point in the WAL stream. However, what about the segments where we switched from one TLI to the next in the middle of the segment? There, you probably need both the old and the new segments, or maybe if you're trying to stream them you only need the new one because we have some weird special case that will send the segment from the new timeline when the segment from the old timeline is requested. So you couldn't just call this function on one LSN per segment and call it good, and it wouldn't necessarily be the case that the filenames you got back were exactly the ones you needed. So I'm not entirely sure this proposal is good enough, but it at least would have the advantage of meaning that the filename you get back is one that existed at some point in time and somebody used it for something. -- Robert Haas EDB: http://www.enterprisedb.com
On Fri, Apr 8, 2022 at 7:28 PM Robert Haas <robertmhaas@gmail.com> wrote: > > On Fri, Apr 8, 2022 at 9:31 AM Bharath Rupireddy > <bharath.rupireddyforpostgres@gmail.com> wrote: > > Fundamental question - should the pg_walfile_{name, name_offset} check > > whether the file with the computed WAL file name exists on the server > > right now or ever existed earlier? Right now, they don't do that, see > > [1]. > > I don't think that checking whether the file exists is the right > approach. However, I do think that it's important to be precise about > which TLI is going to be used. I think it would be reasonable to > redefine this function (on both the primary and the standby) so that > the TLI that is used is the one that was in effect at the time record > at the given LSN was either written or replayed. Then, you could > potentially use this function to figure out whether you still have the > WAL files that are needed to replay up to some previous point in the > WAL stream. However, what about the segments where we switched from > one TLI to the next in the middle of the segment? There, you probably > need both the old and the new segments, or maybe if you're trying to > stream them you only need the new one because we have some weird > special case that will send the segment from the new timeline when the > segment from the old timeline is requested. So you couldn't just call > this function on one LSN per segment and call it good, and it wouldn't > necessarily be the case that the filenames you got back were exactly > the ones you needed. > > So I'm not entirely sure this proposal is good enough, but it at least > would have the advantage of meaning that the filename you get back is > one that existed at some point in time and somebody used it for > something. Using insert tli when not in recovery and using tli of the last WAL replayed record in crash/archive/standby recovery, seems a reasonable choice to me. I've also added a note in the docs. Attaching v2 with the above change. Please review it further. Regards, Bharath Rupireddy.
Attachment
> On 9 Apr 2022, at 18:30, Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> wrote: > > Using insert tli when not in recovery and using tli of the last WAL > replayed record in crash/archive/standby recovery, seems a reasonable > choice to me. Please excuse me if I'm not attentive enough. I've read this thread. And I could not find what is the problem that you aresolving. What is the purpose of the WAL file name you want to obtain? pg_walfile_name() - is a formatting function. With TLI as an hidden argument. If we want it to work on Standby we shouldjust convert it to pure formatting function without access the the DB state, pass TLI as an argument. Making implicit TLI computation with certain expectations is not a good idea IMV. pg_walfile_name() could just read .history file, determine which TLI contains given LSN and format the name. And still there'sa tricky segments during TLI switch. Either way we can rename the function to pg_walfile_name_as_if_on_timeline_of_last_wal_replayed(). Thanks! Best regards, Andrey Borodin.
On Sat, Apr 9, 2022 at 12:25 PM Andrey Borodin <x4mmm@yandex-team.ru> wrote: > Please excuse me if I'm not attentive enough. I've read this thread. And I could not find what is the problem that youare solving. What is the purpose of the WAL file name you want to obtain? Yeah, I'd also like to know this. -- Robert Haas EDB: http://www.enterprisedb.com
On Sat, Apr 9, 2022 at 10:21 PM Robert Haas <robertmhaas@gmail.com> wrote: > > On Sat, Apr 9, 2022 at 12:25 PM Andrey Borodin <x4mmm@yandex-team.ru> wrote: > > Please excuse me if I'm not attentive enough. I've read this thread. And I could not find what is the problem that youare solving. What is the purpose of the WAL file name you want to obtain? > > Yeah, I'd also like to know this. IMO, uses of pg_walfile_{name, name_offset} are plenty. Say, I have LSNs (say, flush, insert, replayed or WAL receiver latest received) and I would like to know the WAL file name and offset in an app connecting to postgres or a control plane either for doing some reporting or figuring out whether a WAL file exists given an LSN or for some other reason. With these functions restricted when the server is in recovery mode, the apps or control plane code can't use them and they have to do if (!pg_is_in_recovery()) {select * from pg_walfile_{name, name_offset}. Am I missing any other important use-cases? Regards, Bharath Rupireddy.
> 22 апр. 2022 г., в 19:15, Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> написал(а): > > On Sat, Apr 9, 2022 at 10:21 PM Robert Haas <robertmhaas@gmail.com> wrote: >> >> On Sat, Apr 9, 2022 at 12:25 PM Andrey Borodin <x4mmm@yandex-team.ru> wrote: >>> Please excuse me if I'm not attentive enough. I've read this thread. And I could not find what is the problem that youare solving. What is the purpose of the WAL file name you want to obtain? >> >> Yeah, I'd also like to know this. > > IMO, uses of pg_walfile_{name, name_offset} are plenty. Say, I have > LSNs (say, flush, insert, replayed or WAL receiver latest received) AFAIK flush, receive and replay LSNs may be on 3 different timelines rendering two names incorrect. Actually, this provesthat pg_wal_filename() should not be called on Standby with a present function prototype. > and I would like to know the WAL file name and offset in an app > connecting to postgres or a control plane either for doing some > reporting What kind of reporting? > or figuring out whether a WAL file exists given an LSN or > for some other reason. There might me many WAL files on the same LSN. Please, specify more detailed scenario to use WAL file name. > With these functions restricted when the server > is in recovery mode, the apps or control plane code can't use them and > they have to do if (!pg_is_in_recovery()) {select * from > pg_walfile_{name, name_offset}. > > Am I missing any other important use-cases? I do not see correct use-case among these. You justify necessity to run pg_wal_filename() on Standby by having a LSN (nota problem), by doing some kind of reporting (to broad problem) and checking existence of some WAL file (more detailsneeded). What is the problem leading to checking the existence of the file? Thanks! Best regards, Andrey Borodin.