Thread: Tracking of page changes for backup purposes. PTRACK [POC]
In this thread I would like to raise the issue of incremental backups. What I suggest in this thread, is to choose one direction, so we can concentrate our community efforts. There is already a number of tools, which provide incremental backup. And we can see five principle techniques they use: 1. Use file modification time as a marker that the file has changed. 2. Compute file checksums and compare them. 3. LSN-based mechanisms. Backup pages with LSN >= last backup LSN. 4. Scan all WAL files in the archive since the previous backup and collect information about changed pages. 5. Track page changes on the fly. (ptrack) They can also be combined to achieve better performance. My personal candidate is the last one, since it provides page-level granularity, while most of the others approaches can only do file-level incremental backups or require additional reads or calculations. In a nutshell, using ptrack patch, PostgreSQL can track page changes on the fly. Each time a relation page is updated, this page is marked in a special PTRACK bitmap fork for this relation. As one page requires just one bit in the PTRACK fork, such bitmaps are quite small. Tracking implies some minor overhead on the database server operation but speeds up incremental backups significantly. Detailed overview of the implementation with all pros and cons, patches and links to the related threads you can find here: https://wiki.postgresql.org/index.php?title=PTRACK_incremental_backups. Patches for v 10.1 and v 9.6 are attached. Since ptrack is basically just an API for use in backup tools, it is impossible to test the patch independently. Now it is integrated with our backup utility, called pg_probackup. You can find it herehttps://github.com/postgrespro/pg_probackup Let me know if you find the documentation too long and complicated, I'll write a brief How-to for ptrack backups. Spoiler: Please consider this patch and README as a proof of concept. It can be improved in some ways, but in its current state PTRACK is a stable prototype, reviewed and tested well enough to find many non-trivial corner cases and subtle problems. And any discussion of change track algorithm must be aware of them. Feel free to share your concerns and point out any shortcomings of the idea or the implementation. -- Anastasia Lubennikova Postgres Professional:http://www.postgrespro.com The Russian Postgres Company
Attachment
Hello Anastasia, Personally I'm very glad PTRACK is finally proposed to be ported to the community PostgreSQL. > Since ptrack is basically just an API for use in backup tools, it is > impossible to test the patch independently. I believe it's worth to create an extension that will provide access to the PTRACK's public API. Even if there will be not many users interested in this extension, we should have at least some basic tests like "if we write to the table, its pages change", "if we update a tuple, corresponding indexes change", "if we clean up a bitmap, PTRACK says there are no changed pages", etc. -- Best regards, Aleksander Alekseev
Attachment
Hello Anastasia, > ptrack_9.6.6_v1.4.patch Also I'm surprised you proposed a patch for 9.6. Since PTRACK is a new feature I don't believe we are going to backport it. -- Best regards, Aleksander Alekseev
Attachment
Hello! Thanks for sharing this work! I think this is important feature to make backups more efficient. > 18 дек. 2017 г., в 15:18, Anastasia Lubennikova <a.lubennikova@postgrespro.ru> написал(а): > > Patches for v 10.1 and v 9.6 are attached. > Since ptrack is basically just an API for use in backup tools, it is > impossible to test the patch independently. > Now it is integrated with our backup utility, called pg_probackup. You can > find it herehttps://github.com/postgrespro/pg_probackup I can add experimental support for WAL-G too. We have QA tools for delta backups too. > > Spoiler: Please consider this patch and README as a proof of concept. It > can be improved in some ways, but in its current state PTRACK is a > stable prototype, reviewed and tested well enough to find many > non-trivial corner cases and subtle problems. And any discussion of > change track algorithm must be aware of them. Feel free to share your > concerns and point out any shortcomings of the idea or the implementation. This version of the patch is quite big - basically it accompanies most of START_CRIT_SECTION() calls with PTRACK calls. I have two concerns about this: 1. Does this affect the performance of the database when PTRACK is not enabled? 2. What is the cost of having PTRACK enabled? Best regards, Andrey Borodin.
Hello hackers, > I have two concerns about this: > 1. Does this affect the performance of the database when PTRACK is not enabled? > 2. What is the cost of having PTRACK enabled? I played with this patch a bit and did a simple benchmark on my laptop. Configuration: ``` make distclean && ./configure prefix=/some/path/ && make -s -j4 ``` postgresql.conf: ``` max_prepared_transactions = 100 wal_level = logical wal_keep_segments = 128 max_connections = 100 wal_log_hints = on max_wal_senders = 8 wal_keep_segments = 64 listen_addresses = '*' hot_standby = on log_statement = all max_locks_per_transaction = 256 shared_buffers = 1GB ``` The benchmark: ``` pgbench -i && pgbench -j 4 -c 4 -T 300 -P 10 ``` Here are the results. 10.1, ptrack_enable=false transaction type: <builtin: TPC-B (sort of)> scaling factor: 1 query mode: simple number of clients: 4 number of threads: 4 duration: 300 s number of transactions actually processed: 28772 latency average = 41.705 ms latency stddev = 18.274 ms tps = 95.895605 (including connections establishing) tps = 95.906434 (excluding connections establishing) 10.1, ptrack_enable=true scaling factor: 1 query mode: simple number of clients: 4 number of threads: 4 duration: 300 s number of transactions actually processed: 28607 latency average = 41.951 ms latency stddev = 18.454 ms tps = 95.344363 (including connections establishing) tps = 95.345290 (excluding connections establishing) 10.1, without ptrack transaction type: <builtin: TPC-B (sort of)> scaling factor: 1 query mode: simple number of clients: 4 number of threads: 4 duration: 300 s number of transactions actually processed: 28622 latency average = 41.928 ms latency stddev = 18.238 ms tps = 95.396155 (including connections establishing) tps = 95.397406 (excluding connections establishing) At first glance PTRACK doesn't seem to affect the overall performance significantly. There are a few minor issues with the patch. There is a missing '/' symbol in the comment before ptrack_get_and_clear procedure: ``` * Get ptrack file as bytea and clear it */ bytea * ptrack_get_and_clear(Oid tablespace_oid, Oid table_oid) { ``` Also I believe the patch should include some changes of postgresql.conf.sample. I suggest to add this patch to the closest commitfest. Otherwise it can be lost. -- Best regards, Aleksander Alekseev
Attachment
On Mon, Dec 18, 2017 at 8:34 AM, Aleksander Alekseev <a.alekseev@postgrespro.ru> wrote: > 10.1, without ptrack > > transaction type: <builtin: TPC-B (sort of)> > scaling factor: 1 > query mode: simple > number of clients: 4 > number of threads: 4 > duration: 300 s > number of transactions actually processed: 28622 > latency average = 41.928 ms > latency stddev = 18.238 ms > tps = 95.396155 (including connections establishing) > tps = 95.397406 (excluding connections establishing) > > > At first glance PTRACK doesn't seem to affect the overall performance > significantly. I think this doesn't really show much because it's apparently limited by the speed of fsync() on your filesystem. You might try running the test with synchronous_commit=off. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Mon, Dec 18, 2017 at 7:18 PM, Anastasia Lubennikova <a.lubennikova@postgrespro.ru> wrote: > Patches for v 10.1 and v 9.6 are attached. Why no patch for HEAD? If you are planning to show some performance numbers of some kind you had better run on the latest development version, which would also avoid interference with any load bottleneck that could have been removed during the development of v10 or v11. At quick glance the patch proposed does not interfere with such areas. -- Michael
On Tue, Dec 19, 2017 at 2:57 AM, Robert Haas <robertmhaas@gmail.com> wrote: > I think this doesn't really show much because it's apparently limited > by the speed of fsync() on your filesystem. You might try running the > test with synchronous_commit=off. You may want to run Postgres on scissors as much as possible by decreasing checkpoint frequency. etc. + /* + * Do not track changes for unlogged and temp relations, + * since we are not going to backup them anyway. + */ + if (rel->rd_rel->relpersistence != RELPERSISTENCE_PERMANENT) + return; This is not true for init forks of unlogged tables, which are logged and included in backups. -- Michael
Hello Robert, > I think this doesn't really show much because it's apparently limited > by the speed of fsync() on your filesystem. You might try running the > test with synchronous_commit=off. You are right, synchronous_commit=off revealed a noticeable performance degradation. Also I realized that using log_statement=all was not very smart as well. Here are the results. 10.1, ptrack_enable=false, synchronous_commit = off transaction type: <builtin: TPC-B (sort of)> scaling factor: 1 query mode: simple number of clients: 4 number of threads: 4 duration: 300 s number of transactions actually processed: 1713550 latency average = 0.700 ms latency stddev = 0.434 ms tps = 5711.822110 (including connections establishing) tps = 5712.251807 (excluding connections establishing) 10.1, ptrack_enable=true, synchronous_commit = off transaction type: <builtin: TPC-B (sort of)> scaling factor: 1 query mode: simple number of clients: 4 number of threads: 4 duration: 300 s number of transactions actually processed: 1691011 latency average = 0.710 ms latency stddev = 0.380 ms tps = 5636.691378 (including connections establishing) tps = 5636.730514 (excluding connections establishing) 10.1, without ptrack, synchronous_commit = off transaction type: <builtin: TPC-B (sort of)> scaling factor: 1 query mode: simple number of clients: 4 number of threads: 4 duration: 300 s number of transactions actually processed: 1843623 latency average = 0.651 ms latency stddev = 0.589 ms tps = 6145.395486 (including connections establishing) tps = 6145.441431 (excluding connections establishing) -- Best regards, Aleksander Alekseev
Attachment
Hi, a couple of months ago there was proposal / patch with the similar goals, from Andrey Borodin. See these two threads [1] https://www.postgresql.org/message-id/flat/843D96CC-7C55-4296-ADE0-622A7ACD4978%40yesql.se#843D96CC-7C55-4296-ADE0-622A7ACD4978@yesql.se [2] https://www.postgresql.org/message-id/flat/449A7A9D-DB58-40F8-B80E-4C4EE7DB47FD%40yandex-team.ru#449A7A9D-DB58-40F8-B80E-4C4EE7DB47FD@yandex-team.ru I recall there was a long discussion regarding which of the approaches is the *right* one (although that certainly depends on other factors). On 12/18/2017 11:18 AM, Anastasia Lubennikova wrote: > In this thread I would like to raise the issue of incremental backups. > What I suggest in this thread, is to choose one direction, so we can > concentrate our community efforts. > There is already a number of tools, which provide incremental backup. > And we can see five principle techniques they use: > > 1. Use file modification time as a marker that the file has changed. > 2. Compute file checksums and compare them. > 3. LSN-based mechanisms. Backup pages with LSN >= last backup LSN. > 4. Scan all WAL files in the archive since the previous backup and > collect information about changed pages. > 5. Track page changes on the fly. (ptrack) > > They can also be combined to achieve better performance. > > My personal candidate is the last one, since it provides page-level > granularity, while most of the others approaches can only do file-level > incremental backups or require additional reads or calculations. > I share the opinion that options 1 and 2 are not particularly attractive, due to either unreliability, or not really saving that much CPU and I/O. I'm not quite sure about 3, because it doesn't really explain how would it be done - it seems to assume we'd have to reread the files. I'll get back to this. Option 4 has some very interesting features. Firstly, relies on WAL and so should not require any new code (and it could, in theory, support even older PostgreSQL releases, for example). Secondly, this can be offloaded to a different machine. And it does even support additional workflows - e.g. "given these two full backups and the WAL, generate an incremental backup between them". So I'm somewhat hesitant to proclaim option 5 as the clear winner, here. > In a nutshell, using ptrack patch, PostgreSQL can track page changes on > the fly. Each time a relation page is updated, this page is marked in a > special PTRACK bitmap fork for this relation. As one page requires just > one bit in the PTRACK fork, such bitmaps are quite small. Tracking > implies some minor overhead on the database server operation but speeds > up incremental backups significantly. > That makes sense, I guess, although I find the "ptrack" acronym somewhat cryptic, and we should probably look for something more descriptive. But the naming can wait, I guess. My main question is if bitmap is the right data type. It seems to cause a lot of complexity later, because it needs to be reset once in a while, you have to be careful about failed incremental backups etc. What if we tracked the LSN for each page instead? Sure, it'd require so, 64x more space (1 bit -> 8 bytes per page), but it would not require resets, you could take incremental backups from arbitrary point in time, and so on. That seems like a significant improvement to me, so perhaps the space requirements are justified (still just 1MB for 1GB segment, with the default 8kB pages). > Detailed overview of the implementation with all pros and cons, > patches and links to the related threads you can find here: > > https://wiki.postgresql.org/index.php?title=PTRACK_incremental_backups. > > Patches for v 10.1 and v 9.6 are attached. > Since ptrack is basically just an API for use in backup tools, it is > impossible to test the patch independently. > Now it is integrated with our backup utility, called pg_probackup. You can > find it herehttps://github.com/postgrespro/pg_probackup > Let me know if you find the documentation too long and complicated, I'll > write a brief How-to for ptrack backups. > > Spoiler: Please consider this patch and README as a proof of concept. It > can be improved in some ways, but in its current state PTRACK is a > stable prototype, reviewed and tested well enough to find many > non-trivial corner cases and subtle problems. And any discussion of > change track algorithm must be aware of them. Feel free to share your > concerns and point out any shortcomings of the idea or the implementation. > Thanks for the proposal and patch! regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Tue, Dec 19, 2017 at 5:37 PM, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote: > On 12/18/2017 11:18 AM, Anastasia Lubennikova wrote: >> 1. Use file modification time as a marker that the file has changed. >> 2. Compute file checksums and compare them. >> 3. LSN-based mechanisms. Backup pages with LSN >= last backup LSN. >> 4. Scan all WAL files in the archive since the previous backup and >> collect information about changed pages. >> 5. Track page changes on the fly. (ptrack) > > I share the opinion that options 1 and 2 are not particularly > attractive, due to either unreliability, or not really saving that much > CPU and I/O. > > I'm not quite sure about 3, because it doesn't really explain how would > it be done - it seems to assume we'd have to reread the files. I'll get > back to this. > > Option 4 has some very interesting features. Firstly, relies on WAL and > so should not require any new code (and it could, in theory, support > even older PostgreSQL releases, for example). Secondly, this can be > offloaded to a different machine. And it does even support additional > workflows - e.g. "given these two full backups and the WAL, generate an > incremental backup between them". > > So I'm somewhat hesitant to proclaim option 5 as the clear winner, here. I agree. I think (4) is better. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
2017-12-20 21:11 GMT+01:00 Robert Haas <robertmhaas@gmail.com>:
On Tue, Dec 19, 2017 at 5:37 PM, Tomas Vondra
<tomas.vondra@2ndquadrant.com> wrote:
> On 12/18/2017 11:18 AM, Anastasia Lubennikova wrote:
>> 1. Use file modification time as a marker that the file has changed.
>> 2. Compute file checksums and compare them.
>> 3. LSN-based mechanisms. Backup pages with LSN >= last backup LSN.
>> 4. Scan all WAL files in the archive since the previous backup and
>> collect information about changed pages.
>> 5. Track page changes on the fly. (ptrack)
>
> I share the opinion that options 1 and 2 are not particularly
> attractive, due to either unreliability, or not really saving that much
> CPU and I/O.
>
> I'm not quite sure about 3, because it doesn't really explain how would
> it be done - it seems to assume we'd have to reread the files. I'll get
> back to this.
>
> Option 4 has some very interesting features. Firstly, relies on WAL and
> so should not require any new code (and it could, in theory, support
> even older PostgreSQL releases, for example). Secondly, this can be
> offloaded to a different machine. And it does even support additional
> workflows - e.g. "given these two full backups and the WAL, generate an
> incremental backup between them".
>
> So I'm somewhat hesitant to proclaim option 5 as the clear winner, here.
I agree. I think (4) is better.
Can depends on load? For smaller intensive updated databases the 5 can be optimal, for large less updated databases the 4 can be better.
Regards
Pavel
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
On Wed, Dec 20, 2017 at 3:15 PM, Pavel Stehule <pavel.stehule@gmail.com> wrote: >> > So I'm somewhat hesitant to proclaim option 5 as the clear winner, here. >> >> I agree. I think (4) is better. > > Can depends on load? For smaller intensive updated databases the 5 can be > optimal, for large less updated databases the 4 can be better. It seems to me that the difference is that (4) tracks which pages have changed in the background, and (5) does it in the foreground. Why would we want the latter? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
2017-12-20 21:18 GMT+01:00 Robert Haas <robertmhaas@gmail.com>:
On Wed, Dec 20, 2017 at 3:15 PM, Pavel Stehule <pavel.stehule@gmail.com> wrote:
>> > So I'm somewhat hesitant to proclaim option 5 as the clear winner, here.
>>
>> I agree. I think (4) is better.
>
> Can depends on load? For smaller intensive updated databases the 5 can be
> optimal, for large less updated databases the 4 can be better.
It seems to me that the difference is that (4) tracks which pages have
changed in the background, and (5) does it in the foreground. Why
would we want the latter?
Isn't more effective hold this info in Postgres than in backup sw? Then any backup sw can use this implementation.
On 12/20/2017 09:29 PM, Pavel Stehule wrote: > > > 2017-12-20 21:18 GMT+01:00 Robert Haas <robertmhaas@gmail.com > <mailto:robertmhaas@gmail.com>>: > > On Wed, Dec 20, 2017 at 3:15 PM, Pavel Stehule > <pavel.stehule@gmail.com <mailto:pavel.stehule@gmail.com>> wrote: > >> > So I'm somewhat hesitant to proclaim option 5 as the clear winner, here. > >> > >> I agree. I think (4) is better. > > > > Can depends on load? For smaller intensive updated databases the 5 can be > > optimal, for large less updated databases the 4 can be better. > > It seems to me that the difference is that (4) tracks which pages have > changed in the background, and (5) does it in the foreground. Why > would we want the latter? > > > Isn't more effective hold this info in Postgres than in backup sw? > Then any backup sw can use this implementation. > I don't think it means it can't be implemented in Postgres, but does it need to be done in backend? For example, it might be a command-line tool similar to pg_waldump, which processes WAL segments and outputs list of modified blocks, possibly with the matching LSN. Or perhaps something like pg_receivewal, doing that in streaming mode. This part of the solution can still be part of PostgreSQL codebase, and the rest has to be part of backup solution anyway. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Wed, Dec 20, 2017 at 3:45 PM, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote: >> Isn't more effective hold this info in Postgres than in backup sw? >> Then any backup sw can use this implementation. > > I don't think it means it can't be implemented in Postgres, but does it > need to be done in backend? > > For example, it might be a command-line tool similar to pg_waldump, > which processes WAL segments and outputs list of modified blocks, > possibly with the matching LSN. Or perhaps something like pg_receivewal, > doing that in streaming mode. > > This part of the solution can still be part of PostgreSQL codebase, and > the rest has to be part of backup solution anyway. I agree with all of that. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Thu, Dec 21, 2017 at 7:35 AM, Robert Haas <robertmhaas@gmail.com> wrote: > On Wed, Dec 20, 2017 at 3:45 PM, Tomas Vondra > <tomas.vondra@2ndquadrant.com> wrote: >>> Isn't more effective hold this info in Postgres than in backup sw? >>> Then any backup sw can use this implementation. >> >> I don't think it means it can't be implemented in Postgres, but does it >> need to be done in backend? >> >> For example, it might be a command-line tool similar to pg_waldump, >> which processes WAL segments and outputs list of modified blocks, >> possibly with the matching LSN. Or perhaps something like pg_receivewal, >> doing that in streaming mode. >> >> This part of the solution can still be part of PostgreSQL codebase, and >> the rest has to be part of backup solution anyway. > > I agree with all of that. +1. This summarizes a bunch of concerns about all kinds of backend implementations proposed. Scanning for a list of blocks modified via streaming gives more availability, but knowing that you will need to switch to a new segment anyway when finishing a backup, does it really matter? Doing it once a segment has finished would be cheap enough, and you can even do it in parallel with a range of segments. Also, since 9.4 and the introduction of the new WAL API to track modified blocks, you don't need to know about the record types to know which blocks are being changed. Here is an example of tool I hacked up in a couple of hours that does actually what you are looking for, aka a scanner of the blocks modified per record for a given WAL segment using xlogreader.c: https://github.com/michaelpq/pg_plugins/tree/master/pg_wal_blocks You could just use that and shape the data in the way you want and you would be good to go. -- Michael
Hi! > 21 дек. 2017 г., в 5:51, Michael Paquier <michael.paquier@gmail.com> написал(а): > > On Thu, Dec 21, 2017 at 7:35 AM, Robert Haas <robertmhaas@gmail.com> wrote: >> On Wed, Dec 20, 2017 at 3:45 PM, Tomas Vondra >> <tomas.vondra@2ndquadrant.com> wrote: >>>> Isn't more effective hold this info in Postgres than in backup sw? >>>> Then any backup sw can use this implementation >> [Skipped] >> I agree with all of that. > > +1. This summarizes a bunch of concerns about all kinds of backend > implementations proposed. Scanning for a list of blocks modified via > streaming gives more availability, but knowing that you will need to > switch to a new segment anyway when finishing a backup, does it really > matter? Doing it once a segment has finished would be cheap enough, > and you can even do it in parallel with a range of segments. > > Also, since 9.4 and the introduction of the new WAL API to track > modified blocks, you don't need to know about the record types to know > which blocks are being changed. Here is an example of tool I hacked up > in a couple of hours that does actually what you are looking for, aka > a scanner of the blocks modified per record for a given WAL segment > using xlogreader.c: > https://github.com/michaelpq/pg_plugins/tree/master/pg_wal_blocks > > You could just use that and shape the data in the way you want and you > would be good to go. Michael, that's almost what I want. I've even filed GSoC proposal for this [0]. But can we have something like this in Postgres? The tool I'm hacking is in Go, I cannot just embed bunch of Postgres C into it. That is why API, like PTRACK, suits my needsbetter. Not because it uses any superior mechanics, but because it is API, ready for external 3rd party backup software(having backup software in Pg would be even better). Anastasia, I've implemented PTRACK support in WAL-G. First, there are few minor issues with patch: 1. There is malformed comment 2. Function pg_ptrack_version() is absent Then, I think that API is far from perfect: pg_ptrack_get_and_clear() changes global ptrack_clear_lsn, which introduces someweakness (for paranoids). May be use something like "pg_ptrack_get_and_clear(oid,oid,previous_lsn)" which will fail ifprevious_lsn do not match? Also, function pg_ptrack_get_and_clear() do not return errors when there is no table with thisoid. Finally, I had to interpret any empty map as absence of map. From my POV, function must fail on errors like: invaidoid passed, no table found, no PTRACK map exists, et c. I use external file-tracking mechanics, so function pg_ptrack_init_get_and_clear() was of no use for me. Last, but most important for me: my tests showed lost page updates. Probably, it is bug or paranoia in my test software.But may I ask you to check this [1] code, which converts PTRACK map to number of block numbers. Do I get meaningof PTRACK map right? Thank you very much. [0] https://wiki.postgresql.org/index.php?title=GSoC_2018#WAL-G_delta_backups_with_WAL_scanning_.282018.29 [1] https://github.com/wal-g/wal-g/blob/ptrack/pagefile.go#L167-L173
On Thu, Dec 21, 2017 at 1:51 AM, Michael Paquier <michael.paquier@gmail.com> wrote:
On Thu, Dec 21, 2017 at 7:35 AM, Robert Haas <robertmhaas@gmail.com> wrote:
> On Wed, Dec 20, 2017 at 3:45 PM, Tomas Vondra
> <tomas.vondra@2ndquadrant.com> wrote:
>>> Isn't more effective hold this info in Postgres than in backup sw?
>>> Then any backup sw can use this implementation.
>>
>> I don't think it means it can't be implemented in Postgres, but does it
>> need to be done in backend?
>>
>> For example, it might be a command-line tool similar to pg_waldump,
>> which processes WAL segments and outputs list of modified blocks,
>> possibly with the matching LSN. Or perhaps something like pg_receivewal,
>> doing that in streaming mode.
>>
>> This part of the solution can still be part of PostgreSQL codebase, and
>> the rest has to be part of backup solution anyway.
>
> I agree with all of that.
+1. This summarizes a bunch of concerns about all kinds of backend
implementations proposed. Scanning for a list of blocks modified via
streaming gives more availability, but knowing that you will need to
switch to a new segment anyway when finishing a backup, does it really
matter? Doing it once a segment has finished would be cheap enough,
and you can even do it in parallel with a range of segments.
There's definitely a lot of value to that, in particular the being able to do out entirely outside the backend making it possible to extract the data from an existing log archive.
Just to throw another option out there, it could also be implemented at least partially as a walsender command. That way you can get it out through a replication connection and piggyback on things like replication slots to make sure you have the data you need, without having to send the full volume of data. And it would make it possible to do incremental/differential *without* having a WAL archive in the first place.