Re: parallelizing the archiver - Mailing list pgsql-hackers
From | Bossart, Nathan |
---|---|
Subject | Re: parallelizing the archiver |
Date | |
Msg-id | 350D9FFB-A933-4EEE-8E21-746A21492623@amazon.com Whole thread Raw |
In response to | Re: parallelizing the archiver (Magnus Hagander <magnus@hagander.net>) |
List | pgsql-hackers |
On 10/19/21, 6:39 AM, "David Steele" <david@pgmasters.net> wrote: > On 10/19/21 8:50 AM, Robert Haas wrote: >> I am not quite sure why we wouldn't just compile the functions into >> the server. Functions pointers can point to core functions as surely >> as loadable modules. The present design isn't too congenial to that >> because it's relying on the shared library loading mechanism to wire >> the thing in place - but there's no reason it has to be that way. >> Logical decoding plugins don't work that way, for example. We could >> still have a GUC, say call it archive_method, that selects the module >> -- with 'shell' being a builtin method, and others being loadable as >> modules. If you set archive_method='shell' then you enable this >> module, and it has its own GUC, say call it archive_command, to >> configure the behavior. >> >> An advantage of this approach is that it's perfectly >> backward-compatible. I understand that archive_command is a hateful >> thing to many people here, but software has to serve the user base, >> not just the developers. Lots of people use archive_command and rely >> on it -- and are not interested in installing yet another piece of >> out-of-core software to do what $OTHERDB has built in. > > +1 to all of this, certainly for the time being. The archive_command > mechanism is not great, but it is simple, and this part is not really > what makes writing a good archive command hard. > > I had also originally envisioned this a default extension in core, but > having the default 'shell' method built-in is certainly simpler. I have no problem building it this way. It's certainly better for backward compatibility, which I think everyone here feels is important. Robert's proposed design is a bit more like my original proof-of- concept [0]. There, I added an archive_library GUC which was basically an extension of shared_preload_libraries (which creates some interesting problems in the library loading logic). You could only set one of archive_command or archive_library at any given time. When the archive_library was set, we ran that library's _PG_init() just like we do for any other library, and then we set the archiver function pointer to the library's _PG_archive() function. IIUC the main difference between this design and what Robert proposes is that we'd also move the existing archive_command stuff somewhere else and then access it via the archiver function pointer. I think that is clearly better than branching based on whether archive_command or archive_library is set. (BTW I'm not wedded to these GUCs. If folks would rather create something like the archive_method GUC, I think that would work just as well.) My original proof-of-concept also attempted to handle a bunch of other shell command GUCs, but perhaps I'd better keep this focused on archive_command for now. What we do here could serve as an example of how to adjust the other shell command GUCs later on. I'll go ahead and rework my patch to look more like what is being discussed here, although I expect the exact design for the interface will continue to evolve based on the feedback in this thread. Nathan [0] https://postgr.es/m/E9035E94-EC76-436E-B6C9-1C03FBD8EF54%40amazon.com
pgsql-hackers by date: