Re: pg_receivelog completion command - Mailing list pgsql-hackers

From Magnus Hagander
Subject Re: pg_receivelog completion command
Date
Msg-id CABUevEyV9c0SBEzWRS7miC5n9YbS+jc-JnQosHX7pBOhsV6Bqw@mail.gmail.com
Whole thread Raw
In response to Re: pg_receivelog completion command  (Andres Freund <andres@2ndquadrant.com>)
Responses Re: pg_receivelog completion command  (Andres Freund <andres@2ndquadrant.com>)
List pgsql-hackers
On Sun, Nov 2, 2014 at 2:31 PM, Andres Freund <andres@2ndquadrant.com> wrote:
> On 2014-11-02 14:26:04 +0100, Magnus Hagander wrote:
>> I had a discussion with a few people recently about a hack I wrote for
>> pg_receivexlog at some point, but never ended up submitting, and in
>> cleaning that up realized I had an open item on it.
>>
>> The idea is to add a switch to pg_receivexlog (in this case, -a, but
>> that can always be bikeshedded ot coursE) that acts somewhat like
>> archive_command on the backend. The idea is to have pg_receivexlog
>> fire off an external command at the end of each segment - for example
>> a command to gzip the file, or to archive it off into a Magic Cloud
>> (TM) or something like that.
>
> I can see that to be useful.
>
>> My current hack just fires off this command using sytem(). That will
>> block the pg_receivexlog process, obviously. The question is, if we
>> want this, what kind of behaviour would we want here? One option is to
>> do just that, which should be safe enough for something like gzip but
>> might cause trouble if the external command blocks on network for
>> example. Another option would be to just fork() and run it in the
>> background, which could in theory lead to unlimited number of
>> processes if they all hang. Or of course we could have a background
>> process that queues them up - much like we do in the main backend,
>> which is definitely more complicated.
>
> How about a middleground: Fork the command of, but wait() on it when
> before you start the next command?
>
> This will nead some persistent state about the commands success -
> similar to the current archive status stuff. Given retries and
> everything it might end up to be easier to have a separate process.

That is mostly what I meant with my thid option, the "background
process". But I guess we can do the actual queueing in the main
process of course. But yeah, it comes down to if we wan tto deal with
retries and such at all, or just leave that up to the external
command. We could for example say that if you specify -a, we just stop
doing the rename() in pg_receivexlog and *instead* do the archive
command, making it that commands responsibility to move the file "from
.partial". That might make things simpler.

-- Magnus HaganderMe: http://www.hagander.net/Work: http://www.redpill-linpro.com/



pgsql-hackers by date:

Previous
From: Andres Freund
Date:
Subject: Re: pg_receivelog completion command
Next
From: Mikko Tiihonen
Date:
Subject: Re: Pipelining executions to postgresql server