Thanks for your response.
None of the data rows are wide (as far as I can remember). We don't have
any blob data, and any text fields only contain several hundred bytes at
most (and even those would be rare).
Just stopping and starting the slon process on the slave node doesn't
seem to help much. Stopping postgres on the slave itself seems to be
also required.
I'm wondering if this requirement is due to the continued running of the
slon psocess on the master.
Does it makes sense that shutting down the slave postgres db is
necessary? Or would stopping and restarting ALL slon processes on all
nodes mean that I wouldn't have to stop and restart the slave postgres DB?
Thanks
John
Ian Burrell wrote:
> On 12/22/05, John Sidney-Woollett <johnsw@wardbrook.com> wrote:
>
>>In trying to investigate a possible memory issue that affects only one
>>of our servers, I have been logging the process list for postgres
>>related items 4 times a day for the past few days.
>>
>>This server uses postgres 7.4.6 + slon 1.1.0 on Debian i686 (Linux
>>server2 2.6.8.1-4-686-smp) and is a slon slave in a two server
>>replicated cluster. Our master DB (similar setup) does not exbibit this
>>problem at all - only the subscriber node...
>>
>>The load average starts to go mental once the machine has to start
>>swapping (ie starts running out of physical RAM). The solution so far is
>>to stop and restart both slon and postgres and things return to normal
>>for another 2 weeks.
>>
>>I know other people have reported similar things but there doesn't seem
>>to be an explanation or solution (other than stopping and starting the
>>two processes).
>>
>>Can anyone suggest what else to look at on the server to see what might
>>be going on?
>>
>>Appreciate any help or advice anyone can offer. I'm not a C programmer
>>nor a unix sysadmin, so any advice needs to be simple to understand.
>>
>
>
>
> The memory usage growth is caused by the buffers in the slave slon
> daemon growing when long rows go through them. The buffers never
> shrink while the slon daemon is running. How big is the largest rows
> which slon replicates?
>
> One suggestion I have seen is to recompile slon to use fewer buffers.
> Another is to set a ulimit for memory size to automatically kill the
> slon daemons when they get too big. The watchdog will then restart
> them. Alternatively, your strategy of restarting the slon daemons
> each week will work (you don't need to restart postgres).
>
> I came up with a patch which shrinks the buffers when they go above a
> certain size. This doesn't fix the problem of lots of big rows
> happening at once but it fixes the gradual growth.
>
> - Ian