Thread: Why does slony use a cursor? Anyone know?
Hey everyone, Frankly, I'm shocked at what I just found. We did a delete last night of a few million rows, and come back this morning to find that slony is 9-hours behind. After some investigation, it became apparent that slony opens up a cursor and orders it by the log_actionseq column. Then it fetches 500 rows, and closes the cursor. So it's fetching several million rows into a cursor, to fetch 500, and then throw the rest away. That is quite possibly the least efficient manner I could think of to build a sync system, so maybe someone knows why they did it that way? At least with a regular query, it could sort by the column it wanted, and put a nifty index on it for those 500-row chunks it's grabbing. I must be missing something... Yeah, I know... millions of rows + slony = bad. But it doesn't have to be *this* bad. Even a few hundred thousand rows would be terrible with this design. I plan on asking the slony guys too, but I figured someone on this list might happen to know. Looks like I might have to look into Bucardo or Londiste. Thanks! -- Shaun Thomas OptionsHouse | 141 W. Jackson Blvd. | Suite 500 | Chicago IL, 60604 312-676-8870 sthomas@optionshouse.com ______________________________________________ See http://www.peak6.com/email_disclaimer/ for terms and conditions related to this email
On Tue, Mar 05, 2013 at 08:51:11AM -0600, Shaun Thomas wrote: > Hey everyone, > > Frankly, I'm shocked at what I just found. > > We did a delete last night of a few million rows, and come back this > morning to find that slony is 9-hours behind. After some > investigation, it became apparent that slony opens up a cursor and > orders it by the log_actionseq column. Then it fetches 500 rows, and > closes the cursor. So it's fetching several million rows into a > cursor, to fetch 500, and then throw the rest away. I once had a similar problem and it looked like they were missing an index. I asked about it and it turns out that I was running a somewhat old version, and it was fixed in later versions. Check that first. But ask anyway, I've always found the Slony guys very helpful. Have a nice day, -- Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/ > He who writes carelessly confesses thereby at the very outset that he does > not attach much importance to his own thoughts. -- Arthur Schopenhauer
Attachment
> From: Shaun Thomas <sthomas@optionshouse.com > To: PostgreSQL General <pgsql-general@postgresql.org> > Cc: > Sent: Tuesday, 5 March 2013, 14:51 > Subject: [GENERAL] Why does slony use a cursor? Anyone know? > > Hey everyone, > > Frankly, I'm shocked at what I just found. > > We did a delete last night of a few million rows, and come back this morning to > find that slony is 9-hours behind. After some investigation, it became apparent > that slony opens up a cursor and orders it by the log_actionseq column. Then it > fetches 500 rows, and closes the cursor. So it's fetching several million > rows into a cursor, to fetch 500, and then throw the rest away. > > That is quite possibly the least efficient manner I could think of to build a > sync system, so maybe someone knows why they did it that way? > > At least with a regular query, it could sort by the column it wanted, and put a > nifty index on it for those 500-row chunks it's grabbing. I must be missing > something... What version of slony are you on? The specifics of what you mention don't sound quite right, but it sounds very much likebug 167 which was fixed in 2.1.2 if I remember correctly. Glyn
On 03/06/2013 04:49 AM, Glyn Astill wrote: > What version of slony are you on? The specifics of what you mention > don't sound quite right, but it sounds very much like bug 167 which > was fixed in 2.1.2 if I remember correctly. We're on 2.1.2. Presumably, anyway. I didn't encounter the problem in stage when I set up a testbed. But it also might not be related. The problem I can tell from the logs, is that it was closing the cursor pretty much right as soon as it got the results. 75 seconds to set up a cursor of that size and then an hour to sync all the data isn't a problem. 75 seconds for every 500 rows *is*. The stage test I did didn't do that when I deleted 20M rows from a 50M row table, but I also only set it up with a single replication set. My next test will be to test with two or three replication sets that all get big deletes like that. My guess is that it can't adequately swap between them on SYNC events, so it has to rebuild the cursor every time. Either way, we're likely to be switching to an ETL system because we need to start scaling horizontally soon. Unless I want to set up a bunch of partition targets, we'll pretty much have to drop Slony then. I just want to keep it working until then. :) Thanks for the info! -- Shaun Thomas OptionsHouse | 141 W. Jackson Blvd. | Suite 500 | Chicago IL, 60604 312-676-8870 sthomas@optionshouse.com ______________________________________________ See http://www.peak6.com/email_disclaimer/ for terms and conditions related to this email
On 6 March 2013 14:35, Shaun Thomas <sthomas@optionshouse.com> wrote: > On 03/06/2013 04:49 AM, Glyn Astill wrote: > >> What version of slony are you on? The specifics of what you mention >> don't sound quite right, but it sounds very much like bug 167 which >> was fixed in 2.1.2 if I remember correctly. > > > We're on 2.1.2. Presumably, anyway. I didn't encounter the problem in stage > when I set up a testbed. But it also might not be related. The problem I can > tell from the logs, is that it was closing the cursor pretty much right as > soon as it got the results. 75 seconds to set up a cursor of that size and > then an hour to sync all the data isn't a problem. 75 seconds for every 500 > rows *is*. > > The stage test I did didn't do that when I deleted 20M rows from a 50M row > table, but I also only set it up with a single replication set. My next test > will be to test with two or three replication sets that all get big deletes > like that. My guess is that it can't adequately swap between them on SYNC > events, so it has to rebuild the cursor every time. > > Either way, we're likely to be switching to an ETL system because we need to > start scaling horizontally soon. Unless I want to set up a bunch of > partition targets, we'll pretty much have to drop Slony then. I just want to > keep it working until then. :) > A cursor can make use of indexes for sorting, so an index on sl_log_1/2(log_actionseq) may help. Regards, Dean
> From: Shaun Thomas <sthomas@optionshouse.com>
> To: Glyn Astill <glynastill@yahoo.co.uk>
> Cc: PostgreSQL General <pgsql-general@postgresql.org>
> Sent: Wednesday, 6 March 2013, 14:35
> Subject: Re: [GENERAL] Why does slony use a cursor? Anyone know?
>
> On 03/06/2013 04:49 AM, Glyn Astill wrote:
>
>> What version of slony are you on? The specifics of what you mention
>> don't sound quite right, but it sounds very much like bug 167 which
>> was fixed in 2.1.2 if I remember correctly.
>
> We're on 2.1.2. Presumably, anyway. I didn't encounter the problem in
> stage when I set up a testbed. But it also might not be related. The problem I
> can tell from the logs, is that it was closing the cursor pretty much right as
> soon as it got the results. 75 seconds to set up a cursor of that size and then
> an hour to sync all the data isn't a problem. 75 seconds for every 500 rows
> *is*.
>
> The stage test I did didn't do that when I deleted 20M rows from a 50M row
> table, but I also only set it up with a single replication set. My next test
> will be to test with two or three replication sets that all get big deletes like
> that. My guess is that it can't adequately swap between them on SYNC events,
> so it has to rebuild the cursor every time.
>
> To: Glyn Astill <glynastill@yahoo.co.uk>
> Cc: PostgreSQL General <pgsql-general@postgresql.org>
> Sent: Wednesday, 6 March 2013, 14:35
> Subject: Re: [GENERAL] Why does slony use a cursor? Anyone know?
>
> On 03/06/2013 04:49 AM, Glyn Astill wrote:
>
>> What version of slony are you on? The specifics of what you mention
>> don't sound quite right, but it sounds very much like bug 167 which
>> was fixed in 2.1.2 if I remember correctly.
>
> We're on 2.1.2. Presumably, anyway. I didn't encounter the problem in
> stage when I set up a testbed. But it also might not be related. The problem I
> can tell from the logs, is that it was closing the cursor pretty much right as
> soon as it got the results. 75 seconds to set up a cursor of that size and then
> an hour to sync all the data isn't a problem. 75 seconds for every 500 rows
> *is*.
>
> The stage test I did didn't do that when I deleted 20M rows from a 50M row
> table, but I also only set it up with a single replication set. My next test
> will be to test with two or three replication sets that all get big deletes like
> that. My guess is that it can't adequately swap between them on SYNC events,
> so it has to rebuild the cursor every time.
>
Yeah, you'd expect the reason for using the cursor would be to pull those 500 lines into memory, process them and then get the next 500 etc. I've not seen any such lags on our systems, that doesn't mean it's not happening with much milder symptoms.
You say it happened on your production setup but not when you tried to reproduce it in your test environment, so is there anything useful in the slony logs to suggest things were not quite right at the time? I'm guessing your slons were running and generating syncs.
I'd definitely be asking on the slony lists about this, either something isn't right with your setup or it's something they can resolve.
On Tue, Mar 5, 2013 at 3:51 PM, Shaun Thomas <sthomas@optionshouse.com> wrote:
Why do you assume that opening a cursor with ORDER BY is equivalent to fetching all rows? It is not.
Just curious what made you think so,
thanks
Hey everyone,
Frankly, I'm shocked at what I just found.
We did a delete last night of a few million rows, and come back this morning to find that slony is 9-hours behind. After some investigation, it became apparent that slony opens up a cursor and orders it by the log_actionseq column. Then it fetches 500 rows, and closes the cursor. So it's fetching several million rows into a cursor, to fetch 500, and then throw the rest away.
Why do you assume that opening a cursor with ORDER BY is equivalent to fetching all rows? It is not.
Just curious what made you think so,
thanks
On 03/07/2013 07:40 AM, Filip Rembiałkowski wrote: > Why do you assume that opening a cursor with ORDER BY is equivalent > to fetching all rows? It is not. It is when 90% of a table's 65M rows are part of the result set. Calculating the cursor puts the retrieved rows in a temporary space until the cursor is freed. Calculating that many rows with that order by, on our particular system took about 75 seconds. The problem wasn't really that it was fetching the rows, but that it was closing the cursor *right* afterwards. Like this: 2013-03-04 07:46:06 CST DEBUG1 remoteHelperThread_1_1: 78.010 seconds until close cursor We saw that in the slave log about every 80 seconds. When I set up slony on our staging system, it only closed the cursor once all 20M rows were fetched and deleted from the subscriber node. Clearly whatever was happening in our production environment wasn't normal. But at the time, it *looked* like slony was forcing it to calculate and fill the cursor, getting the top 500 rows, and closing it right away on purpose. Turns out, that was abnormal. :) So, I'm not shocked anymore. A one time cursor calculation is fine. Now I'm confused as to what broke it or why it was acting that way when that clearly isn't the design goal. Ah well. -- Shaun Thomas OptionsHouse | 141 W. Jackson Blvd. | Suite 500 | Chicago IL, 60604 312-676-8870 sthomas@optionshouse.com ______________________________________________ See http://www.peak6.com/email_disclaimer/ for terms and conditions related to this email
Shaun Thomas <sthomas@optionshouse.com> wrote: > It is when 90% of a table's 65M rows are part of the result set. > Calculating the cursor puts the retrieved rows in a temporary space > until the cursor is freed. Calculating that many rows with that order > by, on our particular system took about 75 seconds. The problem wasn't > really that it was fetching the rows, but that it was closing the cursor > *right* afterwards. Like this: > > 2013-03-04 07:46:06 CST DEBUG1 remoteHelperThread_1_1: 78.010 seconds > until close cursor > > We saw that in the slave log about every 80 seconds. When I set up slony > on our staging system, it only closed the cursor once all 20M rows were > fetched and deleted from the subscriber node. Clearly whatever was > happening in our production environment wasn't normal. But at the time, > it *looked* like slony was forcing it to calculate and fill the cursor, > getting the top 500 rows, and closing it right away on purpose. Turns > out, that was abnormal. :) > > So, I'm not shocked anymore. A one time cursor calculation is fine. Now > I'm confused as to what broke it or why it was acting that way when that > clearly isn't the design goal. Exactly what version was this? Just about any queuing system (and this definitely includes Slony and some JMS implementations) were vulnerable to an autovacuum bug fixed with this patch from Jan Wieck: http://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=b19e4250b45e91c9cbdd18d35ea6391ab5961c8d This fix was backpatched as far as 9.0 and is present in the latest minor releases, but not earlier ones. It can cause symptoms such as you describe. -- Kevin Grittner EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On 03/07/2013 08:23 AM, Kevin Grittner wrote: > Exactly what version was this? Just about any queuing system (and > this definitely includes Slony and some JMS implementations) were > vulnerable to an autovacuum bug fixed with this patch from Jan > Wieck: Hmm. It's a 9.1.7 release, so if the latest is 9.1.8, that might be it. It would explain why my contrived scenario in stage couldn't replicate it, too. I'm going to see if I can trigger this behavior in stage by creating a bunch of dead tuples in the slony log table while it's working. Thanks, Kevin! -- Shaun Thomas OptionsHouse | 141 W. Jackson Blvd. | Suite 500 | Chicago IL, 60604 312-676-8870 sthomas@optionshouse.com ______________________________________________ See http://www.peak6.com/email_disclaimer/ for terms and conditions related to this email