Thread: 9.2 streaming replication issue and solution strategy
Hello, We are running 9.2 w/ streaming replication. The slave is used for heavy tsearch based data mining. Apparently depending on the business of the master the slave queries fail with different frequency with the following message— ERROR: canceling statement due to conflict with recovery DETAIL: User query might have needed to see row versions that must be removed. Initially we resolved the problem by introducing a retry (wait 1 second, try), which works in some cases, and fails on otherdue to too many retries necessary (10+). After that we introduced a second slave, and told the try to switch slaves (try here, if not, wait 1 second and try there). Which works almost every time. So apparently the issues are not synchronous (probably logical as the streaming replication is asynchronous). Has anybody else experience such an issue and are there different approaches to it? Than you, T.E.L.
On 11/17/2012 06:08 AM, T. E. Lawrence wrote: > Hello, > > We are running 9.2 w/ streaming replication. > > The slave is used for heavy tsearch based data mining. > > Apparently depending on the business of the master the slave queries fail with different frequency with the following message— > > ERROR: canceling statement due to conflict with recovery > DETAIL: User query might have needed to see row versions that must be removed. > > Initially we resolved the problem by introducing a retry (wait 1 second, try), which works in some cases, and fails onother due to too many retries necessary (10+). > > After that we introduced a second slave, and told the try to switch slaves (try here, if not, wait 1 second and try there). > > Which works almost every time. > > So apparently the issues are not synchronous (probably logical as the streaming replication is asynchronous). > > Has anybody else experience such an issue and are there different approaches to it? Have you looked at the below?: http://www.postgresql.org/docs/9.2/interactive/hot-standby.html#HOT-STANDBY-CONFLICT 25.5.2. Handling Query Conflicts > > Than you, > T.E.L. > -- Adrian Klaver adrian.klaver@gmail.com
> Have you looked at the below?: > > http://www.postgresql.org/docs/9.2/interactive/hot-standby.html#HOT-STANDBY-CONFLICT > > 25.5.2. Handling Query Conflicts Yes, thank you! I am hoping to hear more from people who have running 9.2 systems w/ between 100m and 1b records, w/ streaming replicationand heavy data mining on the slaves (5-50m records read per hour by multiple parallel processes), while from timeto time (2-3 times/week) between 20 and 50m records are inserted/updated within 24 hours. How do they resolve this situation. For us retry + switch slave works quite well right now, without touching the db configuration in this respect yet. But may be there are different approaches.
On 11/17/2012 07:33 AM, T. E. Lawrence wrote: >> Have you looked at the below?: >> >> http://www.postgresql.org/docs/9.2/interactive/hot-standby.html#HOT-STANDBY-CONFLICT >> >> 25.5.2. Handling Query Conflicts > > Yes, thank you! > > I am hoping to hear more from people who have running 9.2 systems w/ between 100m and 1b records, w/ streaming replicationand heavy data mining on the slaves (5-50m records read per hour by multiple parallel processes), while from timeto time (2-3 times/week) between 20 and 50m records are inserted/updated within 24 hours. > > How do they resolve this situation. > > For us retry + switch slave works quite well right now, without touching the db configuration in this respect yet. > > But may be there are different approaches. The conditions you cite above are outside my experience and I will leave it to others for specific suggestions. On a more conceptual level, assuming asynchronous replication, I see the following. 1) In a given database data changes are like wave fronts flowing across a sea of data. 2) Replication introduces another wave front in the movement of data from parent to child. 3) Querying that wave front in the child becomes a timing issue. 4) There are settings to mitigate that timing issue but not eliminate it. To do so would require more information exchange between the parent/child then takes place currently. 5) If it is essential to work the wave front then turbulence is to be expected and dealt with. See your solution. I too would be interested in method that is not some variation of what you do. 6) If working the wave front is not essential, then other strategies come into play. For instance partitioning, where older more 'settled' data can be segregated and worked on. > > -- Adrian Klaver adrian.klaver@gmail.com
Adrian Klaver wrote: > I am hoping to hear more from people who have running 9.2 systems > w/ between 100m and 1b records, w/ streaming replication and heavy > data mining on the slaves (5-50m records read per hour by multiple > parallel processes), while from time to time (2-3 times/week) > between 20 and 50m records are inserted/updated within 24 hours. I've run replication on that general scale. IMV, when you are using PostgreSQL hot standby and streaming replication you need to decide whether a particular replica is primarily for recovery purposes, in which case you can't run queries which take very long without getting canceled for this reason, or primarily for reporting, in which case long-running queries can finish, but the data in the database may get relatively stale while they run. If you have multiple replicas, you probably want to configure them differently in this regard. -Kevin