Nick Rupley wrote:
> Hey guys, so we applied that patch, and it *appears* to have fixed the
> issue! Through our application, we basically have it to the point where we
> are able to reliably reproduce the issue within 5 minutes or so. However we
> applied the patch, ran the same tests, and it no longer happened at all,
> even after an hour of testing.
>
> We attempted to reproduce the issue in a standalone way, doing all the same
> inserts/updates in all the same transactions, but unfortunately we haven't
> yet been able to reproduce it there. I'm thinking it's likely a very
> timing-sensitive issue, and it just happens to manifest for our application
> because of race conditions, etc.
Yes, it's extremely timing-sensitive.
> Not sure if this is relevant or not, but it looks like the duplicate rows
> continue to be inserted here and there on our production box (to which we
> haven't yet applied the hotfix). As I stated before that production box did
> have some server crashes before, but actually it hasn't had any recently
> (in the past week), and yet the duplicate rows continue to happen.
This bug is not dependent on a crash; the corruption occurs to the live
data. Only the previous bug mentioned by Tom manifested itself during
crash recovery.
> At one point we did identify and reindex the tables that were needed,
> which worked great. But then *after* that, new duplicate rows cropped
> up, even without the server having crashed. Does that still make sense
> within the context of this bug?
Yes. Upgrading to a fixed binary is, of course, strongly recommended.
> If we're able to create that self-contained test case (we're trying) we'll
> be sure to let you know.
Be sure to let us know if you find other bugs, too!
--
Álvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services