Re: BUG #10189: Limit in 9.3.4 no longer works when ordering using a composite multi-type index - Mailing list pgsql-bugs

From Nick Rupley
Subject Re: BUG #10189: Limit in 9.3.4 no longer works when ordering using a composite multi-type index
Date
Msg-id CAMi1eSFUoh+Qd2-U_i=W366w7D5xyH_vAhyKTnHmCZzepGu8Bg@mail.gmail.com
Whole thread Raw
In response to Re: BUG #10189: Limit in 9.3.4 no longer works when ordering using a composite multi-type index  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-bugs
Hey guys, so we applied that patch, and it *appears* to have fixed the
issue! Through our application, we basically have it to the point where we
are able to reliably reproduce the issue within 5 minutes or so. However we
applied the patch, ran the same tests, and it no longer happened at all,
even after an hour of testing.

We attempted to reproduce the issue in a standalone way, doing all the same
inserts/updates in all the same transactions, but unfortunately we haven't
yet been able to reproduce it there. I'm thinking it's likely a very
timing-sensitive issue, and it just happens to manifest for our application
because of race conditions, etc.

Not sure if this is relevant or not, but it looks like the duplicate rows
continue to be inserted here and there on our production box (to which we
haven't yet applied the hotfix). As I stated before that production box did
have some server crashes before, but actually it hasn't had any recently
(in the past week), and yet the duplicate rows continue to happen. At one
point we did identify and reindex the tables that were needed, which worked
great. But then *after* that, new duplicate rows cropped up, even without
the server having crashed. Does that still make sense within the context of
this bug?

If we're able to create that self-contained test case (we're trying) we'll
be sure to let you know.

-Nick


On Fri, May 2, 2014 at 12:08 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

> Andres Freund <andres@2ndquadrant.com> writes:
> > On 2014-05-02 14:23:50 -0400, Tom Lane wrote:
> >> There's been one post-9.3.4 fix in this same general area:
> >>
> http://git.postgresql.org/gitweb/?p=postgresql.git&a=commitdiff&h=c0bd128c8
> >> but according to the commit message, at least, that bug would not have
> led
> >> to the symptom you're seeing, namely rows disappearing from indexes
> while
> >> they're still visible to seqscans.
>
> > Hm. With a bit of bad luck it might. The bug essentially has the
> > consequence that two updates might succeed for the same row. Consider
> > what happens if the row gets hot updated and then a second hot update,
> > due to the bug, also succeeds. The second update will change t_ctid of
> > the old tuple to point to the second version. If the transaction that
> > did the second update then aborts a index search starting at the root of
> > the hot change won't find any surviving tuple. But a seqscan will. :(.
>
> Hm, good point.  Nick, if you're up for applying a hotfix you could try
> grabbing the aforesaid patch and seeing if it makes things better.
> If not, we're probably gonna need that test case to figure out where
> things are still going wrong.
>
>                         regards, tom lane
>

--
CONFIDENTIALITY NOTICE: The information contained in this electronic
transmission may be confidential. If you are not an intended recipient, be
aware that any disclosure, copying, distribution or use of the information
contained in this transmission is prohibited and may be unlawful. If you
have received this transmission in error, please notify us by email reply
and then erase it from your computer system.

pgsql-bugs by date:

Previous
From: Alvaro Herrera
Date:
Subject: Re: BUG #10189: Limit in 9.3.4 no longer works when ordering using a composite multi-type index
Next
From: Nick Rupley
Date:
Subject: Re: BUG #10189: Limit in 9.3.4 no longer works when ordering using a composite multi-type index