> The good news: after much pain, I was able to develop a semi-repeatable
> test for that duplicate-tuple problem. (It's a script of about 6300 SQL
> commands for five concurrent backends, which takes about 10 minutes to
> run and produces a corrupted table about one run out of two...) And
> this test indicates that the current CVS sources don't have the bug.
> So, rather than expending a lot of effort trying to figure out just
> what the bug is in 6.3.2, we are going to cross our fingers and put our
> production application on 6.4beta.
That's good news.
> The bad news: this same script exposes a different bug in the current
> sources (and perhaps older releases too). *Very* rarely, like less
> than one run out of ten, the test driver gets wedged or fails with an
> "out of memory" error. I just traced this to its cause, and the cause
> is that a SELECT reply coming from the backend is corrupt. In fact,
> what I see in libpq's input buffer is that a "NOTIFY" message has been
> inserted into the middle of the tuple data :-(. So the interlock that
> supposedly prevents Async_NotifyFrontEnd() from being invoked during
> another command does not work reliably.
>
> I will look into this, but I could use advice from anyone who
> understands how that stuff is supposed to work.
Tom, I can't think of anyone who understands it better than you.
However, if you find something in the backend and need help, let me
know.
I will be on vacation from Sunday until Wednesday.
--
Bruce Momjian | http://www.op.net/~candle
maillist@candle.pha.pa.us | (610) 853-3000
+ If your life is a hard drive, | 830 Blythe Avenue
+ Christ can be your backup. | Drexel Hill, Pennsylvania 19026