Stuart Bishop <stuart@stuartbishop.net> writes:
> I've tracked down some crashes we are having and produced a test case using
> the nasty data. The sample data used to trigger the crash is 6MB in size. It
> doesn't crash immediately, instead chewing up CPU for several minutes before
> crashing.
> http://launchpadlibrarian.net/9501485/crashme.sql
This example crashes CVS HEAD as well. It appears that pg_qsort is
going into infinite recursion, which surely ought to be impossible ...
unless the comparison function is giving self-inconsistent results.
Which it looks like it is: compareWORD() is written in such a way that
it will never return zero, which cannot be right. Given two inputs
that are in fact equal, the result will depend on the order in which
they are presented, which is sufficient to confuse any sort algorithm.
And sure enough, that's what is being compared:
(gdb) f 0
#0 0x00000000005d9c37 in compareWORD (a=0x2aaaaeec2048, b=0x2aaaaeec69c8)
at to_tsany.c:37
37 int res = strncmp(
(gdb) p *(ParsedWord *) a
$1 = {len = 1, nvariant = 0, pos = {pos = 16383, apos = 0x3fff},
word = 0x2aaab547da78 "0", alen = 0}
(gdb) p *(ParsedWord *) b
$2 = {len = 1, nvariant = 0, pos = {pos = 16383, apos = 0x3fff},
word = 0x2aaab988c710 "0", alen = 0}
(gdb)
It may be that this is a "not supposed to happen" case because there
shouldn't be any equal items in the sort input; in which case there
is some other bug involved too. But I would say that compareWORD
is broken nonetheless.
Teodor, Oleg, your thoughts?
regards, tom lane