Thread: Odd 9.4, 9.3 buildfarm failure on s390x
Hi, It seems Mark started a new buildfarm animal on s390x. It shows a pretty odd failure on 9.3 and 9.4, but *not* on newer animals: https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=lumpsucker&dt=2018-09-26%2020%3A30%3A58 ================== pgsql.build/src/test/regress/regression.diffs =================== *** /home/linux1/build-farm-8-clang/buildroot/REL9_4_STABLE/pgsql.build/src/test/regress/expected/uuid.out Mon Sep 2417:49:30 2018 --- /home/linux1/build-farm-8-clang/buildroot/REL9_4_STABLE/pgsql.build/src/test/regress/results/uuid.out Wed Sep 26 16:31:312018 *************** *** 64,72 **** SELECT guid_field FROM guid1 ORDER BY guid_field DESC; guid_field -------------------------------------- - 3f3e3c3b-3a30-3938-3736-353433a2313e - 22222222-2222-2222-2222-222222222222 11111111-1111-1111-1111-111111111111 (3 rows) -- = operator test --- 64,72 ---- SELECT guid_field FROM guid1 ORDER BY guid_field DESC; guid_field -------------------------------------- 11111111-1111-1111-1111-111111111111 + 22222222-2222-2222-2222-222222222222 + 3f3e3c3b-3a30-3938-3736-353433a2313e (3 rows) -- = operator test ====================================================================== Mark, is there anything odd for specific branches? I don't see anything immediately suspicious in the relevant comparator code... Greetings, Andres Freund
Andres Freund <andres@anarazel.de> writes: > It seems Mark started a new buildfarm animal on s390x. It shows a pretty > odd failure on 9.3 and 9.4, but *not* on newer animals: No, lumpsucker is showing the same failure on 9.5 as well. I suspect that the reason 9.6 and up are OK is that 9.6 is where we introduced the abbreviated-sort-key machinery. IOW, the problem exists in the old-style UUID sort comparator but not the new one. Which is pretty darn odd, because the old-style comparator is just memcmp(). How could that be broken without causing lots more issues? regards, tom lane
On Fri, Sep 28, 2018 at 11:52:15AM -0700, Andres Freund wrote: > Mark, is there anything odd for specific branches? No... I don't have anything in the config that would be applied to specific branches... Regards, Mark -- Mark Wong http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, RemoteDBA, Training & Services
Hi, On 2018-09-28 15:22:23 -0700, Mark Wong wrote: > On Fri, Sep 28, 2018 at 11:52:15AM -0700, Andres Freund wrote: > > Mark, is there anything odd for specific branches? > > No... I don't have anything in the config that would be applied to > specific branches... Could you perhaps do some manual debugging on that machine? Maybe starting with manually running something like: SELECT uuid_cmp('11111111-1111-1111-1111-111111111111'::uuid, '22222222-2222-2222-2222-222222222222'::uuid); SELECT uuid_cmp('11111111-1111-1111-1111-111111111111'::uuid, '11111111-1111-1111-1111-111111111111'::uuid); SELECT uuid_cmp('11111111-1111-1111-1111-111111111111'::uuid, '11111113-1111-1111-1111-111111111111'::uuid); SELECT uuid_cmp('11111111-1111-1111-1111-111111111111'::uuid, '11111110-1111-1111-1111-111111111111'::uuid); on both master and one of the failing branches? Greetings, Andres Freund
Hi Andres, On Fri, Sep 28, 2018 at 03:41:27PM -0700, Andres Freund wrote: > On 2018-09-28 15:22:23 -0700, Mark Wong wrote: > > On Fri, Sep 28, 2018 at 11:52:15AM -0700, Andres Freund wrote: > > > Mark, is there anything odd for specific branches? > > > > No... I don't have anything in the config that would be applied to > > specific branches... > > Could you perhaps do some manual debugging on that machine? > > Maybe starting with manually running something like: > > SELECT uuid_cmp('11111111-1111-1111-1111-111111111111'::uuid, '22222222-2222-2222-2222-222222222222'::uuid); > SELECT uuid_cmp('11111111-1111-1111-1111-111111111111'::uuid, '11111111-1111-1111-1111-111111111111'::uuid); > SELECT uuid_cmp('11111111-1111-1111-1111-111111111111'::uuid, '11111113-1111-1111-1111-111111111111'::uuid); > SELECT uuid_cmp('11111111-1111-1111-1111-111111111111'::uuid, '11111110-1111-1111-1111-111111111111'::uuid); > > on both master and one of the failing branches? I've attached the output for head and the 9.4 stable branch. It appears they are returning the same results. I built them both by: CC=/usr/bin/clang ./configure --enable-cassert --enable-debug \ --enable-nls --with-perl --with-python --with-tcl \ --with-tclconfig=/usr/lib64 --with-gssapi --with-openssl \ --with-ldap --with-libxml --with-libxslt What should I try next? Regards, Mark -- Mark Wong http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, RemoteDBA, Training & Services
Attachment
>>>>> "Mark" == Mark Wong <mark@2ndQuadrant.com> writes: Mark> What should I try next? What is the size of a C "int" on this platform? -- Andrew (irc:RhodiumToad)
On 09/29/2018 01:36 AM, Andrew Gierth wrote: > Mark> What should I try next? > > What is the size of a C "int" on this platform? > 4. cheers andrew -- Andrew Dunstan https://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
>>>>> "Andrew" == Andrew Dunstan <andrew.dunstan@2ndquadrant.com> writes: >> What is the size of a C "int" on this platform? Andrew> 4. Hmm. Because int being more than 32 bits is the simplest explanation for this difference. How about the output of this query: with d(a) as (values ('11111111-1111-1111-1111-111111111111'::uuid), ('22222222-2222-2222-2222-222222222222'::uuid), ('3f3e3c3b-3a30-3938-3736-353433a2313e'::uuid)) select d1.a, d2.a, uuid_cmp(d1.a,d2.a) from d d1, d d2 order by d1.a asc, d2.a desc; -- Andrew (irc:RhodiumToad)
Andrew Gierth <andrew@tao11.riddles.org.uk> writes: > Because int being more than 32 bits is the simplest explanation for this > difference. Curious to hear your reasoning behind that statement? I hadn't gotten further than "memcmp is broken" ... and neither of those theories is tenable, because if they were true then a lot more things besides uuid sorting would be falling over. regards, tom lane
>>>>> "Tom" == Tom Lane <tgl@sss.pgh.pa.us> writes: > Andrew Gierth <andrew@tao11.riddles.org.uk> writes: >> Because int being more than 32 bits is the simplest explanation for >> this difference. Tom> Curious to hear your reasoning behind that statement? I hadn't Tom> gotten further than "memcmp is broken" ... and neither of those Tom> theories is tenable, because if they were true then a lot more Tom> things besides uuid sorting would be falling over. memcmp() returns an int, and guarantees only the sign of the result, so ((int32) memcmp()) may have the wrong value if int is wider than int32. But yeah, it seems unlikely that it would break for uuid but not bytea (or text in collate C). -- Andrew (irc:RhodiumToad)
On Sun, Sep 30, 2018 at 12:38:46AM +0100, Andrew Gierth wrote: > >>>>> "Andrew" == Andrew Dunstan <andrew.dunstan@2ndquadrant.com> writes: > > >> What is the size of a C "int" on this platform? > > Andrew> 4. > > Hmm. > > Because int being more than 32 bits is the simplest explanation for this > difference. > > How about the output of this query: > > with d(a) as (values ('11111111-1111-1111-1111-111111111111'::uuid), > ('22222222-2222-2222-2222-222222222222'::uuid), > ('3f3e3c3b-3a30-3938-3736-353433a2313e'::uuid)) > select d1.a, d2.a, uuid_cmp(d1.a,d2.a) from d d1, d d2 > order by d1.a asc, d2.a desc; That also appears to produce the same results: With 9.4: postgres=# select version(); version ------------------------------------------------------------------------------------------------------------------- PostgreSQL 9.4.19 on s390x-ibm-linux-gnu, compiled by clang version 5.0.1 (tags/RELEASE_501/final 312548), 64-bit (1 row) ... a | a | uuid_cmp --------------------------------------+--------------------------------------+------------- 11111111-1111-1111-1111-111111111111 | 11111111-1111-1111-1111-111111111111 | 0 11111111-1111-1111-1111-111111111111 | 22222222-2222-2222-2222-222222222222 | -2147483648 11111111-1111-1111-1111-111111111111 | 3f3e3c3b-3a30-3938-3736-353433a2313e | -2147483648 22222222-2222-2222-2222-222222222222 | 11111111-1111-1111-1111-111111111111 | 1 22222222-2222-2222-2222-222222222222 | 22222222-2222-2222-2222-222222222222 | 0 22222222-2222-2222-2222-222222222222 | 3f3e3c3b-3a30-3938-3736-353433a2313e | -2147483648 3f3e3c3b-3a30-3938-3736-353433a2313e | 11111111-1111-1111-1111-111111111111 | 1 3f3e3c3b-3a30-3938-3736-353433a2313e | 22222222-2222-2222-2222-222222222222 | 1 3f3e3c3b-3a30-3938-3736-353433a2313e | 3f3e3c3b-3a30-3938-3736-353433a2313e | 0 (9 rows) Then with HEAD: postgres=# select version(); version -------------------------------------------------------------------------------------------------------------------- PostgreSQL 12devel on s390x-ibm-linux-gnu, compiled by clang version 5.0.1 (tags/RELEASE_501/final 312548), 64-bit (1 row) ... a | a | uuid_cmp --------------------------------------+--------------------------------------+------------- 11111111-1111-1111-1111-111111111111 | 11111111-1111-1111-1111-111111111111 | 0 11111111-1111-1111-1111-111111111111 | 22222222-2222-2222-2222-222222222222 | -2147483648 11111111-1111-1111-1111-111111111111 | 3f3e3c3b-3a30-3938-3736-353433a2313e | -2147483648 22222222-2222-2222-2222-222222222222 | 11111111-1111-1111-1111-111111111111 | 1 22222222-2222-2222-2222-222222222222 | 22222222-2222-2222-2222-222222222222 | 0 22222222-2222-2222-2222-222222222222 | 3f3e3c3b-3a30-3938-3736-353433a2313e | -2147483648 3f3e3c3b-3a30-3938-3736-353433a2313e | 11111111-1111-1111-1111-111111111111 | 1 3f3e3c3b-3a30-3938-3736-353433a2313e | 22222222-2222-2222-2222-222222222222 | 1 3f3e3c3b-3a30-3938-3736-353433a2313e | 3f3e3c3b-3a30-3938-3736-353433a2313e | 0 (9 rows) Regards, Mark -- Mark Wong http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, RemoteDBA, Training & Services
Mark Wong <mark@2ndQuadrant.com> writes: > a | a | uuid_cmp > --------------------------------------+--------------------------------------+------------- > 11111111-1111-1111-1111-111111111111 | 11111111-1111-1111-1111-111111111111 | 0 > 11111111-1111-1111-1111-111111111111 | 22222222-2222-2222-2222-222222222222 | -2147483648 > 11111111-1111-1111-1111-111111111111 | 3f3e3c3b-3a30-3938-3736-353433a2313e | -2147483648 > 22222222-2222-2222-2222-222222222222 | 11111111-1111-1111-1111-111111111111 | 1 > 22222222-2222-2222-2222-222222222222 | 22222222-2222-2222-2222-222222222222 | 0 > 22222222-2222-2222-2222-222222222222 | 3f3e3c3b-3a30-3938-3736-353433a2313e | -2147483648 > 3f3e3c3b-3a30-3938-3736-353433a2313e | 11111111-1111-1111-1111-111111111111 | 1 > 3f3e3c3b-3a30-3938-3736-353433a2313e | 22222222-2222-2222-2222-222222222222 | 1 > 3f3e3c3b-3a30-3938-3736-353433a2313e | 3f3e3c3b-3a30-3938-3736-353433a2313e | 0 > (9 rows) Oooh ... apparently, on that platform, memcmp() is willing to produce INT_MIN in some cases. That's not a safe value for a sort comparator to produce --- we explicitly say that somewhere, IIRC. I think we implement DESC by negating the comparator's result, which explains why only the DESC case fails. regards, tom lane
On 2018-10-01 11:58:51 -0400, Tom Lane wrote: > Mark Wong <mark@2ndQuadrant.com> writes: > > a | a | uuid_cmp > > --------------------------------------+--------------------------------------+------------- > > 11111111-1111-1111-1111-111111111111 | 11111111-1111-1111-1111-111111111111 | 0 > > 11111111-1111-1111-1111-111111111111 | 22222222-2222-2222-2222-222222222222 | -2147483648 > > 11111111-1111-1111-1111-111111111111 | 3f3e3c3b-3a30-3938-3736-353433a2313e | -2147483648 > > 22222222-2222-2222-2222-222222222222 | 11111111-1111-1111-1111-111111111111 | 1 > > 22222222-2222-2222-2222-222222222222 | 22222222-2222-2222-2222-222222222222 | 0 > > 22222222-2222-2222-2222-222222222222 | 3f3e3c3b-3a30-3938-3736-353433a2313e | -2147483648 > > 3f3e3c3b-3a30-3938-3736-353433a2313e | 11111111-1111-1111-1111-111111111111 | 1 > > 3f3e3c3b-3a30-3938-3736-353433a2313e | 22222222-2222-2222-2222-222222222222 | 1 > > 3f3e3c3b-3a30-3938-3736-353433a2313e | 3f3e3c3b-3a30-3938-3736-353433a2313e | 0 > > (9 rows) > > > Oooh ... apparently, on that platform, memcmp() is willing to produce > INT_MIN in some cases. That's not a safe value for a sort comparator > to produce --- we explicitly say that somewhere, IIRC. Hm, that'd be pretty painful - memcmp() isn't guaranteed to return anything smaller. And we use memcmp in a fair number of comparators. > I think we implement DESC by negating the comparator's result, which > explains why only the DESC case fails. That makes sense. Greetings, Andres Freund
Andres Freund <andres@anarazel.de> writes: > On 2018-10-01 11:58:51 -0400, Tom Lane wrote: >> Oooh ... apparently, on that platform, memcmp() is willing to produce >> INT_MIN in some cases. That's not a safe value for a sort comparator >> to produce --- we explicitly say that somewhere, IIRC. > Hm, that'd be pretty painful - memcmp() isn't guaranteed to return > anything smaller. And we use memcmp in a fair number of comparators. Yeah. So our choices are (1) Retain the current restriction on what sort comparators can produce. Find all the places where memcmp's result is returned directly, and fix them. (I wonder if strcmp has same issue.) (2) Drop the restriction. This'd require at least changing the DESC correction, and maybe other things. I'm not sure what the odds would be of finding everyplace we need to check. Neither one is sounding very pleasant, or maintainable. regards, tom lane
On 2018-10-01 12:13:57 -0400, Tom Lane wrote: > Andres Freund <andres@anarazel.de> writes: > > On 2018-10-01 11:58:51 -0400, Tom Lane wrote: > >> Oooh ... apparently, on that platform, memcmp() is willing to produce > >> INT_MIN in some cases. That's not a safe value for a sort comparator > >> to produce --- we explicitly say that somewhere, IIRC. > > > Hm, that'd be pretty painful - memcmp() isn't guaranteed to return > > anything smaller. And we use memcmp in a fair number of comparators. > > Yeah. So our choices are > > (1) Retain the current restriction on what sort comparators can > produce. Find all the places where memcmp's result is returned > directly, and fix them. (I wonder if strcmp has same issue.) > > (2) Drop the restriction. This'd require at least changing the > DESC correction, and maybe other things. I'm not sure what the > odds would be of finding everyplace we need to check. > > Neither one is sounding very pleasant, or maintainable. (2) seems more maintainable to me (or perhaps less unmaintainable). It's infrastructure, rather than every datatype + support out there... Greetings, Andres Freund
Andres Freund <andres@anarazel.de> writes: > On 2018-10-01 12:13:57 -0400, Tom Lane wrote: >> Yeah. So our choices are >> >> (1) Retain the current restriction on what sort comparators can >> produce. Find all the places where memcmp's result is returned >> directly, and fix them. (I wonder if strcmp has same issue.) >> >> (2) Drop the restriction. This'd require at least changing the >> DESC correction, and maybe other things. I'm not sure what the >> odds would be of finding everyplace we need to check. >> >> Neither one is sounding very pleasant, or maintainable. > (2) seems more maintainable to me (or perhaps less unmaintainable). It's > infrastructure, rather than every datatype + support out there... I guess we could set up some testing infrastructure: hack int4cmp and/or a couple other popular comparators so that they *always* return INT_MIN, 0, or INT_MAX, and then see what falls over. I'm fairly sure that btree, as well as the sort code proper, has got an issue here. regards, tom lane
On 10/01/2018 11:58 AM, Tom Lane wrote: > Mark Wong <mark@2ndQuadrant.com> writes: >> a | a | uuid_cmp >> --------------------------------------+--------------------------------------+------------- >> 11111111-1111-1111-1111-111111111111 | 11111111-1111-1111-1111-111111111111 | 0 >> 11111111-1111-1111-1111-111111111111 | 22222222-2222-2222-2222-222222222222 | -2147483648 >> 11111111-1111-1111-1111-111111111111 | 3f3e3c3b-3a30-3938-3736-353433a2313e | -2147483648 >> 22222222-2222-2222-2222-222222222222 | 11111111-1111-1111-1111-111111111111 | 1 >> 22222222-2222-2222-2222-222222222222 | 22222222-2222-2222-2222-222222222222 | 0 >> 22222222-2222-2222-2222-222222222222 | 3f3e3c3b-3a30-3938-3736-353433a2313e | -2147483648 >> 3f3e3c3b-3a30-3938-3736-353433a2313e | 11111111-1111-1111-1111-111111111111 | 1 >> 3f3e3c3b-3a30-3938-3736-353433a2313e | 22222222-2222-2222-2222-222222222222 | 1 >> 3f3e3c3b-3a30-3938-3736-353433a2313e | 3f3e3c3b-3a30-3938-3736-353433a2313e | 0 >> (9 rows) > > Oooh ... apparently, on that platform, memcmp() is willing to produce > INT_MIN in some cases. That's not a safe value for a sort comparator > to produce --- we explicitly say that somewhere, IIRC. I think we > implement DESC by negating the comparator's result, which explains > why only the DESC case fails. > > Is there a standard that forbids this, or have we just been lucky up to now? cheers andrew -- Andrew Dunstan https://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Andrew Dunstan <andrew.dunstan@2ndquadrant.com> writes: > On 10/01/2018 11:58 AM, Tom Lane wrote: >> Oooh ... apparently, on that platform, memcmp() is willing to produce >> INT_MIN in some cases. That's not a safe value for a sort comparator >> to produce --- we explicitly say that somewhere, IIRC. I think we >> implement DESC by negating the comparator's result, which explains >> why only the DESC case fails. > Is there a standard that forbids this, or have we just been lucky up to now? We've been lucky; POSIX just says the value is less than, equal to, or greater than zero. In practice, a memcmp that operates byte-at-a-time would not likely return anything outside +-255. But on a big-endian machine you could easily optimize to use word-wide operations to compare 4 bytes at a time, and I suspect that's what's happening here. Or maybe there's just some weird architecture-specific reason that makes it cheap for them to return INT_MIN rather than some other value? regards, tom lane
On Mon, Oct 01, 2018 at 05:11:02PM -0400, Tom Lane wrote: > Andrew Dunstan <andrew.dunstan@2ndquadrant.com> writes: > > On 10/01/2018 11:58 AM, Tom Lane wrote: > >> Oooh ... apparently, on that platform, memcmp() is willing to produce > >> INT_MIN in some cases. That's not a safe value for a sort comparator > >> to produce --- we explicitly say that somewhere, IIRC. I think we > >> implement DESC by negating the comparator's result, which explains > >> why only the DESC case fails. > > > Is there a standard that forbids this, or have we just been lucky up to now? > > We've been lucky; POSIX just says the value is less than, equal to, > or greater than zero. > > In practice, a memcmp that operates byte-at-a-time would not likely > return anything outside +-255. But on a big-endian machine you could > easily optimize to use word-wide operations to compare 4 bytes at a > time, and I suspect that's what's happening here. Or maybe there's > just some weird architecture-specific reason that makes it cheap > for them to return INT_MIN rather than some other value? > as a former S3[79]x assembler programmer, they probably do it in registers or using TRT. All of which could be word wise. > regards, tom lane > -- Larry Rosenman http://www.lerctr.org/~ler Phone: +1 214-642-9640 E-Mail: ler@lerctr.org US Mail: 5708 Sabbia Drive, Round Rock, TX 78665-2106
Attachment
On 10/01/2018 12:50 PM, Tom Lane wrote: > Andres Freund <andres@anarazel.de> writes: >> On 2018-10-01 12:13:57 -0400, Tom Lane wrote: >>> Yeah. So our choices are >>> >>> (1) Retain the current restriction on what sort comparators can >>> produce. Find all the places where memcmp's result is returned >>> directly, and fix them. (I wonder if strcmp has same issue.) >>> >>> (2) Drop the restriction. This'd require at least changing the >>> DESC correction, and maybe other things. I'm not sure what the >>> odds would be of finding everyplace we need to check. >>> >>> Neither one is sounding very pleasant, or maintainable. >> (2) seems more maintainable to me (or perhaps less unmaintainable). It's >> infrastructure, rather than every datatype + support out there... > I guess we could set up some testing infrastructure: hack int4cmp > and/or a couple other popular comparators so that they *always* > return INT_MIN, 0, or INT_MAX, and then see what falls over. > > I'm fairly sure that btree, as well as the sort code proper, > has got an issue here. > > I agree option 2 seems less unmaintainable. (Nice use of litotes there?) cheers andrew -- Andrew Dunstan https://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Tue, Oct 2, 2018 at 10:55 AM Andrew Dunstan <andrew.dunstan@2ndquadrant.com> wrote: > On 10/01/2018 12:50 PM, Tom Lane wrote: > > Andres Freund <andres@anarazel.de> writes: > >> On 2018-10-01 12:13:57 -0400, Tom Lane wrote: > >>> Yeah. So our choices are > >>> > >>> (1) Retain the current restriction on what sort comparators can > >>> produce. Find all the places where memcmp's result is returned > >>> directly, and fix them. (I wonder if strcmp has same issue.) > >>> > >>> (2) Drop the restriction. This'd require at least changing the > >>> DESC correction, and maybe other things. I'm not sure what the > >>> odds would be of finding everyplace we need to check. > >>> > >>> Neither one is sounding very pleasant, or maintainable. > >> (2) seems more maintainable to me (or perhaps less unmaintainable). It's > >> infrastructure, rather than every datatype + support out there... > > I guess we could set up some testing infrastructure: hack int4cmp > > and/or a couple other popular comparators so that they *always* > > return INT_MIN, 0, or INT_MAX, and then see what falls over. > > > > I'm fairly sure that btree, as well as the sort code proper, > > has got an issue here. > > > > > > > I agree option 2 seems less unmaintainable. (Nice use of litotes there?) +1 for option 2. It seems to me that it should ideally be the job of the code that is negating the value to deal with this edge case. I see that the restriction is clearly documented at the top of src/backend/access/nbtree/nbtcompare.c even though it directly returns strncmp() results. This was quite a surprising thread. -- Thomas Munro http://www.enterprisedb.com
I wrote: > Andres Freund <andres@anarazel.de> writes: >> On 2018-10-01 12:13:57 -0400, Tom Lane wrote: >>> (2) Drop the restriction. This'd require at least changing the >>> DESC correction, and maybe other things. I'm not sure what the >>> odds would be of finding everyplace we need to check. >> (2) seems more maintainable to me (or perhaps less unmaintainable). It's >> infrastructure, rather than every datatype + support out there... > I guess we could set up some testing infrastructure: hack int4cmp > and/or a couple other popular comparators so that they *always* > return INT_MIN, 0, or INT_MAX, and then see what falls over. Here's a draft patch against HEAD for this. I looked for problem spots by (a) testing with the STRESS_SORT_INT_MIN option I added in nbtcompare.c, (b) grepping for "x = -x" type code, and (c) grepping for "return -x" type code. (b) and (c) found several places that (a) didn't, which does not give me a warm feeling about whether I have found quite everything. I changed a couple of places where things might've been safe but I didn't feel like chasing the calls to prove it (e.g. imath.c), and contrariwise I left a *very* small number of places alone because they were inverting the result of a specific function that is defined to return 1/0/-1 and nothing else. regards, tom lane diff --git a/contrib/pgcrypto/imath.c b/contrib/pgcrypto/imath.c index cd528bf..ca4995a 100644 --- a/contrib/pgcrypto/imath.c +++ b/contrib/pgcrypto/imath.c @@ -1254,11 +1254,9 @@ mp_int_compare(mp_int a, mp_int b) * If they're both zero or positive, the normal comparison applies; if * both negative, the sense is reversed. */ - if (sa == MP_ZPOS) - return cmp; - else - return -cmp; - + if (sa != MP_ZPOS) + INVERT_SIGN(cmp); + return cmp; } else { @@ -1314,10 +1312,9 @@ mp_int_compare_value(mp_int z, int value) { cmp = s_vcmp(z, value); - if (vsign == MP_ZPOS) - return cmp; - else - return -cmp; + if (vsign != MP_ZPOS) + INVERT_SIGN(cmp); + return cmp; } else { diff --git a/doc/src/sgml/btree.sgml b/doc/src/sgml/btree.sgml index 8bd0bad..c16825e 100644 --- a/doc/src/sgml/btree.sgml +++ b/doc/src/sgml/btree.sgml @@ -228,11 +228,8 @@ <replaceable>B</replaceable>, <replaceable>A</replaceable> <literal>=</literal> <replaceable>B</replaceable>, or <replaceable>A</replaceable> <literal>></literal> - <replaceable>B</replaceable>, respectively. The function must not - return <literal>INT_MIN</literal> for the <replaceable>A</replaceable> - <literal><</literal> <replaceable>B</replaceable> case, - since the value may be negated before being tested for sign. A null - result is disallowed, too. + <replaceable>B</replaceable>, respectively. + A null result is disallowed: all values of the data type must be comparable. See <filename>src/backend/access/nbtree/nbtcompare.c</filename> for examples. </para> diff --git a/src/backend/access/nbtree/nbtcompare.c b/src/backend/access/nbtree/nbtcompare.c index b1855e8..6f2ad23 100644 --- a/src/backend/access/nbtree/nbtcompare.c +++ b/src/backend/access/nbtree/nbtcompare.c @@ -22,11 +22,10 @@ * * The result is always an int32 regardless of the input datatype. * - * Although any negative int32 (except INT_MIN) is acceptable for reporting - * "<", and any positive int32 is acceptable for reporting ">", routines + * Although any negative int32 is acceptable for reporting "<", + * and any positive int32 is acceptable for reporting ">", routines * that work on 32-bit or wider datatypes can't just return "a - b". - * That could overflow and give the wrong answer. Also, one must not - * return INT_MIN to report "<", since some callers will negate the result. + * That could overflow and give the wrong answer. * * NOTE: it is critical that the comparison function impose a total order * on all non-NULL values of the data type, and that the datatype's @@ -44,13 +43,31 @@ * during an index access won't be recovered till end of query. This * primarily affects comparison routines for toastable datatypes; * they have to be careful to free any detoasted copy of an input datum. + * + * NOTE: we used to forbid comparison functions from returning INT_MIN, + * but that proves to be too error-prone because some platforms' versions + * of memcmp() etc can return INT_MIN. As a means of stress-testing + * callers, this file can be compiled with STRESS_SORT_INT_MIN defined + * to cause many of these functions to return INT_MIN or INT_MAX instead of + * their customary -1/+1. For production, though, that's not a good idea + * since users or third-party code might expect the traditional results. *------------------------------------------------------------------------- */ #include "postgres.h" +#include <limits.h> + #include "utils/builtins.h" #include "utils/sortsupport.h" +#ifdef STRESS_SORT_INT_MIN +#define A_LESS_THAN_B INT_MIN +#define A_GREATER_THAN_B INT_MAX +#else +#define A_LESS_THAN_B (-1) +#define A_GREATER_THAN_B 1 +#endif + Datum btboolcmp(PG_FUNCTION_ARGS) @@ -95,11 +112,11 @@ btint4cmp(PG_FUNCTION_ARGS) int32 b = PG_GETARG_INT32(1); if (a > b) - PG_RETURN_INT32(1); + PG_RETURN_INT32(A_GREATER_THAN_B); else if (a == b) PG_RETURN_INT32(0); else - PG_RETURN_INT32(-1); + PG_RETURN_INT32(A_LESS_THAN_B); } static int @@ -109,11 +126,11 @@ btint4fastcmp(Datum x, Datum y, SortSupport ssup) int32 b = DatumGetInt32(y); if (a > b) - return 1; + return A_GREATER_THAN_B; else if (a == b) return 0; else - return -1; + return A_LESS_THAN_B; } Datum @@ -132,11 +149,11 @@ btint8cmp(PG_FUNCTION_ARGS) int64 b = PG_GETARG_INT64(1); if (a > b) - PG_RETURN_INT32(1); + PG_RETURN_INT32(A_GREATER_THAN_B); else if (a == b) PG_RETURN_INT32(0); else - PG_RETURN_INT32(-1); + PG_RETURN_INT32(A_LESS_THAN_B); } static int @@ -146,11 +163,11 @@ btint8fastcmp(Datum x, Datum y, SortSupport ssup) int64 b = DatumGetInt64(y); if (a > b) - return 1; + return A_GREATER_THAN_B; else if (a == b) return 0; else - return -1; + return A_LESS_THAN_B; } Datum @@ -169,11 +186,11 @@ btint48cmp(PG_FUNCTION_ARGS) int64 b = PG_GETARG_INT64(1); if (a > b) - PG_RETURN_INT32(1); + PG_RETURN_INT32(A_GREATER_THAN_B); else if (a == b) PG_RETURN_INT32(0); else - PG_RETURN_INT32(-1); + PG_RETURN_INT32(A_LESS_THAN_B); } Datum @@ -183,11 +200,11 @@ btint84cmp(PG_FUNCTION_ARGS) int32 b = PG_GETARG_INT32(1); if (a > b) - PG_RETURN_INT32(1); + PG_RETURN_INT32(A_GREATER_THAN_B); else if (a == b) PG_RETURN_INT32(0); else - PG_RETURN_INT32(-1); + PG_RETURN_INT32(A_LESS_THAN_B); } Datum @@ -197,11 +214,11 @@ btint24cmp(PG_FUNCTION_ARGS) int32 b = PG_GETARG_INT32(1); if (a > b) - PG_RETURN_INT32(1); + PG_RETURN_INT32(A_GREATER_THAN_B); else if (a == b) PG_RETURN_INT32(0); else - PG_RETURN_INT32(-1); + PG_RETURN_INT32(A_LESS_THAN_B); } Datum @@ -211,11 +228,11 @@ btint42cmp(PG_FUNCTION_ARGS) int16 b = PG_GETARG_INT16(1); if (a > b) - PG_RETURN_INT32(1); + PG_RETURN_INT32(A_GREATER_THAN_B); else if (a == b) PG_RETURN_INT32(0); else - PG_RETURN_INT32(-1); + PG_RETURN_INT32(A_LESS_THAN_B); } Datum @@ -225,11 +242,11 @@ btint28cmp(PG_FUNCTION_ARGS) int64 b = PG_GETARG_INT64(1); if (a > b) - PG_RETURN_INT32(1); + PG_RETURN_INT32(A_GREATER_THAN_B); else if (a == b) PG_RETURN_INT32(0); else - PG_RETURN_INT32(-1); + PG_RETURN_INT32(A_LESS_THAN_B); } Datum @@ -239,11 +256,11 @@ btint82cmp(PG_FUNCTION_ARGS) int16 b = PG_GETARG_INT16(1); if (a > b) - PG_RETURN_INT32(1); + PG_RETURN_INT32(A_GREATER_THAN_B); else if (a == b) PG_RETURN_INT32(0); else - PG_RETURN_INT32(-1); + PG_RETURN_INT32(A_LESS_THAN_B); } Datum @@ -253,11 +270,11 @@ btoidcmp(PG_FUNCTION_ARGS) Oid b = PG_GETARG_OID(1); if (a > b) - PG_RETURN_INT32(1); + PG_RETURN_INT32(A_GREATER_THAN_B); else if (a == b) PG_RETURN_INT32(0); else - PG_RETURN_INT32(-1); + PG_RETURN_INT32(A_LESS_THAN_B); } static int @@ -267,11 +284,11 @@ btoidfastcmp(Datum x, Datum y, SortSupport ssup) Oid b = DatumGetObjectId(y); if (a > b) - return 1; + return A_GREATER_THAN_B; else if (a == b) return 0; else - return -1; + return A_LESS_THAN_B; } Datum @@ -299,9 +316,9 @@ btoidvectorcmp(PG_FUNCTION_ARGS) if (a->values[i] != b->values[i]) { if (a->values[i] > b->values[i]) - PG_RETURN_INT32(1); + PG_RETURN_INT32(A_GREATER_THAN_B); else - PG_RETURN_INT32(-1); + PG_RETURN_INT32(A_LESS_THAN_B); } } PG_RETURN_INT32(0); diff --git a/src/backend/access/nbtree/nbtsearch.c b/src/backend/access/nbtree/nbtsearch.c index d3700bd..8091db3 100644 --- a/src/backend/access/nbtree/nbtsearch.c +++ b/src/backend/access/nbtree/nbtsearch.c @@ -530,7 +530,7 @@ _bt_compare(Relation rel, scankey->sk_argument)); if (!(scankey->sk_flags & SK_BT_DESC)) - result = -result; + INVERT_SIGN(result); } /* if the keys are unequal, return the difference */ diff --git a/src/backend/access/nbtree/nbtutils.c b/src/backend/access/nbtree/nbtutils.c index 4528e87..365153b 100644 --- a/src/backend/access/nbtree/nbtutils.c +++ b/src/backend/access/nbtree/nbtutils.c @@ -518,7 +518,7 @@ _bt_compare_array_elements(const void *a, const void *b, void *arg) cxt->collation, da, db)); if (cxt->reverse) - compare = -compare; + INVERT_SIGN(compare); return compare; } @@ -1652,7 +1652,7 @@ _bt_check_rowcompare(ScanKey skey, IndexTuple tuple, TupleDesc tupdesc, subkey->sk_argument)); if (subkey->sk_flags & SK_BT_DESC) - cmpresult = -cmpresult; + INVERT_SIGN(cmpresult); /* Done comparing if unequal, else advance to next column */ if (cmpresult != 0) diff --git a/src/backend/executor/nodeGatherMerge.c b/src/backend/executor/nodeGatherMerge.c index a1a11df..32f67f7 100644 --- a/src/backend/executor/nodeGatherMerge.c +++ b/src/backend/executor/nodeGatherMerge.c @@ -755,7 +755,10 @@ heap_compare_slots(Datum a, Datum b, void *arg) datum2, isNull2, sortKey); if (compare != 0) - return -compare; + { + INVERT_SIGN(compare); + return compare; + } } return 0; } diff --git a/src/backend/executor/nodeIndexscan.c b/src/backend/executor/nodeIndexscan.c index d74499f..d952766 100644 --- a/src/backend/executor/nodeIndexscan.c +++ b/src/backend/executor/nodeIndexscan.c @@ -467,9 +467,10 @@ reorderqueue_cmp(const pairingheap_node *a, const pairingheap_node *b, ReorderTuple *rtb = (ReorderTuple *) b; IndexScanState *node = (IndexScanState *) arg; - return -cmp_orderbyvals(rta->orderbyvals, rta->orderbynulls, - rtb->orderbyvals, rtb->orderbynulls, - node); + /* exchange argument order to invert the sort order */ + return cmp_orderbyvals(rtb->orderbyvals, rtb->orderbynulls, + rta->orderbyvals, rta->orderbynulls, + node); } /* diff --git a/src/backend/executor/nodeMergeAppend.c b/src/backend/executor/nodeMergeAppend.c index 6daf60a..a2b3856 100644 --- a/src/backend/executor/nodeMergeAppend.c +++ b/src/backend/executor/nodeMergeAppend.c @@ -338,7 +338,10 @@ heap_compare_slots(Datum a, Datum b, void *arg) datum2, isNull2, sortKey); if (compare != 0) - return -compare; + { + INVERT_SIGN(compare); + return compare; + } } return 0; } diff --git a/src/bin/pg_rewind/filemap.c b/src/bin/pg_rewind/filemap.c index 4ad7b2f..222b56f 100644 --- a/src/bin/pg_rewind/filemap.c +++ b/src/bin/pg_rewind/filemap.c @@ -810,7 +810,7 @@ final_filemap_cmp(const void *a, const void *b) return -1; if (fa->action == FILE_ACTION_REMOVE) - return -strcmp(fa->path, fb->path); + return strcmp(fb->path, fa->path); else return strcmp(fa->path, fb->path); } diff --git a/src/include/access/nbtree.h b/src/include/access/nbtree.h index 04ecb4c..ea495f1 100644 --- a/src/include/access/nbtree.h +++ b/src/include/access/nbtree.h @@ -274,8 +274,7 @@ typedef struct BTMetaPageData * When a new operator class is declared, we require that the user * supply us with an amproc procedure (BTORDER_PROC) for determining * whether, for two keys a and b, a < b, a = b, or a > b. This routine - * must return < 0, 0, > 0, respectively, in these three cases. (It must - * not return INT_MIN, since we may negate the result before using it.) + * must return < 0, 0, > 0, respectively, in these three cases. * * To facilitate accelerated sorting, an operator class may choose to * offer a second procedure (BTSORTSUPPORT_PROC). For full details, see diff --git a/src/include/c.h b/src/include/c.h index 25d7d60..a3bce9c 100644 --- a/src/include/c.h +++ b/src/include/c.h @@ -1023,6 +1023,14 @@ extern void ExceptionalCondition(const char *conditionName, */ /* + * Invert the sign of a qsort-style comparison result, ie, exchange negative + * and positive integer values, being careful not to get the wrong answer + * for INT_MIN. The argument should be an integral variable. + */ +#define INVERT_SIGN(var) \ + ((var) = ((var) < 0) ? 1 : -(var)) + +/* * Use this, not "char buf[BLCKSZ]", to declare a field or local variable * holding a page buffer, if that page might be accessed as a page and not * just a string of bytes. Otherwise the variable might be under-aligned, diff --git a/src/include/utils/sortsupport.h b/src/include/utils/sortsupport.h index 53b692e..4166fb9 100644 --- a/src/include/utils/sortsupport.h +++ b/src/include/utils/sortsupport.h @@ -96,8 +96,7 @@ typedef struct SortSupportData * Comparator function has the same API as the traditional btree * comparison function, ie, return <0, 0, or >0 according as x is less * than, equal to, or greater than y. Note that x and y are guaranteed - * not null, and there is no way to return null either. Do not return - * INT_MIN, as callers are allowed to negate the result before using it. + * not null, and there is no way to return null either. * * This may be either the authoritative comparator, or the abbreviated * comparator. Core code may switch this over the initial preference of @@ -224,7 +223,7 @@ ApplySortComparator(Datum datum1, bool isNull1, { compare = ssup->comparator(datum1, datum2, ssup); if (ssup->ssup_reverse) - compare = -compare; + INVERT_SIGN(compare); } return compare; @@ -262,7 +261,7 @@ ApplySortAbbrevFullComparator(Datum datum1, bool isNull1, { compare = ssup->abbrev_full_comparator(datum1, datum2, ssup); if (ssup->ssup_reverse) - compare = -compare; + INVERT_SIGN(compare); } return compare;
On Fri, Oct 5, 2018 at 3:12 PM Tom Lane <tgl@sss.pgh.pa.us> wrote: > Here's a draft patch against HEAD for this. + * Invert the sign of a qsort-style comparison result, ie, exchange negative + * and positive integer values, being careful not to get the wrong answer + * for INT_MIN. The argument should be an integral variable. + */ +#define INVERT_SIGN(var) \ + ((var) = ((var) < 0) ? 1 : -(var)) I suppose someone might mistake this for a function that converts -42 to 42... would something like INVERT_COMPARE_RESULT() be better? -- Thomas Munro http://www.enterprisedb.com
Thomas Munro <thomas.munro@enterprisedb.com> writes: > I suppose someone might mistake this for a function that converts -42 > to 42... would something like INVERT_COMPARE_RESULT() be better? I have no particular allegiance to the macro name; it's just the first idea that came to mind. regards, tom lane
I wrote: > Here's a draft patch against HEAD for this. > I looked for problem spots by (a) testing with the STRESS_SORT_INT_MIN > option I added in nbtcompare.c, (b) grepping for "x = -x" type code, > and (c) grepping for "return -x" type code. (b) and (c) found several > places that (a) didn't, which does not give me a warm feeling about > whether I have found quite everything. I thought of another, uglier way to stress things: make wrappers around memcmp, strcmp, and strncmp to force those to return INT_MIN/0/INT_MAX, thereby modeling what we see is happening on Mark's machine. This successfully exposed the bug I'd already found by grepping in pg_rewind/filemap.c, as well as some astonishingly unportable code in contrib/ltree. The attached update incorporates Thomas' suggestion for the macro name, as well as the ltree fix. For completeness, I also show the very quick-hacky way I changed memcmp et al, but I don't propose committing that. I'm inclined to just go ahead and apply/backpatch this. It's certainly possible that more bugs remain to be found, but I have no good ideas about how to search for them, and in any case that wouldn't invalidate the patch as it stands. regards, tom lane diff --git a/contrib/ltree/ltree_op.c b/contrib/ltree/ltree_op.c index 759f1f8..df61c63 100644 --- a/contrib/ltree/ltree_op.c +++ b/contrib/ltree/ltree_op.c @@ -45,17 +45,24 @@ ltree_compare(const ltree *a, const ltree *b) ltree_level *bl = LTREE_FIRST(b); int an = a->numlevel; int bn = b->numlevel; - int res = 0; while (an > 0 && bn > 0) { + int res; + if ((res = memcmp(al->name, bl->name, Min(al->len, bl->len))) == 0) { if (al->len != bl->len) return (al->len - bl->len) * 10 * (an + 1); } else + { + if (res < 0) + res = -1; + else + res = 1; return res * 10 * (an + 1); + } an--; bn--; @@ -146,7 +153,7 @@ inner_isparent(const ltree *c, const ltree *p) { if (cl->len != pl->len) return false; - if (memcmp(cl->name, pl->name, cl->len)) + if (memcmp(cl->name, pl->name, cl->len) != 0) return false; pn--; diff --git a/contrib/pgcrypto/imath.c b/contrib/pgcrypto/imath.c index cd528bf..b94a51b 100644 --- a/contrib/pgcrypto/imath.c +++ b/contrib/pgcrypto/imath.c @@ -1254,11 +1254,9 @@ mp_int_compare(mp_int a, mp_int b) * If they're both zero or positive, the normal comparison applies; if * both negative, the sense is reversed. */ - if (sa == MP_ZPOS) - return cmp; - else - return -cmp; - + if (sa != MP_ZPOS) + INVERT_COMPARE_RESULT(cmp); + return cmp; } else { @@ -1314,10 +1312,9 @@ mp_int_compare_value(mp_int z, int value) { cmp = s_vcmp(z, value); - if (vsign == MP_ZPOS) - return cmp; - else - return -cmp; + if (vsign != MP_ZPOS) + INVERT_COMPARE_RESULT(cmp); + return cmp; } else { diff --git a/doc/src/sgml/btree.sgml b/doc/src/sgml/btree.sgml index 8bd0bad..c16825e 100644 --- a/doc/src/sgml/btree.sgml +++ b/doc/src/sgml/btree.sgml @@ -228,11 +228,8 @@ <replaceable>B</replaceable>, <replaceable>A</replaceable> <literal>=</literal> <replaceable>B</replaceable>, or <replaceable>A</replaceable> <literal>></literal> - <replaceable>B</replaceable>, respectively. The function must not - return <literal>INT_MIN</literal> for the <replaceable>A</replaceable> - <literal><</literal> <replaceable>B</replaceable> case, - since the value may be negated before being tested for sign. A null - result is disallowed, too. + <replaceable>B</replaceable>, respectively. + A null result is disallowed: all values of the data type must be comparable. See <filename>src/backend/access/nbtree/nbtcompare.c</filename> for examples. </para> diff --git a/src/backend/access/nbtree/nbtcompare.c b/src/backend/access/nbtree/nbtcompare.c index b1855e8..6f2ad23 100644 --- a/src/backend/access/nbtree/nbtcompare.c +++ b/src/backend/access/nbtree/nbtcompare.c @@ -22,11 +22,10 @@ * * The result is always an int32 regardless of the input datatype. * - * Although any negative int32 (except INT_MIN) is acceptable for reporting - * "<", and any positive int32 is acceptable for reporting ">", routines + * Although any negative int32 is acceptable for reporting "<", + * and any positive int32 is acceptable for reporting ">", routines * that work on 32-bit or wider datatypes can't just return "a - b". - * That could overflow and give the wrong answer. Also, one must not - * return INT_MIN to report "<", since some callers will negate the result. + * That could overflow and give the wrong answer. * * NOTE: it is critical that the comparison function impose a total order * on all non-NULL values of the data type, and that the datatype's @@ -44,13 +43,31 @@ * during an index access won't be recovered till end of query. This * primarily affects comparison routines for toastable datatypes; * they have to be careful to free any detoasted copy of an input datum. + * + * NOTE: we used to forbid comparison functions from returning INT_MIN, + * but that proves to be too error-prone because some platforms' versions + * of memcmp() etc can return INT_MIN. As a means of stress-testing + * callers, this file can be compiled with STRESS_SORT_INT_MIN defined + * to cause many of these functions to return INT_MIN or INT_MAX instead of + * their customary -1/+1. For production, though, that's not a good idea + * since users or third-party code might expect the traditional results. *------------------------------------------------------------------------- */ #include "postgres.h" +#include <limits.h> + #include "utils/builtins.h" #include "utils/sortsupport.h" +#ifdef STRESS_SORT_INT_MIN +#define A_LESS_THAN_B INT_MIN +#define A_GREATER_THAN_B INT_MAX +#else +#define A_LESS_THAN_B (-1) +#define A_GREATER_THAN_B 1 +#endif + Datum btboolcmp(PG_FUNCTION_ARGS) @@ -95,11 +112,11 @@ btint4cmp(PG_FUNCTION_ARGS) int32 b = PG_GETARG_INT32(1); if (a > b) - PG_RETURN_INT32(1); + PG_RETURN_INT32(A_GREATER_THAN_B); else if (a == b) PG_RETURN_INT32(0); else - PG_RETURN_INT32(-1); + PG_RETURN_INT32(A_LESS_THAN_B); } static int @@ -109,11 +126,11 @@ btint4fastcmp(Datum x, Datum y, SortSupport ssup) int32 b = DatumGetInt32(y); if (a > b) - return 1; + return A_GREATER_THAN_B; else if (a == b) return 0; else - return -1; + return A_LESS_THAN_B; } Datum @@ -132,11 +149,11 @@ btint8cmp(PG_FUNCTION_ARGS) int64 b = PG_GETARG_INT64(1); if (a > b) - PG_RETURN_INT32(1); + PG_RETURN_INT32(A_GREATER_THAN_B); else if (a == b) PG_RETURN_INT32(0); else - PG_RETURN_INT32(-1); + PG_RETURN_INT32(A_LESS_THAN_B); } static int @@ -146,11 +163,11 @@ btint8fastcmp(Datum x, Datum y, SortSupport ssup) int64 b = DatumGetInt64(y); if (a > b) - return 1; + return A_GREATER_THAN_B; else if (a == b) return 0; else - return -1; + return A_LESS_THAN_B; } Datum @@ -169,11 +186,11 @@ btint48cmp(PG_FUNCTION_ARGS) int64 b = PG_GETARG_INT64(1); if (a > b) - PG_RETURN_INT32(1); + PG_RETURN_INT32(A_GREATER_THAN_B); else if (a == b) PG_RETURN_INT32(0); else - PG_RETURN_INT32(-1); + PG_RETURN_INT32(A_LESS_THAN_B); } Datum @@ -183,11 +200,11 @@ btint84cmp(PG_FUNCTION_ARGS) int32 b = PG_GETARG_INT32(1); if (a > b) - PG_RETURN_INT32(1); + PG_RETURN_INT32(A_GREATER_THAN_B); else if (a == b) PG_RETURN_INT32(0); else - PG_RETURN_INT32(-1); + PG_RETURN_INT32(A_LESS_THAN_B); } Datum @@ -197,11 +214,11 @@ btint24cmp(PG_FUNCTION_ARGS) int32 b = PG_GETARG_INT32(1); if (a > b) - PG_RETURN_INT32(1); + PG_RETURN_INT32(A_GREATER_THAN_B); else if (a == b) PG_RETURN_INT32(0); else - PG_RETURN_INT32(-1); + PG_RETURN_INT32(A_LESS_THAN_B); } Datum @@ -211,11 +228,11 @@ btint42cmp(PG_FUNCTION_ARGS) int16 b = PG_GETARG_INT16(1); if (a > b) - PG_RETURN_INT32(1); + PG_RETURN_INT32(A_GREATER_THAN_B); else if (a == b) PG_RETURN_INT32(0); else - PG_RETURN_INT32(-1); + PG_RETURN_INT32(A_LESS_THAN_B); } Datum @@ -225,11 +242,11 @@ btint28cmp(PG_FUNCTION_ARGS) int64 b = PG_GETARG_INT64(1); if (a > b) - PG_RETURN_INT32(1); + PG_RETURN_INT32(A_GREATER_THAN_B); else if (a == b) PG_RETURN_INT32(0); else - PG_RETURN_INT32(-1); + PG_RETURN_INT32(A_LESS_THAN_B); } Datum @@ -239,11 +256,11 @@ btint82cmp(PG_FUNCTION_ARGS) int16 b = PG_GETARG_INT16(1); if (a > b) - PG_RETURN_INT32(1); + PG_RETURN_INT32(A_GREATER_THAN_B); else if (a == b) PG_RETURN_INT32(0); else - PG_RETURN_INT32(-1); + PG_RETURN_INT32(A_LESS_THAN_B); } Datum @@ -253,11 +270,11 @@ btoidcmp(PG_FUNCTION_ARGS) Oid b = PG_GETARG_OID(1); if (a > b) - PG_RETURN_INT32(1); + PG_RETURN_INT32(A_GREATER_THAN_B); else if (a == b) PG_RETURN_INT32(0); else - PG_RETURN_INT32(-1); + PG_RETURN_INT32(A_LESS_THAN_B); } static int @@ -267,11 +284,11 @@ btoidfastcmp(Datum x, Datum y, SortSupport ssup) Oid b = DatumGetObjectId(y); if (a > b) - return 1; + return A_GREATER_THAN_B; else if (a == b) return 0; else - return -1; + return A_LESS_THAN_B; } Datum @@ -299,9 +316,9 @@ btoidvectorcmp(PG_FUNCTION_ARGS) if (a->values[i] != b->values[i]) { if (a->values[i] > b->values[i]) - PG_RETURN_INT32(1); + PG_RETURN_INT32(A_GREATER_THAN_B); else - PG_RETURN_INT32(-1); + PG_RETURN_INT32(A_LESS_THAN_B); } } PG_RETURN_INT32(0); diff --git a/src/backend/access/nbtree/nbtsearch.c b/src/backend/access/nbtree/nbtsearch.c index d3700bd..8b2772c 100644 --- a/src/backend/access/nbtree/nbtsearch.c +++ b/src/backend/access/nbtree/nbtsearch.c @@ -530,7 +530,7 @@ _bt_compare(Relation rel, scankey->sk_argument)); if (!(scankey->sk_flags & SK_BT_DESC)) - result = -result; + INVERT_COMPARE_RESULT(result); } /* if the keys are unequal, return the difference */ diff --git a/src/backend/access/nbtree/nbtutils.c b/src/backend/access/nbtree/nbtutils.c index 4528e87..205457e 100644 --- a/src/backend/access/nbtree/nbtutils.c +++ b/src/backend/access/nbtree/nbtutils.c @@ -518,7 +518,7 @@ _bt_compare_array_elements(const void *a, const void *b, void *arg) cxt->collation, da, db)); if (cxt->reverse) - compare = -compare; + INVERT_COMPARE_RESULT(compare); return compare; } @@ -1652,7 +1652,7 @@ _bt_check_rowcompare(ScanKey skey, IndexTuple tuple, TupleDesc tupdesc, subkey->sk_argument)); if (subkey->sk_flags & SK_BT_DESC) - cmpresult = -cmpresult; + INVERT_COMPARE_RESULT(cmpresult); /* Done comparing if unequal, else advance to next column */ if (cmpresult != 0) diff --git a/src/backend/executor/nodeGatherMerge.c b/src/backend/executor/nodeGatherMerge.c index a1a11df..6362e44 100644 --- a/src/backend/executor/nodeGatherMerge.c +++ b/src/backend/executor/nodeGatherMerge.c @@ -755,7 +755,10 @@ heap_compare_slots(Datum a, Datum b, void *arg) datum2, isNull2, sortKey); if (compare != 0) - return -compare; + { + INVERT_COMPARE_RESULT(compare); + return compare; + } } return 0; } diff --git a/src/backend/executor/nodeIndexscan.c b/src/backend/executor/nodeIndexscan.c index d74499f..d952766 100644 --- a/src/backend/executor/nodeIndexscan.c +++ b/src/backend/executor/nodeIndexscan.c @@ -467,9 +467,10 @@ reorderqueue_cmp(const pairingheap_node *a, const pairingheap_node *b, ReorderTuple *rtb = (ReorderTuple *) b; IndexScanState *node = (IndexScanState *) arg; - return -cmp_orderbyvals(rta->orderbyvals, rta->orderbynulls, - rtb->orderbyvals, rtb->orderbynulls, - node); + /* exchange argument order to invert the sort order */ + return cmp_orderbyvals(rtb->orderbyvals, rtb->orderbynulls, + rta->orderbyvals, rta->orderbynulls, + node); } /* diff --git a/src/backend/executor/nodeMergeAppend.c b/src/backend/executor/nodeMergeAppend.c index 6daf60a..f8aa50f 100644 --- a/src/backend/executor/nodeMergeAppend.c +++ b/src/backend/executor/nodeMergeAppend.c @@ -338,7 +338,10 @@ heap_compare_slots(Datum a, Datum b, void *arg) datum2, isNull2, sortKey); if (compare != 0) - return -compare; + { + INVERT_COMPARE_RESULT(compare); + return compare; + } } return 0; } diff --git a/src/bin/pg_rewind/filemap.c b/src/bin/pg_rewind/filemap.c index 4ad7b2f..222b56f 100644 --- a/src/bin/pg_rewind/filemap.c +++ b/src/bin/pg_rewind/filemap.c @@ -810,7 +810,7 @@ final_filemap_cmp(const void *a, const void *b) return -1; if (fa->action == FILE_ACTION_REMOVE) - return -strcmp(fa->path, fb->path); + return strcmp(fb->path, fa->path); else return strcmp(fa->path, fb->path); } diff --git a/src/include/access/nbtree.h b/src/include/access/nbtree.h index 04ecb4c..ea495f1 100644 --- a/src/include/access/nbtree.h +++ b/src/include/access/nbtree.h @@ -274,8 +274,7 @@ typedef struct BTMetaPageData * When a new operator class is declared, we require that the user * supply us with an amproc procedure (BTORDER_PROC) for determining * whether, for two keys a and b, a < b, a = b, or a > b. This routine - * must return < 0, 0, > 0, respectively, in these three cases. (It must - * not return INT_MIN, since we may negate the result before using it.) + * must return < 0, 0, > 0, respectively, in these three cases. * * To facilitate accelerated sorting, an operator class may choose to * offer a second procedure (BTSORTSUPPORT_PROC). For full details, see diff --git a/src/include/c.h b/src/include/c.h index 25d7d60..86fbb63 100644 --- a/src/include/c.h +++ b/src/include/c.h @@ -1023,6 +1023,14 @@ extern void ExceptionalCondition(const char *conditionName, */ /* + * Invert the sign of a qsort-style comparison result, ie, exchange negative + * and positive integer values, being careful not to get the wrong answer + * for INT_MIN. The argument should be an integral variable. + */ +#define INVERT_COMPARE_RESULT(var) \ + ((var) = ((var) < 0) ? 1 : -(var)) + +/* * Use this, not "char buf[BLCKSZ]", to declare a field or local variable * holding a page buffer, if that page might be accessed as a page and not * just a string of bytes. Otherwise the variable might be under-aligned, diff --git a/src/include/utils/sortsupport.h b/src/include/utils/sortsupport.h index 53b692e..818e0b1 100644 --- a/src/include/utils/sortsupport.h +++ b/src/include/utils/sortsupport.h @@ -96,8 +96,7 @@ typedef struct SortSupportData * Comparator function has the same API as the traditional btree * comparison function, ie, return <0, 0, or >0 according as x is less * than, equal to, or greater than y. Note that x and y are guaranteed - * not null, and there is no way to return null either. Do not return - * INT_MIN, as callers are allowed to negate the result before using it. + * not null, and there is no way to return null either. * * This may be either the authoritative comparator, or the abbreviated * comparator. Core code may switch this over the initial preference of @@ -224,7 +223,7 @@ ApplySortComparator(Datum datum1, bool isNull1, { compare = ssup->comparator(datum1, datum2, ssup); if (ssup->ssup_reverse) - compare = -compare; + INVERT_COMPARE_RESULT(compare); } return compare; @@ -262,7 +261,7 @@ ApplySortAbbrevFullComparator(Datum datum1, bool isNull1, { compare = ssup->abbrev_full_comparator(datum1, datum2, ssup); if (ssup->ssup_reverse) - compare = -compare; + INVERT_COMPARE_RESULT(compare); } return compare; diff --git a/src/include/port.h b/src/include/port.h index e654d5c..e5eda63 100644 --- a/src/include/port.h +++ b/src/include/port.h @@ -196,6 +196,17 @@ extern char *pg_strerror_r(int errnum, char *buf, size_t buflen); #define strerror_r pg_strerror_r #define PG_STRERROR_R_BUFLEN 256 /* Recommended buffer size for strerror_r */ +/* Override these too for sort testing */ +extern int pg_memcmp(const void *s1, const void *s2, size_t n); +extern int pg_strcmp(const char *s1, const char *s2); +extern int pg_strncmp(const char *s1, const char *s2, size_t n); +#undef memcmp +#undef strcmp +#undef strncmp +#define memcmp pg_memcmp +#define strcmp pg_strcmp +#define strncmp pg_strncmp + /* Portable prompt handling */ extern void simple_prompt(const char *prompt, char *destination, size_t destlen, bool echo); diff --git a/src/port/snprintf.c b/src/port/snprintf.c index ef496fa..be404b1 100644 --- a/src/port/snprintf.c +++ b/src/port/snprintf.c @@ -1357,3 +1357,40 @@ trailing_pad(int padlen, PrintfTarget *target) if (padlen < 0) dopr_outchmulti(' ', -padlen, target); } + +#undef memcmp +#undef strcmp +#undef strncmp + +int pg_memcmp(const void *s1, const void *s2, size_t n) +{ + int res = memcmp(s1, s2, n); + + if (res < 0) + return INT_MIN; + if (res > 0) + return INT_MAX; + return 0; +} + +int pg_strcmp(const char *s1, const char *s2) +{ + int res = strcmp(s1, s2); + + if (res < 0) + return INT_MIN; + if (res > 0) + return INT_MAX; + return 0; +} + +int pg_strncmp(const char *s1, const char *s2, size_t n) +{ + int res = strncmp(s1, s2, n); + + if (res < 0) + return INT_MIN; + if (res > 0) + return INT_MAX; + return 0; +}
I wrote: > I'm inclined to just go ahead and apply/backpatch this. It's certainly > possible that more bugs remain to be found, but I have no good ideas > about how to search for them, and in any case that wouldn't invalidate > the patch as it stands. And done. If anyone can think of additional ways to test or search for more instances of the same bug, please do. In the meantime, I've configured buildfarm member longfin to define STRESS_SORT_INT_MIN, so it should help prevent introduction of new instances. regards, tom lane