Re: machine-dependent hash_any vs the regression tests - Mailing list pgsql-hackers

From Kenneth Marshall
Subject Re: machine-dependent hash_any vs the regression tests
Date
Msg-id 20080406154514.GA21544@it.is.rice.edu
Whole thread Raw
In response to machine-dependent hash_any vs the regression tests  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
On Sat, Apr 05, 2008 at 05:57:35PM -0400, Tom Lane wrote:
> So the proposed changes in hash_any make its hash values different
> between big-endian and little-endian machines (at least for string keys;
> for keys that are really arrays of int, I think the changes will
> unify the behavior).  This means that the hash_seq_search traversal
> order for an internal hash table changes, and it turns out this breaks
> at least two regression tests: portals and dblink.  The portals test
> is easy to fix by adding a couple of ORDER BYs, but the problem with
> dblink is here:
> 
>   SELECT dblink_get_connections();
>    dblink_get_connections 
>   ------------------------
> !  {dtest1,dtest2,dtest3}
>   (1 row)
>   
>   SELECT dblink_is_busy('dtest1');
> --- 714,720 ----
>   SELECT dblink_get_connections();
>    dblink_get_connections 
>   ------------------------
> !  {dtest1,dtest3,dtest2}
>   (1 row)
>   
>   SELECT dblink_is_busy('dtest1');
> 
> and right offhand I can't think of a simple way to force those array
> elements into a consistent order.
> 
> No doubt that can be worked around, but does anyone wish to argue that
> this whole thing is a bad path to be headed down?  We're not going to
> gain a *whole* lot of speedup from the word-wide-hashing change, and
> so maybe this type of headache isn't worth the trouble.
> 
>             regards, tom lane
> 
It may be just me, but it is a little bit surprising that the order of
a sequential search traversal should matter. It smacks of the row ordering
being non-deterministic without specifying an "order by". As long as all
of the values are returned, it would make sense not to have the regression
tests depend on that ordering. It will make it easier to evaluate new hash
functions if they do not break the regression tests in such an unintuitive
way -- my two cents.

Regards,
Ken Marshall


pgsql-hackers by date:

Previous
From: Hannu Krosing
Date:
Subject: Adding pipelining support to set returning functions
Next
From: Gregory Stark
Date:
Subject: Re: machine-dependent hash_any vs the regression tests