On 14 June 2013 16:09, Hannu Krosing <hannu@2ndquadrant.com> wrote:
> What skytools/pgq/londiste uses for comparing tables on master
> and slave is query like this
>
> select sum(hashtext(t.*::text)) from <yourtable> t;
>
> This is non-modulo sum and does not use md5 but relies on
> whatever the hashtext() du jour is :)
>
> So it is not comparable to anything external (like the md5sum
> compatible idea above) but is usually good enough for fast
> checks of compatible tables.
>
> As tables are unordered by definition anyway, this should be
> good enough for most SQL.
>
> The speed comes from both fast(er) hashtext() function and
> avoiding the sort.
>
That sounds like a pretty good approach. We could do that if we had a
version of md5() that returned numeric. My impression is that numeric
computations are pretty fast compared to the sorting overhead.
On the other hand, if there is a usable index, select md5_agg(..) from
(sub-query) will do and index scan rather than a sort, making it much
faster than using an ORDER BY in the aggregate.
Regards,
Dean