Home > mailing lists

Re: MD5 aggregate - Mailing list pgsql-hackers

From	Benedikt Grundmann
Subject	Re: MD5 aggregate
Date	June 14, 2013 16:23:00
Msg-id	CADbMkNPFHbevxzfDN5iCnR+pF18cPTbf0SC+GcVC0+7epZO3fw@mail.gmail.com Whole thread Raw
In response to	Re: MD5 aggregate (Tom Lane <tgl@sss.pgh.pa.us>)
List	pgsql-hackers

Tree view

On Fri, Jun 14, 2013 at 2:14 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Marko Kreen <markokr@gmail.com> writes:
> On Thu, Jun 13, 2013 at 12:35 PM, Dean Rasheed <dean.a.rasheed@gmail.com> wrote:
>> Attached is a patch implementing a new aggregate function md5_agg() to
>> compute the aggregate MD5 sum across a number of rows.

> It's more efficient to calculate per-row md5, and then sum() them.
> This avoids the need for ORDER BY.

Good point. The aggregate md5 function also fails to distinguish the
case where we have 'xyzzy' followed by 'xyz' in two adjacent rows
from the case where they contain 'xyz' followed by 'zyxyz'.

Now, as against that, you lose any sensitivity to the ordering of the
values.

Personally I'd be a bit inclined to xor the per-row md5's rather than
sum them, but that's a small matter.

regards, tom lane

xor works but only if each row is different (e.g. at the very least all columns together make a unique key).

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

pgsql-hackers by date:

From: Tom Lane
Date: 14 June 2013, 16:21:56
Subject: Re: Patch for fail-back without fresh backup

From: Andrew Dunstan
Date: 14 June 2013, 16:33:33
Subject: Re: [PATCH] Remove useless USE_PGXS support in contrib

Re: MD5 aggregate - Mailing list pgsql-hackers

Previous

Next