Re: cyclical redundancy checksum algorithm(s)? - Mailing list pgsql-general

From Karen Hill
Subject Re: cyclical redundancy checksum algorithm(s)?
Date
Msg-id 1159390971.380102.283840@i3g2000cwc.googlegroups.com
Whole thread Raw
In response to cyclical redundancy checksum algorithm(s)?  ("Karen Hill" <karen_hill22@yahoo.com>)
Responses Re: cyclical redundancy checksum algorithm(s)?  (Ron Johnson <ron.l.johnson@cox.net>)
Re: cyclical redundancy checksum algorithm(s)?  ("Marshall" <marshall.spight@gmail.com>)
List pgsql-general
Gene Wirchenko wrote:

> >I just finished reading one of Ralph Kimball's books.  In it he
> >mentions something called a cyclical redundancy checksum (crc)
> >function.  A crc function is a hash function that generates a checksum.
> >
> >I am wondering a few things.  A crc function would be extremely useful
> >and time saving in determining if a row needs to be updated or not (are
> >the values the same, if yes don't update, if not update).  In fact
> >Ralph Kimball states that this is a way to check for changes.  You just
> >have an extra column for the crc checksum.  When you go to update data,
> >generate a crc checksum and compare it to the one in the crc column.
> >If they are same, your data has not changed.
> >
> >Yet what happens if there is a collision of the checksum for a row?
>
>      Then you get told that no change has occurred when one has.  I
> would call this an error.

That's exactly what I thought when I read that in his book.  I was
thinking back to the sha1 and md5 algorithms, maybe a special crc
algorithm is safe from this.

> >Ralph Kimball did not mention which algorithm to use, nor how to create
> >a crc function that would not have collisions.   He does have a PhD,
> >and a leader in the OLAP datawarehouse world, so I assume there is a
> >good solution.
>
>      Your error.  Having a Ph.D. does not stop someone from being
> wrong.

> >Is there a crc function in postgresql?  If not what algorithm would I
> >need to use to create one in pl/pgsql?
>
>      I think you are focusing on irrelevant minutiae.  Is the
> performance really that bad that you have go to odd lengths to up it?

It is not for performance.  It is to save time writing a lot of stored
procedure code.  when you hav e an updateable view with 70 values that
need to be checked for changes a checksum starts to sound very
appealing.


pgsql-general by date:

Previous
From: Tom Lane
Date:
Subject: Re: cyclical redundancy checksum algorithm(s)?
Next
From: Jonathan Vanasco
Date:
Subject: memory issues when running with mod_perl