Gene Wirchenko wrote:
> >I just finished reading one of Ralph Kimball's books. In it he
> >mentions something called a cyclical redundancy checksum (crc)
> >function. A crc function is a hash function that generates a checksum.
> >
> >I am wondering a few things. A crc function would be extremely useful
> >and time saving in determining if a row needs to be updated or not (are
> >the values the same, if yes don't update, if not update). In fact
> >Ralph Kimball states that this is a way to check for changes. You just
> >have an extra column for the crc checksum. When you go to update data,
> >generate a crc checksum and compare it to the one in the crc column.
> >If they are same, your data has not changed.
> >
> >Yet what happens if there is a collision of the checksum for a row?
>
> Then you get told that no change has occurred when one has. I
> would call this an error.
That's exactly what I thought when I read that in his book. I was
thinking back to the sha1 and md5 algorithms, maybe a special crc
algorithm is safe from this.
> >Ralph Kimball did not mention which algorithm to use, nor how to create
> >a crc function that would not have collisions. He does have a PhD,
> >and a leader in the OLAP datawarehouse world, so I assume there is a
> >good solution.
>
> Your error. Having a Ph.D. does not stop someone from being
> wrong.
> >Is there a crc function in postgresql? If not what algorithm would I
> >need to use to create one in pl/pgsql?
>
> I think you are focusing on irrelevant minutiae. Is the
> performance really that bad that you have go to odd lengths to up it?
It is not for performance. It is to save time writing a lot of stored
procedure code. when you hav e an updateable view with 70 values that
need to be checked for changes a checksum starts to sound very
appealing.