I just finished reading one of Ralph Kimball's books. In it he
mentions something called a cyclical redundancy checksum (crc)
function. A crc function is a hash function that generates a checksum.
I am wondering a few things. A crc function would be extremely useful
and time saving in determining if a row needs to be updated or not (are
the values the same, if yes don't update, if not update). In fact
Ralph Kimball states that this is a way to check for changes. You just
have an extra column for the crc checksum. When you go to update data,
generate a crc checksum and compare it to the one in the crc column.
If they are same, your data has not changed.
Yet what happens if there is a collision of the checksum for a row?
Ralph Kimball did not mention which algorithm to use, nor how to create
a crc function that would not have collisions. He does have a PhD,
and a leader in the OLAP datawarehouse world, so I assume there is a
good solution.
Is there a crc function in postgresql? If not what algorithm would I
need to use to create one in pl/pgsql?
regards,
karen