Thread: How to properly hash a row or set of columns.

How to properly hash a row or set of columns.

From
Klaudie Willis
Date:
Hashing a row or set of columns is useful in some circumstances where you need to compare the row with for instance incoming data. This is most relevant when you do not control the source data yourself, or else you would usually solve it by other means.

Still, It would be great if you could do it like this:
create table t (
  a text
  b text
  hashid varchar not NULL GENERATED ALWAYS AS (sha256(row(a,b))) stored
)

But, row is not immutable so you are not allowed to do this.  Instead, you need to start concatenating columns, but if you want to do that correctly, you also need separator symbols between columns, which then needs to be escaped in the individual column values. And then you have to handle NULL properly as well.  The first example handles all of this, if only row was immutable.

Any better way of doing this?  Can I create my own row2() constructor that IS immutable in a simple way?

best regards
Klaudie



Re: How to properly hash a row or set of columns.

From
Klaudie Willis
Date:
A minor correction by casting the row a little:
create table t (
  a text
  b text
  hashid varchar not NULL GENERATED ALWAYS AS (sha256(row(a,b)::text::bytea)) stored
)

Sent with ProtonMail Secure Email.


‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐
On Tuesday, January 18th, 2022 at 14:34, Klaudie Willis <Klaudie.Willis@protonmail.com> wrote:
Hashing a row or set of columns is useful in some circumstances where you need to compare the row with for instance incoming data. This is most relevant when you do not control the source data yourself, or else you would usually solve it by other means.

Still, It would be great if you could do it like this:
create table t (
  a text
  b text
  hashid varchar not NULL GENERATED ALWAYS AS (sha256(row(a,b))) stored
)

But, row is not immutable so you are not allowed to do this.  Instead, you need to start concatenating columns, but if you want to do that correctly, you also need separator symbols between columns, which then needs to be escaped in the individual column values. And then you have to handle NULL properly as well.  The first example handles all of this, if only row was immutable.

Any better way of doing this?  Can I create my own row2() constructor that IS immutable in a simple way?

best regards
Klaudie