Enforce Symmetric Matrix - Mailing list pgsql-general

From Randy Westlund
Subject Enforce Symmetric Matrix
Date
Msg-id 20140507203618.GC20967@legion
Whole thread Raw
Responses Re: Enforce Symmetric Matrix  (David G Johnston <david.g.johnston@gmail.com>)
List pgsql-general
I have a table with N (over 35k) rows.  I need to compare each of
those elements to every other element in the list, which is the
handshake problem.

This results in a symmetric matrix (an adjacency matrix for an
undirected graph).  Because all relations are symmetric, I only need to
store the upper triangle of this matrix.  A ~ B and B ~ A are the same.

I need some advice on how to implement this.  At first, I thought that
I'd only store the upper triangle, resulting in N^2 / 2 rows of the
form:

id1   id2   value
---- ----   -----
A     B      1
A     C      2
A     D      3
B     C      4
B     D      5
C     D      6

Where value is the result of an expensive computation, and with the
constraint that id1 < id2.

But there are problems.  To get a list of all things compared to B, I
have to look in both columns.  Complicated queries with selects nested
inside updates become very difficult or impossible to do in one go.  I'd
also need to index both columns, which seems like a waste.

So I thought maybe I'd just store the full matrix, with A,B and B,A
having the same data

id1   id2   value
---- ----   -----
A     B       1
A     C       2
A     D       3
B     A       1
B     C       4
B     D       5
C     A       2
C     B       4
C     D       6

With the constraint that id1 != id2 because I don't need the diagonal.

The table is twice as big, but still O(n^2).  This would let me get the
list of all things compared to B with SELECT ... WHERE id1 = B, which is
super easy.

My problem here is that I'm not sure how to enforce the rule that A,B
and B,A have the same value.  I want to use a rule or trigger such that
when row A,B is updated or inserted, row B,A is updated or inserted with
the same date.  But then it would get called recursively, and that
doesn't work.

For my application, the total number of items I have (N) will be growing
over time and this table will need to be updated accordingly.

This seems like a problem that many people must have come across before,
but I've been strangely unable to find advice online.  Any
recommendations?

Attachment

pgsql-general by date:

Previous
From: Oleg Bartunov
Date:
Subject: Re: Full text: Ispell dictionary
Next
From: "Andrus"
Date:
Subject: Re: How to fix lost synchronization with server