[SQL] death of array? - Mailing list pgsql-sql

From Rob Sargent
Subject [SQL] death of array?
Date
Msg-id 73B8E1AF-BD4A-4E6A-B192-8394D59EF47C@gmail.com
Whole thread Raw
Responses Re: [SQL] death of array?  ("David G. Johnston" <david.g.johnston@gmail.com>)
List pgsql-sql
I believe I have an appropriate use[1] for an array column, but I’m having a hard time using that array in a join
clause.
The SQL question is how to use an array value in the join clause?  I’m using postgres 9.6 on ubuntu 16.04


create table probandset (id UUID, probands UUID[])
create table segment(id uuid, chr int, sbp int, epb int, probandset_id)
create table people_member(people_id uuid, person_id uuid)
create table people(id uuid, name text)
create table person(id uuid, name text)

probandset.probands is a set of person.id

I need to gather all segments whose probandset is within in a specified people.
select s.* from segment s join probandset ps on s.probandset_id = ps.id
--PROBLEM: WOULD LIKE SOMETHING BETTER THAN THE FOLLOWING: join (select id, unnest(probands) as proband from probandset
asl) as pu on s.probandset_id = pu.id join people_member pm on pu.proband = pm.person_id join people pl on pm.people_id
=pl.id 
where pl.name = ‘target population name’

The query I have works (showing only half of it here) and I’m not pushing the performance of it too much right now as
I’mmore interested in the SQL problem. However, I am getting a seq scan on people_member, not surprisingly.
People_memberwill be blocks of people, 50 to 1000 per block and each block loaded in a single transaction, no editing:
shouldthis be clustered (and reclustered)? Over time would the seq scan go away as people_member.people_id becomes more
discriminating?(There is an index on people_member.people_id). A people has a know set of probands (and we need all
subsetsof those probands) 

Current discussions at our end on whether or not a probandset may be filled with members of more than one people.  If
notI might be able to add people_id to probandset and I am home free. 
But that still doesn’t answer the SQL question.

Thanks for reading, sorry it’s a tad wordy.
rjs

[1] I’m dealing with power sets. A given set of N element has 2**N subsets and the number of subset grows exponentially
(obviously). But to model this ‘normally’ would require an even larger number of subset member records. (That summation
leftto the student[2]).  My choice was to list each subset once with and array of elements.  Still an exponential
problem,but only one exponential problem. 
It all breaks down somewhere between 20 and 30 elements but we'll burn that bridge when we get there.
[2] Something like the sum over i=0..N of ((N choose i) * i maybe?


pgsql-sql by date:

Previous
From: Adrian Klaver
Date:
Subject: Re: Crosstab function
Next
From: "David G. Johnston"
Date:
Subject: Re: [SQL] death of array?