Set Operators and Arrays Question - Mailing list pgsql-general

From Geoff Russell
Subject Set Operators and Arrays Question
Date
Msg-id Pine.GSO.4.05.10102280857000.16678-100000@slayer
Whole thread Raw
Responses Re: Set Operators and Arrays Question  (Peter Eisentraut <peter_e@gmx.net>)
Re: Set Operators and Arrays Question  (Stephan Szabo <sszabo@megazone23.bigpanda.com>)
List pgsql-general
Hello readers and writers,

I have a concrete problem which I think is similar to a large class of
problems. I want to propose a better solution for my concrete problem,
which, if implemented, would be generally useful. Then again someone
may come up with a better method again...

My concrete problem is with keeping track of fundraising letters at
the RSPCA where I do volunteer programing. There are 100,000 people on
the mailing list, and each Fundraising letter or reminder has an id and
I need to keep track of who has got what so that people don't get too many
letters, don't get reminders for fundraising appeals they weren't involved in,
etc.

For example there are (over say 5 years) perhaps 200 letters and each person
receives a very small subset of these.

       Person| Letters received
        John : 3,35,45,27
        Mary : 1,35,4,
        Ian  : 1,35,4,12,45,63
        ...

Current Solution:

My current tables look (in simplified form) like:

        create table member       (  member_id serial, ...);
        create table mailing      (  mailing_id serial, ... );
        create table memb_mailing (  member_id int,mailing_id int);

This all works fine and I can define indexes to speed up queries like
"tell me all the people who got mailing 23?", "did person X get mailing
Y?". But I'm not quite happy. Its a bit slow and takes up more disk than
I think it should.


2nd attempt:

        create table member       (  member_id serial, ...);
        create table mailing      (  mailing_id serial, ...);
        create table memb_mailing (  member_id int,mailings bool[]);

Here each of the members of the mailings boolean array represents one letter.
You initialise it with '{f,f,f,f,f.....}'. A disadvantage is that you need
to guestimate the largest letter number to build the initial string.

On the other hand you can do great things like:

        update memb_mailing set mailings[15]='t'
             where member_id=templist.member_id;

where you have previously built a temporary table templist of members.

This is little better. It uses a bit less disk, and answers queries a little
faster (even without indexes).

Wish List Solution:

Heres what I think would be really the best way to solve the problem.

        create table member       (  member_id serial, data text);
        create table mailing      (  mailing_id serial, name text);
        create table memb_mailing (  member_id int,mailings int[]);

Here the initial mailings array is empty and I say things like:

        update memb_mailing add 27 into mailings
             where member_id=templist.member_id;

This just inserts the letter number 27 into the mailings array. And we could
also say things like:

        select M.member_id from member M,memb_mailing MM
             where M.member_id=MM.member_id and
             MM.mailings contains 27;

Summary:

I'm really just suggesting a couple of set operators for arrays.

Perhaps there is already a better way of doing this? I'm open to suggestions.

Cheers
Geoff

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
6 Fifth Ave           +618-8332-5069 (Home)          |
St Morris SA 5068     +618-8207-2029 (Work)          | geoff@austrics.com.au
Adelaide, AUSTRALIA   +618-8364-1543 (Fax-Home)      |


pgsql-general by date:

Previous
From: Brian Kurt Fujikawa
Date:
Subject: insert into table from C/C++ application
Next
From: Peter Eisentraut
Date:
Subject: Re: Postgres eats up memory when using cursors