On Mon, 2006-06-12 at 21:58 -0500, Aaron Bono wrote:
> I agree with Tom. Personally I cannot think of a time I would use an
> array column over a child table. Maybe someone can enlighten me on
> when an array column would be a good choice.
Arrays are a good choice when the data comes naturally segmented.
I regularly store sentences or sentence fragments as an array of words
when I don't care about whitespace (whitespace should be regenerated on
the output). Consider meta-keywords for a webpage for example.
This is similar process as removing all formatting from phone numbers
before storage and reformatting for display to the user again.
(123)456-7890 might be stored as '1234567890'.
"A\tcat in the\nhat" might be stored as ARRAY['A', 'cat', 'in', 'the',
'hat'].
It makes comparisons and uniqueness much easier to deal with if you
remove garbage from the data prior to storage.
> On 6/12/06, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> "Phillip Smith" <phillips@weatherbeeta.com.au> writes:
> > The whole sys file is variable length records like this -
> they range =
> > from 1
> > to over 17,000 fields per record.
>
> 17000? I think you really need to rethink your schema. While
> you could
> theoretically drop 17000 elements into a PG array column, you
> wouldn't
> like the performance --- it'd be almost unsearchable for
> instance.
>
> I'd think about two tables, one with a single row for each SYS
> record
> from the original, and one with one row for each detail item
> (the
> invoice numbers in this case). With suitable indexes and a
> foreign key
> constraint, this will perform a lot better than an array-based
> translation.
>
> And no, in neither case will you be able to import that file
> without
> massaging it first.
>
> regards, tom lane
>
--