Thread: Design question

Design question

From
Mike Diehl
Date:
Hi all,

I've got a design question that I need to ask before I go too far down what
might be the wrong road.

I've got a customer, who has multiple customers, who need to be able to upload
an excel spreadsheet into Postgres.  Then they want to be able to slice and
dice that data.

The problem is that probably none of these spreadsheets will have the same
fields in them.

There are two ways to do this, that I can think of...

1.  Create a table for each spreadsheet, using column headings as field names.
Every field would be a char/varchar.  We might have a table to track which
client owns which table.  This could amount to 10's of tables being added to
the db.

2.  Create a table in which we store individual cells and associate them with
an owner.  Then each client would essentially have one (huge?) table that
they can work with.

Design #1 is easy to implement, but might make management more difficult.
Design #2 is easy to manage, but the SQL needed to generate reports would
be "tricky."  I'm intending to provide a report generator, so the complexity
of the reporting SQL can be mitigated.

So, which road should I travel down?

TIA,

--
Mike Diehl

Re: Design question

From
James Strater
Date:
Have you considered one large table with all of the columns from the various spreadsheets, then a separate view  for each customer?

----- Original Message ----
From: Mike Diehl <mdiehl@diehlnet.com>
To: pgsql-general@postgresql.org
Sent: Wednesday, September 17, 2008 12:29:15 PM
Subject: [GENERAL] Design question

Hi all,

I've got a design question that I need to ask before I go too far down what
might be the wrong road.

I've got a customer, who has multiple customers, who need to be able to upload
an excel spreadsheet into Postgres.  Then they want to be able to slice and
dice that data.

The problem is that probably none of these spreadsheets will have the same
fields in them.

There are two ways to do this, that I can think of...

1.  Create a table for each spreadsheet, using column headings as field names. 
Every field would be a char/varchar.  We might have a table to track which
client owns which table.  This could amount to 10's of tables being added to
the db.

2.  Create a table in which we store individual cells and associate them with
an owner.  Then each client would essentially have one (huge?) table that
they can work with.

Design #1 is easy to implement, but might make management more difficult. 
Design #2 is easy to manage, but the SQL needed to generate reports would
be "tricky."  I'm intending to provide a report generator, so the complexity
of the reporting SQL can be mitigated.

So, which road should I travel down?

TIA,

--
Mike Diehl

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: Design question

From
"Scott Marlowe"
Date:
On Wed, Sep 17, 2008 at 11:29 AM, Mike Diehl <mdiehl@diehlnet.com> wrote:
> Hi all,
>
> I've got a design question that I need to ask before I go too far down what
> might be the wrong road.
>
> I've got a customer, who has multiple customers, who need to be able to upload
> an excel spreadsheet into Postgres.  Then they want to be able to slice and
> dice that data.
>
> The problem is that probably none of these spreadsheets will have the same
> fields in them.
>
> There are two ways to do this, that I can think of...
>
> 1.  Create a table for each spreadsheet, using column headings as field names.
> Every field would be a char/varchar.  We might have a table to track which
> client owns which table.  This could amount to 10's of tables being added to
> the db.

If you choose this method, you might want to split out customers by
schema, to make it easier to manage their tables.  Then, all you have
to do is either prefix the customer name in front of the table
reference or set it in your search_path after connecting.

> 2.  Create a table in which we store individual cells and associate them with
> an owner.  Then each client would essentially have one (huge?) table that
> they can work with.

This is basically going to be an EAV (entity, attribute, value) type
setup.  They are notoriously hard to write useful queries against, and
generally a poor performer.

I'd go with option 1 myself.

Re: Design question

From
Richard Huxton
Date:
Mike Diehl wrote:
> 1.  Create a table for each spreadsheet, using column headings as field names.
> Every field would be a char/varchar.  We might have a table to track which
> client owns which table.  This could amount to 10's of tables being added to
> the db.

Give each client their own schema. Set permissions on the whole schema.
You should be fine with a few dozen or a few hundred tables. You can
also dump/restore a whole schema then.

--
  Richard Huxton
  Archonet Ltd