Thread: One big table or several smaller tables ?

One big table or several smaller tables ?

From
AarniRuuhimäki / Megative Tmi / KYMI.com
Date:
Hi !

I have a general design question:

We plan to collect sales events / data from approx. 50 outlets (shops).

We are expecting 200 000 events annually.

Would it be wiser to have a table for each outlet or one big table for
everything ? (a dozen fields or so in both cases)

I.e. which one is faster to pull reports from ? E.g. after five years, if we
want to know the number of refrigerators sold between may and august for each
year and for each outlet, go through one table with million rows or 50 tables
with 20 000 rows ?

'Query-wise' it would be much simpler to have one big table, but how about
the performance ?

TIA and keep on keeping on !

aarni

Re: One big table or several smaller tables ?

From
Jason Earl
Date:
Aarni Ruuhimäki / Megative Tmi / KYMI.com <aarni.ruuhimaki@kymi.com> writes:

> Hi !
>
> I have a general design question:
>
> We plan to collect sales events / data from approx. 50 outlets
> (shops).
>
> We are expecting 200 000 events annually.
>
> Would it be wiser to have a table for each outlet or one big table
> for everything ? (a dozen fields or so in both cases)
>
> I.e. which one is faster to pull reports from ? E.g. after five
> years, if we want to know the number of refrigerators sold between
> may and august for each year and for each outlet, go through one
> table with million rows or 50 tables with 20 000 rows ?
>
> 'Query-wise' it would be much simpler to have one big table, but how
> about the performance ?
>
> TIA and keep on keeping on !
>
> aarni

My suggestion would be to put all of the data in one table.
Especially if that would make the system easier to query.  Believe it
or not but 1 million records isn't really that big of a deal.  I have
tables with 17 million rows, and I generally use cheap commodity
hardware.

You can always create summary tables later if querying the entire
dataset isn't fast enough.  Not to mention the fact that querying 50
different tables will require 50 different queries.  It is better to
have one query with one massive tablescan than to have fifty queries
that run a tablescan across 50 smaller tables.

In either case, five years from now, when you finally have 1 million
records, you can always buy more powerful hardware.  Even now, you can
buy enough memory to cache your entire dataset in RAM for a
ridiculously small amount of money.  In five years the computers you
can buy now will be toys.

Jason

Re: One big table or several smaller tables ?

From
"Josh Berkus"
Date:
Aarni, Jason,

> My suggestion would be to put all of the data in one table.
> Especially if that would make the system easier to query.  Believe it
> or not but 1 million records isn't really that big of a deal.  I have
> tables with 17 million rows, and I generally use cheap commodity
> hardware.

You may want expert help or advice on performance tweaks for your
 server, though.  The default settings in postgresql.conf are not
 intended for a million-row table on a serious server.

-Josh Berkus