Aarni Ruuhimäki / Megative Tmi / KYMI.com <aarni.ruuhimaki@kymi.com> writes:
> Hi !
>
> I have a general design question:
>
> We plan to collect sales events / data from approx. 50 outlets
> (shops).
>
> We are expecting 200 000 events annually.
>
> Would it be wiser to have a table for each outlet or one big table
> for everything ? (a dozen fields or so in both cases)
>
> I.e. which one is faster to pull reports from ? E.g. after five
> years, if we want to know the number of refrigerators sold between
> may and august for each year and for each outlet, go through one
> table with million rows or 50 tables with 20 000 rows ?
>
> 'Query-wise' it would be much simpler to have one big table, but how
> about the performance ?
>
> TIA and keep on keeping on !
>
> aarni
My suggestion would be to put all of the data in one table.
Especially if that would make the system easier to query. Believe it
or not but 1 million records isn't really that big of a deal. I have
tables with 17 million rows, and I generally use cheap commodity
hardware.
You can always create summary tables later if querying the entire
dataset isn't fast enough. Not to mention the fact that querying 50
different tables will require 50 different queries. It is better to
have one query with one massive tablescan than to have fifty queries
that run a tablescan across 50 smaller tables.
In either case, five years from now, when you finally have 1 million
records, you can always buy more powerful hardware. Even now, you can
buy enough memory to cache your entire dataset in RAM for a
ridiculously small amount of money. In five years the computers you
can buy now will be toys.
Jason