Re: One large v. many small - Mailing list pgsql-performance

From Chad Thompson
Subject Re: One large v. many small
Date
Msg-id 011301c2c897$d22719f0$32021aac@chad
Whole thread Raw
In response to Re: One large v. many small  (Noah Silverman <noah@allresearch.com>)
List pgsql-performance
I have a database running on PostgresSQL w/ close to 7 million records in
one table and ~ 3 million in another, along w/ various smaller supportive
tables.
Before I started here everything was run out of small tables, one for each
client, similar ( i think ) to what you are doing now.
We submit ~ 50 - 100K records each week. And before we moved to one table,
our company had no idea of how it was doing on a daily, weekly or monthly
basis.  Now that we have moved to one large structure, new ideas about
reporting funtions and added services we can give to our clients are poping
up all the time.

There are MANY benifts to following Josh's advice and putting all your
information in one table.  Others than those given, what if you wanted to
give an added service to your clients where they are made aware of similar
postings by your other clients.  Running this kind of report would be a
nightmare in your current situation.

As far as performance goes, I am able to join these 2 tables along w/ others
and get the information, counts etc., that I need, using some rather
complicated queries, in about 2-3 seconds per query.  While this may sound
awful realize that Im running on a standard workstation PIII 700, and for
the money, Its a dream!

More importantly you need to realize, as my coworkers have now done, that
anything that you can do w/ a small table, you can do w/ one big table and
an extra line in the where clause (eg. Where client_id = 'blah' ).
PostgresSQL has wonderful support and many excellent DBA's that if you post
a SQL problem they are very supportive in helping solve the problem.

I hope this helps make your decision.
Thanks
Chad


----- Original Message -----
From: "Noah Silverman" <noah@allresearch.com>
To: <pgsql-performance@postgresql.org>
Cc: <pgsql-performance@postgresql.org>
Sent: Thursday, January 30, 2003 11:24 AM
Subject: Re: [PERFORM] One large v. many small


> OK,
>
> Thanks for the quick responses.
>
> A bit more information.
>
> We are in the business of gathering data for our clients. (We're a news
> service).  Subsequently, we do a lot of inserting and very rarely do
> any deleting. (We periodically clear out results that are over 6 months
> old.)
>
> On a give day, we will insert around 100,000 records in total.
> (Currently split across all the client tables).
>
> A challenging part of the process is that we have to keep track of
> previous content that may be similar.  We CAN'T do this with a unique
> index (don't ask, it would take too long to explain, but trust me, it
> isn't possible).  So, we have to query the table first and then compare
> the results of that query to what we are inserting.  SO, we probably do
> close to 1 million queries, but then only make about 100,000 inserts.
> The basic flow is 1) our system finds something it likes, 2) query the
> table to see if something similar already exists, 3) if nothing similar
> exists, insert.
>
> While all this is going on, our clients are accessing our online
> reporting system.  This system makes a variety of count and record
> requests from the database.
>
> As I mentioned in our earlier post, we are attempting to decide if
> Postgres will run faster/better/ with one big table, or a bunch of
> smaller ones.  It really doesn't make much difference to us, we just
> want whatever structure will be faster.
>
> Thanks,
>
> -N
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org
>


pgsql-performance by date:

Previous
From: Robert Treat
Date:
Subject: Re: One large v. many small
Next
From: Neil Conway
Date:
Subject: Re: Postgres 7.3.1 poor insert/update/search performance