Re: One large v. many small - Mailing list pgsql-performance
From | Chad Thompson |
---|---|
Subject | Re: One large v. many small |
Date | |
Msg-id | 011301c2c897$d22719f0$32021aac@chad Whole thread Raw |
In response to | Re: One large v. many small (Noah Silverman <noah@allresearch.com>) |
List | pgsql-performance |
I have a database running on PostgresSQL w/ close to 7 million records in one table and ~ 3 million in another, along w/ various smaller supportive tables. Before I started here everything was run out of small tables, one for each client, similar ( i think ) to what you are doing now. We submit ~ 50 - 100K records each week. And before we moved to one table, our company had no idea of how it was doing on a daily, weekly or monthly basis. Now that we have moved to one large structure, new ideas about reporting funtions and added services we can give to our clients are poping up all the time. There are MANY benifts to following Josh's advice and putting all your information in one table. Others than those given, what if you wanted to give an added service to your clients where they are made aware of similar postings by your other clients. Running this kind of report would be a nightmare in your current situation. As far as performance goes, I am able to join these 2 tables along w/ others and get the information, counts etc., that I need, using some rather complicated queries, in about 2-3 seconds per query. While this may sound awful realize that Im running on a standard workstation PIII 700, and for the money, Its a dream! More importantly you need to realize, as my coworkers have now done, that anything that you can do w/ a small table, you can do w/ one big table and an extra line in the where clause (eg. Where client_id = 'blah' ). PostgresSQL has wonderful support and many excellent DBA's that if you post a SQL problem they are very supportive in helping solve the problem. I hope this helps make your decision. Thanks Chad ----- Original Message ----- From: "Noah Silverman" <noah@allresearch.com> To: <pgsql-performance@postgresql.org> Cc: <pgsql-performance@postgresql.org> Sent: Thursday, January 30, 2003 11:24 AM Subject: Re: [PERFORM] One large v. many small > OK, > > Thanks for the quick responses. > > A bit more information. > > We are in the business of gathering data for our clients. (We're a news > service). Subsequently, we do a lot of inserting and very rarely do > any deleting. (We periodically clear out results that are over 6 months > old.) > > On a give day, we will insert around 100,000 records in total. > (Currently split across all the client tables). > > A challenging part of the process is that we have to keep track of > previous content that may be similar. We CAN'T do this with a unique > index (don't ask, it would take too long to explain, but trust me, it > isn't possible). So, we have to query the table first and then compare > the results of that query to what we are inserting. SO, we probably do > close to 1 million queries, but then only make about 100,000 inserts. > The basic flow is 1) our system finds something it likes, 2) query the > table to see if something similar already exists, 3) if nothing similar > exists, insert. > > While all this is going on, our clients are accessing our online > reporting system. This system makes a variety of count and record > requests from the database. > > As I mentioned in our earlier post, we are attempting to decide if > Postgres will run faster/better/ with one big table, or a bunch of > smaller ones. It really doesn't make much difference to us, we just > want whatever structure will be faster. > > Thanks, > > -N > > > ---------------------------(end of broadcast)--------------------------- > TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org >
pgsql-performance by date: