Re: pgsql and large tables - Mailing list pgsql-general
From | Francisco Reyes |
---|---|
Subject | Re: pgsql and large tables |
Date | |
Msg-id | 20011115115301.C60187-100000@zoraida.natserv.net Whole thread Raw |
In response to | pgsql and large tables ("Gurupartap Davis" <partap@yahoo.com>) |
List | pgsql-general |
On Tue, 13 Nov 2001, Gurupartap Davis wrote: I am somewhat new to PostgreSQL, but didn't see a reply to your answer so I would tell you what I know about your queries. > I've got a large database, currently about 40 million rows in the biggest table. ... > I initially created the table and with a primary key (5 fields: char(4), varchar(32), date, int, int) Out of curiosity why are you using this as your primary key and not a serial? Will your tables be normalized? >and a couple additional indexes, one that just changes the order of the >primary key, and one that sorts on the date. From whawt I have read so far your primary key doesn't sound like what you would want to have as a primary key... but I guess the design of your table is a totally different issue. > Then I imported 40 million rows and tried some queries, but discovered >that none of my queries were using indexes, causing them to take forever. It is difficult for anyone to reply to generic questions. How about showing us the query, an explain plan, the index attributes. Any of the index based on a function? > So I read somewhere in the archives thatyou need to VACUUM a table regularly >for indexes to work properly. Sort of. Let's say they work better. It has to do with what the optimizer believes is your data distribution to determine when to use the index. > I tried that, but aborted after about 5 hours. You have told us nothing about your hardware, your OS, which version of PostgreSQL (unless I missed it) > I can't use pgsql if I have to take the db down for more than 10-15 minutes a day. Then maybe you may not be able to use it right now. I believe that the current "vacuum analyse" and maybe vacuum too may need to lock the table while they are run. Hopefully others will explain this better. This will also change on the soon to be release 7.2. >Then I read somewhere else that you should drop your indexes before VACUUMing >and re-create them afterwards. That makes little sense to me. Also don't recall reading this. What I do recall is that before a bulk load you want to drop your indexes. My understanding was that you wanted to have your indexes when you do vacuum analyze. >I tried that, and VACUUM finished in about 10 minutes. >I've been trying to recreate my primary key for the last >18 hours...not so good. Again, we need more info about your hardware, OS, versino of pgsql, etc... > Should I have dropped all indexes *except* for the primary? My understanding is that you may want to drop your data when doing a big load, not when doing a vacuum. >Should I make an artificial primary key with a serial type to simplify things? I recommend you do this. Not only because it is more efficient, but because the type of key you selected has "issues" for lack of a better term. You can of course have a "unique" index so you don't have dups. I think that when you are dealing with a 40 million table you need to consider your hardware. You also didn't tell us how big are the rows. The more info you give us the more others will be able to help.
pgsql-general by date: