Re: index creation order? - Mailing list pgsql-performance
From | Rod Taylor |
---|---|
Subject | Re: index creation order? |
Date | |
Msg-id | 1067617409.17097.7.camel@jester Whole thread Raw |
In response to | index creation order? (Allen Landsidel <all@biosys.net>) |
Responses |
Re: index creation order?
|
List | pgsql-performance |
If it is 7.4 beta 5 or later, I would definitely go with A. Adding indexes after the fact seems to be much quicker. Foreign keys use the same algorithm prior to beta 5 regardless of timing. A primary key and unique index will have approx the same performance (a check for NULL isn't very costly). On Fri, 2003-10-31 at 11:02, Allen Landsidel wrote: > Yet another question.. thanks to everyone responding to all these so far.. ;) > > This one is basically.. given I have a big table already in COPY format, > about 28 million rows, all keys guaranteed to be unique, I'm trying to find > out which of the following will get the import finished the fastest: > > a) CREATE TABLE with no indexes or keys. Run the COPY (fast, ~30min), then > CREATE INDEX on each column it's needed on, and ALTER TABLE for the pk and > each fk needed. > > b) Same as above, but instead of ALTER TABLE -- ditch the FK, and CREATE > UNIQUE INDEX on the PK. > > c) CREATE TABLE with the PK/FK's in the table structure, CREATE INDEX on > needed columns, then run the COPY. > > d) .. is to c as b is to a .. Don't create PK/FK's, just CREATE UNIQUE > INDEX after table creation, then run the COPY. > > My gut instinct tells me that in order, fastest to slowest, it's going to > be d,b,c,a; this is what I've experienced on other DBs such as MSSQL and > Oracle. > > If there isn't a significant difference between all of them, performance > wise, I think something is dreadfully wrong here. Running "a", the ALTER > TABLE to add the PK ran for 17 hours and still wasn't finished. > > The table without indexes or keys is: > CREATE TABLE foo ( > id BIGINT NOT NULL DEFAULT nextval('foo_id_sequence'), > master_id BIGINT NOT NULL, > other_id INTEGER NOT NULL, > status INTEGER NOT NULL, > addtime TIMESTAMP WITH TIME ZONE DEFAULT now() > ); > > Details on machine and configuration are: > > The machine is the same one I've mentioned before.. SMP AthlonMP 2800+ > (2.1GHz), 4x18GB 15krpm SCSI RAID-0 with 256MB onboard cache on a > quad-channel ICP-Vortex controller, 2GB physical memory. Running FreeBSD > RELENG_4, relevant filesystems with softupdates enabled and mounted noatime. > > kernel options are: > maxusers 0 > > options MAXDSIZ="(1536UL*1024*1024)" # maximum limit > options MAXSSIZ="(512UL*1024*1024)" # maximum stack > options DFLDSIZ="(512UL*1024*1024)" # default limit > options VM_BCACHE_SIZE_MAX="(384UL*1024*1024)" # cache size upped > from default 200MB > options SYSVSHM #SYSV-style shared memory > options SYSVMSG #SYSV-style message queues > options SYSVSEM #SYSV-style semaphores > options SHMMAXPGS=262144 > options SHMALL=262144 > options SHMSEG=256 > options SEMMNI=384 > options SEMMNS=768 > options SEMMNU=384 > options SEMMAP=384 > > postgresql.conf settings are: > > shared_buffers = 30000 > max_fsm_relations = 10000 > max_fsm_pages = 2000000 > max_locks_per_transaction = 64 > wal_buffers = 128 > sort_mem = 1310720 (1.2GB) > vacuum_mem = 262144 (256MB) > checkpoint_segments = 64 > checkpoint_timeout = 1200 > commit_delay = 20000 > commit_siblings = 2 > fsync=true > random_page_cost = 1.7 > cpu_tuple_cost = 0.005 > cpu_index_tuple_cost = 0.005 > cpu_operator_cost = 0.0012 > > stats_start_collector = true > stats_command_string = true > stats_row_level = true > stats_block_level = true > > > ---------------------------(end of broadcast)--------------------------- > TIP 2: you can get off all lists at once with the unregister command > (send "unregister YourEmailAddressHere" to majordomo@postgresql.org) >
Attachment
pgsql-performance by date: