index creation order? - Mailing list pgsql-performance
From | Allen Landsidel |
---|---|
Subject | index creation order? |
Date | |
Msg-id | 6.0.0.22.0.20031031104214.024853b0@pop.hotpop.com Whole thread Raw |
Responses |
Re: index creation order?
Re: index creation order? |
List | pgsql-performance |
Yet another question.. thanks to everyone responding to all these so far.. ;) This one is basically.. given I have a big table already in COPY format, about 28 million rows, all keys guaranteed to be unique, I'm trying to find out which of the following will get the import finished the fastest: a) CREATE TABLE with no indexes or keys. Run the COPY (fast, ~30min), then CREATE INDEX on each column it's needed on, and ALTER TABLE for the pk and each fk needed. b) Same as above, but instead of ALTER TABLE -- ditch the FK, and CREATE UNIQUE INDEX on the PK. c) CREATE TABLE with the PK/FK's in the table structure, CREATE INDEX on needed columns, then run the COPY. d) .. is to c as b is to a .. Don't create PK/FK's, just CREATE UNIQUE INDEX after table creation, then run the COPY. My gut instinct tells me that in order, fastest to slowest, it's going to be d,b,c,a; this is what I've experienced on other DBs such as MSSQL and Oracle. If there isn't a significant difference between all of them, performance wise, I think something is dreadfully wrong here. Running "a", the ALTER TABLE to add the PK ran for 17 hours and still wasn't finished. The table without indexes or keys is: CREATE TABLE foo ( id BIGINT NOT NULL DEFAULT nextval('foo_id_sequence'), master_id BIGINT NOT NULL, other_id INTEGER NOT NULL, status INTEGER NOT NULL, addtime TIMESTAMP WITH TIME ZONE DEFAULT now() ); Details on machine and configuration are: The machine is the same one I've mentioned before.. SMP AthlonMP 2800+ (2.1GHz), 4x18GB 15krpm SCSI RAID-0 with 256MB onboard cache on a quad-channel ICP-Vortex controller, 2GB physical memory. Running FreeBSD RELENG_4, relevant filesystems with softupdates enabled and mounted noatime. kernel options are: maxusers 0 options MAXDSIZ="(1536UL*1024*1024)" # maximum limit options MAXSSIZ="(512UL*1024*1024)" # maximum stack options DFLDSIZ="(512UL*1024*1024)" # default limit options VM_BCACHE_SIZE_MAX="(384UL*1024*1024)" # cache size upped from default 200MB options SYSVSHM #SYSV-style shared memory options SYSVMSG #SYSV-style message queues options SYSVSEM #SYSV-style semaphores options SHMMAXPGS=262144 options SHMALL=262144 options SHMSEG=256 options SEMMNI=384 options SEMMNS=768 options SEMMNU=384 options SEMMAP=384 postgresql.conf settings are: shared_buffers = 30000 max_fsm_relations = 10000 max_fsm_pages = 2000000 max_locks_per_transaction = 64 wal_buffers = 128 sort_mem = 1310720 (1.2GB) vacuum_mem = 262144 (256MB) checkpoint_segments = 64 checkpoint_timeout = 1200 commit_delay = 20000 commit_siblings = 2 fsync=true random_page_cost = 1.7 cpu_tuple_cost = 0.005 cpu_index_tuple_cost = 0.005 cpu_operator_cost = 0.0012 stats_start_collector = true stats_command_string = true stats_row_level = true stats_block_level = true
pgsql-performance by date: