Re: Inserting 26 million rows takes 8 hours, how to - Mailing list pgsql-admin
From | Jose Vicente Nunez Z |
---|---|
Subject | Re: Inserting 26 million rows takes 8 hours, how to |
Date | |
Msg-id | 1063726066.5407.46.camel@linux0037 Whole thread Raw |
In response to | Re: Inserting 26 million rows takes 8 hours, how to improve those times? (Tom Lane <tgl@sss.pgh.pa.us>) |
Responses |
Re: Inserting 26 million rows takes 8 hours, how to improve those times?
Re: Inserting 26 million rows takes 8 hours, how to |
List | pgsql-admin |
Tom, On Tue, 2003-09-16 at 10:44, Tom Lane wrote: > Jose Vicente Nunez Z <josevnz@newbreak.com> writes: > > For that i'm using the PostgreSQL copy tool, but the problem is that is > > taking almost 9 hours to load the data: > > Hmm, 889 rows/second doesn't sound that bad, considering that you've > given us no data about the hardware you're running on. In particular, > what's your disk setup? My fault, i should provide more information about my setup. I got two ATA disks (one for PostgreSQL, the other for the data files). The box is a Pentium III 1.3Ghz, dual CPU with 2 GB of RAM. One disk is 30 GB, the other is 40GB. As you can see is not state of the art harware, but for development purposes i should get a decent performance (only one or two applications will be accessing the server to do some data mining). > > But at any rate, the only bottlenecks I could think of for a COPY > command are: > > 1. Data conversion. This would depend on the column datatypes of the > table, which you didn't tell us. > create table mytable ( a varchar(20) not null, b varchar(20) not null, c varchar(20) not null, d char(6), f int null, g float not null, h float not null ) Data conversions shouldn't be that complicated, isn't it? > 2. Index updating. This would depend on the indexes on the table, > which you didn't tell us. I had one index and dropped it for the load. Also i turn of the 'fsync' flag (fsync=false) and now the load time is only 15 minutes!. But then recreating the original index takes forever: create index myindex on mytable ( a, b, c ); For some reason i don't see much I/O but a lot of CPU ussage: 11:18:05 up 3 days, 20:55, 3 users, load average: 1.00, 1.00, 0.92 51 processes: 49 sleeping, 2 running, 0 zombie, 0 stopped CPU0 states: 96.0% user 3.0% system 0.0% nice 0.0% iowait 0.0% idle CPU1 states: 0.1% user 0.0% system 0.0% nice 0.0% iowait 99.0% idle Mem: 2064404k av, 2049312k used, 15092k free, 0k shrd, 149544k buff 406064k actv, 1076876k in_d, 345612k in_c Swap: 2047744k av, 4252k used, 2043492k free 1809160k cached PID USER PRI NI SIZE RSS SHARE STAT %CPU %MEM TIME CPU COMMAND 1882 postgres 25 0 41492 40M 36080 R 98.6 2.0 87:34 0 postmaster 16998 root 15 0 0 0 0 SW 1.9 0.0 4:30 0 kjournald 1956 root 19 0 1152 1152 848 R 0.9 0.0 0:00 1 top 1 root 15 0 476 448 420 S 0.0 0.0 0:07 1 init 2 root RT 0 0 0 0 SW 0.0 0.0 0:00 0 migration/0 3 root RT 0 0 0 0 SW 0.0 0.0 0:00 1 migration/1 4 root 15 0 0 0 0 SW 0.0 0.0 0:00 1 keventd 5 root 34 19 0 0 0 SWN 0.0 0.0 0:00 0 ksoftirqd_CPU0 6 root 34 19 0 0 0 SWN 0.0 0.0 0:00 1 ksoftirqd_CPU1 11 root 19 0 0 0 0 SW 0.0 0.0 0:00 0 bdflush 7 root 15 0 0 0 0 SW 0.0 0.0 0:22 0 kswapd 8 root 15 0 0 0 0 SW 0.0 0.0 0:00 1 kscand/DMA 9 root 15 0 0 0 0 SW 0.0 0.0 1:57 0 kscand/Normal 10 root 15 0 0 0 0 SW 0.0 0.0 2:28 0 kscand/HighMem 12 root 15 0 0 0 0 SW 0.0 0.0 0:04 0 kupdated Also seems to be than memory usage is ok, i don't see any process blocked or waiting for resources, nor swapped out due lack of memory: [root@linux0105 data]# vmstat 10 10 procs memory swap io system cpu r b w swpd free buff cache si so bi bo in cs us sy id 1 0 0 4252 15092 149564 1809160 0 0 60 43 1 16 2 1 33 1 0 0 4252 15092 149564 1809160 0 0 0 1820 134 12 49 1 50 1 0 0 4252 15092 149564 1809160 0 0 0 1282 125 8 49 1 50 1 0 0 4252 15092 149568 1809160 0 0 0 1202 124 9 50 1 50 1 0 0 4252 15092 149572 1809160 0 0 0 1216 123 9 50 1 50 1 0 0 4252 15092 149576 1809160 0 0 0 1224 125 8 50 0 50 1 0 0 4252 15092 149580 1809160 0 0 0 1212 124 10 50 1 50 1 0 0 4252 15092 149584 1809160 0 0 0 1206 123 8 50 1 50 1 0 0 4252 15092 149588 1809160 0 0 0 1212 125 8 49 1 50 1 0 0 4252 15092 149592 1809160 0 0 0 1208 125 9 49 1 50 Any ideas about what i can do to speed up the Index creation? Why is using so much CPU (doesn't seem to be writting much information to disk though). > > 3. Foreign-key constraint checking. This would depend on the foreign > keys the table has, which you didn't tell us. > None so far, is just one table. > In short ... you'd better provide a lot more detail if you want > useful commentary. > > > I apologize if this not the proper place to post this questions. > > pgsql-performance would be more appropriate, probably. I'm reading some documentation (checking the list archives): http://www.varlena.com/varlena/GeneralBits/Tidbits/perf.html http://www.varlena.com/varlena/GeneralBits/Tidbits/annotated_conf_e.html Hopefully this would help me to fix the problem. Thanks for the help, JV. > > regards, tom lane -- Jose Vicente Nunez Zuleta (josevnz at newbreak dot com) Newbreak LLC System Administrator http://www.newbreak.com RHCE, SCJD, SCJP
pgsql-admin by date: