Disk Performance Problem on Large DB - Mailing list pgsql-admin

From Jonathan Hoover
Subject Disk Performance Problem on Large DB
Date
Msg-id 68C107706843A54BA7D9CAEF123ADF05010346C85A@SP2-EX07VS03.ds.corp.yahoo.com
Whole thread Raw
Responses Re: Disk Performance Problem on Large DB  (Kenneth Marshall <ktm@rice.edu>)
Re: Disk Performance Problem on Large DB  ("Kevin Grittner" <Kevin.Grittner@wicourts.gov>)
List pgsql-admin

Hello,

 

I have a RHEL 5 box, 4 GB RAM, single hard drive SATA, Intel Core 2 Duo 2.4. A basic workstation.

 

I have a simple database, with one table for now. It has 4 columns:

 

anid serial primary key unique,

time timestamp,

source varchar(5),

unitid varchar(15),

guid varchar(32)

 

There is a btree index on each.

 

I am loading data 1,000,000 (1M) rows at a time using psql and a COPY command. Once I hit 2M rows, my performance just drops out, and the next 1M never finishes. It takes 7 minutes for 1M rows to load. Once 2M are in there, I’ve waited an hour, and nothing. It doesn’t seem to matter which 1M rows I try to load next, none ever finish. Each 1M rows is about 70MB on disk in the raw input file.

 

I have “atop” installed, and it reports the drives at 100%, which it reports for the first 1M rows too. The MBw/s goes from 20+ on the first 2M rows, down to about 4 MBw/s or less now. The processor usage is at about 2 to 8% at this time (used by postgres).

 

I have even waited for 1M rows to load, then done a vacuum for no good reason, then even restarted postgresql. I’ve made sure no disk or proc activity is happening before I start the next 1M rows. None of that seems to matter.

 

I have a total of about 70M rows to load, but am at a standstill. I’ve read up on whatever performance docs I can find online, but I am not getting anywhere.

 

I’ve increased shared_buffers to 256MB, and I’ve tried it with fsync commented out as per the default config. I’ve also tried it with fsync=off. No difference.

 

Ideas? Thanks in advance,

Jon

pgsql-admin by date:

Previous
From: Lou Picciano
Date:
Subject: Re: 2 PGSQL instances in the same server
Next
From: Kenneth Marshall
Date:
Subject: Re: Disk Performance Problem on Large DB