Re: Bunching "transactions" - Mailing list pgsql-performance

From Jean-David Beyer
Subject Re: Bunching "transactions"
Date
Msg-id 4721E5EB.5060907@verizon.net
Whole thread Raw
In response to Bunching "transactions"  (Jean-David Beyer <jeandavid8@verizon.net>)
List pgsql-performance
Heikki Linnakangas wrote:
> Jean-David Beyer wrote:
>
>> My IO system has two Ultra/320 LVD SCSI controllers and 6 10,000rpm SCSI
>> hard drives. The dual SCSI controller is on its own PCI-X bus (the machine
>> has 5 independent PCI-X busses). Two hard drives are on one SCSI controller
>> and the other four are on the other. The WAL is on the first controller,
>> most of the rest is on the other controller. Once in a while, I get 144
>> Megabytes/sec transfers for a few seconds at a time to the hard drive
>> system, where I have an advertizing-maximum of 640 Megabytes/second. Each
>> hard drive claims to take a sustained data rate of about 80
>> Megabytes/second. When I test it, I can get 55 and sometimes a little more
>> for a single drive.
>
> You might find that you get better performance by just putting all the
> drives on a single RAID array. Or not :-). I'm not a hardware guy
> myself, but having read this mailing list for some time, I've seen
> different people come to different conclusions on that one. I guess it
> depends on the hardware and the application.

In the old days, I was a "hardware guy." But not in the last 15 years or so
(although I did put this machine together from parts). Right now, I do not
think I would get more performance with a single RAID array. Certainly not
if it were software RAID. Right now, I have the WAL on one drive that is not
heavily used when doing bulk loading of the database, and the main data on
the other 4 drives on a different SCSI controller. Measurements revealed
that THE bottleneck was the writing to the WAL.

The indices for any one table are on a different drive from the data itself
to minimize seek contention (and IO transmission contention, too, but that
does not seem to be an issue). Note that now the machine is only in IO-WAIT
state less than 1% of the time, and I no longer notice the main postgres
server process in D state. It used to be in D state a lot of the time before
I started bunching transactions. The IO to the drive with the WAL dropped
from a little over 3000 sectors per second to about 700 sectors per second,
for one thing. And the IO bandwidth consumed dropped, IIRC, about 50%.
>
>> Likewise, I seemto have enough processing power.
>>
>> top - 12:47:22 up 2 days,  5:06,  4 users,  load average: 1.40, 3.13, 4.20
>> Tasks: 168 total,   3 running, 165 sleeping,   0 stopped,   0 zombie
>> Cpu0  : 29.5%us,  3.3%sy,  0.0%ni, 67.0%id,  0.2%wa,  0.0%hi,  0.0%si,
>> Cpu1  : 21.8%us,  3.1%sy,  0.0%ni, 73.7%id,  1.4%wa,  0.0%hi,  0.0%si,
>> Cpu2  : 24.6%us,  3.6%sy,  0.0%ni, 71.7%id,  0.1%wa,  0.0%hi,  0.0%si,
>> Cpu3  : 23.1%us,  2.7%sy,  0.0%ni, 74.0%id,  0.1%wa,  0.1%hi,  0.0%si,
>> Mem:   8185340k total,  5112656k used,  3072684k free,    32916k buffers
>> Swap:  4096496k total,      384k used,  4096112k free,  4549536k cached
>>
 > Actually it looks like you've saturated the CPU.

How do you figure that? There are two or four (depending on how you count
them) CPUs. The CPUs seem to be running at 75% idle. If I let BOINC
processes run (nice 19), I can soak up most of this idle time. I turned them
off for the purpose of these measurements because they hide the io-wait times.

> Postgres backends are
> single-threaded, so a single bulk load like that won't use more than one
> CPU at a time. If you add up the usr percentages above, it's ~100%.

If you add up the idle percentages, it is about 300%. Recall that there are
two hyperthreaded processors here. That is more than two processors (but
less than four). If I examine the postgres processes, one of them used to
get to 100% once in a while when I did things like DELETE FROM tablename;
but I do a TRUNCATE now and it is much faster. Now any single process peaks
at 80% of a CPU and usually runs at less than 50%. The postgres processes
run on multiple CPUS. Looking at the top command, normally my client runs at
around 20% on one CPU, the main postgres server runs on a second at between
30% and 80% (depends on which tables I am loading), and the writer runs on
yet another. The two loggers wander around more. But these last three run at
around 1% each. In fact, the writer is idle much of the time.
>
> You should switch to using COPY if you can.
>
Someone else posted that I should not get neurotic about squeezing the last
little bit out of this (not his exact words), and I agree. This is only for
doing an initial load of the database after all. And as long as the total
time is acceptable, that is good enough. When I first started this (using
DB2), one of the loads used to take something like 10 hours. Redesigning my
basic approach got that time down to about 2 hours without too much
programming effort. As the amount of data has increased, that started
creeping up, and one of the tables, that has about 6,000,000 entries at the
moment, took overnight to load. That is why I looked into bunching these
transactions, with gratifying results.

To use COPY, I would have to write a bunch of special purpose programs to
convert the data as I get them into a form that COPY could handle them. (I
imagine pg_dump and pg_restore use COPY). And running those would take time
too. There ought to be a law against making huge spreadsheets for data, but
people who should be using a relational database for things seem more
comfortable with spreadsheets. So that is the form in which I get these data.

From a programming point of view, I hate spreadsheets because the
calculations and the data are intermixed, and you cannot see what the
calculations are unless you specifically look for them. And the way people
design (if that is the proper term) a spreadsheet results in something that
could not be considered normalized in any sense of that term.
One of these tables has columns from A all the way to KC (I guess that is
over 300 columns), and I would not be able to use such a table even if
postgres would accept one. IIRC, DB2 would not take such a wide one, but I
am not sure about that anymore. Anyhow, I believe in separation of concerns,
and mixing programs and data as in spreadsheets is a step in the wrong
direction.

--
  .~.  Jean-David Beyer          Registered Linux User 85642.
  /V\  PGP-Key: 9A2FC99A         Registered Machine   241939.
 /( )\ Shrewsbury, New Jersey    http://counter.li.org
 ^^-^^ 08:15:01 up 3 days, 33 min, 0 users, load average: 4.09, 4.15, 4.20

pgsql-performance by date:

Previous
From: "Giulio Cesare Solaroli"
Date:
Subject: Re: Finalizing commit taking very long
Next
From: "Jignesh K. Shah"
Date:
Subject: Re: 8.3beta1 testing on Solaris