Re: How to improve db performance with $7K? - Mailing list pgsql-performance

From Mohan, Ross
Subject Re: How to improve db performance with $7K?
Date
Msg-id CC74E7E10A8A054798B6611BD1FEF4D30625DA58@vamail01.thexchange.com
Whole thread Raw
In response to How to improve db performance with $7K?  (Steve Poe <spoe@sfnet.cc>)
List pgsql-performance
Imagine a system in "furious activity" with two (2) process regularly occuring

Process One:  Looooong read (or write). Takes 20ms to do seek, latency, and
                stream off. Runs over and over.
Process Two:  Single block read ( or write ). Typical database row access.
                Optimally, could be submillisecond. happens more or less randomly.


Let's say process one starts, and then process two. Assume, for sake of this discussion,
that P2's block lies w/in P1's swath. (But doesn't have to...)

Now, everytime process two has to wait at LEAST 20ms to complete. In a queue-reordering
system, it could be a lot faster. And me, looking for disk service times on P2, keep
wondering "why does a single diskblock read keep taking >20ms?"


Soooo....it doesn't need to be "a read" or "a write". It doesn't need to be "furious activity"
(two processes is not furious, even for a single user desktop.)  This is not a "corner case",
and while it doesn't take into account kernel/drivecache/UBC buffering issues, I think it
shines a light on why command re-ordering might be useful. <shrug>

YMMV.



-----Original Message-----
From: pgsql-performance-owner@postgresql.org [mailto:pgsql-performance-owner@postgresql.org] On Behalf Of Kevin Brown
Sent: Thursday, April 14, 2005 4:36 AM
To: pgsql-performance@postgresql.org
Subject: Re: [PERFORM] How to improve db performance with $7K?


Greg Stark wrote:


> I think you're being misled by analyzing the write case.
>
> Consider the read case. When a user process requests a block and that
> read makes its way down to the driver level, the driver can't just put
> it aside and wait until it's convenient. It has to go ahead and issue
> the read right away.

Well, strictly speaking it doesn't *have* to.  It could delay for a couple of milliseconds to see if other requests
comein, and then issue the read if none do.  If there are already other requests being fulfilled, then it'll schedule
therequest in question just like the rest. 

> In the 10ms or so that it takes to seek to perform that read
> *nothing* gets done. If the driver receives more read or write
> requests it just has to sit on them and wait. 10ms is a lifetime for a
> computer. In that time dozens of other processes could have been
> scheduled and issued reads of their own.

This is true, but now you're talking about a situation where the system goes from an essentially idle state to one of
furiousactivity.  In other words, it's a corner case that I strongly suspect isn't typical in situations where SCSI has
historicallymade a big difference. 

Once the first request has been fulfilled, the driver can now schedule the rest of the queued-up requests in
disk-layoutorder. 


I really don't see how this is any different between a system that has tagged queueing to the disks and one that
doesn't. The only difference is where the queueing happens.  In the case of SCSI, the queueing happens on the disks (or
atleast on the controller).  In the case of SATA, the queueing happens in the kernel. 

I suppose the tagged queueing setup could begin the head movement and, if another request comes in that requests a
blockon a cylinder between where the head currently is and where it's going, go ahead and read the block in question.
Butis that *really* what happens in a tagged queueing system?  It's the only major advantage I can see it having. 


> The same thing would happen if you had lots of processes issuing lots
> of small fsynced writes all over the place. Postgres doesn't really do
> that though. It sort of does with the WAL logs, but that shouldn't
> cause a lot of seeking.  Perhaps it would mean that having your WAL
> share a spindle with other parts of the OS would have a bigger penalty
> on IDE drives than on SCSI drives though?

Perhaps.

But I rather doubt that has to be a huge penalty, if any.  When a process issues an fsync (or even a sync), the kernel
doesn't*have* to drop everything it's doing and get to work on it immediately.  It could easily gather a few more
requests,bundle them up, and then issue them.  If there's a lot of disk activity, it's probably smart to do just that.
Allfsync and sync require is that the caller block until the data hits the disk (from the point of view of the kernel).
Thespecification doesn't require that the kernel act on the calls immediately or write only the blocks referred to by
thecall in question. 


--
Kevin Brown                          kevin@sysexperts.com

---------------------------(end of broadcast)---------------------------
TIP 3: if posting/reading through Usenet, please send an appropriate
      subscribe-nomail command to majordomo@postgresql.org so that your
      message can get through to the mailing list cleanly

pgsql-performance by date:

Previous
From: Greg Stark
Date:
Subject: Intel SRCS16 SATA raid?
Next
From: Tom Lane
Date:
Subject: Re: Foreign key slows down copy/insert