Re: a heavy duty operation on an "unused" table kills my server - Mailing list pgsql-performance

From Eduardo Piombino
Subject Re: a heavy duty operation on an "unused" table kills my server
Date
Msg-id e24c1d9d1001122303g1d4eea48h18bd73467d1fa8ec@mail.gmail.com
Whole thread Raw
In response to Re: a heavy duty operation on an "unused" table kills my server  (Craig Ringer <craig@postnewspapers.com.au>)
Responses Re: a heavy duty operation on an "unused" table kills my server  (Craig Ringer <craig@postnewspapers.com.au>)
Re: a heavy duty operation on an "unused" table kills my server  (Robert Haas <robertmhaas@gmail.com>)
List pgsql-performance
Excellent, lots of useful information in your message.
I will follow your advices, and keep you posted on any progress. I have yet to confirm you with some technical details of my setup, but I'm pretty sure you hit the nail in any case.

One last question, this IO issue I'm facing, do you think it is just a matter of RAID configuration speed, or a matter of queue gluttony (and not leaving time for other processes to get into the IO queue in a reasonable time)?

Because if it was just a matter of speed, ok, with my actual RAID configuration lets say it takes 10 minutes to process the ALTER TABLE (leaving no space to other IOs until the ALTER TABLE is done), lets say then i put the fastest possible RAID setup, or even remove RAID for the sake of speed, and it completes in lets say again, 10 seconds (an unreal assumption). But if my table now grows 60 times, I would be facing the very same problem again, even with the best RAID configuration.

The problem would seem to be in the way the OS (or hardware, or someone else, or all of them) is/are inserting the IO requests into the queue.
What can I do to control the order in which these IO requests are finally entered into the queue?
I mean .. what i would like to obtain is:

Considering the ALTER TABLE as a sequence of 100.000 READ/WRITE OPERATIONS
Considering the SELECT * FROM xxx as a sequence of 100 READ OPERATIONS (totally unrelated in disk)

First i run the ALTER TABLE on a thread...
Lets say by the time it generates 1.000 READ/WRITE OPERATIONS, the other thread starts with the SELECT * FROM xxx ...
I would expect the IO system to give chance to the those 100 READ OPERATIONS to execute immediately (with no need to wait for the remaining 990.000 READ/WRITE OPERATIONS finish), that is, to enter the queue at *almost* the very same moment the IO request were issued.

If I can not guarantee that, I'm kinda doomed, because the largest the amount of IO operations requested by a "heavy duty operation", the longest it will take any other thread to start doing anything.

What cards do I have to manipulate the order the IO requests are entered into the "queue"?
Can I disable this queue?
Should I turn disk's IO operation caches off?
Not use some specific disk/RAID  vendor, for instance?

I think I have some serious reading to do on this matter, google will help of course, but as always, every advice for small it may seem, will be very much appreciated.

Nonetheless, thanks a lot for all the light you already brought me on this matter.
I really appreciate it.

Eduardo.



On Wed, Jan 13, 2010 at 3:02 AM, Craig Ringer <craig@postnewspapers.com.au> wrote:
On 13/01/2010 1:47 PM, Eduardo Piombino wrote:
I'm sorry.

The server is a production server HP Proliant, I don't remember the
exact model, but the key features were:
4 cores, over 2GHz each (I'm sorry I don't remember the actual specs), I
think it had 16G of RAM (if that is possible?)
It has two 320G disks in RAID (mirrored).

Plain 'ol SATA disks in RAID-1?

Hardware RAID (and if so, controller model)? With battery backup? Write cache on or off?

Or software RAID? If so, Windows build-in sw raid, or some vendor's fakeraid (Highpoint, Promise, Adaptec, etc) ?

Anyway, with two disks in RAID-1 I'm not surprised you're seeing some performance issues with heavy writes, especially since it seems unlikely that you have a BBU hardware RAID controller. In RAID-1 a write must hit both disks, so a 1Mb write effectively costs twice as much as a 1Mb read. Since many controllers try for high throughput (because it looks good in benchmarks) at the expense of latency they also tend to try to batch writes into long blocks, which keeps the disks busy in extended bursts. That slaughters read latencies.

I had this sort of issue with a 3Ware 8500-8, and landed up modifying and recompiling the driver to reduce its built-in queue depth. I also increased readahead. It was still pretty awful as I was working with RAID 5 on SATA disks, but it made a big difference and more importantly meant that my Linux server was able to honour `ionice' priorities and feed more important requests to the controller first.

On windows, I really don't know what to do about it beyond getting a better I/O subsystem. Google may help - look into I/O priorities, queue depths, reducing read latencies, etc.


I don't even have the emails with the specs here, but I can give you the
exact configuration by tomorrow.

Operating system: Windows 2003 server, with latest patches.
Postgres version: 8.2.4, with all defaults, except DateStyle and TimeZone.

Urk. 8.2 ?

Pg on Windows improves a lot with each release, and that's an old buggy version of 8.2 at that. Looking into an upgrade would be a really, REALLY good idea.


Please try interpreting again my original mail, considering that when I
said "high CPU usage" It might very well be "high IO usage".

The final effect was that the server went non-responsive, for all
matters, not even the TaskManager would come up when i hit CTRL-ALT-DEL,
and of course, every client would suffer horrific (+20 secs) for the
simplest operations like SELECT NOW();

That sounds a LOT like horrible read latencies caused by total I/O overload. It could also be running out of memory and swapping heavily, so do keep an eye out for that, but I wouldn't expect to see that with an ALTER TABLE - especially on a 16GB server.

   / My question then is: is there a way to limit the CPU* or **IO
   USAGE* assigned to a specific connection?/

In win32 you can set CPU priorities manually in Task Manager, but only once you already know the process ID of the Pg backend that's going to be hammering the machine. Not helpful.

I don't know of any way to do per-process I/O priorities in Win32, but I only use win32 reluctantly and don't use it for anything I care about (like a production Pg server) so I'm far from a definitive source.

--
Craig Ringer

pgsql-performance by date:

Previous
From: Jaime Casanova
Date:
Subject: Re: [PERFORMANCE] work_mem vs temp files issue
Next
From: Greg Smith
Date:
Subject: Re: a heavy duty operation on an "unused" table kills my server