Re: Is there anything special about pg_dump's compression? - Mailing list pgsql-sql
From | Jean-David Beyer |
---|---|
Subject | Re: Is there anything special about pg_dump's compression? |
Date | |
Msg-id | 473DFF54.1030605@verizon.net Whole thread Raw |
In response to | Re: Is there anything special about pg_dump's compression? (Shane Ambler <pgsql@Sheeky.Biz>) |
List | pgsql-sql |
Shane Ambler wrote: > Jean-David Beyer wrote: >>> The physical tape speed is surely the real bottleneck here, and the >>> fact that the total elapsed time is about the same both ways proves >>> that about the same number of bits went onto tape both ways. >> >> I do not get that. If the physical tape speed is the bottleneck, why is >> it only about 242 kB/s in the software-compressed case, and 4.2 MB/s in >> the hardware-uncompressed case? The tape drive usually gives over 6 >> MB/s rates when running a BRU (similar to find > cpio) when doing a >> backup of the rest > > It would really depend on where the speed measurement comes from and how > they are calculated. Is it data going to the drive controller or is it > data going to tape? Is it the uncompressed size of data going to tape? I imagine it is the speed measured by the CPU of data going to the (Linux) operating system's write() calls. > > My guess is that it is calculated as the uncompressed size going to tape. > In the two examples you give similar times for the same original > uncompressed data. True. But that tells me that it is the CPU that is the limiting factor. In other words, if I send compressed data, it sends 30 Megabytes in about the same time that if I send uncompressed data (for the tape drive hardware to compress -- the SCSI controller driving the tape drive sure does not compress anything much). I originally started this thread because I wanted to know if the compression in pg_dump was anything special, and I was told that it was probably not. And this seems to be the case as it takes about the same amount of time do dump the database whether I compress it in pg_dump or in the tape drive. But then it seemed, and still seems to me, that instead of being limited by the tape speed, it is limited by the CPU speed of the CPU running the postgres server -- and that confuses me, since intuitively it is not doing much. > > I would say that both methods send 30MB to tape which takes around 124 > seconds You are right about this. In other words, the time to send the data to the tape drive, whether it is 30 Megabytes (compressed by the program) or 524 megabytes (compressed by the drive) will put down about the same number of bytes onto the tape. I.e., the tape head sees (about) the same number of bytes either way. This means the transmission speed of the SCSI controller is certainly fast enough to handle what is going on (though I do not think there was any questioning of that). But since the tape drive can take 6 uncompressed megabytes per second (and it does -- this is not advertizing hype: I get that when doing normal backups of my system), and is getting only 4.3, that means the bottleneck is _before_ the SCSI controller. Here is an typical example. Bru does a backup of my entire system (except for the postgres stuff), rewinds the tape, and reads it all back in, verifying the checksum of every block on the tape. It does not (although it could) do any compression. **** bru: execution summary **** Started: Wed Nov 14 01:04:16 2007 Completed: Wed Nov 14 02:02:56 2007 Archive id: 473a8fe017a4 Messages: 0 warnings, 0 errors Archive I/O: 5588128 blocks (11176256Kb) written Archive I/O: 5588128 blocks (11176256Kb) read Files written: 202527 files (170332 regular, 32195 other) So we wrote 11.176Gb, rewound the tape, and then read 11.176Gb and then rewound the tape again in about an hour. Ignoring rewind times, this would say it wrote or read 6.2 uncompressed megabytes/second. It would be a little faster if we consider that the rewind times are not really important in this discussion. This is the rate of stuff going to the interface. This just shows that the 6 Megabytes/second claimed by the manufacturer is realistic -- that you actually get this in a real application. Now what is on this machine? A lot of binary program files that probably do not compress much. Quite a bunch of .jpeg files that are already compressed, so they probably do not compress much. Some .mp3 files: I do not know how much they compress. Program source files (but not lots of them). _Lots_ of files that have been zipped, so they probably do not compress much; 1,347,184 blocks worth of that stuff. > > The first example states 4.2MB/s - calculated from the uncompressed size > of 524MB, yet the drive compresses that to 30MB which is written to tape. > So it is saying it got 524MB and saved it to tape in 125 seconds > (4.2MB/s), but it still only put 30MB on the tape. > > 524MB/125 seconds = 4.192MB per second > > The second example states 242KB/s - calculated from the size sent to the > drive - as the data the drive gets is compressed it can't compress it any > smaller - the data received is the same size as the data written to tape. > This would indicate your tape speed. > > 30MB/123 seconds = 243KB/s > > To verify this - > > 524/30=17 - the compressed data is 1/17 the original size. > > 242*17=4114 - that's almost the 4.2MB/s that you get sending uncompressed > data, I would say you get a little more compression from the tape > hardware that gives you the slightly better transfer rate. Or sending > compressed data to the drive with it set to compress incoming data is > causing a delay as the drive tries to compress the data without reducing > the size sent to tape. (my guess is that if you disabled the drive > compression and sent the compressed pg_dump to the drive you would get > about 247KB/s) > I suppose so too, but it is too much bother to turn the compression off, so I do not propose to test that. > > I would also say the 6MB/s from a drive backup would come about from - 1. > less overhead as data is sent directly from disk to tape. (DMA should > reduce the software overhead as well). (pg_dump formats the data it gets > and waits for responses from postgres - no DMA) Well, the tape drive is run by an Ultra/320 LVD SCSI controller on a dedicated PCI-X bus, and that is sent in 65536-byte blocks, so the software overhead is minimal -- as can be seen by the fact that the client (pg_dump) only runs at 12% of a CPU when writing uncompressed. It is the postgres _server_ that runs at 100% of a CPU, so it is the bottleneck. The question is What is the postgres server doing that needs 100% of a 3.06 GigaHertz Xeon processor? Recall that this database is by no means fully loaded and everything is in RAM. > > And maybe - 2. A variety of file contents would also offer different > rates of compression - some of your file system contents can be > compressed more than pg_dump output. If I am getting 17:1 compression on the database, I would say that is far more compression than I get on the average files in my system. Manufacturers normally claim you get about 2:1 compression on what they consider typical data. Some claim even that is optimistic. Some vendors claim 2.5:1 compression, and maybe for their data, whatever it is, they do get that. > 3. Streamed as one lot to the drive it may also allow it to treat your > entire drive contents as one file - allowing duplicates in different > files to be compressed the way the above example does. > I am not sure what you mean by streamed in this case. My tape drive can start and stop while running if required. In fact, if the computer fails to keep up, it slows down the writing speed so that the tape need not start and stop -- so it never has to backspace and restart if the computer has trouble keeping up like the old DDS-2 tapes had to do all the time. The tape drive writes one block at a time. It also compresses (or decompresses) one block at a time. -- .~. Jean-David Beyer Registered Linux User 85642. /V\ PGP-Key: 9A2FC99A Registered Machine 241939./()\ Shrewsbury, New Jersey http://counter.li.org^^-^^ 14:45:01 up 24 days, 8:03, 0 users, load average: 4.67,4.43, 4.18