Thread: some PITR performance data with DBT-2
Hi Simon, Sorry it has taken so long. Among other things, I doubled the controllers and drives on the system I was testing this on. But now I have some data against PostgreSQL-8.0beta2. Here is the test run with archiving enabled:http://www.osdl.org/projects/dbt2dev/results/dev4-010/158/ Here is the test run with archiving disabled:http://www.osdl.org/projects/dbt2dev/results/dev4-010/159/ Here is sar/iostat/vmstat and oprofile data during the first hour of recovery. Total recovery time took about 6.5 hours:http://www.developer.osdl.org/markw/pitr/ The overall throughput difference between the two runs with archiving enabled/disabled was within 1%. I ran the test over a duration of 3 hours (including a 2 hour rampup of the driver), as opposed to the 6 hours you originally requested. I hope that is ok. System details, which you may be interested in: 4 x 1.5 GHz Itanium 2 16GB RAM 6 x Compaq Computer Corporation Smart Array 64xx 6 x 14 disk 15K RPM drives (split bus) The database and archive directory were put onto a single LVM volume across all 84 drives. Let me know if I left anything out. -- Mark Wong - - markw@osdl.org Open Source Development Lab Inc - A non-profit corporation 12725 SW Millikan Way - Suite 400 - Beaverton, OR 97005 (503) 626-2455 x 32 (office) (503) 626-2436 (fax) http://developer.osdl.org/markw/
>Mark Wong wrote > Hi Simon, > > Sorry it has taken so long. Among other things, I doubled the controllers > and drives on the system I was testing this on. But now I have some data > against PostgreSQL-8.0beta2. > Thanks very much. > Here is the test run with archiving enabled: > http://www.osdl.org/projects/dbt2dev/results/dev4-010/158/ > > Here is the test run with archiving disabled: > http://www.osdl.org/projects/dbt2dev/results/dev4-010/159/ > > The overall throughput difference between the two runs with archiving > enabled/disabled was within 1%. > Excellent. I hoped it was that low - my target was < 5%. Stats check out with no wierdness in the results. TGFT. Also, I notice the tpm figures have gone up some more - have you got new hardware, or has the PostgreSQL setup been tuned more? Or can it be that rel8.0 really is that much faster?? > Here is sar/iostat/vmstat and oprofile data during the first hour of > recovery. Total recovery time took about 6.5 hours: > http://www.developer.osdl.org/markw/pitr/ > That's bad news. My own recovery performance estimates would lead me to hope that its possible to get the recovery to be quicker than the processes that wrote the logs, even on a very quick 4 CPU system. I'd be hoping for ~1 hour, or at least <= 4 hours. > I ran the test over a duration of 3 hours (including a 2 hour rampup of > the driver), as opposed to the 6 hours you originally requested. I > hope that is ok. > > System details, which you may be interested in: > > 4 x 1.5 GHz Itanium 2 > 16GB RAM > 6 x Compaq Computer Corporation Smart Array 64xx > 6 x 14 disk 15K RPM drives (split bus) > > The database and archive directory were put onto a single LVM volume > across all 84 drives. > > Let me know if I left anything out. > First off, thank you again. I've had a look at all the results, but I found a few things: - couldnt find postgresql.conf or recovery.conf anywhere, so not sure what OS command you are using - log files were very large indeed due to the SPI error messages, so I haven't been able to download those properly for analysis - any chance you could grep out the SPI stuff, so I can see the archive and restore commands? Stats I'd be interested in for analysing recovery performance would be: - how many log files in total were archived/restored - where were they archived to - what was the archive/recovery command? Best Regards, Simon Riggs
On Wed, Sep 15, 2004 at 09:50:17PM +0100, Simon Riggs wrote: > >Mark Wong wrote > > Hi Simon, > > > > Sorry it has taken so long. Among other things, I doubled the controllers > > and drives on the system I was testing this on. But now I have some data > > against PostgreSQL-8.0beta2. > > > > Thanks very much. > > > Here is the test run with archiving enabled: > > http://www.osdl.org/projects/dbt2dev/results/dev4-010/158/ > > > > Here is the test run with archiving disabled: > > http://www.osdl.org/projects/dbt2dev/results/dev4-010/159/ > > > > > The overall throughput difference between the two runs with archiving > > enabled/disabled was within 1%. > > > > Excellent. I hoped it was that low - my target was < 5%. > > Stats check out with no wierdness in the results. TGFT. > > Also, I notice the tpm figures have gone up some more - have you got new > hardware, or has the PostgreSQL setup been tuned more? Or can it be that > rel8.0 really is that much faster?? It's actually lower than where I was when I started breaking tables out onto separate volumes. I suspect you may be looking at data from a different (and slower) system. Slightly old data from the same system are here:http://www.osdl.org/projects/dbt2dev/results/fs-64bit.html > > Here is sar/iostat/vmstat and oprofile data during the first hour of > > recovery. Total recovery time took about 6.5 hours: > > http://www.developer.osdl.org/markw/pitr/ > > > > That's bad news. My own recovery performance estimates would lead me to hope > that its possible to get the recovery to be quicker than the processes that > wrote the logs, even on a very quick 4 CPU system. I'd be hoping for ~1 > hour, or at least <= 4 hours. > > > I ran the test over a duration of 3 hours (including a 2 hour rampup of > > the driver), as opposed to the 6 hours you originally requested. I > > hope that is ok. > > > > System details, which you may be interested in: > > > > 4 x 1.5 GHz Itanium 2 > > 16GB RAM > > 6 x Compaq Computer Corporation Smart Array 64xx > > 6 x 14 disk 15K RPM drives (split bus) > > > > The database and archive directory were put onto a single LVM volume > > across all 84 drives. > > > > Let me know if I left anything out. > > > > First off, thank you again. > > I've had a look at all the results, but I found a few things: > > - couldnt find postgresql.conf or recovery.conf anywhere, so not sure what > OS command you are using For postgresql.conf parameters, I added "database parameters" link to a "SHOW ALL" command a little late, but it's there now and shows:archive_command | cp %p /opt/misc/archive/%f I've already lost the recovery.done file but I used the command:restore_command = 'cp /opt/misc/archive/%f %p' > - log files were very large indeed due to the SPI error messages, so I > haven't been able to download those properly for analysis - any chance you > could grep out the SPI stuff, so I can see the archive and restore commands? Ok, there should be a log-sans-spi.txt.gz available now. > Stats I'd be interested in for analysing recovery performance would be: > - how many log files in total were archived/restored I did a line count of "archived transaction log file" and got 7604. Unforunitely I don't have the output for the restore anymore. > - where were they archived to Into a separate directory on the same volume with the rest of the database. I'm starting to break things out into separate volumes again. Mark