Re: [PATCHES] WAL Performance Improvements - Mailing list pgsql-hackers

From Janardhana Reddy
Subject Re: [PATCHES] WAL Performance Improvements
Date
Msg-id 3C7B80A7.B9CF859C@mediaring.com.sg
Whole thread Raw
Responses Re: [PATCHES] WAL Performance Improvements
List pgsql-hackers
Helge Bahmann wrote:

> On Tue, 26 Feb 2002, Janardhana Reddy wrote:
> >    Test Results with Latest patch :
> >           environment:  Intel PC ,IDE (harddisk),Linux Kernel 2.4.0 (OS
> > Version). Single
> >                 connection is connected to the database and pumping
> > continously insert statements. each insert
> >                generates 160 bytes  to WAL Log.
>
> 8192:
> > Transaction Per Second :     332 TPS
> > Time Taken by fdatasync :  2160
>
> 4096:
> > Transaction Per Second : 435 TPS
> > Time Taken by fdatasync :  512
>
> Unforunately your timings are meaningless. Assuming you have a
> 10000rpm drive (that is, 166 rounds per second), it is physically
> impossible to write 332 or 435 times per second to the same location
> on the disk.
>
> So I guess your disk is performing write-caching and not really writing
> the data back when requested by fsync(). You may try to disable
> write caching and see if it makes a difference:
>
>   hdparm -W 0 /dev/hda
>
> But note that most (or even all) modern IDE drives will not disable write
> caching even when instructed to do so. You should try to repeat the timings
> using SCSI drives -- I guess you will not see any improvement here.
>
> Regards
> --
> Helge Bahmann <bahmann@math.tu-freiberg.de>             /| \__
> Network admin, systems programmer                      /_|____\
>                                                      _/\ |   __)
> $ ./configure                                        \\ \|__/__|
> checking whether build environment is sane... yes     \\/___/ |
> checking for AIX... no (we already did this)            |

i  have tested again but it gives the same result .
now i have tested with small program justing doing write and fdatasync by
changing
the  size of data in write call.  there is big difference in fdatasync time:
The  test program looks as below:
------------------------------------
#include <stdio.h>
#include <fcntl.h>
#include <sys/types.h>
#include  <sys/stat.h>
main()
{
  int fd;
  int i;
  int data_size;
  char buf[20000];
  fd=open("testdata",O_CREAT|O_WRONLY);
   data_size=8*1024 ;
   while (1)
   {
     i++;
     lseek(fd,0,SEEK_SET);
     write(fd,buf,data_size);
     fdatasync(fd);
     if (i%10000 ==0)
         {
           printf("---------------\nindex: %d size: %d :\n ",i,data_size);
           system("date");
         }
   }
}
=========================================================
Test1 : with test program   data_size= 8*1024
  the output looks as below:
  ./a.out
---------------
index: 134520000 size: 8192 :
Tue Feb 26 19:46:12 SGT 2002
 ---------------
index: 134530000 size: 8192 :
Tue Feb 26 19:46:51 SGT 2002
 ---------------
index: 134540000 size: 8192 :
Tue Feb 26 19:47:27 SGT 2002
 ---------------
index: 134550000 size: 8192 :
Tue Feb 26 19:48:04 SGT 2002
 ---------------
strace output:
     % time     seconds  usecs/call     calls    errors syscall
------ ----------- ----------- --------- --------- ----------------
 98.36   29.354861        3141      9347           fdatasync
  1.45    0.432686          46      9350           write
  0.15    0.044835           5      9348           lseek
  0.03    0.009375        9375         1           wait4
  0.00    0.001114        1114         1           vfork
  0.00    0.000140          35         4           rt_sigaction
  0.00    0.000007           4         2           rt_sigprocmask
------ ----------- ----------- --------- --------- ----------------
100.00   29.843018                 28053           total
======================================================================
Test2 :  with test program  , data_size=160
 the output looks as below:
---------------
index: 134520000 size: 160 :
Tue Feb 26 19:44:41 SGT 2002
 ---------------
index: 134530000 size: 160 :
Tue Feb 26 19:44:44 SGT 2002
 ---------------
index: 134540000 size: 160 :
Tue Feb 26 19:44:48 SGT 2002
 ---------------
index: 134550000 size: 160 :
Tue Feb 26 19:44:52 SGT 2002
 ---------------
index: 134560000 size: 160 :
Tue Feb 26 19:44:56 SGT 2002

strace output :
  strace -c -p 4741
% time     seconds  usecs/call     calls    errors syscall
------ ----------- ----------- --------- --------- ----------------
 95.54    5.672195         396     14328           fdatasync
  3.12    0.185227          13     14330           write
  1.16    0.069158           5     14329           lseek
  0.12    0.006927        6927         1           vfork
  0.05    0.003146        3146         1           wait4
  0.00    0.000020           5         4           rt_sigaction
  0.00    0.000007           4         2           rt_sigprocmask
------ ----------- ----------- --------- --------- ----------------
100.00    5.936680                 42995           total
======================================================================================

   SUMMARY :

 Test1: (data_size=   8192 , with test program)
        fdatasync time +write time:  3141+46 = 3187 usec/call
         Time taken for 10000 iterations:  nearly   40 seconds
  Test2 : (data_size = 160,  with test program)
         fdatasync time+write time: 396 +13 = 409 usec/call
          Time taken for 10000 iterations: nealy 4 seconds

When i test with database by  doing 10000 inserts which generates 160 bytes
into WAL Log :
    Test3: (without apllying the patch,  with  existing  database)
          10000 insert  = 30 seconds
            fdatasync time= 2160 usec
     Test4 : (with patch  apllied, with database)
          10000 inserts = 23 seconds
             fdatasync time= 512 usec

 what i don't understand is, in the test3  with  extisting postgres database
it takes
 the fdatasync time 2160 usec. but according to  Test1 it  takes 3141 usec
eventhough both the
  data size it write is 8192.  This  cause the difference in the results .
 The hard disk is not doing any write caching. While doing the fdatasync the
linux OS
writies only the dirty buffers (size=512 bytes) so it causes big difference in
fdatsync from
 396 usec to 3141 usec .

Regards
jana





pgsql-hackers by date:

Previous
From: Tatsuo Ishii
Date:
Subject: Re: [BUGS] COPY FROM is not 8bit clean
Next
From: Thomas Lockhart
Date:
Subject: quotes in SET grammar