Re: Column storage positions - Mailing list pgsql-hackers

From Sergey E. Koposov
Subject Re: Column storage positions
Date
Msg-id Pine.LNX.4.64.0702210042030.3994@lnfm1.sai.msu.ru
Whole thread Raw
In response to Column storage positions  ("Phil Currier" <pcurrier@gmail.com>)
List pgsql-hackers
Just as my 2 cents to the proposed idea. 
I want to demonstrate that the proposed idea is very relevant for the
performance.

I recently did an migration from PG 8.1 to PG 8.2. During that time I was 
dumping the 2TB database with several very wide tables (having ~ 200 
columns). And I saw that on my pretty powerful server with 8Gb 
RAM, Itanium2 procesor,large RAID which can do I/O at 100Mb/sec the 
performance of pg_dump was CPU limited, and the read speed of the tables 
was 1-1.5mb/sec (leading to 2 week dumping time).

I was very surprised by these times, and profiled postgres to check the 
reason of that:
here is the top of gprof:  %   cumulative   self              self     total time   seconds   seconds    calls   s/call
 s/call  name 60.72     13.52    13.52  6769826     0.00     0.00  nocachegetattr 10.58     15.88     2.36  9035566
0.00    0.00  CopyAttributeOutText  7.22     17.49     1.61 65009457     0.00     0.00  CopySendData  6.34     18.90
1.41        1     1.41    22.21  CopyTo
 

So the main slow-down of the process was all this code recomputing the 
boundaries of the columns.... I checked that by removing one tiny varchar 
column and COALESCING all NULLs, and after that the performance of 
pg_dumping increased by more than a factor of 2!

I should have reported that experience earlier... but I hope that my 
observations can be useful in the context of the Phil's idea.

regards,    Sergey

*******************************************************************
Sergey E. Koposov
Max Planck Institute for Astronomy/Cambridge Institute for Astronomy/Sternberg Astronomical Institute
Tel: +49-6221-528-349
Web: http://lnfm1.sai.msu.ru/~math
E-mail: math@sai.msu.ru


pgsql-hackers by date:

Previous
From: "Nikolay Samokhvalov"
Date:
Subject: --enable-xml instead of --with-libxml?
Next
From: Robert Treat
Date:
Subject: Re: Column storage positions