Thread: Performance Bottleneck

Performance Bottleneck

From

Martin Foster

Date:

03 August 2004, 15:19:35

I run a Perl/CGI driven website that makes extensive use of PostgreSQL
(7.4.3) for everything from user information to formatting and display
of specific sections of the site.   The server itself, is a dual
processor AMD Opteron 1.4Ghz w/ 2GB Ram and 2 x 120GB hard drives
mirrored for redundancy running under FreeBSD 5.2.1 (AMD64).

Recently loads on the site have increased during peak hours to the point
of showing considerable loss in performance.    This can be observed
when connections move from the 120 concurrent connections to PostgreSQL
to roughly 175 or more.     Essentially, the machine seems to struggle
to keep up with continual requests and slows down respectively as
resources are tied down.

Code changes have been made to the scripts to essentially back off in
high load working environments which have worked to an extent.
However, as loads continue to increase the database itself is not taking
well to the increased traffic taking place.

Having taken a look at 'Tuning PostgreSQL for Performance'
(http://www.varlena.com/GeneralBits/Tidbits/perf.html) using it as best
I could in order to set my settings.    However, even with statistics
disabled and ever setting tweaked things still consider to deteriorate.

Is there anything anyone can recommend in order to give the system a
necessary speed boost?   It would seem to me that a modest dataset of
roughly a Gig combined with that type of hardware should be able to
handle substantially more load then what it is.  Can anyone provide me
with clues as where to pursue?    Would disabling 'fsync' provide more
performance if I choose that information may be lost in case of a crash?

If anyone needs access to logs, settings et cetera.   Please ask, I
simply wish to test the waters first on what is needed. Thanks!

    Martin Foster
    martin@ethereal-realms.org

Re: Performance Bottleneck

From

Martin Foster

Date:

03 August 2004, 15:35:56

Scott Marlowe wrote:

> On Tue, 2004-08-03 at 12:05, Martin Foster wrote:
>
>>I run a Perl/CGI driven website that makes extensive use of PostgreSQL
>>(7.4.3) for everything from user information to formatting and display
>>of specific sections of the site.   The server itself, is a dual
>>processor AMD Opteron 1.4Ghz w/ 2GB Ram and 2 x 120GB hard drives
>>mirrored for redundancy running under FreeBSD 5.2.1 (AMD64).
>>
>>Recently loads on the site have increased during peak hours to the point
>>of showing considerable loss in performance.    This can be observed
>>when connections move from the 120 concurrent connections to PostgreSQL
>>to roughly 175 or more.     Essentially, the machine seems to struggle
>>to keep up with continual requests and slows down respectively as
>>resources are tied down.
>>
>>Code changes have been made to the scripts to essentially back off in
>>high load working environments which have worked to an extent.
>>However, as loads continue to increase the database itself is not taking
>>well to the increased traffic taking place.
>>
>>Having taken a look at 'Tuning PostgreSQL for Performance'
>>(http://www.varlena.com/GeneralBits/Tidbits/perf.html) using it as best
>>I could in order to set my settings.    However, even with statistics
>>disabled and ever setting tweaked things still consider to deteriorate.
>>
>>Is there anything anyone can recommend in order to give the system a
>>necessary speed boost?   It would seem to me that a modest dataset of
>>roughly a Gig combined with that type of hardware should be able to
>>handle substantially more load then what it is.  Can anyone provide me
>>with clues as where to pursue?    Would disabling 'fsync' provide more
>>performance if I choose that information may be lost in case of a crash?
>>
>>If anyone needs access to logs, settings et cetera.   Please ask, I
>>simply wish to test the waters first on what is needed. Thanks!
>
>
> Are you properly vacuuming the database and have you set your fsm
> settings high enough?  You may need to do a vacuum full first to reclaim
> lost space.
>
> Post your postgresql.conf settings if you can.
>

Actually, I have it do a VACUUM FULL ANALYZE nightly now.   On a weekly
basis, it would take about five minutes to run the maintenance process,
now it takes less then a minute.

I was also using the pg_autovacuum daemon with little to no performance
improvements overall when loads were light, but heavy penalties during
high load averages.   This has since been disabled in favour of the
existing script maintenance that I put into place already.

    Martin Foster
    Creator/Designer Ethereal Realms
    martin@ethereal-realms.org

Re: Performance Bottleneck

From

"Scott Marlowe"

Date:

03 August 2004, 15:39:27

On Tue, 2004-08-03 at 12:05, Martin Foster wrote:
> I run a Perl/CGI driven website that makes extensive use of PostgreSQL
> (7.4.3) for everything from user information to formatting and display
> of specific sections of the site.   The server itself, is a dual
> processor AMD Opteron 1.4Ghz w/ 2GB Ram and 2 x 120GB hard drives
> mirrored for redundancy running under FreeBSD 5.2.1 (AMD64).
>
> Recently loads on the site have increased during peak hours to the point
> of showing considerable loss in performance.    This can be observed
> when connections move from the 120 concurrent connections to PostgreSQL
> to roughly 175 or more.     Essentially, the machine seems to struggle
> to keep up with continual requests and slows down respectively as
> resources are tied down.
>
> Code changes have been made to the scripts to essentially back off in
> high load working environments which have worked to an extent.
> However, as loads continue to increase the database itself is not taking
> well to the increased traffic taking place.
>
> Having taken a look at 'Tuning PostgreSQL for Performance'
> (http://www.varlena.com/GeneralBits/Tidbits/perf.html) using it as best
> I could in order to set my settings.    However, even with statistics
> disabled and ever setting tweaked things still consider to deteriorate.
>
> Is there anything anyone can recommend in order to give the system a
> necessary speed boost?   It would seem to me that a modest dataset of
> roughly a Gig combined with that type of hardware should be able to
> handle substantially more load then what it is.  Can anyone provide me
> with clues as where to pursue?    Would disabling 'fsync' provide more
> performance if I choose that information may be lost in case of a crash?
>
> If anyone needs access to logs, settings et cetera.   Please ask, I
> simply wish to test the waters first on what is needed. Thanks!

Are you properly vacuuming the database and have you set your fsm
settings high enough?  You may need to do a vacuum full first to reclaim
lost space.

Post your postgresql.conf settings if you can.

Re: Performance Bottleneck

From

Oliver Fromme

Date:

04 August 2004, 06:57:40

Martin Foster wrote:
 > I run a Perl/CGI driven website that makes extensive use of PostgreSQL
 > (7.4.3) for everything from user information to formatting and display
 > of specific sections of the site.   The server itself, is a dual
 > processor AMD Opteron 1.4Ghz w/ 2GB Ram and 2 x 120GB hard drives
 > mirrored for redundancy running under FreeBSD 5.2.1 (AMD64).
 >
 > Recently loads on the site have increased during peak hours to the point
 > of showing considerable loss in performance.    This can be observed
 > when connections move from the 120 concurrent connections to PostgreSQL
 > to roughly 175 or more.     Essentially, the machine seems to struggle
 > to keep up with continual requests and slows down respectively as
 > resources are tied down.

There are several aspects which are performance-related.

First, try to find out exactly where the bottleneck is.
Are you sure that it's the database server backends?
(I.e. not the web server, not the perl CGIs or anything
else?  I assume you use mod_perl, do you?)

Are you running out of processor time, or out of RAM so
the machine starts swapping/paging, or is there an I/O
bandwidth bottleneck, or maybe it's even the network?
Commands like "top", "vmstat", "iostat", "netstat" etc.
will help finding out.  If "top" displays nearly 100%
"user CPU state", then the processor time is your problem,
otherwise it's more likely that I/O is the bottleneck.

As long as you have enough RAM available, yu can try
increasing FSM values etc. in your postgresql.conf (you
might need to increase the kernel limits on SysV shared
memory, too).

Disabling fsync, as you suggested, will improve performance
somewhat (especially if you have a lot of insert and update
commands), at the cost of increased risk of inconsistencies
and data loss in the case of a crash.  If you're willing to
go that far, you might as well mount your data filesystem
async (or with soft-updates), and with the "noatime" flag.

Second, there are several ways to tune FreeBSD's settings
for your particular application.  I suggest that you study
the tuning(7) manual page.  It covers a lot of topics,
raning from optimized newfs filesystem parameters to things
like network buffers, VFS settings and VM settings.  It
also contains pointers to several other related manual
pages.

Third, there might be room for optimizing your database
queries.  Use the explain analyze command to find out which
query plans are choosen for your select statements.  Maybe
there are ways you can improve on them, e.g. by creating
an appropriate index, by re-formulating the queries in a
different way, or by adapting your database design in a way
that suits your queries better and allows for improved
performance.  (If in doubt, you can post your queries in
this list, and I'm sure someone will have good advice for
you.)

Best regards
   Oliver

--
Oliver Fromme, secnetix GmbH & Co KG, Oettingenstr. 2, 80538 München
Any opinions expressed in this message may be personal to the author
and may not necessarily reflect the opinions of secnetix in any way.

"To this day, many C programmers believe that 'strong typing'
just means pounding extra hard on the keyboard."
        -- Peter van der Linden

Re: help with COPY from .csv file into database

From

michael@floog.net

Date:

04 August 2004, 11:40:06

I have tried to COPY the contents of the .csv file into my database
called TEST_DB, but I'm still doing something incorrectly.  Currently,
database TEST_DB has one table called "matters".

In my attempts listed below, I specified the name of each column/field
in my database and specified the delimiter used in the .csv file, but
still no go:

TEST_DB-# copy matters (client_1_lastname, client_1_firstname,
client_1_address, client_1_phone, client_2_lastname,
client_2_firstname, client_2_address, client_2_phone, carrier_name,
claim_rep, claim_num, responsible, supervisory, second, client_num,
matter_num, file_name, adverse_1_lastname, adverse_1_firstname,
adverse_2_lastname, adverse_2_firstname, adverse_3_lastname,
adverse_3_firstname, other_party_lastname, other_party_firstname,
case_type, billing_type, date_of_loss, judicial_district, date_open,
date_close) from '/home/floog/TEST_DB.csv' DELIMITER ',';

ERROR:  syntax error at or near "client_1_lastname" at character 2

TEST_DB=# copy matters client_1_lastname, client_1_firstname,
client_1_address, client_1_phone, client_2_lastname,
client_2_firstname, client_2_address, client_2_phone, carrier_name,
claim_rep, claim_num, responsible, supervisory, second, client_num,
matter_num, file_name, adverse_1_lastname, adverse_1_firstname,
adverse_2_lastname, adverse_2_firstname, adverse_3_lastname,
adverse_3_firstname, other_party_lastname, other_party_firstname,
case_type, billing_type, date_of_loss, judicial_district, date_open,
date_close from '/home/floog/TEST_DB.csv' DELIMITER ',';

ERROR:  syntax error at or near "client_1_lastname" at character 14

What did I do incorrectly?
Thank you for your time and assistance.

Mike

Re: help with COPY from .csv file into database

From

michael@floog.net

Date:

04 August 2004, 12:29:26

I have tried again to copy using a suggestion sent to me from another
new user on the list:

Attempt 1 --

TEST_DB=# copy matters from '/home/floog/TEST_DB.csv' delimiter ',';
ERROR:  could not open file "/home/floog/TEST_DB.csv" for reading:
Permission denied

Okay, so I chmod the file to be read/write by anyone -- (chmod 0777
/home/floog/TEST_DB.csv)

TEST_DB=# copy matters from '/home/floog/TEST_DB.csv' delimiter ',';
ERROR:  could not open file "/home/floog/TEST_DB.csv" for reading:
Permission denied

Still no go so I copied the file into my PGDATA directory -
/usr/local/postgres

TEST_DB=# copy matters from '/usr/local/postgres/TEST_DB.csv' delimiter
',';
ERROR:  invalid input syntax for integer: ""CFW""
CONTEXT:  COPY matters, line 1, column matter_num: ""CFW""
TEST_DB=# select * from matters;
 client_1_lastname | client_1_firstname | client_1_address |
client_1_phone | client_2_lastname | client_2_firstname |
client_2_address | client_2_phone | carrier_name | claim_rep |
claim_num | responsible | supervisory | second | client_num |
matter_num | file_name | adverse_1_lastname | adverse_1_firstname |
adverse_2_lastname | adverse_2_firstname | adverse_3_lastname |
adverse_3_firstname | other_party_lastname | other_party_firstname |
case_type | billing_type | date_of_loss | judicial_district | date_open
| date_close
-------------------+--------------------+------------------+----------------
+-------------------+--------------------+------------------+--------------
--+--------------+-----------+-----------+-------------+-------------+-
-------+------------+------------+-----------+--------------------+----
-----------------+--------------------+---------------------+---------
-----------+---------------------+----------------------+------------
-----------+-----------+--------------+--------------+---------------
----+-----------+------------
(0 rows)

TEST_DB=#

I understand the first error about the "CFW" being in a field designated
in an integer format.  That's okay.  CFW are the initials of a person
at my office and that's just a data entry error -- wrong data in the
wrong column.

But I don't understand why none of the data from the .csv file was
copied to the database.  SELECT * FROM matters - results in nothing but
a line of hyphens.

Yikes, I can't get off the ground.  I can't imagine manually inserting 2
megs. of stuff one entry at a time.  Please help if you can.

I appreciate your time and patience.

Regards,

Mike

Re: help with COPY from .csv file into database

From

Oliver Elphick

Date:

04 August 2004, 12:52:50

On Wed, 2004-08-04 at 16:29, michael@floog.net wrote:
> I have tried again to copy using a suggestion sent to me from another
> new user on the list:
>
> Attempt 1 --
>
> TEST_DB=# copy matters from '/home/floog/TEST_DB.csv' delimiter ',';
> ERROR:  could not open file "/home/floog/TEST_DB.csv" for reading:
> Permission denied
>
> Okay, so I chmod the file to be read/write by anyone -- (chmod 0777
> /home/floog/TEST_DB.csv)
>
> TEST_DB=# copy matters from '/home/floog/TEST_DB.csv' delimiter ',';
> ERROR:  could not open file "/home/floog/TEST_DB.csv" for reading:
> Permission denied

You also have to make the whole path accessible; presumably the
directory /home/floog is not accessible to the postgres user.  (The
backend runs with the permission of the user who starts the postmaster,
which is commonly postgres.)

> Still no go so I copied the file into my PGDATA directory -
> /usr/local/postgres
>
> TEST_DB=# copy matters from '/usr/local/postgres/TEST_DB.csv' delimiter
> ',';
> ERROR:  invalid input syntax for integer: ""CFW""
> CONTEXT:  COPY matters, line 1, column matter_num: ""CFW""
> TEST_DB=# select * from matters;
...
> (0 rows)
>
> I understand the first error about the "CFW" being in a field designated
> in an integer format.  That's okay.  CFW are the initials of a person
> at my office and that's just a data entry error -- wrong data in the
> wrong column.
>
> But I don't understand why none of the data from the .csv file was
> copied to the database.  SELECT * FROM matters - results in nothing but
> a line of hyphens.

COPY happens in a transaction; either all of the data goes in or none of
it does.

--
Oliver Elphick                                          olly@lfix.co.uk
Isle of Wight                              http://www.lfix.co.uk/oliver
GPG: 1024D/A54310EA  92C8 39E7 280E 3631 3F0E  1EC0 5664 7A2F A543 10EA
                 ========================================
     "And not only so, but we glory in tribulations also;
      knowing that tribulation worketh patience; And
      patience, experience; and experience, hope."
                                        Romans 5:3,4

Re: help with COPY from .csv file into database

From

Oliver Elphick

Date:

04 August 2004, 12:57:43

On Wed, 2004-08-04 at 15:40, michael@floog.net wrote:
> I have tried to COPY the contents of the .csv file into my database
> called TEST_DB, but I'm still doing something incorrectly.  Currently,
> database TEST_DB has one table called "matters".
>
> In my attempts listed below, I specified the name of each column/field
> in my database and specified the delimiter used in the .csv file, but
> still no go:
>
> TEST_DB-# copy matters (client_1_lastname, client_1_firstname,
...
> date_close) from '/home/floog/TEST_DB.csv' DELIMITER ',';
>
> ERROR:  syntax error at or near "client_1_lastname" at character 2
...
>
> What did I do incorrectly?

You had a previous line without a terminating semi-colon.  Notice that
the prompt (TEST_DB-# rather than TEST_DB=#) indicates that this is a
continuation line.

--
Oliver Elphick                                          olly@lfix.co.uk
Isle of Wight                              http://www.lfix.co.uk/oliver
GPG: 1024D/A54310EA  92C8 39E7 280E 3631 3F0E  1EC0 5664 7A2F A543 10EA
                 ========================================
     "And not only so, but we glory in tribulations also;
      knowing that tribulation worketh patience; And
      patience, experience; and experience, hope."
                                        Romans 5:3,4

Re: help with COPY from .csv file into database

From

michael@floog.net

Date:

04 August 2004, 13:16:40

Hi Oliver,

This time the copy command appeared to work, except for the "CFW" entry
in the integer-formatted column, but then trying to view the data shows
the long hyphenated line again and no readable data.

What do you think is causing this?
Thank you very much hanging in there with me on this issue.

Mike

TEST_DB=# copy matters (client_1_lastname, client_1_firstname,

<<<snip>>

TEST_DB(# case_type, billing_type, date_of_loss, judicial_district,
date_open,
TEST_DB(# date_close) from '/usr/local/postgres/TEST_DB.csv' DELIMITER
',';

ERROR:  invalid input syntax for integer: ""CFW""
CONTEXT:  COPY matters, line 1, column matter_num: ""CFW""

TEST_DB=# select * from matters;
 client_1_lastname | client_1_firstname | client_1_address |
client_1_phone | client_2_lastname | client_2_firstname |

<<<snip>>>

-------------------+--------------------+------------------+
----------------+-------------------+--------------------+--
----------------+----------------+--------------+-----------+
-----------+-------------+-------------+--------+------------
+------------+-----------+--------------------+--------------
-------+--------------------+---------------------+----------
----------+---------------------+----------------------+-----
------------------+-----------+--------------+--------------+
-------------------+-----------+------------
(0 rows)

TEST_DB=#


Quoting Oliver Elphick <olly@lfix.co.uk>:

> You had a previous line without a terminating semi-colon.  Notice
> that
> the prompt (TEST_DB-# rather than TEST_DB=#) indicates that this is
> a
> continuation line.
>

Re: help with COPY from .csv file into database

From

michael@floog.net

Date:

04 August 2004, 13:21:55

Ahhhh, I see.
The light is slowly beginning to filter through the cobwebs.  :-)

Let's hope the first steps are the toughest and most confusing.

Mike

Quoting Oliver Elphick <olly@lfix.co.uk>:

> COPY happens in a transaction; either all of the data goes in or none
> of
> it does.
>

Re: help with COPY from .csv file into database

From

Oliver Elphick

Date:

04 August 2004, 13:22:53

On Wed, 2004-08-04 at 17:16, michael@floog.net wrote:
> Hi Oliver,
>
> This time the copy command appeared to work, except for the "CFW" entry
> in the integer-formatted column, but then trying to view the data shows
> the long hyphenated line again and no readable data.
>
> What do you think is causing this?

Precisely that.  Either all the data goes in or none of it does.

Clean up your input file and it will work.

--
Oliver Elphick                                          olly@lfix.co.uk
Isle of Wight                              http://www.lfix.co.uk/oliver
GPG: 1024D/A54310EA  92C8 39E7 280E 3631 3F0E  1EC0 5664 7A2F A543 10EA
                 ========================================
     "And not only so, but we glory in tribulations also;
      knowing that tribulation worketh patience; And
      patience, experience; and experience, hope."
                                        Romans 5:3,4

Re: help with COPY from .csv file into database

From

michael@floog.net

Date:

04 August 2004, 15:16:51

Definitely will do this when I get home from work tonight.
Right now I'm ssh'ing into my box at home, but don't have an X-windows
server installed on my Win-bloze machine here at work.  It's too
difficult to edit the .csv file from a console with vim.  I need
x-windows so I can run OpenOffice spreadsheet to help see the file
better.

I'm wondering If I may need to reconfigure the fields using CREATE
TABLE, and add NULL to each field.  I know that the .csv file
definitely has blank cells scattered throughout that were never filled
in with text or integers.  I wonder if subsequent cell data is getting
moved because COPY won't accept empty space between comma separated
values.  Or is this not an issue.

Mike


Quoting Oliver Elphick <olly@lfix.co.uk>:

>
> Precisely that.  Either all the data goes in or none of it does.
>
> Clean up your input file and it will work.
>

Re: help with COPY from .csv file into database

From

Oliver Elphick

Date:

04 August 2004, 16:09:38

On Wed, 2004-08-04 at 19:15, michael@floog.net wrote:
> Definitely will do this when I get home from work tonight.
> Right now I'm ssh'ing into my box at home, but don't have an X-windows
> server installed on my Win-bloze machine here at work.  It's too
> difficult to edit the .csv file from a console with vim.  I need
> x-windows so I can run OpenOffice spreadsheet to help see the file
> better.
>
> I'm wondering If I may need to reconfigure the fields using CREATE
> TABLE, and add NULL to each field.  I know that the .csv file
> definitely has blank cells scattered throughout that were never filled
> in with text or integers.  I wonder if subsequent cell data is getting
> moved because COPY won't accept empty space between comma separated
> values.  Or is this not an issue.

COPY will accept empty space, but it will treat an empty space as an
empty string, which is not the same as NULL.  If that is meant to go in
an integer or date field, it will be invalid.

Use \N to represent nulls in the input file (or use some other
combination and specify it when invoking COPY).

> Quoting Oliver Elphick <olly@lfix.co.uk>:
>
> >
> > Precisely that.  Either all the data goes in or none of it does.
> >
> > Clean up your input file and it will work.
> >
>
--
Oliver Elphick                                          olly@lfix.co.uk
Isle of Wight                              http://www.lfix.co.uk/oliver
GPG: 1024D/A54310EA  92C8 39E7 280E 3631 3F0E  1EC0 5664 7A2F A543 10EA
                 ========================================
     "And not only so, but we glory in tribulations also;
      knowing that tribulation worketh patience; And
      patience, experience; and experience, hope."
                                        Romans 5:3,4

Re: Performance Bottleneck

From

Kenneth Marshall

Date:

06 August 2004, 01:14:00

On Tue, Aug 03, 2004 at 06:05:23PM +0000, Martin Foster wrote:
> I run a Perl/CGI driven website that makes extensive use of PostgreSQL
> (7.4.3) for everything from user information to formatting and display
> of specific sections of the site.   The server itself, is a dual
> processor AMD Opteron 1.4Ghz w/ 2GB Ram and 2 x 120GB hard drives
> mirrored for redundancy running under FreeBSD 5.2.1 (AMD64).
>
> Recently loads on the site have increased during peak hours to the point
> of showing considerable loss in performance.    This can be observed
> when connections move from the 120 concurrent connections to PostgreSQL
> to roughly 175 or more.     Essentially, the machine seems to struggle
> to keep up with continual requests and slows down respectively as
> resources are tied down.
>
> Code changes have been made to the scripts to essentially back off in
> high load working environments which have worked to an extent.
> However, as loads continue to increase the database itself is not taking
> well to the increased traffic taking place.
>
> Having taken a look at 'Tuning PostgreSQL for Performance'
> (http://www.varlena.com/GeneralBits/Tidbits/perf.html) using it as best
> I could in order to set my settings.    However, even with statistics
> disabled and ever setting tweaked things still consider to deteriorate.
>
> Is there anything anyone can recommend in order to give the system a
> necessary speed boost?   It would seem to me that a modest dataset of
> roughly a Gig combined with that type of hardware should be able to
> handle substantially more load then what it is.  Can anyone provide me
> with clues as where to pursue?    Would disabling 'fsync' provide more
> performance if I choose that information may be lost in case of a crash?
>
> If anyone needs access to logs, settings et cetera.   Please ask, I
> simply wish to test the waters first on what is needed. Thanks!
>
>     Martin Foster
>     martin@ethereal-realms.org
>
> ---------------------------(end of broadcast)---------------------------
> TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org

Martin,

Maybe a connection pool middleware like pgpool could help. This would
allow a lower number of backends to serve the same frontends more
effectively and reduce the connection startup costs.

--Ken