Thread: Performance Bottleneck
I run a Perl/CGI driven website that makes extensive use of PostgreSQL (7.4.3) for everything from user information to formatting and display of specific sections of the site. The server itself, is a dual processor AMD Opteron 1.4Ghz w/ 2GB Ram and 2 x 120GB hard drives mirrored for redundancy running under FreeBSD 5.2.1 (AMD64). Recently loads on the site have increased during peak hours to the point of showing considerable loss in performance. This can be observed when connections move from the 120 concurrent connections to PostgreSQL to roughly 175 or more. Essentially, the machine seems to struggle to keep up with continual requests and slows down respectively as resources are tied down. Code changes have been made to the scripts to essentially back off in high load working environments which have worked to an extent. However, as loads continue to increase the database itself is not taking well to the increased traffic taking place. Having taken a look at 'Tuning PostgreSQL for Performance' (http://www.varlena.com/GeneralBits/Tidbits/perf.html) using it as best I could in order to set my settings. However, even with statistics disabled and ever setting tweaked things still consider to deteriorate. Is there anything anyone can recommend in order to give the system a necessary speed boost? It would seem to me that a modest dataset of roughly a Gig combined with that type of hardware should be able to handle substantially more load then what it is. Can anyone provide me with clues as where to pursue? Would disabling 'fsync' provide more performance if I choose that information may be lost in case of a crash? If anyone needs access to logs, settings et cetera. Please ask, I simply wish to test the waters first on what is needed. Thanks! Martin Foster martin@ethereal-realms.org
Scott Marlowe wrote: > On Tue, 2004-08-03 at 12:05, Martin Foster wrote: > >>I run a Perl/CGI driven website that makes extensive use of PostgreSQL >>(7.4.3) for everything from user information to formatting and display >>of specific sections of the site. The server itself, is a dual >>processor AMD Opteron 1.4Ghz w/ 2GB Ram and 2 x 120GB hard drives >>mirrored for redundancy running under FreeBSD 5.2.1 (AMD64). >> >>Recently loads on the site have increased during peak hours to the point >>of showing considerable loss in performance. This can be observed >>when connections move from the 120 concurrent connections to PostgreSQL >>to roughly 175 or more. Essentially, the machine seems to struggle >>to keep up with continual requests and slows down respectively as >>resources are tied down. >> >>Code changes have been made to the scripts to essentially back off in >>high load working environments which have worked to an extent. >>However, as loads continue to increase the database itself is not taking >>well to the increased traffic taking place. >> >>Having taken a look at 'Tuning PostgreSQL for Performance' >>(http://www.varlena.com/GeneralBits/Tidbits/perf.html) using it as best >>I could in order to set my settings. However, even with statistics >>disabled and ever setting tweaked things still consider to deteriorate. >> >>Is there anything anyone can recommend in order to give the system a >>necessary speed boost? It would seem to me that a modest dataset of >>roughly a Gig combined with that type of hardware should be able to >>handle substantially more load then what it is. Can anyone provide me >>with clues as where to pursue? Would disabling 'fsync' provide more >>performance if I choose that information may be lost in case of a crash? >> >>If anyone needs access to logs, settings et cetera. Please ask, I >>simply wish to test the waters first on what is needed. Thanks! > > > Are you properly vacuuming the database and have you set your fsm > settings high enough? You may need to do a vacuum full first to reclaim > lost space. > > Post your postgresql.conf settings if you can. > Actually, I have it do a VACUUM FULL ANALYZE nightly now. On a weekly basis, it would take about five minutes to run the maintenance process, now it takes less then a minute. I was also using the pg_autovacuum daemon with little to no performance improvements overall when loads were light, but heavy penalties during high load averages. This has since been disabled in favour of the existing script maintenance that I put into place already. Martin Foster Creator/Designer Ethereal Realms martin@ethereal-realms.org
On Tue, 2004-08-03 at 12:05, Martin Foster wrote: > I run a Perl/CGI driven website that makes extensive use of PostgreSQL > (7.4.3) for everything from user information to formatting and display > of specific sections of the site. The server itself, is a dual > processor AMD Opteron 1.4Ghz w/ 2GB Ram and 2 x 120GB hard drives > mirrored for redundancy running under FreeBSD 5.2.1 (AMD64). > > Recently loads on the site have increased during peak hours to the point > of showing considerable loss in performance. This can be observed > when connections move from the 120 concurrent connections to PostgreSQL > to roughly 175 or more. Essentially, the machine seems to struggle > to keep up with continual requests and slows down respectively as > resources are tied down. > > Code changes have been made to the scripts to essentially back off in > high load working environments which have worked to an extent. > However, as loads continue to increase the database itself is not taking > well to the increased traffic taking place. > > Having taken a look at 'Tuning PostgreSQL for Performance' > (http://www.varlena.com/GeneralBits/Tidbits/perf.html) using it as best > I could in order to set my settings. However, even with statistics > disabled and ever setting tweaked things still consider to deteriorate. > > Is there anything anyone can recommend in order to give the system a > necessary speed boost? It would seem to me that a modest dataset of > roughly a Gig combined with that type of hardware should be able to > handle substantially more load then what it is. Can anyone provide me > with clues as where to pursue? Would disabling 'fsync' provide more > performance if I choose that information may be lost in case of a crash? > > If anyone needs access to logs, settings et cetera. Please ask, I > simply wish to test the waters first on what is needed. Thanks! Are you properly vacuuming the database and have you set your fsm settings high enough? You may need to do a vacuum full first to reclaim lost space. Post your postgresql.conf settings if you can.
Martin Foster wrote: > I run a Perl/CGI driven website that makes extensive use of PostgreSQL > (7.4.3) for everything from user information to formatting and display > of specific sections of the site. The server itself, is a dual > processor AMD Opteron 1.4Ghz w/ 2GB Ram and 2 x 120GB hard drives > mirrored for redundancy running under FreeBSD 5.2.1 (AMD64). > > Recently loads on the site have increased during peak hours to the point > of showing considerable loss in performance. This can be observed > when connections move from the 120 concurrent connections to PostgreSQL > to roughly 175 or more. Essentially, the machine seems to struggle > to keep up with continual requests and slows down respectively as > resources are tied down. There are several aspects which are performance-related. First, try to find out exactly where the bottleneck is. Are you sure that it's the database server backends? (I.e. not the web server, not the perl CGIs or anything else? I assume you use mod_perl, do you?) Are you running out of processor time, or out of RAM so the machine starts swapping/paging, or is there an I/O bandwidth bottleneck, or maybe it's even the network? Commands like "top", "vmstat", "iostat", "netstat" etc. will help finding out. If "top" displays nearly 100% "user CPU state", then the processor time is your problem, otherwise it's more likely that I/O is the bottleneck. As long as you have enough RAM available, yu can try increasing FSM values etc. in your postgresql.conf (you might need to increase the kernel limits on SysV shared memory, too). Disabling fsync, as you suggested, will improve performance somewhat (especially if you have a lot of insert and update commands), at the cost of increased risk of inconsistencies and data loss in the case of a crash. If you're willing to go that far, you might as well mount your data filesystem async (or with soft-updates), and with the "noatime" flag. Second, there are several ways to tune FreeBSD's settings for your particular application. I suggest that you study the tuning(7) manual page. It covers a lot of topics, raning from optimized newfs filesystem parameters to things like network buffers, VFS settings and VM settings. It also contains pointers to several other related manual pages. Third, there might be room for optimizing your database queries. Use the explain analyze command to find out which query plans are choosen for your select statements. Maybe there are ways you can improve on them, e.g. by creating an appropriate index, by re-formulating the queries in a different way, or by adapting your database design in a way that suits your queries better and allows for improved performance. (If in doubt, you can post your queries in this list, and I'm sure someone will have good advice for you.) Best regards Oliver -- Oliver Fromme, secnetix GmbH & Co KG, Oettingenstr. 2, 80538 München Any opinions expressed in this message may be personal to the author and may not necessarily reflect the opinions of secnetix in any way. "To this day, many C programmers believe that 'strong typing' just means pounding extra hard on the keyboard." -- Peter van der Linden
I have tried to COPY the contents of the .csv file into my database called TEST_DB, but I'm still doing something incorrectly. Currently, database TEST_DB has one table called "matters". In my attempts listed below, I specified the name of each column/field in my database and specified the delimiter used in the .csv file, but still no go: TEST_DB-# copy matters (client_1_lastname, client_1_firstname, client_1_address, client_1_phone, client_2_lastname, client_2_firstname, client_2_address, client_2_phone, carrier_name, claim_rep, claim_num, responsible, supervisory, second, client_num, matter_num, file_name, adverse_1_lastname, adverse_1_firstname, adverse_2_lastname, adverse_2_firstname, adverse_3_lastname, adverse_3_firstname, other_party_lastname, other_party_firstname, case_type, billing_type, date_of_loss, judicial_district, date_open, date_close) from '/home/floog/TEST_DB.csv' DELIMITER ','; ERROR: syntax error at or near "client_1_lastname" at character 2 TEST_DB=# copy matters client_1_lastname, client_1_firstname, client_1_address, client_1_phone, client_2_lastname, client_2_firstname, client_2_address, client_2_phone, carrier_name, claim_rep, claim_num, responsible, supervisory, second, client_num, matter_num, file_name, adverse_1_lastname, adverse_1_firstname, adverse_2_lastname, adverse_2_firstname, adverse_3_lastname, adverse_3_firstname, other_party_lastname, other_party_firstname, case_type, billing_type, date_of_loss, judicial_district, date_open, date_close from '/home/floog/TEST_DB.csv' DELIMITER ','; ERROR: syntax error at or near "client_1_lastname" at character 14 What did I do incorrectly? Thank you for your time and assistance. Mike
I have tried again to copy using a suggestion sent to me from another new user on the list: Attempt 1 -- TEST_DB=# copy matters from '/home/floog/TEST_DB.csv' delimiter ','; ERROR: could not open file "/home/floog/TEST_DB.csv" for reading: Permission denied Okay, so I chmod the file to be read/write by anyone -- (chmod 0777 /home/floog/TEST_DB.csv) TEST_DB=# copy matters from '/home/floog/TEST_DB.csv' delimiter ','; ERROR: could not open file "/home/floog/TEST_DB.csv" for reading: Permission denied Still no go so I copied the file into my PGDATA directory - /usr/local/postgres TEST_DB=# copy matters from '/usr/local/postgres/TEST_DB.csv' delimiter ','; ERROR: invalid input syntax for integer: ""CFW"" CONTEXT: COPY matters, line 1, column matter_num: ""CFW"" TEST_DB=# select * from matters; client_1_lastname | client_1_firstname | client_1_address | client_1_phone | client_2_lastname | client_2_firstname | client_2_address | client_2_phone | carrier_name | claim_rep | claim_num | responsible | supervisory | second | client_num | matter_num | file_name | adverse_1_lastname | adverse_1_firstname | adverse_2_lastname | adverse_2_firstname | adverse_3_lastname | adverse_3_firstname | other_party_lastname | other_party_firstname | case_type | billing_type | date_of_loss | judicial_district | date_open | date_close -------------------+--------------------+------------------+---------------- +-------------------+--------------------+------------------+-------------- --+--------------+-----------+-----------+-------------+-------------+- -------+------------+------------+-----------+--------------------+---- -----------------+--------------------+---------------------+--------- -----------+---------------------+----------------------+------------ -----------+-----------+--------------+--------------+--------------- ----+-----------+------------ (0 rows) TEST_DB=# I understand the first error about the "CFW" being in a field designated in an integer format. That's okay. CFW are the initials of a person at my office and that's just a data entry error -- wrong data in the wrong column. But I don't understand why none of the data from the .csv file was copied to the database. SELECT * FROM matters - results in nothing but a line of hyphens. Yikes, I can't get off the ground. I can't imagine manually inserting 2 megs. of stuff one entry at a time. Please help if you can. I appreciate your time and patience. Regards, Mike
On Wed, 2004-08-04 at 16:29, michael@floog.net wrote: > I have tried again to copy using a suggestion sent to me from another > new user on the list: > > Attempt 1 -- > > TEST_DB=# copy matters from '/home/floog/TEST_DB.csv' delimiter ','; > ERROR: could not open file "/home/floog/TEST_DB.csv" for reading: > Permission denied > > Okay, so I chmod the file to be read/write by anyone -- (chmod 0777 > /home/floog/TEST_DB.csv) > > TEST_DB=# copy matters from '/home/floog/TEST_DB.csv' delimiter ','; > ERROR: could not open file "/home/floog/TEST_DB.csv" for reading: > Permission denied You also have to make the whole path accessible; presumably the directory /home/floog is not accessible to the postgres user. (The backend runs with the permission of the user who starts the postmaster, which is commonly postgres.) > Still no go so I copied the file into my PGDATA directory - > /usr/local/postgres > > TEST_DB=# copy matters from '/usr/local/postgres/TEST_DB.csv' delimiter > ','; > ERROR: invalid input syntax for integer: ""CFW"" > CONTEXT: COPY matters, line 1, column matter_num: ""CFW"" > TEST_DB=# select * from matters; ... > (0 rows) > > I understand the first error about the "CFW" being in a field designated > in an integer format. That's okay. CFW are the initials of a person > at my office and that's just a data entry error -- wrong data in the > wrong column. > > But I don't understand why none of the data from the .csv file was > copied to the database. SELECT * FROM matters - results in nothing but > a line of hyphens. COPY happens in a transaction; either all of the data goes in or none of it does. -- Oliver Elphick olly@lfix.co.uk Isle of Wight http://www.lfix.co.uk/oliver GPG: 1024D/A54310EA 92C8 39E7 280E 3631 3F0E 1EC0 5664 7A2F A543 10EA ======================================== "And not only so, but we glory in tribulations also; knowing that tribulation worketh patience; And patience, experience; and experience, hope." Romans 5:3,4
On Wed, 2004-08-04 at 15:40, michael@floog.net wrote: > I have tried to COPY the contents of the .csv file into my database > called TEST_DB, but I'm still doing something incorrectly. Currently, > database TEST_DB has one table called "matters". > > In my attempts listed below, I specified the name of each column/field > in my database and specified the delimiter used in the .csv file, but > still no go: > > TEST_DB-# copy matters (client_1_lastname, client_1_firstname, ... > date_close) from '/home/floog/TEST_DB.csv' DELIMITER ','; > > ERROR: syntax error at or near "client_1_lastname" at character 2 ... > > What did I do incorrectly? You had a previous line without a terminating semi-colon. Notice that the prompt (TEST_DB-# rather than TEST_DB=#) indicates that this is a continuation line. -- Oliver Elphick olly@lfix.co.uk Isle of Wight http://www.lfix.co.uk/oliver GPG: 1024D/A54310EA 92C8 39E7 280E 3631 3F0E 1EC0 5664 7A2F A543 10EA ======================================== "And not only so, but we glory in tribulations also; knowing that tribulation worketh patience; And patience, experience; and experience, hope." Romans 5:3,4
Hi Oliver, This time the copy command appeared to work, except for the "CFW" entry in the integer-formatted column, but then trying to view the data shows the long hyphenated line again and no readable data. What do you think is causing this? Thank you very much hanging in there with me on this issue. Mike TEST_DB=# copy matters (client_1_lastname, client_1_firstname, <<<snip>> TEST_DB(# case_type, billing_type, date_of_loss, judicial_district, date_open, TEST_DB(# date_close) from '/usr/local/postgres/TEST_DB.csv' DELIMITER ','; ERROR: invalid input syntax for integer: ""CFW"" CONTEXT: COPY matters, line 1, column matter_num: ""CFW"" TEST_DB=# select * from matters; client_1_lastname | client_1_firstname | client_1_address | client_1_phone | client_2_lastname | client_2_firstname | <<<snip>>> -------------------+--------------------+------------------+ ----------------+-------------------+--------------------+-- ----------------+----------------+--------------+-----------+ -----------+-------------+-------------+--------+------------ +------------+-----------+--------------------+-------------- -------+--------------------+---------------------+---------- ----------+---------------------+----------------------+----- ------------------+-----------+--------------+--------------+ -------------------+-----------+------------ (0 rows) TEST_DB=# Quoting Oliver Elphick <olly@lfix.co.uk>: > You had a previous line without a terminating semi-colon. Notice > that > the prompt (TEST_DB-# rather than TEST_DB=#) indicates that this is > a > continuation line. >
Ahhhh, I see. The light is slowly beginning to filter through the cobwebs. :-) Let's hope the first steps are the toughest and most confusing. Mike Quoting Oliver Elphick <olly@lfix.co.uk>: > COPY happens in a transaction; either all of the data goes in or none > of > it does. >
On Wed, 2004-08-04 at 17:16, michael@floog.net wrote: > Hi Oliver, > > This time the copy command appeared to work, except for the "CFW" entry > in the integer-formatted column, but then trying to view the data shows > the long hyphenated line again and no readable data. > > What do you think is causing this? Precisely that. Either all the data goes in or none of it does. Clean up your input file and it will work. -- Oliver Elphick olly@lfix.co.uk Isle of Wight http://www.lfix.co.uk/oliver GPG: 1024D/A54310EA 92C8 39E7 280E 3631 3F0E 1EC0 5664 7A2F A543 10EA ======================================== "And not only so, but we glory in tribulations also; knowing that tribulation worketh patience; And patience, experience; and experience, hope." Romans 5:3,4
Definitely will do this when I get home from work tonight. Right now I'm ssh'ing into my box at home, but don't have an X-windows server installed on my Win-bloze machine here at work. It's too difficult to edit the .csv file from a console with vim. I need x-windows so I can run OpenOffice spreadsheet to help see the file better. I'm wondering If I may need to reconfigure the fields using CREATE TABLE, and add NULL to each field. I know that the .csv file definitely has blank cells scattered throughout that were never filled in with text or integers. I wonder if subsequent cell data is getting moved because COPY won't accept empty space between comma separated values. Or is this not an issue. Mike Quoting Oliver Elphick <olly@lfix.co.uk>: > > Precisely that. Either all the data goes in or none of it does. > > Clean up your input file and it will work. >
On Wed, 2004-08-04 at 19:15, michael@floog.net wrote: > Definitely will do this when I get home from work tonight. > Right now I'm ssh'ing into my box at home, but don't have an X-windows > server installed on my Win-bloze machine here at work. It's too > difficult to edit the .csv file from a console with vim. I need > x-windows so I can run OpenOffice spreadsheet to help see the file > better. > > I'm wondering If I may need to reconfigure the fields using CREATE > TABLE, and add NULL to each field. I know that the .csv file > definitely has blank cells scattered throughout that were never filled > in with text or integers. I wonder if subsequent cell data is getting > moved because COPY won't accept empty space between comma separated > values. Or is this not an issue. COPY will accept empty space, but it will treat an empty space as an empty string, which is not the same as NULL. If that is meant to go in an integer or date field, it will be invalid. Use \N to represent nulls in the input file (or use some other combination and specify it when invoking COPY). > Quoting Oliver Elphick <olly@lfix.co.uk>: > > > > > Precisely that. Either all the data goes in or none of it does. > > > > Clean up your input file and it will work. > > > -- Oliver Elphick olly@lfix.co.uk Isle of Wight http://www.lfix.co.uk/oliver GPG: 1024D/A54310EA 92C8 39E7 280E 3631 3F0E 1EC0 5664 7A2F A543 10EA ======================================== "And not only so, but we glory in tribulations also; knowing that tribulation worketh patience; And patience, experience; and experience, hope." Romans 5:3,4
On Tue, Aug 03, 2004 at 06:05:23PM +0000, Martin Foster wrote: > I run a Perl/CGI driven website that makes extensive use of PostgreSQL > (7.4.3) for everything from user information to formatting and display > of specific sections of the site. The server itself, is a dual > processor AMD Opteron 1.4Ghz w/ 2GB Ram and 2 x 120GB hard drives > mirrored for redundancy running under FreeBSD 5.2.1 (AMD64). > > Recently loads on the site have increased during peak hours to the point > of showing considerable loss in performance. This can be observed > when connections move from the 120 concurrent connections to PostgreSQL > to roughly 175 or more. Essentially, the machine seems to struggle > to keep up with continual requests and slows down respectively as > resources are tied down. > > Code changes have been made to the scripts to essentially back off in > high load working environments which have worked to an extent. > However, as loads continue to increase the database itself is not taking > well to the increased traffic taking place. > > Having taken a look at 'Tuning PostgreSQL for Performance' > (http://www.varlena.com/GeneralBits/Tidbits/perf.html) using it as best > I could in order to set my settings. However, even with statistics > disabled and ever setting tweaked things still consider to deteriorate. > > Is there anything anyone can recommend in order to give the system a > necessary speed boost? It would seem to me that a modest dataset of > roughly a Gig combined with that type of hardware should be able to > handle substantially more load then what it is. Can anyone provide me > with clues as where to pursue? Would disabling 'fsync' provide more > performance if I choose that information may be lost in case of a crash? > > If anyone needs access to logs, settings et cetera. Please ask, I > simply wish to test the waters first on what is needed. Thanks! > > Martin Foster > martin@ethereal-realms.org > > ---------------------------(end of broadcast)--------------------------- > TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org Martin, Maybe a connection pool middleware like pgpool could help. This would allow a lower number of backends to serve the same frontends more effectively and reduce the connection startup costs. --Ken