Thread: Database performs massive reads when I'm doing writes.
Hello: I have a PostgreSQL 9.0.3 database that has suddenly started exhibiting odd read behavior. The version number is: "PostgreSQL 9.0.3 on x86_64-unknown-linux-gnu, compiled by GCC gcc (GCC) 4.1.2 20080704 (Red Hat 4.1.2-50), 64-bit" This was compiled from source about a year ago and has been running just fine. The database feeds a very low-volume web site (a few hits per minute). Starting a few days ago we started to see a strange behavior where writing to the database causes massive read operations. For example, I have a table that needs to be updated every night (about 20,000 rows). Using Django ORM, we update them one item at a time. Usually the overall process takes a few minutes; it is now taking hours (like over 15 hours). Running atop, we're seeing Read Disk values in the range of 147.2M/10s and Write Disk values in the range of 16K/10s. Together, the Disk throughput is in the high 90% and frequently hits 100%. When I'm not writing to the database, it becomes quiet without any disk activity. This is a 256 MByte system and atop shows we're not swapping. There's nothing unusual in the PostgreSQL logs. Appreciate any pointers as to how I go about identifying the root cause of this problem. Thanks. --Ray ---------------------------------------------- Ray Cote, President Appropriate Solutions, Inc. We Build Software www.AppropriateSolutions.com 603.924.6079
On Thu, Jun 6, 2013 at 9:53 AM, Ray Cote <rgacote@appropriatesolutions.com> wrote:
Starting a few days ago we started to see a strange behavior where writing to the database causes massive read operations.
For example, I have a table that needs to be updated every night (about 20,000 rows).
Using Django ORM, we update them one item at a time.
Usually the overall process takes a few minutes; it is now taking hours (like over 15 hours).
Running atop, we're seeing Read Disk values in the range of 147.2M/10s and Write Disk values in the range of 16K/10s.
Together, the Disk throughput is in the high 90% and frequently hits 100%.
When I'm not writing to the database, it becomes quiet without any disk activity.
This is a 256 MByte system and atop shows we're not swapping
Is autovacuum set to 'on'?
Possibly there is bloat in your table and indexes; what happens to the system after you issue a manual "VACUUM ANALYZE VERBOSE"? If that helps, you might need to make your autovacuum settings more aggressive.
From: "bricklen" <bricklen@gmail.com>Yes Autovacuum is set to 'on'.
To: "Ray Cote" <rgacote@appropriatesolutions.com>
Cc: pgsql-general@postgresql.org
Sent: Thursday, June 6, 2013 3:44:04 PM
Subject: Re: [GENERAL] Database performs massive reads when I'm doing writes.On Thu, Jun 6, 2013 at 9:53 AM, Ray Cote <rgacote@appropriatesolutions.com> wrote:Starting a few days ago we started to see a strange behavior where writing to the database causes massive read operations.
For example, I have a table that needs to be updated every night (about 20,000 rows).
Using Django ORM, we update them one item at a time.
Usually the overall process takes a few minutes; it is now taking hours (like over 15 hours).
Running atop, we're seeing Read Disk values in the range of 147.2M/10s and Write Disk values in the range of 16K/10s.
Together, the Disk throughput is in the high 90% and frequently hits 100%.
When I'm not writing to the database, it becomes quiet without any disk activity.
This is a 256 MByte system and atop shows we're not swappingIs autovacuum set to 'on'?Possibly there is bloat in your table and indexes; what happens to the system after you issue a manual "VACUUM ANALYZE VERBOSE"? If that helps, you might need to make your autovacuum settings more aggressive.
I'll run a Vacuum Analyze Verbose after the data load finally completes (which could easily be another two hours).
Don't want to touch the system until this data finally gets loaded.
Thanks for the recommendation.
--Ray
--
Ray Cote, President Appropriate Solutions, Inc.
We Build Software
www.AppropriateSolutions.com 603.924.6079
We Build Software
www.AppropriateSolutions.com 603.924.6079
On Thu, Jun 6, 2013 at 9:53 AM, Ray Cote <rgacote@appropriatesolutions.com> wrote:
Hello:
I have a PostgreSQL 9.0.3 database that has suddenly started exhibiting odd read behavior.
The version number is:
"PostgreSQL 9.0.3 on x86_64-unknown-linux-gnu, compiled by GCC gcc (GCC) 4.1.2 20080704 (Red Hat 4.1.2-50), 64-bit"
This was compiled from source about a year ago and has been running just fine.
The database feeds a very low-volume web site (a few hits per minute).
Starting a few days ago we started to see a strange behavior where writing to the database causes massive read operations.
For example, I have a table that needs to be updated every night (about 20,000 rows).
Using Django ORM, we update them one item at a time.
Usually the overall process takes a few minutes; it is now taking hours (like over 15 hours).
Running atop, we're seeing Read Disk values in the range of 147.2M/10s and Write Disk values in the range of 16K/10s.
Together, the Disk throughput is in the high 90% and frequently hits 100%.
Which specific processes are using the disk? (on my atop, if I hit 'c' it will change mode to show me the entire so-called "command line", which will let you know if the culprit is a vacuum worker, the checkpointer, or someone else).
Cheers,
Jeff
It is Postmaster itself:
11068 - 315.9M 136K 0K 85% postmaster
11000 - 56808K 8K 0K 15% postmaster
11003 - 0K 80K 0K 0% postmaster
11004 - 0K 24K 0K 0% postmaster
11067 - 0K 0K 0K 0% load_rets.py
The above are the atop lines for just postmaster. This is a 10s snapshot so you can see lots of read activity.
The load_rets.py task is the Python script loading the database.
--Ray
--
11068 - 315.9M 136K 0K 85% postmaster
11000 - 56808K 8K 0K 15% postmaster
11003 - 0K 80K 0K 0% postmaster
11004 - 0K 24K 0K 0% postmaster
11067 - 0K 0K 0K 0% load_rets.py
The above are the atop lines for just postmaster. This is a 10s snapshot so you can see lots of read activity.
The load_rets.py task is the Python script loading the database.
--Ray
From: "Jeff Janes" <jeff.janes@gmail.com>
To: "Ray Cote" <rgacote@appropriatesolutions.com>
Cc: pgsql-general@postgresql.org
Sent: Thursday, June 6, 2013 4:05:28 PM
Subject: Re: [GENERAL] Database performs massive reads when I'm doing writes.On Thu, Jun 6, 2013 at 9:53 AM, Ray Cote <rgacote@appropriatesolutions.com> wrote:Hello:
I have a PostgreSQL 9.0.3 database that has suddenly started exhibiting odd read behavior.
The version number is:
"PostgreSQL 9.0.3 on x86_64-unknown-linux-gnu, compiled by GCC gcc (GCC) 4.1.2 20080704 (Red Hat 4.1.2-50), 64-bit"
This was compiled from source about a year ago and has been running just fine.
The database feeds a very low-volume web site (a few hits per minute).
Starting a few days ago we started to see a strange behavior where writing to the database causes massive read operations.
For example, I have a table that needs to be updated every night (about 20,000 rows).
Using Django ORM, we update them one item at a time.
Usually the overall process takes a few minutes; it is now taking hours (like over 15 hours).
Running atop, we're seeing Read Disk values in the range of 147.2M/10s and Write Disk values in the range of 16K/10s.
Together, the Disk throughput is in the high 90% and frequently hits 100%.Which specific processes are using the disk? (on my atop, if I hit 'c' it will change mode to show me the entire so-called "command line", which will let you know if the culprit is a vacuum worker, the checkpointer, or someone else).Cheers,Jeff
--
Ray Cote, President Appropriate Solutions, Inc.
We Build Software
www.AppropriateSolutions.com 603.924.6079
We Build Software
www.AppropriateSolutions.com 603.924.6079
And... it was a rookie mistake.
In answering the last question I once again looked at the database and realized there was an index missing!
Swear it has been there before (and I should have double-checked sooner).
Added the index and now we're moving right along.
Thanks for all leads everyone sent me.
Learned a lot about Postgresql today.
--Ray
--
In answering the last question I once again looked at the database and realized there was an index missing!
Swear it has been there before (and I should have double-checked sooner).
Added the index and now we're moving right along.
Thanks for all leads everyone sent me.
Learned a lot about Postgresql today.
--Ray
From: "Ray Cote" <rgacote@appropriatesolutions.com>
To: "Jeff Janes" <jeff.janes@gmail.com>
Cc: pgsql-general@postgresql.org
Sent: Thursday, June 6, 2013 5:12:51 PM
Subject: Re: [GENERAL] Database performs massive reads when I'm doing writes.p { margin: 0; } It is Postmaster itself:
11068 - 315.9M 136K 0K 85% postmaster
11000 - 56808K 8K 0K 15% postmaster
11003 - 0K 80K 0K 0% postmaster
11004 - 0K 24K 0K 0% postmaster
11067 - 0K 0K 0K 0% load_rets.py
The above are the atop lines for just postmaster. This is a 10s snapshot so you can see lots of read activity.
The load_rets.py task is the Python script loading the database.
--RayFrom: "Jeff Janes" <jeff.janes@gmail.com>
To: "Ray Cote" <rgacote@appropriatesolutions.com>
Cc: pgsql-general@postgresql.org
Sent: Thursday, June 6, 2013 4:05:28 PM
Subject: Re: [GENERAL] Database performs massive reads when I'm doing writes.On Thu, Jun 6, 2013 at 9:53 AM, Ray Cote <rgacote@appropriatesolutions.com> wrote:Hello:
I have a PostgreSQL 9.0.3 database that has suddenly started exhibiting odd read behavior.
The version number is:
"PostgreSQL 9.0.3 on x86_64-unknown-linux-gnu, compiled by GCC gcc (GCC) 4.1.2 20080704 (Red Hat 4.1.2-50), 64-bit"
This was compiled from source about a year ago and has been running just fine.
The database feeds a very low-volume web site (a few hits per minute).
Starting a few days ago we started to see a strange behavior where writing to the database causes massive read operations.
For example, I have a table that needs to be updated every night (about 20,000 rows).
Using Django ORM, we update them one item at a time.
Usually the overall process takes a few minutes; it is now taking hours (like over 15 hours).
Running atop, we're seeing Read Disk values in the range of 147.2M/10s and Write Disk values in the range of 16K/10s.
Together, the Disk throughput is in the high 90% and frequently hits 100%.Which specific processes are using the disk? (on my atop, if I hit 'c' it will change mode to show me the entire so-called "command line", which will let you know if the culprit is a vacuum worker, the checkpointer, or someone else).Cheers,Jeff
--Ray Cote, President Appropriate Solutions, Inc.
We Build Software
www.AppropriateSolutions.com 603.924.6079
--
Ray Cote, President Appropriate Solutions, Inc.
We Build Software
www.AppropriateSolutions.com 603.924.6079
We Build Software
www.AppropriateSolutions.com 603.924.6079
On Thu, Jun 6, 2013 at 2:12 PM, Ray Cote <rgacote@appropriatesolutions.com> wrote:
It is Postmaster itself:
11068 - 315.9M 136K 0K 85% postmaster
11000 - 56808K 8K 0K 15% postmaster
11003 - 0K 80K 0K 0% postmaster
11004 - 0K 24K 0K 0% postmaster
11067 - 0K 0K 0K 0% load_rets.py
The above are the atop lines for just postmaster. This is a 10s snapshot so you can see lots of read activity.
The load_rets.py task is the Python script loading the database.
The postmaster has many children, each of which is responsible for something different. If you display the full command line rather than just the abbreviated one, it will give you more info on exactly which child is using the disk, for example:
PID DSK COMMAND-LINE
3950 16% postgres: jjanes jjanes [local] UPDATE
2978 0% postgres: checkpointer process
2982 0% postgres: stats collector process
Cheers,
Jeff
From: "Jeff Janes" <jeff.janes@gmail.com>Hi Jeff:
To: "Ray Cote" <rgacote@appropriatesolutions.com>
Cc: pgsql-general@postgresql.org
Sent: Thursday, June 6, 2013 5:41:00 PM
Subject: Re: [GENERAL] Database performs massive reads when I'm doing writes.On Thu, Jun 6, 2013 at 2:12 PM, Ray Cote <rgacote@appropriatesolutions.com> wrote:It is Postmaster itself:
11068 - 315.9M 136K 0K 85% postmaster
11000 - 56808K 8K 0K 15% postmaster
11003 - 0K 80K 0K 0% postmaster
11004 - 0K 24K 0K 0% postmaster
11067 - 0K 0K 0K 0% load_rets.py
The above are the atop lines for just postmaster. This is a 10s snapshot so you can see lots of read activity.
The load_rets.py task is the Python script loading the database.The postmaster has many children, each of which is responsible for something different. If you display the full command line rather than just the abbreviated one, it will give you more info on exactly which child is using the disk, for example:PID DSK COMMAND-LINE3950 16% postgres: jjanes jjanes [local] UPDATE2978 0% postgres: checkpointer process2982 0% postgres: stats collector processCheers,Jeff
I can see that now after pressing c in disk display mode:
19007 - D 86% postgres: postgres petersons 127.0.0.1(59702) UPDATE
11003 - S 12% postgres: writer process
11004 - S 2% postgres: wal writer process
This is how it looks when running properly (doing a big update at the moment).
Thanks for the lead. Learning much about atop today.
--Ray
--
Ray Cote, President Appropriate Solutions, Inc.
We Build Software
www.AppropriateSolutions.com 603.924.6079
We Build Software
www.AppropriateSolutions.com 603.924.6079