Home > mailing lists

Thread: Checkpoint_segments optimal value

Checkpoint_segments optimal value

From

Prabhjot Sheena

Date:

17 July 2014, 17:44:38

Hello All

i had a database load issue today and when i was checking logs i saw this message. I m running postgresql 8.3

db::ts:2014-07-17 09:38:21 PDT::LOG: checkpoints are occurring too frequently (26 seconds apart)
db::ts:2014-07-17 09:38:21 PDT::HINT: Consider increasing the configuration parameter "checkpoint_segments".
db::ts:2014-07-17 09:38:21 PDT::LOG: checkpoint starting: xlog

pls suggest the optimal checkpoint_segment value i should set

My current values are this

checkpoint_segments = 32                # in logfile segments, min 1, 16MB each
#checkpoint_timeout = 5min              # range 30s-1h
#checkpoint_completion_target = 0.5     # checkpoint target duration, 0.0 - 1.0
#checkpoint_warning = 30s               # 0 is off

Current Memory values

# - Memory -

effective_cache_size = 8GB

shared_buffers = 1024MB                 # min 128kB or max_connections*16kB
                                        # (change requires restart)
temp_buffers = 64MB                     # min 800kB
#max_prepared_transactions = 5          # can be 0 or more
                                        # (change requires restart)

work_mem = 512MB                                # min 64kB
maintenance_work_mem = 2048MB           # min 1MB     !!! increased to 1GB 10/1/2010 mrostron (to assist autovacuum workers)
#max_stack_depth = 2MB                  # min 100kB

# - Free Space Map -

max_fsm_pages = 809600                  # min max_fsm_relations*16, 6 bytes each
                                        # (change requires restart)
synchronous_commit = off

#full_page_writes = on                  # recover from partial page writes
#wal_buffers = 64kB                     # min 32kB
                                        # (change requires restart)
#wal_writer_delay = 200ms               # 1-10000 milliseconds

#commit_delay = 0                       # range 0-100000, in microseconds
#commit_siblings = 5                    # range 1-1000

Re: Checkpoint_segments optimal value

From

Potentialtech

Date:

17 July 2014, 18:01:15

On Thu, 17 Jul 2014 10:44:32 -0700
Prabhjot Sheena <prabhjot.sheena@rivalwatch.com> wrote:

> Hello All
>             i had a database load issue today and when i was checking logs
> i saw this message. I m running postgresql 8.3
>
> db::ts:2014-07-17 09:38:21 PDT::LOG:  checkpoints are occurring too
> frequently (26 seconds apart)
> db::ts:2014-07-17 09:38:21 PDT::HINT:  Consider increasing the
> configuration parameter "checkpoint_segments".
> db::ts:2014-07-17 09:38:21 PDT::LOG:  checkpoint starting: xlog

Is this the first time you've had the load problem?  How often does it
happen?  How often is that message in the logs?

The downside (as described in the docs) is that increasing the value will
cause PG to take longer to recover in the event of a crash.  That and that
you need enough disk space to hold the extra segments.

If the warning isn't happening too often, I would try increasing it only a
little and see if it helps.  If it's not enough you can then increase it some
more.  Various sources around the Internet suggest that you don't want to go
much larger than 256 for this (if only because it's uncommon to do so and is
probably indicative of other tuning that you need to do).  Unfortunatley, you
need to restart PG for the change to take effect, so you have to balance
experimenting with your tuning against how often you can get away with a server
restart.

If it's just that one time that you got that message, then you might want to
double checkpoint_segments to 64.  Anything more than that seems unjustified,
unless you're seeing the problem a lot more often than your email suggests.

If the problem is happening frequently, you're probably way better off
organizing an upgrade to PG 9.3 than fidgeting with a lot of tuning.  The
number of performance improvements from 8.3 to 9.3 is quite significant.

> pls suggest the optimal checkpoint_segment value i should set
>
> *My current values are this*
>
> checkpoint_segments = 32                # in logfile segments, min 1, 16MB
> each
> #checkpoint_timeout = 5min              # range 30s-1h
> #checkpoint_completion_target = 0.5     # checkpoint target duration, 0.0 -
> 1.0
> #checkpoint_warning = 30s               # 0 is off
>
> *Current Memory values*
>
> # - Memory -
>
> effective_cache_size = 8GB
>
> shared_buffers = 1024MB                 # min 128kB or max_connections*16kB
>                                         # (change requires restart)
> temp_buffers = 64MB                     # min 800kB
> #max_prepared_transactions = 5          # can be 0 or more
>                                         # (change requires restart)
>
> work_mem = 512MB                                # min 64kB
> maintenance_work_mem = 2048MB           # min 1MB     !!! increased to 1GB
> 10/1/2010 mrostron (to assist autovacuum workers)
> #max_stack_depth = 2MB                  # min 100kB
>
> # - Free Space Map -
>
> max_fsm_pages = 809600                  # min max_fsm_relations*16, 6 bytes
> each
>                                         # (change requires restart)
> synchronous_commit = off
>
> #full_page_writes = on                  # recover from partial page writes
> #wal_buffers = 64kB                     # min 32kB
>                                         # (change requires restart)
> #wal_writer_delay = 200ms               # 1-10000 milliseconds
>
> #commit_delay = 0                       # range 0-100000, in microseconds
> #commit_siblings = 5                    # range 1-1000

--
Potentialtech <wmoran@potentialtech.com>

Re: Checkpoint_segments optimal value

From

Tom Lane

Date:

17 July 2014, 18:06:33

Potentialtech <wmoran@potentialtech.com> writes:
> If the warning isn't happening too often, I would try increasing it only a
> little and see if it helps.  If it's not enough you can then increase it some
> more.  Various sources around the Internet suggest that you don't want to go
> much larger than 256 for this (if only because it's uncommon to do so and is
> probably indicative of other tuning that you need to do).  Unfortunatley, you
> need to restart PG for the change to take effect, so you have to balance
> experimenting with your tuning against how often you can get away with a server
> restart.

Huh?  You don't need a restart, just a reload (SIGHUP) to change that.

            regards, tom lane

Re: Checkpoint_segments optimal value

From

Prabhjot Sheena

Date:

17 July 2014, 18:28:09

Thanks i have changed that to 64 and reloaded it.

When i had load issue today there was this exact same query that hits the db like 50 to 60 times from different machines in 3 to 4 minutes and was taking long time to execute and was holding up the database. i did recreate an index and it started performing better. My question is why it is not fetching the result from the memory since its the same query that runs again and again.

This is the actual query i m taking about:

SELECT account.id, account.organization_id, run.application_id, work_unit.script, work_unit.id, work_unit.start_time, run.id, work_unit.priority FROM work_unit, run, account WHERE work_unit.status = 3 AND work_unit.run_id = run.id AND work_unit.type != 1 AND run.status = 1 AND run.account_id = account.id

Pls suggest if i can do something to fix this

On Thu, Jul 17, 2014 at 11:06 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Potentialtech <wmoran@potentialtech.com> writes:
> If the warning isn't happening too often, I would try increasing it only a
> little and see if it helps. If it's not enough you can then increase it some
> more. Various sources around the Internet suggest that you don't want to go
> much larger than 256 for this (if only because it's uncommon to do so and is
> probably indicative of other tuning that you need to do). Unfortunatley, you
> need to restart PG for the change to take effect, so you have to balance
> experimenting with your tuning against how often you can get away with a server
> restart.

Huh? You don't need a restart, just a reload (SIGHUP) to change that.

regards, tom lane

Re: Checkpoint_segments optimal value

From

Potentialtech

Date:

17 July 2014, 18:53:46

On Thu, 17 Jul 2014 14:06:28 -0400
Tom Lane <tgl@sss.pgh.pa.us> wrote:

> Potentialtech <wmoran@potentialtech.com> writes:
> > If the warning isn't happening too often, I would try increasing it only a
> > little and see if it helps.  If it's not enough you can then increase it some
> > more.  Various sources around the Internet suggest that you don't want to go
> > much larger than 256 for this (if only because it's uncommon to do so and is
> > probably indicative of other tuning that you need to do).  Unfortunatley, you
> > need to restart PG for the change to take effect, so you have to balance
> > experimenting with your tuning against how often you can get away with a server
> > restart.
>
> Huh?  You don't need a restart, just a reload (SIGHUP) to change that.

Really?  Well, thanks for correcting me on that.

--
Potentialtech <wmoran@potentialtech.com>

Re: Checkpoint_segments optimal value

From

Potentialtech

Date:

17 July 2014, 19:00:51

On Thu, 17 Jul 2014 11:28:04 -0700
Prabhjot Sheena <prabhjot.sheena@rivalwatch.com> wrote:

> Thanks i have changed that to 64 and reloaded it.
>
> When i had load issue today there was this exact same query that hits the
> db like 50 to 60 times from different machines in 3 to 4 minutes and was
> taking long time to execute and was holding up the database. i did recreate
> an index and it started performing better. My question is why it is not
> fetching the result from the memory since  its the same query that runs
> again and again.
>
> This is the actual query i m taking about:
>
> SELECT account.id, account.organization_id, run.application_id,
> work_unit.script, work_unit.id, work_unit.start_time, run.id,
> work_unit.priority  FROM work_unit, run, account  WHERE  work_unit.status =
> 3 AND work_unit.run_id = run.id AND work_unit.type != 1 AND run.status = 1
> AND run.account_id = account.id
>
> Pls suggest if i can do something to fix this

Provide a lot more information if you want anyone on the list to be able
to help: such as explain output while the problem is happening, and some
information about the makeup of the tables (column types/indexes/# rows).

Guessing, based on the little information you've provided, it's likely
that you have something else going on at the same time that you're not
aware of, and this particular query is only a symptom.  I'm saying that
because SELECTs don't generally create any WAL traffic, so there were
probably some INSERT/UPDATE/DELETE running at the same time that both
pushed those 3 tables out of memory and/or saturated disk activity to
the point that accessing everything becomes slow for a short while, and
it's just those queries that you noticed.

Are you making the mistake where you set log_min_duration to 1s and only
worry about queries that take longer than 1s?  Because I've seen (on
multiple occasions) where many 1000s of queries, each running less than
1s, are the actual cause of the problem.  pgBadger is particularly helpful
in tracking down situations like that.

> On Thu, Jul 17, 2014 at 11:06 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> > Potentialtech <wmoran@potentialtech.com> writes:
> > > If the warning isn't happening too often, I would try increasing it only
> > a
> > > little and see if it helps.  If it's not enough you can then increase it
> > some
> > > more.  Various sources around the Internet suggest that you don't want
> > to go
> > > much larger than 256 for this (if only because it's uncommon to do so
> > and is
> > > probably indicative of other tuning that you need to do).
> >  Unfortunatley, you
> > > need to restart PG for the change to take effect, so you have to balance
> > > experimenting with your tuning against how often you can get away with a
> > server
> > > restart.
> >
> > Huh?  You don't need a restart, just a reload (SIGHUP) to change that.
> >
> >                         regards, tom lane
> >

--
Potentialtech <wmoran@potentialtech.com>

Re: Checkpoint_segments optimal value

From

Prabhjot Sheena

Date:

17 July 2014, 23:26:26

Here is the explain plan. There query time went backup up to 2 to 3 minutes from 3 second in just 2 hrs. Can anyone suggest something on how to fix this or why this is happening

explain SELECT account.id, account.organization_id, run.application_id, work_unit.script, work_unit.id, work_unit.start_time, run.id, work_unit.priority FROM work_unit, run, account WHERE work_unit.status = 3 AND work_unit.run_id = run.id AND work_unit.type != 1 AND run.status = 1 AND run.account_id = account.id;
                                             QUERY PLAN
-----------------------------------------------------------------------------------------------------
Nested Loop (cost=0.00..24.84 rows=1 width=64)
   Join Filter: (run.id = work_unit.run_id)
   -> Nested Loop (cost=0.00..16.55 rows=1 width=16)
         -> Index Scan using run_status_1 on run (cost=0.00..8.27 rows=1 width=12)
               Index Cond: (status = 1)
         -> Index Scan using account_pkey on account (cost=0.00..8.27 rows=1 width=8)
               Index Cond: (account.id = run.account_id)
   -> Index Scan using idx_work_unit_1_status_3_new on work_unit (cost=0.00..8.28 rows=1 width=52)
         Index Cond: (work_unit.status = 3)

Table Structure

\d work_unit
               Table "public.work_unit"
   Column   |            Type             | Modifiers
------------+-----------------------------+-----------
id         | integer                     | not null
run_id     | integer                     |
status     | integer                     |
script     | character varying           |
parameters | character varying           |
input      | character varying           |
start_time | timestamp without time zone |
stop_time | timestamp without time zone |
priority   | integer                     |
type       | integer                     |
lineitems | integer                     |
outputs    | integer                     |
sub_type   | integer                     |
site_code | text                        |
Indexes:
    "work_unit_pkey" PRIMARY KEY, btree (id)
    "idx_work_unit_1_partial" btree (run_id, start_time) WHERE status >= 3
    "idx_work_unit_1_run_id" btree (run_id)
    "idx_work_unit_1_script" btree (script)
    "idx_work_unit_1_site_code" btree (site_code)
    "idx_work_unit_1_starttime" btree (start_time)
    "idx_work_unit_1_status_3_new" btree (status, type) WHERE status = 3 AND type <> 1
    "idx_work_unit_1_status_5" btree (status) WHERE status >= 4
    "idx_work_unit_1_status_part_new" btree (status) WHERE status < 4
    "idx_work_unit_1_stop_time" btree (stop_time)

\d run
                       Table "public.run"
       Column        |           Type           |   Modifiers
---------------------+--------------------------+---------------
id                  | integer                  | not null
job_id              | integer                  |
start_time          | timestamp with time zone | not null
status              | integer                  | not null
job_name            | character varying        |
account_id          | integer                  |
application_id      | integer                  |
postproc_script     | character varying        |
upload_url          | character varying        |
complete_time       | timestamp with time zone |
email               | character varying        |
size                | integer                  |
errors              | integer                  |
raw_count           | integer                  |
munge               | integer                  |
job_details         | character varying        |
user_settings       | character varying        |
run_type            | integer                  |
stop_after          | timestamp with time zone |
total_work_time     | integer                  |
flags               | integer                  |
cluster_info_id     | integer                  |
skiplineitems_count | integer                  |
last_update_ts      | timestamp with time zone |
result_exists       | boolean                  | default false
abort_type          | integer                  |
Indexes:
    "run_pkey" PRIMARY KEY, btree (id)
    "idx_run_acc_stat_comp" btree (account_id, status, complete_time)
    "idx_run_app" btree (application_id)
    "idx_run_complete_time_2" btree (complete_time)
    "idx_run_job_name" btree (job_name)
    "run_application_account_idx" btree (application_id, account_id, status)
    "run_job" btree (job_id)
    "run_result_exists" btree (result_exists)
    "run_start_time" btree (start_time)
    "run_status" btree (status)
    "run_status_1" btree (status) WHERE status = 1
    "run_status_part_idx" btree (status) WHERE status < 3
Triggers:
    run_upd_ts BEFORE INSERT OR UPDATE ON run FOR EACH ROW EXECUTE PROCEDURE set_last_update_ts()
    tr_proc_update_job_summary AFTER UPDATE ON run FOR EACH ROW EXECUTE PROCEDURE tr_proc_update_job_summary()

On Thu, Jul 17, 2014 at 12:00 PM, Potentialtech <wmoran@potentialtech.com> wrote:

On Thu, 17 Jul 2014 11:28:04 -0700
Prabhjot Sheena <prabhjot.sheena@rivalwatch.com> wrote:

> Thanks i have changed that to 64 and reloaded it.
>
> When i had load issue today there was this exact same query that hits the
> db like 50 to 60 times from different machines in 3 to 4 minutes and was
> taking long time to execute and was holding up the database. i did recreate
> an index and it started performing better. My question is why it is not
> fetching the result from the memory since its the same query that runs
> again and again.
>
> This is the actual query i m taking about:
>
> SELECT account.id, account.organization_id, run.application_id,
> work_unit.script, work_unit.id, work_unit.start_time, run.id,
> work_unit.priority FROM work_unit, run, account WHERE work_unit.status =
> 3 AND work_unit.run_id = run.id AND work_unit.type != 1 AND run.status = 1
> AND run.account_id = account.id
>
> Pls suggest if i can do something to fix this

Provide a lot more information if you want anyone on the list to be able
to help: such as explain output while the problem is happening, and some
information about the makeup of the tables (column types/indexes/# rows).

Guessing, based on the little information you've provided, it's likely
that you have something else going on at the same time that you're not
aware of, and this particular query is only a symptom. I'm saying that
because SELECTs don't generally create any WAL traffic, so there were
probably some INSERT/UPDATE/DELETE running at the same time that both
pushed those 3 tables out of memory and/or saturated disk activity to
the point that accessing everything becomes slow for a short while, and
it's just those queries that you noticed.

Are you making the mistake where you set log_min_duration to 1s and only
worry about queries that take longer than 1s? Because I've seen (on
multiple occasions) where many 1000s of queries, each running less than
1s, are the actual cause of the problem. pgBadger is particularly helpful
in tracking down situations like that.

> On Thu, Jul 17, 2014 at 11:06 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> > Potentialtech <wmoran@potentialtech.com> writes:
> > > If the warning isn't happening too often, I would try increasing it only
> > a
> > > little and see if it helps. If it's not enough you can then increase it
> > some
> > > more. Various sources around the Internet suggest that you don't want
> > to go
> > > much larger than 256 for this (if only because it's uncommon to do so
> > and is
> > > probably indicative of other tuning that you need to do).
> > Unfortunatley, you
> > > need to restart PG for the change to take effect, so you have to balance
> > > experimenting with your tuning against how often you can get away with a
> > server
> > > restart.
> >
> > Huh? You don't need a restart, just a reload (SIGHUP) to change that.
> >
> > regards, tom lane
> >

--
Potentialtech <wmoran@potentialtech.com>

Re: Checkpoint_segments optimal value

From

Prabhjot Sheena

Date:

17 July 2014, 23:39:50

even though the explain plan suggests differently but its taking long long time

On Thu, Jul 17, 2014 at 4:26 PM, Prabhjot Sheena <prabhjot.sheena@rivalwatch.com> wrote:

Here is the explain plan. There query time went backup up to 2 to 3 minutes from 3 second in just 2 hrs. Can anyone suggest something on how to fix this or why this is happening

explain SELECT account.id, account.organization_id, run.application_id, work_unit.script, work_unit.id, work_unit.start_time, run.id, work_unit.priority FROM work_unit, run, account WHERE work_unit.status = 3 AND work_unit.run_id = run.id AND work_unit.type != 1 AND run.status = 1 AND run.account_id = account.id;
                                             QUERY PLAN
-----------------------------------------------------------------------------------------------------
Nested Loop (cost=0.00..24.84 rows=1 width=64)
   Join Filter: (run.id = work_unit.run_id)
   -> Nested Loop (cost=0.00..16.55 rows=1 width=16)
         -> Index Scan using run_status_1 on run (cost=0.00..8.27 rows=1 width=12)
               Index Cond: (status = 1)
         -> Index Scan using account_pkey on account (cost=0.00..8.27 rows=1 width=8)
               Index Cond: (account.id = run.account_id)
   -> Index Scan using idx_work_unit_1_status_3_new on work_unit (cost=0.00..8.28 rows=1 width=52)
         Index Cond: (work_unit.status = 3)

Table Structure

\d work_unit
               Table "public.work_unit"
   Column   |            Type             | Modifiers
------------+-----------------------------+-----------
id         | integer                     | not null
run_id     | integer                     |
status     | integer                     |
script     | character varying           |
parameters | character varying           |
input      | character varying           |
start_time | timestamp without time zone |
stop_time | timestamp without time zone |
priority   | integer                     |
type       | integer                     |
lineitems | integer                     |
outputs    | integer                     |
sub_type   | integer                     |
site_code | text                        |
Indexes:
    "work_unit_pkey" PRIMARY KEY, btree (id)
    "idx_work_unit_1_partial" btree (run_id, start_time) WHERE status >= 3
    "idx_work_unit_1_run_id" btree (run_id)
    "idx_work_unit_1_script" btree (script)
    "idx_work_unit_1_site_code" btree (site_code)
    "idx_work_unit_1_starttime" btree (start_time)
    "idx_work_unit_1_status_3_new" btree (status, type) WHERE status = 3 AND type <> 1
    "idx_work_unit_1_status_5" btree (status) WHERE status >= 4
    "idx_work_unit_1_status_part_new" btree (status) WHERE status < 4
    "idx_work_unit_1_stop_time" btree (stop_time)

\d run
                       Table "public.run"
       Column        |           Type           |   Modifiers
---------------------+--------------------------+---------------
id                  | integer                  | not null
job_id              | integer                  |
start_time          | timestamp with time zone | not null
status              | integer                  | not null
job_name            | character varying        |
account_id          | integer                  |
application_id      | integer                  |
postproc_script     | character varying        |
upload_url          | character varying        |
complete_time       | timestamp with time zone |
email               | character varying        |
size                | integer                  |
errors              | integer                  |
raw_count           | integer                  |
munge               | integer                  |
job_details         | character varying        |
user_settings       | character varying        |
run_type            | integer                  |
stop_after          | timestamp with time zone |
total_work_time     | integer                  |
flags               | integer                  |
cluster_info_id     | integer                  |
skiplineitems_count | integer                  |
last_update_ts      | timestamp with time zone |
result_exists       | boolean                  | default false
abort_type          | integer                  |
Indexes:
    "run_pkey" PRIMARY KEY, btree (id)
    "idx_run_acc_stat_comp" btree (account_id, status, complete_time)
    "idx_run_app" btree (application_id)
    "idx_run_complete_time_2" btree (complete_time)
    "idx_run_job_name" btree (job_name)
    "run_application_account_idx" btree (application_id, account_id, status)
    "run_job" btree (job_id)
    "run_result_exists" btree (result_exists)
    "run_start_time" btree (start_time)
    "run_status" btree (status)
    "run_status_1" btree (status) WHERE status = 1
    "run_status_part_idx" btree (status) WHERE status < 3
Triggers:
    run_upd_ts BEFORE INSERT OR UPDATE ON run FOR EACH ROW EXECUTE PROCEDURE set_last_update_ts()
    tr_proc_update_job_summary AFTER UPDATE ON run FOR EACH ROW EXECUTE PROCEDURE tr_proc_update_job_summary()

On Thu, Jul 17, 2014 at 12:00 PM, Potentialtech <wmoran@potentialtech.com> wrote:
On Thu, 17 Jul 2014 11:28:04 -0700
Prabhjot Sheena <prabhjot.sheena@rivalwatch.com> wrote:

> Thanks i have changed that to 64 and reloaded it.
>
> When i had load issue today there was this exact same query that hits the
> db like 50 to 60 times from different machines in 3 to 4 minutes and was
> taking long time to execute and was holding up the database. i did recreate
> an index and it started performing better. My question is why it is not
> fetching the result from the memory since its the same query that runs
> again and again.
>
> This is the actual query i m taking about:
>
> SELECT account.id, account.organization_id, run.application_id,
> work_unit.script, work_unit.id, work_unit.start_time, run.id,
> work_unit.priority FROM work_unit, run, account WHERE work_unit.status =
> 3 AND work_unit.run_id = run.id AND work_unit.type != 1 AND run.status = 1
> AND run.account_id = account.id
>
> Pls suggest if i can do something to fix this

Provide a lot more information if you want anyone on the list to be able
to help: such as explain output while the problem is happening, and some
information about the makeup of the tables (column types/indexes/# rows).

Guessing, based on the little information you've provided, it's likely
that you have something else going on at the same time that you're not
aware of, and this particular query is only a symptom. I'm saying that
because SELECTs don't generally create any WAL traffic, so there were
probably some INSERT/UPDATE/DELETE running at the same time that both
pushed those 3 tables out of memory and/or saturated disk activity to
the point that accessing everything becomes slow for a short while, and
it's just those queries that you noticed.

Are you making the mistake where you set log_min_duration to 1s and only
worry about queries that take longer than 1s? Because I've seen (on
multiple occasions) where many 1000s of queries, each running less than
1s, are the actual cause of the problem. pgBadger is particularly helpful
in tracking down situations like that.

> On Thu, Jul 17, 2014 at 11:06 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> > Potentialtech <wmoran@potentialtech.com> writes:
> > > If the warning isn't happening too often, I would try increasing it only
> > a
> > > little and see if it helps. If it's not enough you can then increase it
> > some
> > > more. Various sources around the Internet suggest that you don't want
> > to go
> > > much larger than 256 for this (if only because it's uncommon to do so
> > and is
> > > probably indicative of other tuning that you need to do).
> > Unfortunatley, you
> > > need to restart PG for the change to take effect, so you have to balance
> > > experimenting with your tuning against how often you can get away with a
> > server
> > > restart.
> >
> > Huh? You don't need a restart, just a reload (SIGHUP) to change that.
> >
> > regards, tom lane
> >

--
Potentialtech <wmoran@potentialtech.com>

Re: Checkpoint_segments optimal value

From

John R Pierce

Date:

17 July 2014, 23:45:45

On 7/17/2014 4:26 PM, Prabhjot Sheena wrote:

Here is the explain plan. There query time went backup up to 2 to 3 minutes from 3 second in just 2 hrs. Can anyone suggest something on how to fix this or why this is happening

explain SELECT account.id, account.organization_id, run.application_id, work_unit.script, work_unit.id, work_unit.start_time, run.id, work_unit.priority FROM work_unit, run, account WHERE work_unit.status = 3 AND work_unit.run_id = run.id AND work_unit.type != 1 AND run.status = 1 AND run.account_id = account.id;

you need to use EXPLAIN ANALYZE to get accurate data.

run it when the query is fast, and again when the query is slow, paste both outputs here. also, you can paste them to http://explain.depesz.com and that will give you a nice analysis of the timing data included in the EXPLAIN ANALYZE output.

-- 
john r pierce                                      37N 122W
somewhere on the middle of the left coast

Re: Checkpoint_segments optimal value

From

John R Pierce

Date:

17 July 2014, 23:51:55

On 7/17/2014 11:28 AM, Prabhjot Sheena wrote:

SELECT account.id, account.organization_id, run.application_id, work_unit.script, work_unit.id, work_unit.start_time, run.id, work_unit.priority FROM work_unit, run, account WHERE work_unit.status = 3 AND work_unit.run_id = run.id AND work_unit.type != 1 AND run.status = 1 AND run.account_id = account.id

that query is an implied 3-way join. it would be clearer to write it like...

SELECT account.id, account.organization_id, run.application_id, work_unit.script, work_unit.id, work_unit.start_time, run.id, work_unit.priority
    FROM work_unit
        JOIN run ON work_unit.run_id = run.id
        JOIN account ON run.account_id = account.id
    WHERE work_unit.status = 3 AND work_unit.type != 1 AND run.status = 1;

postgres would (or at least should) treat this exactly the same, but its clearer what the query is doing when its written this way.

-- 
john r pierce                                      37N 122W
somewhere on the middle of the left coast

Re: Checkpoint_segments optimal value

From

Prabhjot Sheena

Date:

18 July 2014, 00:01:57

i just did explain analyze and currently database is running slow coz of the query

explain ANALYZE SELECT account.id, account.organization_id, run.application_id, work_unit.script, work_unit.id, work_unit.start_time, run.id, work_unit.priority FROM work_unit, run, account WHERE work_unit.status = 3 AND work_unit.run_id = run.id AND work_unit.type != 1 AND run.status = 1 AND run.account_id = account.id;
                                                                        QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------------------------
Nested Loop (cost=0.00..24.84 rows=1 width=64) (actual time=71.824..69729.467 rows=1820 loops=1)
   -> Nested Loop (cost=0.00..16.56 rows=1 width=60) (actual time=71.760..69628.874 rows=1820 loops=1)
         Join Filter: (work_unit.run_id = run.id)
         -> Index Scan using idx_work_unit_1_status_3_new on work_unit (cost=0.00..8.28 rows=1 width=52) (actual time=0.067..154.364 rows=1820 loops=1)
               Index Cond: (status = 3)
         -> Index Scan using run_status_1 on run (cost=0.00..8.27 rows=1 width=12) (actual time=0.081..34.338 rows=3138 loops=1820)
               Index Cond: (run.status = 1)
   -> Index Scan using account_pkey on account (cost=0.00..8.27 rows=1 width=8) (actual time=0.044..0.046 rows=1 loops=1820)
         Index Cond: (account.id = run.account_id)
Total runtime: 69732.893 ms

On Thu, Jul 17, 2014 at 4:45 PM, John R Pierce <pierce@hogranch.com> wrote:

On 7/17/2014 4:26 PM, Prabhjot Sheena wrote:
Here is the explain plan. There query time went backup up to 2 to 3 minutes from 3 second in just 2 hrs. Can anyone suggest something on how to fix this or why this is happening

explain SELECT account.id, account.organization_id, run.application_id, work_unit.script, work_unit.id, work_unit.start_time, run.id, work_unit.priority FROM work_unit, run, account WHERE work_unit.status = 3 AND work_unit.run_id = run.id AND work_unit.type != 1 AND run.status = 1 AND run.account_id = account.id;

you need to use EXPLAIN ANALYZE to get accurate data.

run it when the query is fast, and again when the query is slow, paste both outputs here. also, you can paste them to http://explain.depesz.com and that will give you a nice analysis of the timing data included in the EXPLAIN ANALYZE output.

-- john r pierce 37N 122W somewhere on the middle of the left coast

Re: Checkpoint_segments optimal value

From

John R Pierce

Date:

18 July 2014, 00:10:39

On 7/17/2014 5:01 PM, Prabhjot Sheena wrote:

i just did explain analyze and currently database is running slow coz of the query

explain ANALYZE SELECT account.id, account.organization_id, run.application_id, work_unit.script, work_unit.id, work_unit.start_time, run.id, work_unit.priority FROM work_unit, run, account WHERE work_unit.status = 3 AND work_unit.run_id = run.id AND work_unit.type != 1 AND run.status = 1 AND run.account_id = account.id;

what does the overall workload on the system look like when this is so slow? taking over a minute to do 1820 iterations of a 3138 row index scan seems way out of bounds for anything other than a very heavily overloaded server.

is this running on a virtual machine where there are other virtual servers contending for the same resources ?

btw, here's your analyze formatted pretty: http://explain.depesz.com/s/XRI

-- 
john r pierce                                      37N 122W
somewhere on the middle of the left coast

Re: Checkpoint_segments optimal value

From

Prabhjot Sheena

Date:

18 July 2014, 00:36:45

The overall load of system is good. It is a vm but the other database is running normal and not doing much activity

vmstat 5 10
procs -----------memory---------- ---swap-- -----io---- --system-- -----cpu------
r b   swpd   free   buff cache   si   so    bi    bo   in   cs us sy id wa st
0 2 22048 296804   5016 10829360    0    0     1     3    0    0 5 1 75 18 1
9 1 22048 255184   5024 10868884    0    0     0    60 8955 10518 33 3 41 15 8
11 3 22048 197664   5024 10925944    0    0     0     0 7866 8767 37 2 44 12 6
12 1 22048 140000   5048 10976620    0    0     0    10 9181 10632 39 3 42 11 6
7 1 22048 148068   5076 10970104    0    0     1    33 7456 7715 34 2 46 8 10
11 3 22048 87408   5076 11027832    0    0     0     0 9846 12631 31 4 38 14 14
8 1 22048 48380   5084 11067456    0    0     0    35 7887 8902 33 3 43 11 10

On Thu, Jul 17, 2014 at 5:10 PM, John R Pierce <pierce@hogranch.com> wrote:

On 7/17/2014 5:01 PM, Prabhjot Sheena wrote:
i just did explain analyze and currently database is running slow coz of the query

explain ANALYZE SELECT account.id, account.organization_id, run.application_id, work_unit.script, work_unit.id, work_unit.start_time, run.id, work_unit.priority FROM work_unit, run, account WHERE work_unit.status = 3 AND work_unit.run_id = run.id AND work_unit.type != 1 AND run.status = 1 AND run.account_id = account.id;

what does the overall workload on the system look like when this is so slow? taking over a minute to do 1820 iterations of a 3138 row index scan seems way out of bounds for anything other than a very heavily overloaded server.

is this running on a virtual machine where there are other virtual servers contending for the same resources ?

btw, here's your analyze formatted pretty: http://explain.depesz.com/s/XRI
-- 
john r pierce                                      37N 122W
somewhere on the middle of the left coast

Re: Checkpoint_segments optimal value

From

John R Pierce

Date:

18 July 2014, 01:11:54

On 7/17/2014 5:36 PM, Prabhjot Sheena wrote:
> The overall load of system is good. It is a vm but the other database
> is running normal and not doing much activity

other database?  is that on the same postgresql cluster?   or on a
seperate instance of the postgres server on the same VM?  or on a
seperate VM on the same host ?

what does `iostat -x 5` show ?   let it run a bunch of samples, then run
this slow query and get more samples.   pay attention to the disk
drive(s) the database is stored on.

high %utilization or await times would be a bad thing.

--
john r pierce                                      37N 122W
somewhere on the middle of the left coast

Re: Checkpoint_segments optimal value

From

Kevin Grittner

Date:

21 July 2014, 20:51:36

Prabhjot Sheena <prabhjot.sheena@rivalwatch.com> wrote:

> I m running postgresql 8.3

That is out of support.  If you are at all concerned about
performance, you would do well to upgrade to a recent and
supported major release.

http://www.postgresql.org/support/versioning/

Anyway, it is always a good idea to show the output of:

SELECT version();

> temp_buffers = 64MB

You don't say what you have set for max_connections, but
temp_buffers is allocated at need *separately for each connection*
and once allocated to a connection it is never released until the
connection is closed.  So, hypothetically, if you had 100
connections open and they each had used temporary tables at some
time or other, even when all are idle they could have 6.4GB of
your 12GB of RAM tied up.

> work_mem = 512MB

Unlike temp_buffers, this one is released at the end of the
command (or earlier), but it is possible to have multiple
allocations for a single connection.  If you had 100 queries
active, each using one work_mem allocation, that would allocate
about 51.2GB of RAM, which looks like it is more than you have.  I
generally recommend setting max_connections only to what is needed
(with a small reserve) and setting work_mem to 25% of machine (or
VM) RAM divided by max_connections.

On the face of it, neither of the above seems to be causing
problems at the time you ran vmstat, but they worry me.

> max_fsm_pages = 809600

Since the EXPLAIN ANALYZE output you later posted show estimates
which were off by several orders of magnitude, quite possibly
leading to a suboptimal plan, you should probably run queries to
check for bloat and update statistics.  If you find badly bloated
tables you should run CLUSTER or VACUUM FULL on them to eliminate
bloat.  Any bloated indexes, and any indexed on bloated tables you
cleaned up using VACUUM FULL should be fixed with REINDEX.  (Once
you are on 9.0 or later it is no longer necessary to REINDEX a
table after using VACUUM FULL on it, although a regular VACUUM
after the VACUUM FULL is still important.)  If you find bloat you
should make autovacuum more aggressive or add VACUUM ANALYZE
crontab jobs, to prevent a recurrence.

You may want to look at increasing statistics targets, either
globally or on specific columns used for selection criteria.

After doing the above, or if you decide not to do any or all of it,
you should probably run EXPLAIN ANALYZE VERBOSE; (without
specifying a table name and without the FULL option) and review the
output.  The most important part of the output is the last few
lines, which in 8.3 might include warnings that suggest
configuration changes.

The above might help, but I think the biggest problem may be your
VM.  You show very low disk I/O numbers, but a pretty significant
fraction of the time waiting for I/O.  The previously-suggested
iostat output may help nail it down more specifically, but
basically you seem to have a big problem with bandwidth for storage
access.  It's pretty disturbing to see lines in vmstat output which
show zero disk in or out, but over 10% of CPU time waiting for
storage?!?

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: Checkpoint_segments optimal value

From

John R Pierce

Date:

21 July 2014, 21:04:24

On 7/21/2014 1:51 PM, Kevin Grittner wrote:
> The above might help, but I think the biggest problem may be your
> VM.  You show very low disk I/O numbers, but a pretty significant
> fraction of the time waiting for I/O.  The previously-suggested
> iostat output may help nail it down more specifically, but
> basically you seem to have a big problem with bandwidth for storage
> access.  It's pretty disturbing to see lines in vmstat output which
> show zero disk in or out, but over 10% of CPU time waiting for
> storage?!?

that would suggest to me a large number of VMs sharing a single SATA
drive, or similar.



--
john r pierce                                      37N 122W
somewhere on the middle of the left coast

Re: Checkpoint_segments optimal value

From

Prabhjot Sheena

Date:

26 July 2014, 00:58:54

Thanks a lot Kevin. This is what i did to improve query performance. i recreated all the indexes on work_unit table and have been running vacuum analyze through cron job 3 times a day on two tables that are in the query. The query performance is between 2 to 3 seconds now. The strange thing i noticed is that just today at one time query performance came down to under 1 second and started using this query plan

http://explain.depesz.com/s/h5q8

But than cronjob started to vacuum analyze the work_unit and run table and after vacuum analyze got completed. it started using another query plan which made query slow and it went back to running between 2 to 3 seconds. This is query plan that is used now and is slower

http://explain.depesz.com/s/AiG

Why is that happening and what should i do to put it back to this query plan http://explain.depesz.com/s/h5q8

No changes have been made to the table structure or indexes between those two query plans

My other questions is once the manual vaccum runs on the table like vacuum analyze work_unit and vacuum analyze run does that also cleans up the indexes or not. If not does that do any damages to the indexes or something

We do have i/o problem during heavy loads because we share mount point from netapp across different machines which we are getting rid in few weeks by moving stuff to Amazon aws and than IO issues will go away with fast iops

i have copied some more information. Can you please recomend what values i should set for temp_buffer and work_memory and also what query should i run to check for bloat.

caesius=# SELECT version();
                                                 version
----------------------------------------------------------------------------------------------------------
PostgreSQL 8.3.7 on x86_64-unknown-linux-gnu, compiled by GCC gcc (GCC) 3.4.6 20060404 (Red Hat 3.4.6-9)

(1 row)

caesius=# show max_connections ;
max_connections
-----------------
600

caesius=# select count(*) from pg_stat_activity;
count
-------
   165

#autovacuum = on

log_autovacuum_min_duration = 10000
autovacuum_max_workers = 1
#autovacuum_naptime = 1min
#autovacuum_vacuum_threshold = 50
#autovacuum_analyze_threshold = 50
autovacuum_vacuum_scale_factor = 0.1
#autovacuum_analyze_scale_factor = 0.05
#autovacuum_freeze_max_age = 200000000
#autovacuum_vacuum_cost_delay = 20
#autovacuum_vacuum_cost_limit = -1
# - Cost-Based Vacuum Delay -
#vacuum_cost_delay = 0                  # 0-1000 milliseconds
#vacuum_cost_page_hit = 1               # 0-10000 credits
#vacuum_cost_page_miss = 10             # 0-10000 credits
#vacuum_cost_page_dirty = 20            # 0-10000 credits
#vacuum_cost_limit = 200                # 1-10000 credits

[caesius@clientdb01 tmp]$ cat /proc/meminfo
MemTotal:     12582912 kB
MemFree:        204748 kB
Buffers:          4540 kB
Cached:        9541024 kB
SwapCached:       5324 kB
Active:        5218556 kB
Inactive:      6554684 kB
HighTotal:           0 kB
HighFree:            0 kB
LowTotal:     12582912 kB
LowFree:        204748 kB
SwapTotal:    16777208 kB
SwapFree:     16755516 kB
Dirty:           36584 kB
Writeback:          20 kB
AnonPages:     2227364 kB
Mapped:        1093452 kB
Slab:           101396 kB
PageTables:     206692 kB
NFS_Unstable:        0 kB
Bounce:              0 kB
CommitLimit: 23068664 kB
Committed_AS: 3796932 kB
VmallocTotal: 34359738367 kB
VmallocUsed:      9196 kB
VmallocChunk: 34359729063 kB

iostar -d -s 5

Device:         rrqm/s   wrqm/s   r/s   w/s   rsec/s   wsec/s avgrq-sz avgqu-sz   await svctm %util
xvda              0.00     7.40 0.20 1.20     4.80    68.80    52.57     0.01    4.57   2.29   0.32
xvda1             0.00     0.00 0.00 0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
xvda2             0.00     7.40 0.20 1.20     4.80    68.80    52.57     0.01    4.57   2.29   0.32
dm-0              0.00     0.00 0.20 8.60     4.80    68.80     8.36     0.04    4.09   0.36   0.32
dm-1              0.00     0.00 0.00 0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00

Device:         rrqm/s   wrqm/s   r/s   w/s   rsec/s   wsec/s avgrq-sz avgqu-sz   await svctm %util
xvda              0.00     3.80 0.00 0.60     0.00    35.20    58.67     0.00    6.67   5.33   0.32
xvda1             0.00     0.00 0.00 0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
xvda2             0.00     3.80 0.00 0.60     0.00    35.20    58.67     0.00    6.67   5.33   0.32
dm-0              0.00     0.00 0.00 4.40     0.00    35.20     8.00     0.02    4.36   0.73   0.32
dm-1              0.00     0.00 0.00 0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00

Device:         rrqm/s   wrqm/s   r/s   w/s   rsec/s   wsec/s avgrq-sz avgqu-sz   await svctm %util
xvda              0.00     9.40 0.00 1.80     0.00    89.60    49.78     0.00    2.22   0.44   0.08
xvda1             0.00     0.00 0.00 0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
xvda2             0.00     9.40 0.00 1.80     0.00    89.60    49.78     0.00    2.22   0.44   0.08
dm-0              0.00     0.00 0.00 11.20     0.00    89.60     8.00     0.03    3.00   0.07   0.08
dm-1              0.00     0.00 0.00 0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00

Device:         rrqm/s   wrqm/s   r/s   w/s   rsec/s   wsec/s avgrq-sz avgqu-sz   await svctm %util
xvda              0.00    12.38 5.79 13.17   106.99   204.39    16.42     0.20   10.78   1.85   3.51
xvda1             0.00     0.00 0.00 0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
xvda2             0.00    12.38 5.79 13.17   106.99   204.39    16.42     0.20   10.78   1.85   3.51
dm-0              0.00     0.00 5.79 25.55   106.99   204.39     9.94     0.31    9.83   1.12   3.51
dm-1              0.00     0.00 0.00 0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00

Device:         rrqm/s   wrqm/s   r/s   w/s   rsec/s   wsec/s avgrq-sz avgqu-sz   await svctm %util
xvda              0.00     0.00 0.00 0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
xvda1             0.00     0.00 0.00 0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
xvda2             0.00     0.00 0.00 0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
dm-0              0.00     0.00 0.00 0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
dm-1              0.00     0.00 0.00 0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00

Device:         rrqm/s   wrqm/s   r/s   w/s   rsec/s   wsec/s avgrq-sz avgqu-sz   await svctm %util
xvda              0.00     0.00 0.00 0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
xvda1             0.00     0.00 0.00 0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
xvda2             0.00     0.00 0.00 0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
dm-0              0.00     0.00 0.00 0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
dm-1              0.00     0.00 0.00 0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00

Device:         rrqm/s   wrqm/s   r/s   w/s   rsec/s   wsec/s avgrq-sz avgqu-sz   await svctm %util
xvda              0.00     0.00 0.00 0.20     0.00     1.60     8.00     0.00    4.00   4.00   0.08
xvda1             0.00     0.00 0.00 0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
xvda2             0.00     0.00 0.00 0.20     0.00     1.60     8.00     0.00    4.00   4.00   0.08
dm-0              0.00     0.00 0.00 0.20     0.00     1.60     8.00     0.00    4.00   4.00   0.08
dm-1              0.00     0.00 0.00 0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00

Thanks a lot

On Mon, Jul 21, 2014 at 2:04 PM, John R Pierce <pierce@hogranch.com> wrote:

On 7/21/2014 1:51 PM, Kevin Grittner wrote:
The above might help, but I think the biggest problem may be your
VM. You show very low disk I/O numbers, but a pretty significant
fraction of the time waiting for I/O. The previously-suggested
iostat output may help nail it down more specifically, but
basically you seem to have a big problem with bandwidth for storage
access. It's pretty disturbing to see lines in vmstat output which
show zero disk in or out, but over 10% of CPU time waiting for
storage?!?

that would suggest to me a large number of VMs sharing a single SATA drive, or similar.

--
john r pierce 37N 122W
somewhere on the middle of the left coast

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: Checkpoint_segments optimal value

From

John R Pierce

Date:

26 July 2014, 02:05:52

On 7/25/2014 5:58 PM, Prabhjot Sheena wrote:
> We do have i/o problem during heavy loads because we share mount point
> from netapp across different machines which we are getting rid in few
> weeks by moving stuff to Amazon aws and than IO issues will go away
> with fast iops

Netapp has to be some of the most expensive storage I've ever seen in a
datacenter and is frequently a source of performance problems.   NFS
isn't really appropriate for storing database data on, there's way too
many opportunites for fsync to get lost, more so if you've got a
virtualization layer in there too.

--
john r pierce                                      37N 122W
somewhere on the middle of the left coast

Re: Checkpoint_segments optimal value

From

Kevin Grittner

Date:

28 July 2014, 14:20:32

Prabhjot Sheena <prabhjot.sheena@rivalwatch.com> wrote:

> PostgreSQL 8.3.7 on x86_64-unknown-linux-gnu, compiled by GCC gcc (GCC) 3.4.6 20060404 (Red Hat 3.4.6-9)

Running anything on that version is insane.  Not only has the 8.3
major release been out of support since February of 2013, but
you're missing about 4 years of fixes for serious bugs and security
issues since the last patch version of 8.3 before it went out of
support.  Any vacuum or planning problems you are seeing on that
release are very likely fixed on a supported version.

> This is what i did to improve query performance. i recreated all
> the indexes on work_unit table and have been running vacuum
> analyze through cron job 3 times a day on two tables that are in
> the query. The query performance is between 2 to 3 seconds now.

That's pretty solid evidence that you have been having problems
with bloat.  The most common cause of this is not setting
autovacuum to be aggressive enough.

> autovacuum_max_workers = 1

Vacuum of one big table could starve all other tables, resulting in
bloat.  If anything, this should be set larger than the default.
You might want to try setting it to somewhere in the 3 to 5 range,
but then watch for any long periods where all workers are busy.  If
you see that, you probably need more workers.  I would also boost
autovacuum_vacuum_cost_limit to maybe 500.

> The strange thing i noticed is that just today at one time query
> performance came down to under 1 second and started using this
> query plan

There's really no point hypothesizing about what might cause that
in such an old version with so many known bugs.  The best way to
improve performance would be to upgrade.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company