Re: postgresql.conf recommendations - Mailing list pgsql-performance

From Josh Krupka
Subject Re: postgresql.conf recommendations
Date
Msg-id CAB6McgWsKt5SjBPc-p8ifAySQy=9jKU8KFGCy20Cz0x+XPjKng@mail.gmail.com
Whole thread Raw
In response to Re: postgresql.conf recommendations  (Johnny Tan <johnnydtan@gmail.com>)
Responses Re: postgresql.conf recommendations  (Johnny Tan <johnnydtan@gmail.com>)
List pgsql-performance
Johnny,

Sure thing, here's the system tap script:

#! /usr/bin/env stap

global pauses, counts

probe begin {
  printf("%s\n", ctime(gettimeofday_s()))
}

probe  kernel.function("compaction_alloc@mm/compaction.c").return {
  elapsed_time = gettimeofday_us() - @entry(gettimeofday_us())
  key = sprintf("%d-%s", pid(), execname())
  pauses[key] = pauses[key] + elapsed_time
  counts[key]++
}

probe end {
  printf("%s\n", ctime(gettimeofday_s()))
  foreach (pid in pauses) {
    printf("pid %s : %d ms %d pauses\n", pid, pauses[pid]/1000, counts[pid])
  }
}


I was able to do some more observations in production, and some testing in the lab, here are my latest findings:
- The THP compaction delays aren't happening during queries (at least not that I've seen yet) from the middleware our legacy app uses.  The pauses during those queries are what originally got my attention.  Those queries though only ever insert/update/read/delete one record at a time (don't ask).  Which would theoretically makes sense, since because of how that app works, the pg backend processes for that app don't have to ask for as much memory during a query, which is when the THP compactions would be happening.
- The THP compaction delays are impacting backend processes that are for other apps, and things like autovacuum processes - sometimes multiple seconds worth of delay over a short period of time
- I haven't been able to duplicate 1+s query times for our "one record at a time" app in the lab, but I was getting some 20-30ms queries which is still higher than it should be most of the time.  We noticed in production by looking at pg_stat_bgwriter that the backends were having to write pages out for 50% of the allocations, so we starting tuning checkpoint/bgwriter settings on the test system and seem to be making some progress.  See http://www.westnet.com/~gsmith/content/postgresql/chkp-bgw-83.htm
- I think you already started looking at this, but the linux dirty memory settings may have to be tuned as well (see Greg's post http://notemagnet.blogspot.com/2008/08/linux-write-cache-mystery.html).   Ours haven't been changed from the defaults, but that's another thing to test for next week.  Have you had any luck tuning these yet?

Josh

pgsql-performance by date:

Previous
From: Scott Marlowe
Date:
Subject: Re: postgresql.conf recommendations
Next
From: Jeff Janes
Date:
Subject: Re: postgresql.conf recommendations