Re: slow i/o - Mailing list pgsql-performance

From Junaili Lie
Subject Re: slow i/o
Date
Msg-id 8d04ce990609261627t7321d3d4v7b8b4715e24b77a5@mail.gmail.com
Whole thread Raw
In response to Re: slow i/o  ("Junaili Lie" <junaili@gmail.com>)
Responses Re: slow i/o
List pgsql-performance
Hi all,
I am still encountering this issue.
I am doing further troubleshooting.
Here is what I found:
When I do: dtrace -s /usr/demo/dtrace/whoio.d
I found that there's one process that is doing majority of i/o, but that process is not listed on pg_stat_activity.
I am also seeing more of this type of query being slow:
EXECUTE <unnamed>  [PREPARE: ...
I am also seeing some article recommending adding some entries on /etc/system:
segmapsize=2684354560 set ufs:freebehind=0
I haven't tried this, I am wondering if this will help.

Also, here is the output of iostat -xcznmP 1 at approx time during the i/o spike:
                    extended device statistics
    r/s    w/s   kr/s   kw/s wait actv wsvc_t asvc_t  %w  %b device
    4.0  213.0   32.0 2089.9  0.0 17.0    0.0   78.5   0  61 c1t0d0s6 (/usr)
     cpu
 us sy wt id
 54  6  0 40
                    extended device statistics
    r/s    w/s   kr/s   kw/s wait actv wsvc_t asvc_t  %w  %b device
    0.0    0.0    0.0    0.0  0.0  0.9    0.0    0.0   0  90 c1t0d0s1 (/var)
    2.0  335.0   16.0 3341.6  0.2 73.3    0.6  217.4   4 100 c1t0d0s6 (/usr)
     cpu
 us sy wt id
 30  4  0 66
                    extended device statistics
    r/s    w/s   kr/s   kw/s wait actv wsvc_t asvc_t  %w  %b device
    0.0    1.0    0.0    4.0  0.0  0.1    0.0  102.0   0  10 c1t0d0s1 (/var)
    1.0  267.0    8.0 2729.1  0.0 117.8    0.0  439.5   0 100 c1t0d0s6 (/usr)
     cpu
 us sy wt id
 28  8  0 64
                    extended device statistics
    r/s    w/s   kr/s   kw/s wait actv wsvc_t asvc_t  %w  %b device
    1.0  270.0    8.0 2589.0  0.0 62.0    0.0  228.7   0 100 c1t0d0s6 (/usr)
     cpu
 us sy wt id
 26  2  0 72
                    extended device statistics
    r/s    w/s   kr/s   kw/s wait actv wsvc_t asvc_t  %w  %b device
    2.0  269.0   16.0 2971.5  0.0 66.6    0.0  245.7   0 100 c1t0d0s6 (/usr)
     cpu
 us sy wt id
  8  7  0 86
                    extended device statistics
    r/s    w/s   kr/s   kw/s wait actv wsvc_t asvc_t  %w  %b device
    1.0  268.0    8.0 2343.5  0.0 110.3    0.0  410.2   0 100 c1t0d0s6 (/usr)
     cpu
 us sy wt id
  4  4  0 92
                    extended device statistics
    r/s    w/s   kr/s   kw/s wait actv wsvc_t asvc_t  %w  %b device
    0.0  260.0    0.0 2494.5  0.0 63.5    0.0  244.2   0 100 c1t0d0s6 (/usr)
     cpu
 us sy wt id
 24  3  0 74
                    extended device statistics
    r/s    w/s   kr/s   kw/s wait actv wsvc_t asvc_t  %w  %b device
    1.0  286.0    8.0 2519.1 35.4 196.5  123.3  684.7  49 100 c1t0d0s6 (/usr)
     cpu
 us sy wt id
 65  4  0 30
                    extended device statistics
    r/s    w/s   kr/s   kw/s wait actv wsvc_t asvc_t  %w  %b device
    2.0  316.0   16.0 2913.8  0.0 117.2    0.0  368.7   0 100 c1t0d0s6 (/usr)
     cpu
 us sy wt id
 84  7  0  9
                    extended device statistics
    r/s    w/s   kr/s   kw/s wait actv wsvc_t asvc_t  %w  %b device
    5.0  263.0   40.0 2406.1  0.0 55.8    0.0  208.1   0 100 c1t0d0s6 (/usr)
     cpu
 us sy wt id
 77  4  0 20
                    extended device statistics
    r/s    w/s   kr/s   kw/s wait actv wsvc_t asvc_t  %w  %b device
    4.0  286.0   32.0 2750.6  0.0 75.0    0.0  258.5   0 100 c1t0d0s6 (/usr)
     cpu
 us sy wt id
 21  3  0 77
                    extended device statistics
    r/s    w/s   kr/s   kw/s wait actv wsvc_t asvc_t  %w  %b device
    2.0  273.0   16.0 2516.4  0.0 90.8    0.0  330.0   0 100 c1t0d0s6 (/usr)
     cpu
 us sy wt id
 15  6  0 78
                    extended device statistics
    r/s    w/s   kr/s   kw/s wait actv wsvc_t asvc_t  %w  %b device
    2.0  280.0   16.0 2711.6  0.0 65.6    0.0  232.6   0 100 c1t0d0s6 (/usr)
     cpu
 us sy wt id
  6  3  0 92
                    extended device statistics
    r/s    w/s   kr/s   kw/s wait actv wsvc_t asvc_t  %w  %b device
    1.0  308.0    8.0 2661.5 61.0 220.2  197.4  712.7  67 100 c1t0d0s6 (/usr)
     cpu
 us sy wt id
  7  4  0 90
                    extended device statistics
    r/s    w/s   kr/s   kw/s wait actv wsvc_t asvc_t  %w  %b device
    1.0  268.0    8.0 2839.9  0.0 97.1    0.0  360.9   0 100 c1t0d0s6 (/usr)
     cpu
 us sy wt id
 11 10  0 80
                    extended device statistics
    r/s    w/s   kr/s   kw/s wait actv wsvc_t asvc_t  %w  %b device
    0.0  309.0    0.0 3333.5 175.2 208.9  566.9  676.2  81  99 c1t0d0s6 (/usr)
     cpu
 us sy wt id
  0  0  0 100
                    extended device statistics
    r/s    w/s   kr/s   kw/s wait actv wsvc_t asvc_t  %w  %b device
    0.0  330.0    0.0 2704.0 145.6 256.0  441.1  775.7 100 100 c1t0d0s6 (/usr)
     cpu
 us sy wt id
  4  2  0 94
                    extended device statistics
    r/s    w/s   kr/s   kw/s wait actv wsvc_t asvc_t  %w  %b device
    0.0  311.0    0.0 2543.9 151.0 256.0  485.6  823.2 100 100 c1t0d0s6 (/usr)
     cpu
 us sy wt id
  2  0  0 98
                    extended device statistics
    r/s    w/s   kr/s   kw/s wait actv wsvc_t asvc_t  %w  %b device
    0.0  319.0    0.0 2576.0 147.4 256.0  462.0  802.5 100 100 c1t0d0s6 (/usr)
     cpu
 us sy wt id
  0  1  0 98
                    extended device statistics
    r/s    w/s   kr/s   kw/s wait actv wsvc_t asvc_t  %w  %b device
    0.0    0.0    0.0    0.0  0.0  0.2    0.0    0.0   2  13 c1t0d0s1 (/var)
    0.0  366.0    0.0 3088.0 124.4 255.8  339.9  698.8 100 100 c1t0d0s6 (/usr)
     cpu
 us sy wt id
  6  5  0 90
                    extended device statistics
    r/s    w/s   kr/s   kw/s wait actv wsvc_t asvc_t  %w  %b device
    0.0    2.0    0.0   16.0  0.0  1.1    0.0  533.2   0  54 c1t0d0s1 (/var)
    1.0  282.0    8.0 2849.0  1.5 129.2    5.2  456.5  10 100 c1t0d0s6 (/usr)

Thank you in advance for your help!

Jun

On 8/30/06, Junaili Lie <junaili@gmail.com> wrote:
I have tried this to no avail.
I have also tried changing the bg_writer_delay parameter to 10. The spike in i/o still occurs although not in a consistent basis and it is only happening for a few seconds.
 


 
On 8/30/06, Jignesh K. Shah <J.K.Shah@sun.com > wrote:
The bgwriter parameters changed in 8.1

Try

bgwriter_lru_maxpages=0
bgwriter_lru_percent=0

to turn off bgwriter and see if there is any change.

-Jignesh


Junaili Lie wrote:
> Hi Jignesh,
> Thank you for my reply.
> I have the setting just like what you described:
>
> wal_sync_method = fsync
> wal_buffers = 128
> checkpoint_segments = 128
> bgwriter_all_percent = 0
> bgwriter_maxpages = 0
>
>
> I ran the dtrace script and found the following:
> During the i/o busy time, there are postgres processes that has very
> high BYTES count. During that non i/o busy time, this same process
> doesn't do a lot of i/o activity. I checked the pg_stat_activity but
> couldn't found this process. Doing ps revealed that this process is
> started at the same time since the postgres started, which leads me to
> believe that it maybe background writer or some other internal process.
> This process are not autovacuum because it doesn't disappear when I
> tried turning autovacuum off.
> Except for the ones mentioned above, I didn't modify the other
> background setting:
> MONSOON=# show bgwriter_delay ;
>  bgwriter_delay
> ----------------
>  200
> (1 row)
>
> MONSOON=# show bgwriter_lru_maxpages ;  bgwriter_lru_maxpages
> -----------------------
>  5
> (1 row)
>
> MONSOON=# show bgwriter_lru_percent ;
>  bgwriter_lru_percent
> ----------------------
>  1
> (1 row)
>
> This i/o spike only happens at minute 1 and minute 6 (ie. 10.51, 10.56 )
> . If I do select * from pg_stat_activity during this time, I will see a
> lot of write queries waiting to be processed. After a few seconds,
> everything seems to be gone. All writes that are not happening at the
> time of this i/o jump are being processed very fast, thus do not show on
> pg_stat_activity.
>
> Thanks in advance for the reply,
> Best,
>
> J
>
> On 8/29/06, *Jignesh K. Shah* < J.K.Shah@sun.com
> <mailto: J.K.Shah@sun.com>> wrote:
>
>     Also to answer your real question:
>
>     DTrace On Solaris 10:
>
>     # dtrace -s /usr/demo/dtrace/whoio.d
>
>     It will tell you the pids doing the io activity and  on which devices.
>     There are more scripts in that directory like iosnoop.d, iotime.d
>     and others which also will give
>     other details like file accessed, time it took for the io etc.
>
>     Hope this helps.
>
>     Regards,
>     Jignesh
>
>
>     Junaili Lie wrote:
>      > Hi everyone,
>      > We have a postgresql 8.1 installed on Solaris 10. It is running fine.
>      > However, for the past couple days, we have seen the i/o reports
>      > indicating that the i/o is busy most of the time. Before this, we
>     only
>      > saw i/o being busy occasionally (very rare). So far, there has
>     been no
>      > performance complaints by customers, and the slow query reports
>     doesn't
>      > indicate anything out of the ordinary.
>      > There's no code changes on the applications layer and no database
>      > configuration changes.
>      > I am wondering if there's a tool out there on Solaris to tell which
>      > process is doing most of the i/o activity?
>      > Thank you in advance.
>      >
>      > J
>      >
>
>


pgsql-performance by date:

Previous
From: Markus Schaber
Date:
Subject: Re: Decreasing BLKSZ
Next
From: Jim Nasby
Date:
Subject: Re: Update on high concurrency OLTP application and Postgres