Home > mailing lists

Re: 9.4 regression - Mailing list pgsql-hackers

From	Jon Nelson
Subject	Re: 9.4 regression
Date	August 9, 2013 06:59:08
Msg-id	CAKuK5J0UUQh25HcK6cGgXi2kCdFjfQSWvhOu9th-FoVqJhSaWg@mail.gmail.com Whole thread Raw
In response to	Re: 9.4 regression (Andres Freund <andres@2ndquadrant.com>)
Responses	Re: 9.4 regression (Andres Freund <andres@2ndquadrant.com>)
List	pgsql-hackers

Tree view

On Thu, Aug 8, 2013 at 9:27 PM, Andres Freund <andres@2ndquadrant.com> wrote:
> On 2013-08-08 16:12:06 -0500, Jon Nelson wrote:
...

>> At this point I'm convinced that the issue is a pathological case in
>> ext4. The performance impact disappears as soon as the unwritten
>> extent(s) are written to with real data. Thus, even though allocating
>> files with posix_fallocate is - frequently - orders of magnitude
>> quicker than doing it with write(2), the subsequent re-write can be
>> more expensive.  At least, that's what I'm gathering from the various
>> threads.
>
>
>>  Why this issue didn't crop up in earlier testing and why I
>> can't seem to make test_fallocate do it (even when I modify
>> test_fallocate to write to the newly-allocated file in a mostly-random
>> fashion) has me baffled.
>
> It might be kernel version specific and concurrency seems to play a
> role. If you reproduce the problem, could you run a "perf record -ga" to
> collect a systemwide profile?

Finally, an excuse to learn how to use 'perf'! I'll try to provide
that info when I am able.

> There's some more things to test:
> - is the slowdown dependent on the scale? I.e is it visible with -j 1 -c
>   1?

scale=1 (-j 1 -c 1):
with fallocate: 685 tps
without: 727

scale=20
with fallocate: 129
without: 402

scale=40
with fallocate: 163
without: 511

> - Does it also occur in synchronous_commit=off configurations? Those
>   don't fdatasync() from so many backends, that might play a role.

With synchronous_commit=off, the performance is vastly improved.
Interestingly, the fallocate case is (immaterially) faster than the
non-fallocate case:   3766tps vs 3700tps.

I tried a few other wal_sync_methods besides the default of fdatasync,
all with scale=80.

fsync:
198 tps (with fallocate) vs 187.

open_sync:
195 tps (with fallocate) vs. 192

> - do bulkloads see it? E.g. the initial pgbench load?

time pgbench -s 200 -p 54320 -d pgb -i

with fallocate: 2m47s
without: 2m50s

Hopefully the above is useful.

-- 
Jon

pgsql-hackers by date:

From: Tomonari Katsumata
Date: 09 August 2013, 06:15:09
Subject: Re: Should we remove "not fast" promotion at all?

From: Bruce Momjian
Date: 09 August 2013, 07:04:59
Subject: pg_dump and schema names

Re: 9.4 regression - Mailing list pgsql-hackers

Previous

Next