Thread: Commit(?) overhead

Commit(?) overhead

From
Duncan Kinnear
Date:
We have a very simple table, whose DDL is as follows: 

    CREATE TABLE public.next_id (
       id varchar(255) NOT NULL,
       next_value int8 NOT NULL,
       CONSTRAINT next_id_pk PRIMARY KEY (id)
    ); 

The table only has about 125 rows, and there are no indexes apart from the primary key constraint.

In DBeaver I am executing the following UPDATE query:

    UPDATE next_id SET next_value=next_value+1 WHERE id='Session';

If I point DBeaver to a server (localhost) running version:
   11.2 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 8.2.1 20181127, 64-bit
it executes on average in about 50ms.

the EXPLAIN (ANALYSE, TIMING TRUE) of this query gives:

 Update on next_id  (cost=0.14..8.16 rows=1 width=36) (actual time=0.057..0.057 rows=0 loops=1)
   ->  Index Scan using next_id_pk on next_id  (cost=0.14..8.16 rows=1 width=36) (actual time=0.039..0.040 rows=1
loops=1)
         Index Cond: ((id)::text = 'Session'::text)
 Planning Time: 0.083 ms
 Execution Time: 0.089 ms

which is significantly less than 50ms.

Now, if I point DBeaver to a VM server on the same gigabit network switch, running version: 
   9.5.3 on i386-pc-solaris2.11, compiled by cc: Sun C 5.10 SunOS_i386 Patch 142363-07 2010/12/09, 64-bit
then the same query executes in about 2-3ms

The EXPLAIN output when executing the query on this server is:

 Update on next_id  (cost=0.27..8.29 rows=1 width=36) (actual time=0.062..0.062 rows=0 loops=1)
   ->  Index Scan using next_id_pkey on next_id  (cost=0.27..8.29 rows=1 width=36) (actual time=0.025..0.026 rows=1
loops=1)
         Index Cond: ((id)::text = 'Session'::text)
 Planning time: 0.083 ms
 Execution time: 0.096 ms

which you will see is virtually identical to the slower version.

Why is the query taking so much longer on the localhost server?

Not that the localhost machine is significantly faster in other metrics (CPU, file system, etc.)

I have also tried the query on another server on the same network switch running version: 
   10.5 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 8.2.0, 64-bit
and the timings are very similar to those for 'localhost'. That is, approx 50ms on average.

Now, if I run the following FOR LOOP query:

    do $$
    begin
    for i in 1..10000 loop
        update NEXT_ID set next_value=next_value+1 where id='Session';
    end loop;
    end;
    $$;

Then this completes in about the same time on ALL of the servers - approximately 1.7s - which makes sense as 10,000
timesthe above plan/execute times is approx 1.7s.
 

So, to me this feels like some kind of COMMIT overhead of approx 50ms that the version 10 and version 11 servers are
experiencing.But I have no idea where to look to try and find where this time is being spent.
 

Note that the schemas of the databases on the 3 servers involved are virtually identical. The schema for this table is
exactlythe same.
 

Hoping that someone can give me an idea about where to go looking.


Regards, 

Duncan Kinnear 
    
Floor 1, 100 McLeod St, Hastings 4120, New Zealand 
PO Box 2006, Hastings 4153, New Zealand 
P: +64 6 871 5700 F: +64 6 871 5709 E: duncan.kinnear@mccarthy.co.nz



Re: Commit(?) overhead

From
Jeff Janes
Date:
On Thu, Apr 4, 2019 at 3:42 AM Duncan Kinnear <duncan.kinnear@mccarthy.co.nz> wrote:

the EXPLAIN (ANALYSE, TIMING TRUE) of this query gives:

 Update on next_id  (cost=0.14..8.16 rows=1 width=36) (actual time=0.057..0.057 rows=0 loops=1)
   ->  Index Scan using next_id_pk on next_id  (cost=0.14..8.16 rows=1 width=36) (actual time=0.039..0.040 rows=1 loops=1)
         Index Cond: ((id)::text = 'Session'::text)
 Planning Time: 0.083 ms
 Execution Time: 0.089 ms

which is significantly less than 50ms.

The EXPLAIN ANALYZE doesn't include the time needed to fsync the transaction logs.  It measures only the update itself, not the implicit commit at the end.  DBeaver is seeing the fsync-inclusive time.  50ms is pretty long, but some file systems and OSes seem to be pretty inefficient at this and take several disk revolutions to get the data down.
 

Now, if I point DBeaver to a VM server on the same gigabit network switch, running version:
   9.5.3 on i386-pc-solaris2.11, compiled by cc: Sun C 5.10 SunOS_i386 Patch 142363-07 2010/12/09, 64-bit
then the same query executes in about 2-3ms

That machine probably has hardware to do a fast fsync, has fsync turned off, or is lying about the safety of its data.

Cheers,

Jeff

Re: Commit(?) overhead

From
Duncan Kinnear
Date:
----- On 5 Apr, 2019, at 4:14 AM, Jeff Janes <jeff.janes@gmail.com> wrote: 

> On Thu, Apr 4, 2019 at 3:42 AM Duncan Kinnear < [
> mailto:duncan.kinnear@mccarthy.co.nz | duncan.kinnear@mccarthy.co.nz ] > wrote:

>> the EXPLAIN (ANALYSE, TIMING TRUE) of this query gives:

>> Update on next_id (cost=0.14..8.16 rows=1 width=36) (actual time=0.057..0.057
>> rows=0 loops=1)
>> -> Index Scan using next_id_pk on next_id (cost=0.14..8.16 rows=1 width=36)
>> (actual time=0.039..0.040 rows=1 loops=1)
>> Index Cond: ((id)::text = 'Session'::text)
>> Planning Time: 0.083 ms
>> Execution Time: 0.089 ms

>> which is significantly less than 50ms.

> The EXPLAIN ANALYZE doesn't include the time needed to fsync the transaction
> logs. It measures only the update itself, not the implicit commit at the end.
> DBeaver is seeing the fsync-inclusive time. 50ms is pretty long, but some file
> systems and OSes seem to be pretty inefficient at this and take several disk
> revolutions to get the data down.

>> Now, if I point DBeaver to a VM server on the same gigabit network switch,
>> running version:
>> 9.5.3 on i386-pc-solaris2.11, compiled by cc: Sun C 5.10 SunOS_i386 Patch
>> 142363-07 2010/12/09, 64-bit
>> then the same query executes in about 2-3ms

> That machine probably has hardware to do a fast fsync, has fsync turned off, or
> is lying about the safety of its data.

Just a quick update. I tried performing a sequence of BEGIN; UPDATE ...; COMMIT; and I got the following log entries:

Apr 10 09:02:40 duncanpc postgres[7656]: 2019-04-10 09:02:40.639 NZST [29887] LOG:  duration: 0.025 ms  parse
<unnamed>:begin
 
Apr 10 09:02:40 duncanpc postgres[7656]: 2019-04-10 09:02:40.639 NZST [29887] LOG:  duration: 0.014 ms  bind <unnamed>:
begin
Apr 10 09:02:40 duncanpc postgres[7656]: 2019-04-10 09:02:40.639 NZST [29887] LOG:  duration: 0.003 ms  execute
<unnamed>:begin
 
Apr 10 09:02:40 duncanpc postgres[7656]: 2019-04-10 09:02:40.639 NZST [29887] LOG:  duration: 0.045 ms  parse
<unnamed>: update NEXT_ID set next_value=next_value+1 where id='Session'
 
Apr 10 09:02:40 duncanpc postgres[7656]: 2019-04-10 09:02:40.640 NZST [29887] LOG:  duration: 0.055 ms  bind <unnamed>:
update NEXT_ID set next_value=next_value+1 where id='Session'
 
Apr 10 09:02:40 duncanpc postgres[7656]: 2019-04-10 09:02:40.640 NZST [29887] LOG:  duration: 0.059 ms  execute
<unnamed>: update NEXT_ID set next_value=next_value+1 where id='Session'
 
Apr 10 09:02:40 duncanpc postgres[7656]: 2019-04-10 09:02:40.640 NZST [29887] LOG:  duration: 0.004 ms  parse
<unnamed>: commit
 
Apr 10 09:02:40 duncanpc postgres[7656]: 2019-04-10 09:02:40.640 NZST [29887] LOG:  duration: 0.003 ms  bind <unnamed>:
commit
 
Apr 10 09:02:40 duncanpc postgres[7656]: 2019-04-10 09:02:40.690 NZST [29887] LOG:  duration: 50.237 ms  execute
<unnamed>: commit
 

So this confirms that the overhead is indeed happening in the COMMIT part. But how do I get more detailed logging to
seewhat it is doing?
 

Note, in a previous reply to Jeff (which I forgot to CC to the list) I explained that the slow machines are both using
BTRFSas the filesystem, and a bit of googling has revealed that using PostgreSQL on BTRFS filesystems is (don't cross
thestreams) bad.
 

Jeff, I will try adding the wait event stuff to see if that it what it is doing.


Cheers, Duncan



Re: Commit(?) overhead

From
Justin Pryzby
Date:
On Wed, Apr 10, 2019 at 09:26:22AM +1200, Duncan Kinnear wrote:
> ----- On 5 Apr, 2019, at 4:14 AM, Jeff Janes <jeff.janes@gmail.com> wrote: 
> 
> > On Thu, Apr 4, 2019 at 3:42 AM Duncan Kinnear <duncan.kinnear@mccarthy.co.nz | duncan.kinnear@mccarthy.co.nz ] >
wrote:
> 
> >> the EXPLAIN (ANALYSE, TIMING TRUE) of this query gives:
> 
> >> Update on next_id (cost=0.14..8.16 rows=1 width=36) (actual time=0.057..0.057
> >> rows=0 loops=1)
> >> -> Index Scan using next_id_pk on next_id (cost=0.14..8.16 rows=1 width=36)
> >> (actual time=0.039..0.040 rows=1 loops=1)
> >> Index Cond: ((id)::text = 'Session'::text)
> >> Planning Time: 0.083 ms
> >> Execution Time: 0.089 ms
> 
> >> which is significantly less than 50ms.
> 
> > The EXPLAIN ANALYZE doesn't include the time needed to fsync the transaction
> > logs. It measures only the update itself, not the implicit commit at the end.
> > DBeaver is seeing the fsync-inclusive time. 50ms is pretty long, but some file
> > systems and OSes seem to be pretty inefficient at this and take several disk
> > revolutions to get the data down.
> 
> >> Now, if I point DBeaver to a VM server on the same gigabit network switch,
> >> running version:
> >> 9.5.3 on i386-pc-solaris2.11, compiled by cc: Sun C 5.10 SunOS_i386 Patch
> >> 142363-07 2010/12/09, 64-bit
> >> then the same query executes in about 2-3ms
> 
> > That machine probably has hardware to do a fast fsync, has fsync turned off, or
> > is lying about the safety of its data.
> 
> Just a quick update. I tried performing a sequence of BEGIN; UPDATE ...; COMMIT; and I got the following log
entries:

> Apr 10 09:02:40 duncanpc postgres[7656]: 2019-04-10 09:02:40.640 NZST [29887] LOG:  duration: 0.003 ms  bind
<unnamed>: commit
 
> Apr 10 09:02:40 duncanpc postgres[7656]: 2019-04-10 09:02:40.690 NZST [29887] LOG:  duration: 50.237 ms  execute
<unnamed>: commit
 
> 
> So this confirms that the overhead is indeed happening in the COMMIT part. But how do I get more detailed logging to
seewhat it is doing?
 

commit is causing the fsync() Jeff mentioned.

You could test that's the issue by comparing with fsync=off (please read what
that means and don't run your production cluster like that).
https://www.postgresql.org/docs/current/runtime-config-wal.html#GUC-FSYNC

You could also put your XLOG on a separate FS (as a test or otherwise).

Justin



Re: Commit(?) overhead

From
Andres Freund
Date:
Hi,

On 2019-04-09 17:12:27 -0500, Justin Pryzby wrote:
> You could test that's the issue by comparing with fsync=off (please read what
> that means and don't run your production cluster like that).
> https://www.postgresql.org/docs/current/runtime-config-wal.html#GUC-FSYNC

I suggest testing it with synchronous_commit=off instead. That's about
as fast for this type of workload, doesn't have cluster corruption
issues, the window of a transaction not persisting in case of a crash is
very small, and it can just set by any user in individual sessions.

Greetings,

Andres Freund



Re: Commit(?) overhead

From
Duncan Kinnear
Date:
----- On 10 Apr, 2019, at 10:23 AM, Andres Freund andres@anarazel.de wrote:

> On 2019-04-09 17:12:27 -0500, Justin Pryzby wrote:
>> You could test that's the issue by comparing with fsync=off (please read what
>> that means and don't run your production cluster like that).
>> https://www.postgresql.org/docs/current/runtime-config-wal.html#GUC-FSYNC
> 
> I suggest testing it with synchronous_commit=off instead. That's about
> as fast for this type of workload, doesn't have cluster corruption
> issues, the window of a transaction not persisting in case of a crash is
> very small, and it can just set by any user in individual sessions.

Bingo! Adding 'SET LOCAL synchronous_commit TO OFF;' to my 'BEGIN; UPDATE ....; COMMIT;' block has given me sub-1ms
timings!Thanks Andres.
 

I'll probably leave the setting as that on my local machine. The option appears to be relatively safe, but my machine
isjust a dev test machine anyway.
 


Regards, 

Duncan Kinnear 



Re: Commit(?) overhead

From
Laurenz Albe
Date:
Duncan Kinnear wrote:
> Bingo! Adding 'SET LOCAL synchronous_commit TO OFF;' to my 'BEGIN; UPDATE ....; COMMIT;'
> block has given me sub-1ms timings! Thanks Andres.

That's a pretty clear indication that your I/O subsystem was overloaded.

Yours,
Laurenz Albe
-- 
Cybertec | https://www.cybertec-postgresql.com