Re: [GENERAL] disk writes within a transaction - Mailing list pgsql-general

From Jeff Janes
Subject Re: [GENERAL] disk writes within a transaction
Date
Msg-id CAMkU=1wmztpcQ7egvfCkn_UztC6-DQ0UH7-xrNWyTukiu-7u7A@mail.gmail.com
Whole thread Raw
In response to [GENERAL] disk writes within a transaction  (2xlp - ListSubscriptions <postgres@2xlp.com>)
Responses Re: [GENERAL] disk writes within a transaction  (jonathan vanasco <postgres@2xlp.com>)
List pgsql-general
On Thu, Feb 16, 2017 at 11:33 AM, 2xlp - ListSubscriptions <postgres@2xlp.com> wrote:
Can someone enlighten me to how postgres handles disk writing?  I've read some generic remarks about buffers, but that's about it.

We have a chunk of code that calls Postgres in a less-than-optimal way within a transaction block.  I'm wondering where to prioritize fixing it, as the traffic on the wire isn't an issue.

Basically the code looks like this:

        begin;
        update foo set foo.a='1' where foo.bar = 1;
        ...
        update foo set foo.b='2' where foo.bar = 1;
        ...
        update foo set foo.c='3' where foo.bar = 1;
        commit;

If the updates are likely to be a memory based operation, consolidating them can wait.  If they are likely to hit the disk, I should schedule refactoring this code sooner than later.

You are going to generate more volume of WAL data, which has to reach disk eventually.  Although it is likely they will all be consolidated into about the same number of physical writes and syncs.  

You are also likely to inhibit the Heap-only-tuple mechanism, because you will end up with 4 copies of the row which all have to fit in the same block.  If they don't, it has to migrate some of them to a different block plus do index maintenance, so you will generate more dirty blocks that way.  How many more depends on how many indexes you have, and whether the columns being updated are themselves included in indexes.

There is also a CPU issue when the same tuple is updated repeatedly in a single transaction.  Each update has to wade through all the previous row versions, so it is an N^2 operation in the number of updates.

It will probably be easier to refactor the code than to quantify just how much damage it does.

cheers,

Jeff

pgsql-general by date:

Previous
From: Adrian Klaver
Date:
Subject: Re: [GENERAL] PostgreSQL mirroring from RPM install to RPMinstall-revisited
Next
From: Arnold Somogyi
Date:
Subject: [GENERAL] Multiply ON CONFLICT ON CONSTRAINT