Re: Insert vs Update - Mailing list pgsql-performance

From Guillaume Lelarge
Subject Re: Insert vs Update
Date
Msg-id CAECtzeX8PFA50EEW601jcK-3dv4uk0c-nSRaOmd6biZuTZOm2w@mail.gmail.com
Whole thread Raw
In response to Re: Insert vs Update  ("David G. Johnston" <david.g.johnston@gmail.com>)
List pgsql-performance

Le 15 juil. 2015 11:16 PM, "David G. Johnston" <david.g.johnston@gmail.com> a écrit :
>
> On Wed, Jul 15, 2015 at 4:53 PM, Michael Nolan <htfoot@gmail.com> wrote:
>>
>> On Wed, Jul 15, 2015 at 3:16 PM, Robert DiFalco <robert.difalco@gmail.com> wrote:
>>>
>>>
>>> Thanks David, my example was a big simplification, but I appreciate your guidance. The different event types have differing amounts of related data. Query speed on this schema is not important, it's really the write speed that matters. So I was just wondering given the INSERT or UPDATE approach (with no indexed data being changed) if one is likely to be substantially faster than the other. 
>>>
>>
>> As I understand how ACID compliance is done, updating a record will require updating any indexes for that record, even if the index keys are not changing.  That's because any pending transactions still need to be able to find the 'old' data, while new transactions need to be able to find the 'new' data.  And ACID also means an update is essentially a delete-and-insert.  
>
>
> ​I might be a bit pedantic here but what you describe is a byproduct of the specific​ implementation that PostgreSQL uses to affect Consistency (the C in ACID) as opposed to a forgone outcome in being ACID compliant.
>
> http://www.postgresql.org/docs/9.4/static/mvcc-intro.html
>
> I'm out of my comfort zone here but the HOT optimization is designed to leverage the fact that an update to a row that does not affect indexed values is able to leave the index alone and instead during index lookup the index points to the old tuple, notices that there is a chain present, and walks that chain to find the currently active tuple.
>

That's true as long as the old and new tuples are stored in the same block.

> In short, if the only index is a PK an update of the row can avoid touching that index.
>
> I mentioned that going from NULL to Not NULL may disrupt this but I'm thinking I may have mis-spoken.
>
> Also, with separate tables the amount of data to write is going to be less because you'd have fewer columns on the affected tables.
>
> While an update is a delete+insert a delete is mostly just a bit-flip action - at least mid-transaction.  Depending on volume, though, the periodic impact of vaccuming may want to be taken into consideration.

--
Guillaume

pgsql-performance by date:

Previous
From: "David G. Johnston"
Date:
Subject: Re: Insert vs Update
Next
From: Julien Rouhaud
Date:
Subject: Re: intel s3500 -- hot stuff