Re: Insert vs Update - Mailing list pgsql-performance

From Robert DiFalco
Subject Re: Insert vs Update
Date
Msg-id CAAXGW-wFUXdeHOmH4sjYYm2dV9nmWw7H7xsK_NaH75OJ5fQ-3g@mail.gmail.com
Whole thread Raw
In response to Re: Insert vs Update  ("David G. Johnston" <david.g.johnston@gmail.com>)
Responses Re: Insert vs Update  ("David G. Johnston" <david.g.johnston@gmail.com>)
List pgsql-performance


On Wed, Jul 15, 2015 at 10:33 AM, David G. Johnston <david.g.johnston@gmail.com> wrote:
On Wednesday, July 15, 2015, Robert DiFalco <robert.difalco@gmail.com> wrote:
First off I apologize if this is question has been beaten to death. I've looked around for a simple answer and could not find one. 

Given a database that will not have it's PKEY or indices modified, is it generally faster to INSERT or UPDATE data. And if there is a performance difference is it substantial?

This seems odd.  If you have an option to update but choose to insert what becomes of the other record? 


Consider the two pseudo-schemas, I'm just making this up for example purposes:

SCHEMA A
=========
meal(id SEQUENCE,user_id, started DEFAULT NOW())
meal_prepared(ref_meal_id, prepared DEFAULT NOW())
meal_abandoned(ref_meal_id, abandoned ...)
meal_consumed(ref_meal_id, consumed ...)
etc.

Then in response to different meal events you always have an insert. 

aMealId = INSERT INTO meal(user_id) VALUES (aUserId);

When preparation starts:

INSERT INTO meal_prepared(ref_meal_id) VALUES (aMealId);

And so on for each event. 

Compare that to this:

SCHEMA B
=========
meal_event(id, started, prepared, abandoned, consumed, ...)

The start of the meal is an INSERT:

aMealId = INSERT INTO meal_event(user_id, started) VALUES (aUserId, NOW());

When preparation starts:

UPDATE meal_event SET prepared = NOW() WHERE id = aMealId;

And so on.

Basically the same data, in one case you always do inserts and add new tables for new events. In the other case you only insert once and then update for each state, then you add columns if you have new states. 

As I said this is just an example. But in SCHEMA A you have only inserts, lots of tables and in SCHEMA B you have a lot of updates and a lot of possibly NULL columns if certain events don't occur. 

Is that more clear?

R.


 

pgsql-performance by date:

Previous
From: "David G. Johnston"
Date:
Subject: Re: Insert vs Update
Next
From: "David G. Johnston"
Date:
Subject: Re: Insert vs Update