Thread: Best practices: MERGE

Best practices: MERGE

From
David Fetter
Date:
Folks,

Although the SQL:2003 command MERGE has not yet been implemented in
PostgreSQL, I'm guessing that there are best practices for how to
implement the MERGE functionality.

To recap, MERGE means (roughly) INSERT the tuple if no tuple matches
certain criteria, otherwise UPDATE using similar criteria.

The "correct" solution, as far as I can tell, is to acquire a LOCK on
the table IN SHARE MODE at the beginning of the transaction, but this
has (at least for many applications) unacceptable performance
characteristics.  Accepting that there is a slight risk of a race
condition when *not* locking the table at the beginning of the
transaction, what procedure minimizes this risk and recovers well from
said race condition, should it occur?

TIA for any hints, tips or pointers on this :)

Cheers,
D
-- 
David Fetter david@fetter.org http://fetter.org/
phone: +1 510 893 6100   mobile: +1 415 235 3778

Remember to vote!


Re: Best practices: MERGE

From
Christopher Kings-Lynne
Date:
> The "correct" solution, as far as I can tell, is to acquire a LOCK on
> the table IN SHARE MODE at the beginning of the transaction, but this
> has (at least for many applications) unacceptable performance
> characteristics.  Accepting that there is a slight risk of a race
> condition when *not* locking the table at the beginning of the
> transaction, what procedure minimizes this risk and recovers well from
> said race condition, should it occur?

IN SHARE MODE is not enough, you can get deadlocks.  You require IN 
SHARE ROW EXCLUSIVE MODE.  other than that, it's a sucky solution 
because it breaks concurrency.  In pgsql 8, you can do it using pl/pgsql 
exception handling.

Chris


Re: Best practices: MERGE

From
David Fetter
Date:
On Tue, Mar 08, 2005 at 11:45:19AM +0800, Christopher Kings-Lynne wrote:
> >The "correct" solution, as far as I can tell, is to acquire a LOCK
> >on the table IN SHARE MODE at the beginning of the transaction, but
> >this has (at least for many applications) unacceptable performance
> >characteristics.  Accepting that there is a slight risk of a race
> >condition when *not* locking the table at the beginning of the
> >transaction, what procedure minimizes this risk and recovers well
> >from said race condition, should it occur?
> 
> IN SHARE MODE is not enough, you can get deadlocks.  You require IN
> SHARE ROW EXCLUSIVE MODE.  other than that, it's a sucky solution
> because it breaks concurrency.  In pgsql 8, you can do it using
> pl/pgsql exception handling.

Luckily, PG 8 is available for this.  Do you have a short example?

Cheers,
D
-- 
David Fetter david@fetter.org http://fetter.org/
phone: +1 510 893 6100   mobile: +1 415 235 3778

Remember to vote!


Re: Best practices: MERGE

From
Christopher Kings-Lynne
Date:
> Luckily, PG 8 is available for this.  Do you have a short example?

No, and I think it should be in the manual as an example.

You will need to enter a loop that uses exception handling to detect 
unique_violation.

Chris


Re: Best practices: MERGE

From
David Fetter
Date:
On Tue, Mar 08, 2005 at 12:27:21PM +0800, Christopher Kings-Lynne wrote:
> >Luckily, PG 8 is available for this.  Do you have a short example?
>
> No, and I think it should be in the manual as an example.
>
> You will need to enter a loop that uses exception handling to detect
> unique_violation.

Pursuant to an IRC discussion to which Dennis Bjorklund and
Christopher Kings-Lynne made most of the contributions, please find
enclosed an example patch demonstrating an UPSERT-like capability.

Cheers,
D
--
David Fetter david@fetter.org http://fetter.org/
phone: +1 510 893 6100   mobile: +1 415 235 3778

Remember to vote!

Re: Best practices: MERGE

From
Simon Riggs
Date:
On Mon, 2005-03-07 at 19:34 -0800, David Fetter wrote:

> Although the SQL:2003 command MERGE has not yet been implemented in
> PostgreSQL, I'm guessing that there are best practices for how to
> implement the MERGE functionality.
> 
> To recap, MERGE means (roughly) INSERT the tuple if no tuple matches
> certain criteria, otherwise UPDATE using similar criteria.

Don't understand that way round...

I thought the logic was:
UPDATE WHERE ..... (locate row)
IF NOT FOUND THEN
INSERT (new row)

You can create a procedure to do that, but MERGE would work better.

ISTM that would require writing some new code that was a mix of
heap_update and heap_insert logic for the low level stuff would be
required. The existing heap_update code is most similar, since the logic
is roughly

UPDATE WHERE.... (locate row)
IF FOUND THEN
INSERT (new row version)

though with various changes to row visibility stuff.

One might aim to do this in two stages:
1. initially support a single row upsert such as MySQL's REPLACE command
2. a full implementation of MERGE that used set logic as per the spec

...

Best Regards, Simon Riggs



Re: Best practices: MERGE

From
Christopher Kings-Lynne
Date:
> You can create a procedure to do that, but MERGE would work better.
> 
> ISTM that would require writing some new code that was a mix of
> heap_update and heap_insert logic for the low level stuff would be
> required. The existing heap_update code is most similar, since the logic
> is roughly
> 
> UPDATE WHERE.... (locate row)
> IF FOUND THEN
> INSERT (new row version)
> 
> though with various changes to row visibility stuff.
> 
> One might aim to do this in two stages:
> 1. initially support a single row upsert such as MySQL's REPLACE command
> 2. a full implementation of MERGE that used set logic as per the spec
> 
> ...

The main issue is dealing with merging into unique index race conditions.

Chris


Re: Best practices: MERGE

From
Bruce Momjian
Date:
Patch applied.  Thanks.  Sorry for the delay in applying.

---------------------------------------------------------------------------


David Fetter wrote:
> On Tue, Mar 08, 2005 at 12:27:21PM +0800, Christopher Kings-Lynne wrote:
> > >Luckily, PG 8 is available for this.  Do you have a short example?
> >
> > No, and I think it should be in the manual as an example.
> >
> > You will need to enter a loop that uses exception handling to detect
> > unique_violation.
>
> Pursuant to an IRC discussion to which Dennis Bjorklund and
> Christopher Kings-Lynne made most of the contributions, please find
> enclosed an example patch demonstrating an UPSERT-like capability.
>
> Cheers,
> D
> --
> David Fetter david@fetter.org http://fetter.org/
> phone: +1 510 893 6100   mobile: +1 415 235 3778
>
> Remember to vote!

[ Attachment, skipping... ]

>
> ---------------------------(end of broadcast)---------------------------
> TIP 9: the planner will ignore your desire to choose an index scan if your
>       joining column's datatypes do not match

--
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073

Re: [PATCHES] Best practices: MERGE

From
Christopher Kings-Lynne
Date:
Is that broken?

http://momjian.postgresql.org/main/writings/pgsql/sgml/build.html

Chris

Bruce Momjian wrote:
> Patch applied.  Thanks.  Sorry for the delay in applying.
>
> ---------------------------------------------------------------------------
>
>
> David Fetter wrote:
>
>>On Tue, Mar 08, 2005 at 12:27:21PM +0800, Christopher Kings-Lynne wrote:
>>
>>>>Luckily, PG 8 is available for this.  Do you have a short example?
>>>
>>>No, and I think it should be in the manual as an example.
>>>
>>>You will need to enter a loop that uses exception handling to detect
>>>unique_violation.
>>
>>Pursuant to an IRC discussion to which Dennis Bjorklund and
>>Christopher Kings-Lynne made most of the contributions, please find
>>enclosed an example patch demonstrating an UPSERT-like capability.
>>
>>Cheers,
>>D
>>--
>>David Fetter david@fetter.org http://fetter.org/
>>phone: +1 510 893 6100   mobile: +1 415 235 3778
>>
>>Remember to vote!
>
>
> [ Attachment, skipping... ]
>
>
>>---------------------------(end of broadcast)---------------------------
>>TIP 9: the planner will ignore your desire to choose an index scan if your
>>      joining column's datatypes do not match
>
>

Re: [PATCHES] Best practices: MERGE

From
Bruce Momjian
Date:
Thanks, fixed.

---------------------------------------------------------------------------

Christopher Kings-Lynne wrote:
> Is that broken?
>
> http://momjian.postgresql.org/main/writings/pgsql/sgml/build.html
>
> Chris
>
> Bruce Momjian wrote:
> > Patch applied.  Thanks.  Sorry for the delay in applying.
> >
> > ---------------------------------------------------------------------------
> >
> >
> > David Fetter wrote:
> >
> >>On Tue, Mar 08, 2005 at 12:27:21PM +0800, Christopher Kings-Lynne wrote:
> >>
> >>>>Luckily, PG 8 is available for this.  Do you have a short example?
> >>>
> >>>No, and I think it should be in the manual as an example.
> >>>
> >>>You will need to enter a loop that uses exception handling to detect
> >>>unique_violation.
> >>
> >>Pursuant to an IRC discussion to which Dennis Bjorklund and
> >>Christopher Kings-Lynne made most of the contributions, please find
> >>enclosed an example patch demonstrating an UPSERT-like capability.
> >>
> >>Cheers,
> >>D
> >>--
> >>David Fetter david@fetter.org http://fetter.org/
> >>phone: +1 510 893 6100   mobile: +1 415 235 3778
> >>
> >>Remember to vote!
> >
> >
> > [ Attachment, skipping... ]
> >
> >
> >>---------------------------(end of broadcast)---------------------------
> >>TIP 9: the planner will ignore your desire to choose an index scan if your
> >>      joining column's datatypes do not match
> >
> >
>

--
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073