Thread: Is replacing transactions with CTE a good idea?

Is replacing transactions with CTE a good idea?

From
Glen Huang
Date:
Hi all,

From application’s standpoint, it seems using CTE saves a lot work. You no longer need to parse values out only to pass
themback in, and only one round-trip to the db server. 

If I’m not wrong, CTE is equivalent to serializable transactions? So I guess the downsize is that quarries can’t be run
inparallel? 

If I decide to replace all my transaction code with CTE, will I shoot myself in the foot down the road?


Re: Is replacing transactions with CTE a good idea?

From
Brian Dunavant
Date:
On Thu, Apr 1, 2021 at 10:49 AM Glen Huang <heyhgl@gmail.com> wrote:
If I decide to replace all my transaction code with CTE, will I shoot myself in the foot down the road?

I do this all the time and makes code way cleaner.   It's very straightforward with inserts queries.  When you deal with updates/deletes, things can be trickier.  I usually leave these in a transaction if there is any concern.

They can also be hard for future programmers that may not understand SQL.  Make sure you comment your queries for maintainability long term.  

I have yet to regret replacing a transaction with a CTE over the past decade.  YMMV
 

Re: Is replacing transactions with CTE a good idea?

From
Dave Cramer
Date:



On Thu, 1 Apr 2021 at 10:50, Glen Huang <heyhgl@gmail.com> wrote:
Hi all,

From application’s standpoint, it seems using CTE saves a lot work. You no longer need to parse values out only to pass them back in, and only one round-trip to the db server.

If I’m not wrong, CTE is equivalent to serializable transactions? So I guess the downsize is that quarries can’t be run in parallel?

I do not think a CTE changes the isolation level. 

If I decide to replace all my transaction code with CTE, will I shoot myself in the foot down the road?


Dave Cramer
www.postgres.rocks 

Re: Is replacing transactions with CTE a good idea?

From
Rob Sargent
Date:
On 4/1/21 8:58 AM, Brian Dunavant wrote:
On Thu, Apr 1, 2021 at 10:49 AM Glen Huang <heyhgl@gmail.com> wrote:
If I decide to replace all my transaction code with CTE, will I shoot myself in the foot down the road?

I do this all the time and makes code way cleaner.   It's very straightforward with inserts queries.  When you deal with updates/deletes, things can be trickier.  I usually leave these in a transaction if there is any concern.

They can also be hard for future programmers that may not understand SQL.  Make sure you comment your queries for maintainability long term.  

I have yet to regret replacing a transaction with a CTE over the past decade.  YMMV
 
This must assume auto-commit mode where every single statement is "committed"?

Re: Is replacing transactions with CTE a good idea?

From
Glen Huang
Date:
> When you deal with updates/deletes, things can be trickier

Care to expand why they are tricker? I presume they run the risk of being referenced more than once?

On Apr 1, 2021, at 10:58 PM, Brian Dunavant <dunavant@gmail.com> wrote:


On Thu, Apr 1, 2021 at 10:49 AM Glen Huang <heyhgl@gmail.com> wrote:
If I decide to replace all my transaction code with CTE, will I shoot myself in the foot down the road?

I do this all the time and makes code way cleaner.   It's very straightforward with inserts queries.  When you deal with updates/deletes, things can be trickier.  I usually leave these in a transaction if there is any concern.

They can also be hard for future programmers that may not understand SQL.  Make sure you comment your queries for maintainability long term.  

I have yet to regret replacing a transaction with a CTE over the past decade.  YMMV
 

Re: Is replacing transactions with CTE a good idea?

From
Glen Huang
Date:
No, but are they equivalent to serializable transactions?

On Apr 1, 2021, at 11:04 PM, Dave Cramer <davecramer@postgres.rocks> wrote:





On Thu, 1 Apr 2021 at 10:50, Glen Huang <heyhgl@gmail.com> wrote:
Hi all,

From application’s standpoint, it seems using CTE saves a lot work. You no longer need to parse values out only to pass them back in, and only one round-trip to the db server.

If I’m not wrong, CTE is equivalent to serializable transactions? So I guess the downsize is that quarries can’t be run in parallel?

I do not think a CTE changes the isolation level. 

If I decide to replace all my transaction code with CTE, will I shoot myself in the foot down the road?


Dave Cramer
www.postgres.rocks 

Re: Is replacing transactions with CTE a good idea?

From
Dave Cramer
Date:


On Thu, 1 Apr 2021 at 11:09, Glen Huang <heyhgl@gmail.com> wrote:
No, but are they equivalent to serializable transactions?

No, they are not. 



Dave Cramer
www.postgres.rocks

On Apr 1, 2021, at 11:04 PM, Dave Cramer <davecramer@postgres.rocks> wrote:





On Thu, 1 Apr 2021 at 10:50, Glen Huang <heyhgl@gmail.com> wrote:
Hi all,

From application’s standpoint, it seems using CTE saves a lot work. You no longer need to parse values out only to pass them back in, and only one round-trip to the db server.

If I’m not wrong, CTE is equivalent to serializable transactions? So I guess the downsize is that quarries can’t be run in parallel?

I do not think a CTE changes the isolation level. 

If I decide to replace all my transaction code with CTE, will I shoot myself in the foot down the road?


Dave Cramer
www.postgres.rocks 

Re: Is replacing transactions with CTE a good idea?

From
Glen Huang
Date:
Sorry, my mistake. I misunderstood serializable. Are queries in a CTE equivalent to those in a repeatable read transaction?

On Apr 1, 2021, at 11:10 PM, Dave Cramer <davecramer@postgres.rocks> wrote:




On Thu, 1 Apr 2021 at 11:09, Glen Huang <heyhgl@gmail.com> wrote:
No, but are they equivalent to serializable transactions?

No, they are not. 



Dave Cramer
www.postgres.rocks

On Apr 1, 2021, at 11:04 PM, Dave Cramer <davecramer@postgres.rocks> wrote:





On Thu, 1 Apr 2021 at 10:50, Glen Huang <heyhgl@gmail.com> wrote:
Hi all,

From application’s standpoint, it seems using CTE saves a lot work. You no longer need to parse values out only to pass them back in, and only one round-trip to the db server.

If I’m not wrong, CTE is equivalent to serializable transactions? So I guess the downsize is that quarries can’t be run in parallel?

I do not think a CTE changes the isolation level. 

If I decide to replace all my transaction code with CTE, will I shoot myself in the foot down the road?


Dave Cramer
www.postgres.rocks 

Re: Is replacing transactions with CTE a good idea?

From
Dave Cramer
Date:
CTE's don't change the isolation level. I'm not sure what you are getting at here ?

Dave Cramer
www.postgres.rocks


On Thu, 1 Apr 2021 at 11:20, Glen Huang <heyhgl@gmail.com> wrote:
Sorry, my mistake. I misunderstood serializable. Are queries in a CTE equivalent to those in a repeatable read transaction?

On Apr 1, 2021, at 11:10 PM, Dave Cramer <davecramer@postgres.rocks> wrote:




On Thu, 1 Apr 2021 at 11:09, Glen Huang <heyhgl@gmail.com> wrote:
No, but are they equivalent to serializable transactions?

No, they are not. 



Dave Cramer
www.postgres.rocks

On Apr 1, 2021, at 11:04 PM, Dave Cramer <davecramer@postgres.rocks> wrote:





On Thu, 1 Apr 2021 at 10:50, Glen Huang <heyhgl@gmail.com> wrote:
Hi all,

From application’s standpoint, it seems using CTE saves a lot work. You no longer need to parse values out only to pass them back in, and only one round-trip to the db server.

If I’m not wrong, CTE is equivalent to serializable transactions? So I guess the downsize is that quarries can’t be run in parallel?

I do not think a CTE changes the isolation level. 

If I decide to replace all my transaction code with CTE, will I shoot myself in the foot down the road?


Dave Cramer
www.postgres.rocks 

Re: Is replacing transactions with CTE a good idea?

From
Glen Huang
Date:
Ah, I see what you mean. You still have to wrap a CTE inside a transaction to specify the isolation level? By default, queries in a CTE run with the read committed isolation level?

On Apr 1, 2021, at 11:10 PM, Dave Cramer <davecramer@postgres.rocks> wrote:




On Thu, 1 Apr 2021 at 11:09, Glen Huang <heyhgl@gmail.com> wrote:
No, but are they equivalent to serializable transactions?

No, they are not. 



Dave Cramer
www.postgres.rocks

On Apr 1, 2021, at 11:04 PM, Dave Cramer <davecramer@postgres.rocks> wrote:





On Thu, 1 Apr 2021 at 10:50, Glen Huang <heyhgl@gmail.com> wrote:
Hi all,

From application’s standpoint, it seems using CTE saves a lot work. You no longer need to parse values out only to pass them back in, and only one round-trip to the db server.

If I’m not wrong, CTE is equivalent to serializable transactions? So I guess the downsize is that quarries can’t be run in parallel?

I do not think a CTE changes the isolation level. 

If I decide to replace all my transaction code with CTE, will I shoot myself in the foot down the road?


Dave Cramer
www.postgres.rocks 

Re: Is replacing transactions with CTE a good idea?

From
Glen Huang
Date:
I had the impression that since they are chained together, somehow they run “tighter” 😂.

Thanks for pointing out that mistake.

On Apr 1, 2021, at 11:25 PM, Dave Cramer <davecramer@postgres.rocks> wrote:


CTE's don't change the isolation level. I'm not sure what you are getting at here ?

Dave Cramer
www.postgres.rocks


On Thu, 1 Apr 2021 at 11:20, Glen Huang <heyhgl@gmail.com> wrote:
Sorry, my mistake. I misunderstood serializable. Are queries in a CTE equivalent to those in a repeatable read transaction?

On Apr 1, 2021, at 11:10 PM, Dave Cramer <davecramer@postgres.rocks> wrote:




On Thu, 1 Apr 2021 at 11:09, Glen Huang <heyhgl@gmail.com> wrote:
No, but are they equivalent to serializable transactions?

No, they are not. 



Dave Cramer
www.postgres.rocks

On Apr 1, 2021, at 11:04 PM, Dave Cramer <davecramer@postgres.rocks> wrote:





On Thu, 1 Apr 2021 at 10:50, Glen Huang <heyhgl@gmail.com> wrote:
Hi all,

From application’s standpoint, it seems using CTE saves a lot work. You no longer need to parse values out only to pass them back in, and only one round-trip to the db server.

If I’m not wrong, CTE is equivalent to serializable transactions? So I guess the downsize is that quarries can’t be run in parallel?

I do not think a CTE changes the isolation level. 

If I decide to replace all my transaction code with CTE, will I shoot myself in the foot down the road?


Dave Cramer
www.postgres.rocks 

Re: Is replacing transactions with CTE a good idea?

From
Brian Dunavant
Date:
On Thu, Apr 1, 2021 at 11:06 AM Glen Huang <heyhgl@gmail.com> wrote:
Care to expand why they are tricker? I presume they run the risk of being referenced more than once?

There are lots of gotchas. It's also been a few years since I dug deep into this, so some of this may have changed in more recent versions.

* Changes in a CTE aren't visible to later CTEs since they haven't happened yet.   Often times people are updating a table and then doing further things and can hit situations they weren't expecting.  

db=> create table foo ( a integer primary key );
NOTICE:  CREATE TABLE / PRIMARY KEY will create implicit index "foo_pkey" for table "foo"
CREATE TABLE
db=> insert into foo values ( 1 );
INSERT 0 1
db=> with upt as ( update foo set a = 2 ) insert into foo values (1);
ERROR:  duplicate key value violates unique constraint "foo_pkey"
DETAIL:  Key (a)=(1) already exists.

* Unless you reference between the CTEs to force ordering, CTEs can happen in any order, which can cause things to get out of the order people expected.

* Just like you can cause deadlocks between threads in a transaction, you can do the same thing by shoving all those statements into a single CTE query. 



 

Re: Is replacing transactions with CTE a good idea?

From
Bruce Momjian
Date:
On Thu, Apr  1, 2021 at 11:24:48AM -0400, Dave Cramer wrote:
> CTE's don't change the isolation level. I'm not sure what you are getting at
> here ?

I think what he/she means here is that all queries in a CTE use a single
snapshot, meaning you don't see changes by commits that happen between
queries that are part of the same CTE.  If you were running the queries
separately in read committed mode, you would see those changes, but you
would not see them in repeatable read or serializable transaction mode.

-- 
  Bruce Momjian  <bruce@momjian.us>        https://momjian.us
  EDB                                      https://enterprisedb.com

  If only the physical world exists, free will is an illusion.




Re: Is replacing transactions with CTE a good idea?

From
Ron
Date:
On 4/1/21 10:04 AM, Rob Sargent wrote:
On 4/1/21 8:58 AM, Brian Dunavant wrote:
On Thu, Apr 1, 2021 at 10:49 AM Glen Huang <heyhgl@gmail.com> wrote:
If I decide to replace all my transaction code with CTE, will I shoot myself in the foot down the road?

I do this all the time and makes code way cleaner.   It's very straightforward with inserts queries.  When you deal with updates/deletes, things can be trickier.  I usually leave these in a transaction if there is any concern.

They can also be hard for future programmers that may not understand SQL.  Make sure you comment your queries for maintainability long term.  

I have yet to regret replacing a transaction with a CTE over the past decade.  YMMV
 
This must assume auto-commit mode where every single statement is "committed"?

That's the only way to explain an otherwise very puzzling question.  OP must not do lots of DML in each transaction.

--
Angular momentum makes the world go 'round.

Re: Is replacing transactions with CTE a good idea?

From
Dave Cramer
Date:


On Thu, 1 Apr 2021 at 15:39, Bruce Momjian <bruce@momjian.us> wrote:
On Thu, Apr  1, 2021 at 11:24:48AM -0400, Dave Cramer wrote:
> CTE's don't change the isolation level. I'm not sure what you are getting at
> here ?

I think what he/she means here is that all queries in a CTE use a single
snapshot, meaning you don't see changes by commits that happen between
queries that are part of the same CTE.  If you were running the queries
separately in read committed mode, you would see those changes, but you
would not see them in repeatable read or serializable transaction mode.

OK, that makes sense, but I think it is wrong minded to think that this absolves one of taking isolation into account.

When you make the first read you will still have to deal with all of the isolation issues 


Dave Cramer
www.postgres.rocks

Re: Is replacing transactions with CTE a good idea?

From
Bruce Momjian
Date:
On Sun, Apr  4, 2021 at 08:35:41AM -0400, Dave Cramer wrote:
> 
> 
> On Thu, 1 Apr 2021 at 15:39, Bruce Momjian <bruce@momjian.us> wrote:
> 
>     On Thu, Apr  1, 2021 at 11:24:48AM -0400, Dave Cramer wrote:
>     > CTE's don't change the isolation level. I'm not sure what you are getting
>     at
>     > here ?
> 
>     I think what he/she means here is that all queries in a CTE use a single
>     snapshot, meaning you don't see changes by commits that happen between
>     queries that are part of the same CTE.  If you were running the queries
>     separately in read committed mode, you would see those changes, but you
>     would not see them in repeatable read or serializable transaction mode.
> 
> OK, that makes sense, but I think it is wrong minded to think that this
> absolves one of taking isolation into account.
> 
> When you make the first read you will still have to deal with all of the
> isolation issues 

I have no idea what you are saying above.  Why is a SELECT-only CTE not
the same as a repeatable-read SELECT-only multi-statement transaction? 
Are you saying that a SELECT in a CTE doesn't do SELECT FOR UPDATE?  

-- 
  Bruce Momjian  <bruce@momjian.us>        https://momjian.us
  EDB                                      https://enterprisedb.com

  If only the physical world exists, free will is an illusion.




Re: Is replacing transactions with CTE a good idea?

From
Dave Cramer
Date:


On Sun, 4 Apr 2021 at 09:12, Bruce Momjian <bruce@momjian.us> wrote:
On Sun, Apr  4, 2021 at 08:35:41AM -0400, Dave Cramer wrote:
>
>
> On Thu, 1 Apr 2021 at 15:39, Bruce Momjian <bruce@momjian.us> wrote:
>
>     On Thu, Apr  1, 2021 at 11:24:48AM -0400, Dave Cramer wrote:
>     > CTE's don't change the isolation level. I'm not sure what you are getting
>     at
>     > here ?
>
>     I think what he/she means here is that all queries in a CTE use a single
>     snapshot, meaning you don't see changes by commits that happen between
>     queries that are part of the same CTE.  If you were running the queries
>     separately in read committed mode, you would see those changes, but you
>     would not see them in repeatable read or serializable transaction mode.
>
> OK, that makes sense, but I think it is wrong minded to think that this
> absolves one of taking isolation into account.
>
> When you make the first read you will still have to deal with all of the
> isolation issues 

I have no idea what you are saying above.  Why is a SELECT-only CTE not
the same as a repeatable-read SELECT-only multi-statement transaction?
Are you saying that a SELECT in a CTE doesn't do SELECT FOR UPDATE? 

No, but where is this documented ?


Dave Cramer
www.postgres.rocks

Re: Is replacing transactions with CTE a good idea?

From
Bruce Momjian
Date:
On Sun, Apr  4, 2021 at 10:02:20AM -0400, Dave Cramer wrote:
> On Sun, 4 Apr 2021 at 09:12, Bruce Momjian <bruce@momjian.us> wrote:
>     > OK, that makes sense, but I think it is wrong minded to think that this
>     > absolves one of taking isolation into account.
>     >
>     > When you make the first read you will still have to deal with all of the
>     > isolation issues 
> 
>     I have no idea what you are saying above.  Why is a SELECT-only CTE not
>     the same as a repeatable-read SELECT-only multi-statement transaction?
>     Are you saying that a SELECT in a CTE doesn't do SELECT FOR UPDATE? 
> 
> 
> No, but where is this documented ?

Well, every query runs with a single snapshot, even WITH queries.  We do
document how non-SELECT WITH visibility is handled:

    https://www.postgresql.org/docs/13/sql-select.html

    The primary query and the WITH queries are all (notionally) executed at
    the same time. This implies that the effects of a data-modifying
    statement in WITH cannot be seen from other parts of the query, other
    than by reading its RETURNING output. If two such data-modifying
    statements attempt to modify the same row, the results are unspecified.
    
    A key property of WITH queries is that they are normally evaluated only
    once per execution of the primary query, even if the primary query
    refers to them more than once. In particular, data-modifying statements
    are guaranteed to be executed once and only once, regardless of whether
    the primary query reads all or any of their output.

-- 
  Bruce Momjian  <bruce@momjian.us>        https://momjian.us
  EDB                                      https://enterprisedb.com

  If only the physical world exists, free will is an illusion.




Re: Is replacing transactions with CTE a good idea?

From
Dave Cramer
Date:


On Mon, 5 Apr 2021 at 14:18, Bruce Momjian <bruce@momjian.us> wrote:
On Sun, Apr  4, 2021 at 10:02:20AM -0400, Dave Cramer wrote:
> On Sun, 4 Apr 2021 at 09:12, Bruce Momjian <bruce@momjian.us> wrote:
>     > OK, that makes sense, but I think it is wrong minded to think that this
>     > absolves one of taking isolation into account.
>     >
>     > When you make the first read you will still have to deal with all of the
>     > isolation issues 
>
>     I have no idea what you are saying above.  Why is a SELECT-only CTE not
>     the same as a repeatable-read SELECT-only multi-statement transaction?
>     Are you saying that a SELECT in a CTE doesn't do SELECT FOR UPDATE? 
>
>
> No, but where is this documented ?

Well, every query runs with a single snapshot, even WITH queries.  We do
document how non-SELECT WITH visibility is handled:

        https://www.postgresql.org/docs/13/sql-select.html

        The primary query and the WITH queries are all (notionally) executed at
        the same time. This implies that the effects of a data-modifying
        statement in WITH cannot be seen from other parts of the query, other
        than by reading its RETURNING output. If two such data-modifying
        statements attempt to modify the same row, the results are unspecified.

        A key property of WITH queries is that they are normally evaluated only
        once per execution of the primary query, even if the primary query
        refers to them more than once. In particular, data-modifying statements
        are guaranteed to be executed once and only once, regardless of whether
        the primary query reads all or any of their output.


I think we are in agreement. My point was that WITH queries don't change the isolation semantics. 

I was pretty sure we didn't do a SELECT FOR UPDATE which would imply a lock.


Dave Cramer
www.postgres.rocks

Re: Is replacing transactions with CTE a good idea?

From
Bruce Momjian
Date:
On Mon, Apr  5, 2021 at 02:32:36PM -0400, Dave Cramer wrote:
> On Mon, 5 Apr 2021 at 14:18, Bruce Momjian <bruce@momjian.us> wrote:
> I think we are in agreement. My point was that WITH queries don't change the
> isolation semantics. 

My point is that when you combine individual queries in a single WITH
query, those queries run together with snaphot behavior as if they were
in a repeatable-read multi-statement transaction.

-- 
  Bruce Momjian  <bruce@momjian.us>        https://momjian.us
  EDB                                      https://enterprisedb.com

  If only the physical world exists, free will is an illusion.




Re: Is replacing transactions with CTE a good idea?

From
Glen Huang
Date:
This discussion really questioned my understanding of concurrency in PostgreSQL, thanks a lot.

I gave the corresponding part of the doc some more read, and I’m now in the option that insolation level has no effect
onCTEs, but please correct me if I’m wrong. 

If notionally all queries execute at the same time, even if they are executed in read committed, they behave like
repeatableread. This should also be true for serializable, since the anomalies that isolation level tries to address
won’toccur in a CTE. 

@Bruce
The gotchas you mentions are really interesting, I have a follow up question if you don’t mind:

CREATE foo(n int);
CREATE bar(n int REFERENCES foo(n));
WITH t AS (
  INSERT INTO foo(n) VALUES(1)
)
INSERT INTO bar(n) VALUES(1);

Is the CTE guaranteed to success or it’s actually unspecified? I ran it a couple times without issues, but I can’t be
sure.If it’s unspecified any idea how should I correct it? 

> On Apr 6, 2021, at 2:41 AM, Bruce Momjian <bruce@momjian.us> wrote:
>
> On Mon, Apr  5, 2021 at 02:32:36PM -0400, Dave Cramer wrote:
>> On Mon, 5 Apr 2021 at 14:18, Bruce Momjian <bruce@momjian.us> wrote:
>> I think we are in agreement. My point was that WITH queries don't change the
>> isolation semantics.
>
> My point is that when you combine individual queries in a single WITH
> query, those queries run together with snaphot behavior as if they were
> in a repeatable-read multi-statement transaction.
>
> --
>  Bruce Momjian  <bruce@momjian.us>        https://momjian.us
>  EDB                                      https://enterprisedb.com
>
>  If only the physical world exists, free will is an illusion.
>




Re: Is replacing transactions with CTE a good idea?

From
Bruce Momjian
Date:
On Fri, Apr 9, 2021 at 11:05:34PM +0800, Glen Huang wrote:
> This discussion really questioned my understanding of concurrency in
> PostgreSQL, thanks a lot.
>
> I gave the corresponding part of the doc some more read, and I’m now
> in the option that insolation level has no effect on CTEs, but please
> correct me if I’m wrong.

Yes, isolation only controls whether a new snapshot is computed
_between_ queries in a multi-statement transaction.  Single queries
always use a single snapshot, except for maintenance commands like
VACUUM.

> If notionally all queries execute at the same time, even if they are
> executed in read committed, they behave like repeatable read. This
> should also be true for serializable, since the anomalies that
> isolation level tries to address won’t occur in a CTE.

You mean multiple queries in a single CTE, yes, they are like repeatable
read.

> @Bruce The gotchas you mentions are really interesting, I have a
> follow up question if you don’t mind:
>
> CREATE foo(n int); CREATE bar(n int REFERENCES foo(n)); WITH t AS (
> INSERT INTO foo(n) VALUES(1) ) INSERT INTO bar(n) VALUES(1);
>
> Is the CTE guaranteed to success or it’s actually unspecified? I ran
> it a couple times without issues, but I can’t be sure. If it’s
> unspecified any idea how should I correct it?

Uh, the SELECT manual page explains that non-SELECT queries in a CTE do
behave unusually:

    The primary query and the WITH queries are all (notionally) executed at
    the same time. This implies that the effects of a data-modifying
    statement in WITH cannot be seen from other parts of the query, other
    than by reading its RETURNING output. If two such data-modifying
    statements attempt to modify the same row, the results are unspecified.

and the quoted paragraph suggests that your query should not work. 
However, you are not referencing the foo table directly, but via
referential integrity check, which I guess does work.

-- 
  Bruce Momjian  <bruce@momjian.us>        https://momjian.us
  EDB                                      https://enterprisedb.com

  If only the physical world exists, free will is an illusion.